4
1
I've got some data (basic personal details) which I need to export from an old legacy system to a newer system (MySQL DB). It's approximately 200MB split across 6 files.
The system I'm exporting from can export all the required data as an Excel file. However it's method of creating an Excel is to create a HTML table and save it with a .xls
extension. When viewed in a text editor they just look like this (except 600,000 lines long).
<table>
<tr>
<td class="tableh">
<b>Birth Date</b>
</td>
<td class="tableh">
<b>First Name</b>
</td>
Much to my surprise Excel can actually open such a document. Unfortunately it takes about half an hour to do so (per file), and almost as long to save it into a proper format.
Does anyone have any suggestions for converting these (faster than the 45-60mins it takes Excel)?
open it in a browser and copy/paste from there to excel maybe? – mcalex – 2013-07-10T07:58:51.307
@mcalex Just gave it a shot, Chrome crashed when trying to copy it all. – Dracs – 2013-07-10T08:16:11.830
1Could you specify "some data" and "one system" and "another" ? If the data is in an HTML file, you could use the import tools to show it in Excel and then save as CSV. you would not normally muck around with HTML source in Excel. – teylyn – 2013-07-10T09:09:42.943
@teylyn I've clarified my question a bit. I just tried using Excel's Data import tools. It took ~20 minutes for one file. It's a small improvement, but it did seem more stable doing it. – Dracs – 2013-07-10T10:35:55.827
I would try changing the extension to .html, open it in Excel, then Save As a csv file. If this didn't behave I would use a good editor (Notepad++ or Sublime Text) to clean up (remove stuff!) in the HTML file before opening it in Excel again. – Andy G – 2013-07-10T18:41:29.350
@AndyG Sorry should have specified, I renamed the files to .html when doing the Excel data import. I just had a go at using some regex with Sublime to convert it to CSV. Had to force close it though after about half an hour of it processing. – Dracs – 2013-07-10T23:15:01.780