Application to open / edit a very large CSV file (500 MB, 4 million records)?

3

5

Possible Duplicates:
Text Editor for very big file - Windows
What editor/viewer to use to inspect large text based files?

I have a CSV file which has about 4 million rows and is about 500 MB in size. Can you recommend any editor that can open the file without making the system crawl? I tried EmEditor but it is complaining that there are too many characters in a single line.

Giorgi

Posted 2010-05-06T19:42:24.737

Reputation: 601

Question was closed 2010-05-26T17:36:04.810

2Both the links to the possible duplicate questions seem to be dead. – chrisfs – 2015-07-21T15:46:25.997

related Stack Overflow question: http://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files

– dag729 – 2010-05-06T21:58:17.037

2@dag: it's not exactly a duplicate if it's not on Super User. we can't close questions as duplicates of cross-site posts. – quack quixote – 2010-05-06T22:08:16.503

I see, thanks for the info (and for the edit) :D – dag729 – 2010-05-06T23:49:27.557

Answers

3

I tested all the editors suggested there and the only one that managed to open it (in a reasonable time) was UltraEdit.


Update

It turned out that the file did not have any line break in it (it used some other character for it) so that's why it was difficult to open it. I wrote a program which replaced all these chars with line break and EmEditor opened it in seconds.

Giorgi

Posted 2010-05-06T19:42:24.737

Reputation: 601

7

Vim should work pretty well if you change a couple settings to turn off swap space and undo functionality: http://vim.wikia.com/wiki/VimTip611

Herbert Sitz

Posted 2010-05-06T19:42:24.737

Reputation: 336

3

I've used JujuEdit (free) to work with 2GB CSVs in the past. Handles 'em pretty well. What are you doing with the contents? Another option might be to make an MS Access linked table pointed to the file if you need to do some querying/sorting/filtering.

Chris_K

Posted 2010-05-06T19:42:24.737

Reputation: 7 943

3

You could give CSVed a try. It claims to load parts of the file on demand, so it might work for you.

afrazier

Posted 2010-05-06T19:42:24.737

Reputation: 21 316

3

Define "edit". How do you want to edit the file? Are you just viewing it, or do you want to make changes? Are they changes that you'd be making programmatically? If so, you may want to look at using awk, sed or Perl to do it.

As to "it is complaining that there are too many characters in a single line," it's probably that it's got line endings it's not able to handle. It would help if you'd tell us what platform you're on.

Andy Lester

Posted 2010-05-06T19:42:24.737

Reputation: 1 121

+1 for recommending good UNIX tools, and for the line ending issue troubleshooting suggestion. – Chris W. Rea – 2010-05-07T00:45:05.197

2

I believe that UltraEdit probably could. I've used it on superhuge Gig+ files and it didn't crash out, but really, locating whatever data you are looking for is likely to be a pain. Can you do whatever you need to do programmetrically? Since you said CSV, assume it is text data, so maybe Perl might be a good choice due to its string handling and decent RegEx.

Blackbeagle

Posted 2010-05-06T19:42:24.737

Reputation: 6 424

Thanks, UltraEdit was the only one that managed to open the file. – Giorgi – 2010-05-08T13:05:47.707

@Giorgi: "the only one" out of which? People might be curious to hear the other editors you tried that didn't work. – Herbert Sitz – 2010-05-08T16:59:08.327

I tried all suggested in this thread. – Giorgi – 2010-05-08T18:47:29.320

@Giorgi: There must be something extra-unusual about your file. On my wimpy netbook gVim (with no swap file and disabled undo) opens a 750MB test file in about 10 seconds, and jumps from the beginning to the end of 10 million line file are instantaneous. – Herbert Sitz – 2010-05-08T21:52:57.213

@Herbert: See my answer below. – Giorgi – 2010-05-09T09:11:58.377

RegEx is difficult to to be 100% certain on if you may have junk data in the file. http://stackoverflow.com/a/3268683 I had several issues with a very large (34GB) file I had to parse.

– GibralterTop – 2017-02-23T20:34:52.193

1

Have you tried Excel? It handled CSV files.

I'm not sure of Excel 2007's file limitation but Excel 2010 64 bit now supports spreadsheets up to 4GB in size.

Mike Fitzpatrick

Posted 2010-05-06T19:42:24.737

Reputation: 15 062

3Excel is terrible at round-tripping CSV files. Getting Excel to save a CSV file back out without mangling it somehow in the process (by Excel making incorrect assumptions about the data formats) is not trivial. For instance, I routinely find fields like employee IDs or SSNs are getting leading zeros removed on saving out from Excel because it assumed they're numeric columns. – Chris W. Rea – 2010-05-07T00:42:30.283

Yes, a fair point. You can mitigate the problem by specifying the column type (eg Text for a column containing IDs/SSNs with leading zeros) but it must be done manually when opening the file, which is a pain. But if other CSV editors choke on the file size then the round-tripping through Excel may still be worth the additional vigilance required. – Mike Fitzpatrick – 2010-05-07T01:08:41.660

I was hoping the 64-bit version of Excel would be the solution, but sadly it has the same row limit as the 32-bit version: 1,048,576. The only difference between the two is that 64 can handle files bigger than 2GB – atraudes – 2012-08-14T14:45:08.397