7

I have a really big csv ~260GB and I want to import it into MySQL. I use the following mysql script in MacOS

DROP TABLE IF EXISTS tmp_catpath_5;

create table tmp_catpath_5( a1 BIGINT(20),a2 BIGINT(20),a3 BIGINT(20),a4 BIGINT(20),a5 BIGINT(20), c1 BIGINT(20),c2 BIGINT(20),c3 BIGINT(20),c4 BIGINT(20),c5 BIGINT(20), INDEX (a5) );

load data local infile '/Volumes/Time Machine Backups 1/tmp_catpath_5.csv' into table tmp_catpath_5 FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' ;

It works for smaller files e.g. 150GB, but for this size the database disappears immediately after the load command

ERROR 1049 (42000): Unknown database

Is there a MySQL size limit? Is it something else? I use the MySQL from MAMP application (LAMP for MacOS), could it be different from the official MySQL application?

Starfish
  • 2,716
  • 24
  • 28
jimkont
  • 173
  • 1
  • 6

2 Answers2

5

I don't know if there are limits for particular DB engines but I've met with some Department of Defense admins who manage a MySQL system which imports 10TB every day and they are not experiencing this kind of problem.

Rather than use macros I recommend writing a script to parse the input file and create insert statements from it, which can be fed straight to MySQL. That way you are not dealing with the entire file in one hit, just one line at a time. I like to use Perl for such things.

You may also like to consider using truncate instead of dropping he table and recreating it each time.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
  • This is a one-time script, every table is created once and deleted after processing, the first drop was a left from when i was testing the script with a small database. i 'll try to use the official mysql app first and then the perl script... Thanks again!!! – jimkont Dec 13 '10 at 23:53
1

While this doesn't really answer your question, is the possibility of splitting the file into smaller chunks something you've considered? The unix utility "split" can handle this easily.

When you say "crashes", do you mean the MySQL client or the server? Is any error displayed?

David
  • 1,012
  • 6
  • 9
  • To be more accurate neither, the error is :ERROR 1049 (42000): Unknown database 'xxx'. the database diasapears immediately after the load command – jimkont Dec 13 '10 at 23:04
  • Now that i think of it, splitting is not convinient, because i have a series of 30 operations, that each one depends on the previous. now everything is on one script that will run (for a number of days). This way i will have to break every step to split / load – jimkont Dec 13 '10 at 23:32