4

I just setup a mysql cluster on a fairly decent baby (IBM x3650 M3) with 24GB memory, xeon 6core, SAS 6Gbps HDD. Running Debian Lenny 5. 64bits.

Ndb version is 7.1.9a. Our database size on MyISAM is around 3.2 GB. Ndb_size estimation is 58GB for ndbengine.

A little info about my database is as follows. 150 common tables for global purpose. 130 tables for each clients. So it goes like this, 130 x 115(clients) = 14950 tables.

Is it normal or usual to have 14000 tables on one database? The reasons why we did this was, Easy maintenance and per client based customization.

Now, the problem is, ndb cluster can only support, 20320 tables. But it can support 5,000,000,000 rows in one table if I'm not wrong.

My real head ache is my cluster data node takes less than two minutes to startup with out any data. But as soon as convert my tables into ndb, that too only 2000 tables, data node takes at least 30 to 40 mins to start up. Is it normal? If I convertt all my tables into ndb, will it take even longer?

Or let's say if consolidate my 14000 table's data into one, which is 130 tables, will it help?

Or is there anything idiotically wrong which I'm doing? I'll attach my config.ini file soon. here's the simple overview of my config

Datamemory = 14G Indexmemory = 3GB Maxnooftable = 14000 Maxnoofattributes = 78000

I'm just testing these values with 2000 tables first.

Please advise, how to increase the start up speed. Please point out where I'm going wrong. Thanks in advance guys!

Arafat
  • 53
  • 5
  • I can't paste my whole config file here. Please follow this link for the config.ini http://arafath.com/config.txt – Arafat Dec 30 '10 at 01:46
  • This looks interesting. I can't really contribute much but I would be interested to see how this pans out. Rethinking the database structure is probably a good idea. I assume the delay in starting the node is due to the size of the database rather than the number of tables (just a guess). I assume the all data is normalized? Maybe looking at some different software solutions for indexing and recalling that amount of data? Make sure you ask this at stackoverflow as the question easily crosses over into database design and programming. – Ablue Dec 30 '10 at 11:06

1 Answers1

1

If you are able to change the database schema, normalize till it hurts, denormalize till it works. Partition your data per data node till it suits your performance requirements, but don't neglect replica nodes. If you are concerned about the node boot and start time, look at disk access/read speeds. Look at your raid config. Mirrored raid reading is superior to writing.

Maybe solid state hdd will offer better read speeds?

Ablue
  • 1,140
  • 1
  • 12
  • 32
  • Thanks Matthew, for the reply. Right now, our database is Normalized to the maximum. One of a techy from oracle also suggested the same as you did. Denormalize!. – Arafat Dec 31 '10 at 04:34
  • Did any of that make any sense? I did some quick reading on mysql clustering. – Ablue Dec 31 '10 at 04:36
  • We are now looking into combine all the clients table sets into one, since, NDB is really good at handling huge data per table. As for the HDD I/O speed, Im using SAS 6Gbps. Is that good enough or still I need to go for RAID 0 with two or more HDD? – Arafat Dec 31 '10 at 04:41
  • Since the DDB is stored in memory, I imagine the faster the better. My guess is that could contribute to the startup delay. Indexing a single table may be good, but doing complicated queries on it may cause you headaches. Normalizing that single table may save you space and improve performance drastically. – Ablue Dec 31 '10 at 04:42
  • And, I did an interesting test. I just created 2 databases and created 20,000 tables with no data! no indexing! It took 4 hours for my nodes to be ready. So I believe node startup time also affected by the number of tables. I will post my second result, which I'm testing now. Second test is with few tables with huge data in it. Please stay in touch... – Arafat Dec 31 '10 at 04:43
  • I think I should also do a test with SSD to compare with SAS. – Arafat Dec 31 '10 at 04:47
  • Awesome. That sounds promising. – Ablue Dec 31 '10 at 04:48
  • @SSD, sure? mirrored 10/15k rpm sas drives sounds pretty fast already. http://www.servethehome.com/ocz-vertex-v-sas-15k-rpm-raid-5/ – Ablue Dec 31 '10 at 04:52
  • Ya, SAS is faster than the SSD. 2 SAS RAID0 gives me two times faster than single SAS. I just tested with 350 tables and 20,000 records in total. I get 4 minutes node startup time. Man! where is 4 hours and 4 Minutes? Now I'm gonna proceed with atleast 8 GB of data. Let me see what happens. – Arafat Dec 31 '10 at 05:02