I'm an administrator of a social game which uses MySQL(Percona 5.1.56 to be precise) for data storage(all tables have InnoDB type). There are about 2 millions of players in the game and database size is about 100Gb and it's gradually growing. There are a few tables which have >500 millions records already.
The game DB is running pretty smoothly even not sharded on a single powerful enough non-virtualized Linux Debian 6 server(24 GB RAM, hardware Adaptec RAID-10, with a couple of read-only slaves). The problem is that from time to time(once a month or two) MySQL crashes with data corruption as following:
InnoDB: Database page corruption on disk or a failed InnoDB: file read of page XXXX.
InnoDB: You may have to recover from a backup.
Restoring from such errors is quite a painful process. Which usually requires promoting one of the slaves being a new master, directing the traffic to this new master and creating the backup slave for this master. There is some downtime which makes players really mad...
Percona folks told me it was the hardware's fault and at first I thought it was the hardware to blame too but after I've changed several servers I don't know what to think really.
Is there any chance it's MySQL corrupting the data? I've already started looking at alternatives(e.g PostgreSQL, or even something radical like Cassandra). But of course I know that every new product has its own baggage of bugs and quirks not to mention the costs of migration....
I'm pulling out my hair(today I've faced another crash), so if you have any ideas, please share...