cassandra nodetool repair - how to schedule properly?

Question

Im putting together a 16 node cassandra cluster (replication factor 2) and want to setup a schedule for nodetool repair. gc_grace_seconds is at the default.

Two questions:

My first impulse is to setup a cron job for each machine and attempt to manually randomize the timing around a one week schedule. Is there a better way?
Does nodetool repair have to be run on every system or every # systems/replication factor systems? (IE for my 16 nodes with replication factor 2 - 8 systems - one of each pair)

Jon Haddad · Accepted Answer · 2015-03-09T21:27:01.093

2

I would not randomize it. Your best bet is to schedule the repairs so they don't stomp on each other.

You should use the -pr option on each node when running repair.

If you're using Cassandra 2.1 you have the option for incremental repair which will speed things up considerably.

RF=2 is also a recipe for disaster.. quorum queries will fail if a node is unavailable. I recommend RF=3.

edited Mar 09 '15 at 21:27

answered Dec 17 '14 at 20:49

Jon Haddad

1,332
3
13
20

On a small cluster (12-16 nodes) - on new reasonable hardware.. will there truly be failures that often? – ethrbunny Dec 17 '14 at 22:49
It's not just about node failures. It's about cluster configuration changes, restarts, network partitions, power failure, rack failure. Additionally, as I mentioned, if you're using QUORUM - your queries will fail if only one node goes down. – Jon Haddad Dec 18 '14 at 04:38
@JonHaddad what is the purpose of using -pr option? – Selvam Palanimalai Jan 27 '15 at 09:45
From "nodetool help repair": -pr, --partitioner-range. Use -pr to repair only the first range returned by the partitioner. Otherwise you initiate a repair on the entire cluster. – Jon Haddad Jan 27 '15 at 23:00
In the last line of your answer, should that say "RF=2 is a recipe for disaster..." as indicated by OP's RF? I've personally been burned by by setting RF=2 and QUORUM CL. – BeepBoop Mar 06 '15 at 21:37

cassandra nodetool repair - how to schedule properly?

1 Answers1