1

On a Cassandra cluster running Apache Cassandra 3.11.4 that is repaired using Cassandra Reaper 1.4.1, I am experiencing the problem that the snapshots that are created by the repair process are sometimes not deleted.

This means that over time, more and more such snapshots (having names in the form of a UUID like xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) accumulate and I eventually run out of disk space unless I delete those snapshots manually.

Of course, I could create a cron job that periodically deletes those snapshots, but this might interfere with running repair sessions because there is no good way to tell which snapshots are stale and which ones are associated with active repair sessions.

Does anyone know why these snapshots are sometimes not deleted automatically (like they are supposed to) and have a better solution than the cron job mentioned above?

BTW: I also saw this problems with older versions of Apache Cassandra and Cassandra Reaper, so I do not think that it is specific to the mentioned versions. On a different cluster that only stores few data, I have not seen this problem yet, so it might be related to failed repair sessions.

1 Answers1

1

What setting are you using for Repair parallelism in your schedule?

We had an issue where we had this set to DATACENTER_AWARE and snapshots were not being cleaned up.

We are now using "Parallel" which seems to be handling snapshots.

Scolari
  • 26
  • 1