I have a table in cassandra where I saved data using clients TTL = 1 month (tables TTL is 0), the table is configured with time window compaction strategy.
Everyday Cassandra cleaned one single sstable containing expired data from one month ago. Recently I changed the clients TTL to 15 days, I was expecting cassandra to clean two sstables a day at some point, and release the space. But it keeps cleaning one sstable a day and keeping 15 days of dead data.
How do I know?
for f in /data/cassandra/data/keyspace/table-*/*Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' $(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort
This command list all the sstables
Max: 05/19/2018 Min: 05/18/2018 Estimated droppable tombstones: 0.9876591095477787 84G May 21 02:59 /data/cassandra/data/pcc/data_history-c46a3220980211e7991e7d12377f9342/mc-218473-big-Data.db
Max: 05/20/2018 Min: 05/19/2018 Estimated droppable tombstones: 0.9875830312750179 84G May 22 15:25 /data/cassandra/data/pcc/data_history-c46a3220980211e7991e7d12377f9342/mc-221915-big-Data.db
Max: 05/21/2018 Min: 05/20/2018 Estimated droppable tombstones: 0.9876636061230402 85G May 23 13:56 /data/cassandra/data/pcc/data_history-c46a3220980211e7991e7d12377f9342/mc-224302-big-Data.db
...
For now I have been triggering the compactions manually using JMX, but I want all erased as it would normally do.
run -b org.apache.cassandra.db:type=CompactionManager forceUserDefinedCompaction /data/cassandra/data/keyspace/sstable_path