1

We are having performance problems with our cluster regarding to timeouts when repairs are running or massive deletes. One of the advice I received was update our casssandra version from 2.0.17 to 2.2. I am draining one of the nodes to start the upgrade and the drain is running now for two days. In the logs only see log like these from time to time:

INFO [ScheduledTasks:1] 2016-04-06 08:17:10,987 ColumnFamilyStore.java (line 808) Enqueuing flush of Memtable-sstable_activity@1382334976(15653/226669 serialized/live bytes, 6023 ops)
 INFO [FlushWriter:1468] 2016-04-06 08:17:10,988 Memtable.java (line 362) Writing Memtable-sstable_activity@1382334976(15653/226669 serialized/live bytes, 6023 ops)
 INFO [ScheduledTasks:1] 2016-04-06 08:17:11,004 ColumnFamilyStore.java (line 808) Enqueuing flush of Memtable-compaction_history@1425848386(1599/15990 serialized/live bytes, 51 ops)
 INFO [FlushWriter:1468] 2016-04-06 08:17:11,012 Memtable.java (line 402) Completed flushing /var/lib/cassandra/data/system/sstable_activity/system-sstable_activity-jb-4826-Data.db (6348 bytes) for commitlog position ReplayPosition(segmentId=1458540068021, position=1198022)
 INFO [FlushWriter:1468] 2016-04-06 08:17:11,012 Memtable.java (line 362) Writing Memtable-compaction_history@1425848386(1599/15990 serialized/live bytes, 51 ops)
 INFO [FlushWriter:1468] 2016-04-06 08:17:11,039 Memtable.java (line 402) Completed flushing /var/lib/cassandra/data/system/compaction_history/system-compaction_history-jb-3491-Data.db (730 bytes) for commitlog position ReplayPosition(segmentId=1458540068021, position=1202850)

Should I wait or just stop the node and start the migration?

ftrujillo
  • 149
  • 9

2 Answers2

2

Problem is related with a bug in versions prior to 2.1 (https://issues.apache.org/jira/browse/CASSANDRA-5911). Commit logs are not removed after a flush.

ftrujillo
  • 149
  • 9
0

Check if there are any other nodetool processes running. I've had drain hang when there are snapshots processes backing up. I stopped them all and restarted Cassandra to make sure it was healthy, then drain worked.

Dave
  • 123
  • 1
  • 1
  • 5
  • There was no nodetool processes running, anyway I think it is related with one problem flushing memtables to sstables in versions prior to 2.1 (https://issues.apache.org/jira/browse/CASSANDRA-5911)because I also see that restarting the node took very long to replay the commitlogs. – ftrujillo May 09 '16 at 07:45