We're trying to verify the state of replication in our cassandra cluster. My colleague has found that only a small number of sstable files exist on multiple nodes. The others are all unique.
To me, this makes sense. As I understand it, each node should be responsible for a unique set of ranges, and should have sstables that reflect those ranges. But now I'm not sure.
Should we find at least n copies of each sstable with replication factor of n? Or are the copies of the sstables a result of the bootstrap, and haven't yet been compacted?