1

Context:

We have a Cassandra cluster with 3 nodes deployed as a Stateful Set in Openshift. The three nodes are configured in the same datacenter, same rack.

I also made a script to test the Cassandra consistency level errors. It runs as a pod within Openshift, connects to the cluster and runs a select query in a loop. It knows the IP addresses of all Cassandra nodes.

Problem:

If I reduce the replica number from 3 to 2 in the stateful set (which also runs nodetool drain on that node), the script can't connect to the cluster anymore. I get the following error:

cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'172.17.0.10': OSError(None, "Tried connecting to [('172.17.0.10', 9042)]. Last error: timed out"), '172.17.0.9': AuthenticationFailed('Failed to authenticate to 172.17.0.9: Err or from server: code=0100 [Bad credentials] message="Error during authentication of user admin : org.apache.cassandra.excepti ons.UnavailableException: Cannot achieve consistency level LOCAL_ONE"',), '172.17.0.8': ConnectionRefusedError(111, "Tried co nnecting to [('172.17.0.8', 9042)]. Last error: Connection refused"), '172.17.0.11': AuthenticationFailed('Failed to authenticate to 172.17.0.11: Error from server: code=0100 [Bad credentials] message="Error during authentication of user admin : org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_ONE"',)})

Question:

Since two nodes are still available, why can't the authentication get the LOCAL_ONE consistency level, and how can I solve my issue?

aspyct
  • 340
  • 6
  • 19

1 Answers1

1

When you created cluster - did you change the replication factor for system_auth keyspace? If not, then you need to bring that node back, and change replication factor for it to 3.

See detailed instructions here.

Alex Ott
  • 316
  • 1
  • 5