-1

My question relates to Opscenter 4.0.2 and Cassandra 2.0.4 with SSL and Auth : agent can't connect. The provided answer does help, but creates another problem.

To summarize, everything runs find until I enable ssl between opscenter and the datastax-agents. I am using DSE 4.0, my configuration is similar to the one in the other ticket and I know the truststore gets picked up. However, the agent sometimes throws the following exception in the log:

INFO [thrift-init] 2014-03-12 12:52:08,283 Registering JMX me.prettyprint.cassandra.service_Agent Cluster:ServiceType=hector,MonitorType=hector
INFO [StompConnection receiver] 2014-03-12 12:52:08,352 Starting OS metric collectors (Linux)
INFO [StompConnection receiver] 2014-03-12 12:52:08,444 Starting Cassandra JMX metric collectors
ERROR [thrift-init] 2014-03-12 12:52:09,022 Exception in thread "thrift-init" 
ERROR [thrift-init] 2014-03-12 12:52:09,023 java.lang.OutOfMemoryError: Java heap space
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:140)
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
ERROR [thrift-init] 2014-03-12 12:52:09,023     at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_cluster_name(Cassandra.java:1101)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at org.apache.cassandra.thrift.Cassandra$Client.describe_cluster_name(Cassandra.java:1089)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at me.prettyprint.cassandra.service.AbstractCluster$2.execute(AbstractCluster.java:149)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at me.prettyprint.cassandra.service.AbstractCluster$2.execute(AbstractCluster.java:145)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:253)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at me.prettyprint.cassandra.service.AbstractCluster.describeClusterName(AbstractCluster.java:155)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at java.lang.reflect.Method.invoke(Method.java:606)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
ERROR [thrift-init] 2014-03-12 12:52:09,024     at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:298)
ERROR [thrift-init] 2014-03-12 12:52:09,025     at clj_hector.core$cluster_name.invoke(core.clj:40)
ERROR [thrift-init] 2014-03-12 12:52:09,025     at opsagent.cassandra$setup_cassandra$f__376__auto____929$fn__949.invoke(cassandra.clj:360)
ERROR [thrift-init] 2014-03-12 12:52:09,025     at opsagent.cassandra$setup_cassandra$f__376__auto____929.invoke(cassandra.clj:358)
ERROR [thrift-init] 2014-03-12 12:52:09,025     at clojure.lang.AFn.run(AFn.java:24)
ERROR [thrift-init] 2014-03-12 12:52:09,025     at java.lang.Thread.run(Thread.java:744)

Like it says in the other ticket, I have to give much more memory to the VM (i had to set -Xmx1024M, as 256MB was not sufficient) in order to get the actual exception:

me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportException
        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:39)
        at me.prettyprint.cassandra.service.AbstractCluster$2.execute(AbstractCluster.java:151)
        at me.prettyprint.cassandra.service.AbstractCluster$2.execute(AbstractCluster.java:145)
        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104)
        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:253)
        at me.prettyprint.cassandra.service.AbstractCluster.describeClusterName(AbstractCluster.java:155)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
        at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:298)
        at clj_hector.core$cluster_name.invoke(core.clj:40)
        at opsagent.cassandra$setup_cassandra$f__376__auto____929$fn__949.invoke(cassandra.clj:360)
        at opsagent.cassandra$setup_cassandra$f__376__auto____929.invoke(cassandra.clj:358)
        at clojure.lang.AFn.run(AFn.java:24)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.thrift.transport.TTransportException
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141)
        at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
        at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_cluster_name(Cassandra.java:1101)
        at org.apache.cassandra.thrift.Cassandra$Client.describe_cluster_name(Cassandra.java:1089)
        at me.prettyprint.cassandra.service.AbstractCluster$2.execute(AbstractCluster.java:149)
        ... 15 more

However, sometimes, everything goes fine, so it does look like a race condition. I followed the instructions proposed in the only answer to the other ticket and manually configured the agent's ssl settings in address.yaml:

thrift_ssl_truststore: /etc/dse/conf/.truststore 
thrift_ssl_truststore_password: XYZ

Now, the ssl part works, but Hector gives an error when it tries to run a request:

ERROR [thrift-processor-1] 2014-03-12 03:34:42,420 Error when proccessing thrift callme.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:You have not logged in)

I do have internal authentication enabled. However, it seems that now that the ssl connection is manually configured on the agent, the credentials sent by opscenter are disregarded.

Is there a proper solution to get ssl communication working with the agents while having internal authentication/authorization enabled in Cassandra?

jlemire
  • 3
  • 1

1 Answers1

1

The same bug that caused the first issue could also cause the authentication details to not get set correctly unfortunately. You can specify them in the address.yaml as well though.

thrift_user: <username>
thrift_pass: <password>
nickmbailey
  • 191
  • 2
  • Thanks! I am going to try that asap and come back with feedback. Are you aware of documentation of any kind concerning the agent configuration file address.yaml? It seems to me that ssl security between opscenter and its agents pretty much requires manual configuration through that file. – jlemire Mar 17 '14 at 19:20
  • I have applied the setting cluster-wide and authentication always works now, whereas it would only work once in a while beforehand (i.e. whenever it would receive its configuration from opscenter before setting up hector). – jlemire Mar 17 '14 at 20:16
  • We are actually working on additional documentation for address.yaml however it does not exist at the moment. In general configuring through address.yaml shouldn't be needed, however the race condition bug here does require it. That should also be fixed soon hopefully. – nickmbailey Mar 18 '14 at 05:49