1

We recently upgraded our company servers (Datastax Enterprise 4.5.3) to DSE 4.6.0. The only problem we face with that is the new Backup Service, in which we are unable to create a backup for "All Keyspaces". Nevertheless, backing up keyspaces one-by-one works like a charm. The error seems to be coming from the datastax-agent(s) installed on the nodes, and I am enclosing as many details as I can recall below.

OpsCenter Event Log:

Backup of all keyspaces failed: Backup of all keyspaces failed for the following destinations: snapshot

Snapshot of all keyspaces on node < node-IP > failed: clojure.lang.Compiler$CompilerException: java.lang.ClassFormatError: Invalid method Code length 96939 in class file clojure/core$eval87, compiling:(NO_SOURCE_PATH:0:0) (< node-IP >)

Snapshot of all keyspaces on node < node-IP > failed: clojure.lang.Compiler$CompilerException: java.lang.ClassFormatError: Invalid method Code length 96939 in class file clojure/core$eval87, compiling:(NO_SOURCE_PATH:0:0) (< node-IP >)

The above error (snapshot of all keyspaces...) is a little longer, since it comes once for every available node on the cluster, and in the end the "Backup of all keyspaces failed:..." error is presented.

At the same time, all the datastax-agents present the following error message:

 ERROR [qtp1549990111-47] 2015-02-13 18:35:50,887 Unhandled route
 Exception: clojure.lang.Compiler$CompilerException:
 java.lang.ClassFormatError: Invalid method Code length 96939 in class
 file clojure/core$eval87, compiling:(NO_SOURCE_PATH:0:0)
                      Compiler.java:6567 clojure.lang.Compiler.analyzeSeq
                      Compiler.java:6361 clojure.lang.Compiler.analyze
                      Compiler.java:6616 clojure.lang.Compiler.eval
                      Compiler.java:6608 clojure.lang.Compiler.eval
                      Compiler.java:6582 clojure.lang.Compiler.eval
                           core.clj:2852 clojure.core/eval
                           routes.clj:58 opsagent.http.routes/fn
                             core.clj:94 compojure.core/make-route[fn]
                             core.clj:40 compojure.core/if-route[fn]
                             core.clj:25 compojure.core/if-method[fn]
                            core.clj:107 compojure.core/routing[fn]
                           core.clj:2443 clojure.core/some
                            core.clj:107 compojure.core/routing
                         RestFn.java:139 clojure.lang.RestFn.applyTo
                            core.clj:619 clojure.core/apply
                            core.clj:112 compojure.core/routes[fn]
                            Var.java:415 clojure.lang.Var.invoke
                       middleware.clj:93 opsagent.http.middleware/wrap-application-error[fn]
                       middleware.clj:75 opsagent.http.middleware/wrap-content-type[fn]
                      middleware.clj:112 opsagent.http.middleware/wrap-content-error[fn]
                       middleware.clj:31 opsagent.http.middleware/wrap-request-logging[fn]
                       middleware.clj:17 opsagent.http.middleware/wrap-opscenter-id-check[fn]
                      middleware.clj:123 opsagent.http.middleware/wrap-version-header[fn]
                   keyword_params.clj:32 ring.middleware.keyword-params/wrap-keyword-params[fn]
                           params.clj:58 ring.middleware.params/wrap-params[fn]
                            jetty.clj:19 opsagent.http.jetty/proxy-handler[fn]
                        (Unknown Source) opsagent.http.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$0.handle
                 HandlerWrapper.java:111 org.eclipse.jetty.server.handler.HandlerWrapper.handle
                         Server.java:349 org.eclipse.jetty.server.Server.handle
         AbstractHttpConnection.java:452 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest
         AbstractHttpConnection.java:894 org.eclipse.jetty.server.AbstractHttpConnection.content
         AbstractHttpConnection.java:948 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content
                     HttpParser.java:857 org.eclipse.jetty.http.HttpParser.parseNext
                     HttpParser.java:235 org.eclipse.jetty.http.HttpParser.parseAvailable
             AsyncHttpConnection.java:76 org.eclipse.jetty.server.AsyncHttpConnection.handle
          SelectChannelEndPoint.java:609 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle
           SelectChannelEndPoint.java:45 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run
               QueuedThreadPool.java:599 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
               QueuedThreadPool.java:534 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run
                        (Unknown Source) java.lang.Thread.run Caused by: java.lang.ClassFormatError: Invalid method Code length 96939 in
 class file clojure/core$eval87
                        (Unknown Source) java.lang.ClassLoader.defineClass1
                        (Unknown Source) java.lang.ClassLoader.defineClass
                        (Unknown Source) java.lang.ClassLoader.defineClass
              DynamicClassLoader.java:46 clojure.lang.DynamicClassLoader.defineClass
                      Compiler.java:4663 clojure.lang.Compiler$ObjExpr.getCompiledClass
                      Compiler.java:3819 clojure.lang.Compiler$FnExpr.parse
                      Compiler.java:6558 clojure.lang.Compiler.analyzeSeq

  INFO [qtp1549990111-47] 2015-02-13 18:35:50,888 HTTP: :post
 /ops/take-snapshot {:req-id "c13bb101-2f9e-4880-8b1f-efc178f49b3e"} -
 500

The above applies to a production cluster of 5 nodes in 2 data centers (Datastax defaults, Cassandra/Analytics DCs and DseSimpleSnitch). The analytics DC works with Spark and CFS. I have tried the same procedure (upgrade path 4.5.3->4.6.0-> Backup all keyspaces) to my local 2-machine cluster (one Cassandra, one Analytics) with a massively smaller dataset and it works like a charm.

Arribah
  • 48
  • 4

1 Answers1

1

There’s a (known) bug in OpsCenter 5.1 that causes backups to fail in specific scenarios. Unfortunately, looks like you have . The fix will be in OpsCenter 5.1.1 that’s to be released soon.

Workaround that you discovered (per-keyspace backup) should work reliably.

arre
  • 281
  • 1
  • 3
  • I'm adding some info just to let you know. It seems that the per-keyspace-backup is not working _very_ reliably either; There are problems when two backups collide with each other (the datastax-agents generate timeouts and OpsCenter considers the whole operation as erroneous) and it's not very easy to avoid backup collision when you have a datacenter used for Analytics. The different workloads will slow backups, while at the same time you want to keep different keyspaces backed up as closely in time as possible. – Arribah Mar 10 '15 at 18:05
  • Good point. Good news is, 5.1.1 is almost done, so there won’t be a need to use that workaround soon. Hope that helps, and I’m sorry this problem caused you so much pain. – arre Mar 11 '15 at 19:41