3

I'm using percona-clustercheck (which comes with Percona's XtraDB Cluster packages) with xinetd and I'm getting an error when trying to curl the clustercheck service.

/usr/bin/clustercheck:

#!/bin/bash 
#
# Script to make a proxy (ie HAProxy) capable of monitoring Percona XtraDB Cluster nodes properly
#
# Author: Olaf van Zandwijk <olaf.vanzandwijk@nedap.com>
# Documentation and download: https://github.com/olafz/percona-clustercheck
#
# Based on the original script from Unai Rodriguez 
#

MYSQL_USERNAME="clustercheckuser" 
MYSQL_PASSWORD="clustercheckpassword!" 
ERR_FILE="/dev/null" 
AVAILABLE_WHEN_DONOR=0

#
# Perform the query to check the wsrep_local_state
#
WSREP_STATUS=`mysql --user=${MYSQL_USERNAME} --password=${MYSQL_PASSWORD} -e "SHOW STATUS LIKE 'wsrep_local_state';" 2>${ERR_FILE} | awk '{if (NR!=1){print $2}}' 2>${ERR_FILE}` 

if [[ "${WSREP_STATUS}" == "4" ]] || [[ "${WSREP_STATUS}" == "2" && ${AVAILABLE_WHEN_DONOR} == 1 ]]
then 
    # Percona XtraDB Cluster node local state is 'Synced' => return HTTP 200
    /bin/echo -en "HTTP/1.1 200 OK\r\n" 
    /bin/echo -en "Content-Type: text/plain\r\n" 
    /bin/echo -en "\r\n" 
    /bin/echo -en "Percona XtraDB Cluster Node is synced.\r\n" 
    /bin/echo -en "\r\n" 
    exit 0
else 
    # Percona XtraDB Cluster node local state is not 'Synced' => return HTTP 503
    /bin/echo -en "HTTP/1.1 503 Service Unavailable\r\n" 
    /bin/echo -en "Content-Type: text/plain\r\n" 
    /bin/echo -en "\r\n" 
    /bin/echo -en "Percona XtraDB Cluster Node is not synced.\r\n" 
    /bin/echo -en "\r\n"
    exit 1 
fi

/etc/xinetd.mysqlchk:

# default: on
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
  disable = no
  flags = REUSE
  socket_type = stream
  port = 9200
  wait = no
  user = nobody
  server = /usr/bin/clustercheck
  log_on_failure += USERID
  only_from = 10.0.0.0/8 127.0.0.1
  # recommended to put the IPs that need
  # to connect exclusively (security purposes)
  per_source = UNLIMITED
}

When attempting to curl the service, I get a valid response (HTTP 200, text) but a 'connection reset by peer' notice at the end:

HTTP/1.1 200 OK
Content-Type: text/plain

Percona XtraDB Cluster Node is synced.

curl: (56) Recv failure: Connection reset by peer

Unfortunately, Amazon ELB appears to see this as a failed check, not a succeeded one.

How can I get clustercheck to exit gracefully in a manner that curl doesn't see a connection failure?

ceejayoz
  • 32,469
  • 7
  • 81
  • 105

4 Answers4

6

Adding Content-Length: 0 makes the client ignore the contents, even is there is content like there is in this case. So this might break other check-software. In your case, the content length is 42 bytes (so add Content-Length: 42) in case of a synced node and 46 bytes in case of an unsynced node.

# curl localhost:9200
Percona XtraDB Cluster Node is synced.

I will push the updated script so it will be fixed in the new version of the Percona XtraDB Cluster package as well.

2

I added a Content-Length: 0 header to the clustercheck response, which appears to do the trick for Amazon's health checking. If anyone has a better practice, please let me know

ceejayoz
  • 32,469
  • 7
  • 81
  • 105
1

I solved similar problem using

echo -en "HTTP/1.1 200 OK\r\n" | cat

instead

echo -en "HTTP/1.1 200 OK\r\n"
0

I also notice you are using the script at localhost. You can also execute clustercheck on the commandline and check for the return value.

In case of a synced node:

# /usr/bin/clustercheck > /dev/null
# echo $?
0

In case of an un-synced node:

# /usr/bin/clustercheck > /dev/null
# echo $?
1
  • `localhost` is just for my testing, the actual checks will be from AWS's load balancer instances. Thanks again! – ceejayoz Mar 21 '13 at 19:43