1

We have a CEPH cluster (Ubuntu 18.04, Luminous) for Openstack images and volumes. As I was taking it into production I found many performance issues, slow OSDs, and throughput down to a trickle; this turned out to be due to the iptables rules.

As is common, one of the first rules is to allow traffic from RELATED or ESTABLISHED connections. NEW connections to any of the CEPH range of listening ports is allowed as well. Other traffic gets dropped (keeping it simple here, there are some more details but this is the gist of it).

By adding a LOG rule before the DROP I found out that sometimes packets got dropped that should have been allowed through because netstat -tanp reports an established connection.

The remedy was to change the rules relating to NEW connections; this rule now allows any traffic on the CEPH port range, for either source or destination port.

But the discrepancy between the connections according to netstat and the tracked connections according to conntrack -L have me worried. I wrote a little script to compare the two.

If anybody has more information on what might be going on, please let me know. Also, if you try my script on your CEPH cluster, let me know.

Dennis
  • 11
  • 1

0 Answers0