elasticsearch server is unreachable every 2 hours

Question

This question is related to this one. We now know that the errors come from elasticsearch. The problems are still not resolved despite the modifications and optimizations made on the es instance. Every 2 hours the es server becomes unreachable: we have timeout or connection reset by peer errors.

We think that is related to this:

I don't really understand this graph because during the day there is no indexing at all. The index process is only launched once a day at 2 AM and it runs without problem.

I have other Grafana reports, where should I look?

Some data:

Versions:

elasticsearch: 1.7.5

I'm not sure if it's related or not, but isn't 30 shards _way_ to many for 439MB? — virullius, Nov 19 '17 at 00:18
Nothing in the elasticsearch logs? Could you check the kernel ring log (dmesg) or /var/log/syslog to see if it hits a certain limit? Perhaps the maximum number of open file descriptors is reached, too many open connections, not enough HEAP size... — Nils, Nov 22 '17 at 16:43
Please consider getting off 1.7. There are some major issues with stability and data reliability with that version. You may never be able to fix it. — TheFiddlerWins, Oct 08 '18 at 13:08

score 0 · Accepted Answer · answered Oct 08 '18 at 12:48

I have forgotten to answer the question. The issue came form the F5 load balancer we were using. After a major upgrade the problem disappeared by itself. We were pretty sure those errors didn't come from the "code". If it can help someone having the same kind of error... Globally this issue was beneficial for the application because, we:

Cleaned up a lot of code
Removed an elasticsearch index that was useless
And more importantly we upgraded elasticsearch from 1.7 to 5.6

elasticsearch server is unreachable every 2 hours

1 Answers1