This question is related to this one. We now know that the errors come from elasticsearch. The problems are still not resolved despite the modifications and optimizations made on the es instance. Every 2 hours the es server becomes unreachable: we have timeout or connection reset by peer errors.
We think that is related to this:
I don't really understand this graph because during the day there is no indexing at all. The index process is only launched once a day at 2 AM and it runs without problem.
I have other Grafana reports, where should I look?
Some data:
Versions:
- elasticsearch: 1.7.5