I have a 5 node elasticsearch cluster. One host has had consistently high IOWait% (40+) for several weeks. The others seem fine (<10%).
Notable:
- Host in question is not the cluster master
- New indexes are randomly distributed among all 5 hosts
- IOTop shows that all high-wait-% processes are elasticsearch (IE no bots, viruses, etc)
- All data is stored on a SAN with a 10G bonded connection
- All hosts are configured the same
Running:
- CentOs 7.x
- 128G RAM
Suggestions on other tools to use to ferret out the problem(s)?