I'm a project manager working with complicated web application which is placed on 3 different clusters(servers) in different parts of the world. On every cluster the code is the same.
But Google page load time is different from cluster to cluster and deviation is very high - it varies from 2.5 seconds on cluster A (which is ok) to 6 seconds on cluster B (which is far beyond company's SLA)
We've implemented NAGIOS http checks to see what it will show and the numbers are pretty much similar to google page load time.
Our admins troubleshoot this issue with regular tcptraceroutes and show the numbers from 0.5 sec to 1.8 sec, after what verdict that there are no issues with the network or servers.
The question are:
1) Is tcptraceroute check relevant for troubleshooting such kind of an issue?
2) Is there any other way to troubleshoot page load time from admins side?
3) My main argument for admins to keep investigating this issue is that on one cluster Page load time is 2.5 sec and on another it is 6 sec. (In both GA and NAGIOS checks) Isn't that enough for admins to keep investigating?
Thanks and sorry if I touched smbdy's feelings with such a vague questions.