2

I have a t2.nano instance running in North-Virginia availability zone (us-east-1) for almost 1 year.
In hope of reducing latency, I just have deployed the created AMI of that instance to a t3.micro instance in Singapore (ap-southeast-1) zone. There is an RDS (AZ:us-east-1) attached to the Apache server of the instance.

But the t3.micro(in Singapore) is responding way slower than the old t2.nano(in N. Virginia, USA) from more than 4 times more distant location (Dhaka, Bangladesh).

As proof of the slowness, Google's Pagespeed site ranks the old & new servers as 100 & 71 respectively, while Pingdom ranks the 2 servers as 100 & 81 respectively & GTmetrix ranks these as 100 & 79 respectively. Screenshot from GTmetrix comparison of the 2 sites: enter image description here

EDIT: The ranks were mistakenly generated using imbalanced requests, but the following screenshot now depicts that there is really a long waiting time for the t3.micro instance: enter image description here

This server hosts a lot of other REST APIs (developed using Laravel framework, both for web front-end & mobile apps), all of which are reflecting the long delays.

I have not used anymore configuration in this system & all other configurations (security group, AMI, IAM, RDS, S3 etc.) are exactly same for both the instances.

I understand that the RDS connection might occur some more milliseconds of delay (& probably some delay due to any caching?), but an average of more than 10s delay feels intolerable.

What can occur such difference & what should be done more to avoid this?

Touhid
  • 121
  • 3
  • 1
    *"I understand that the RDS connection might occur some more milliseconds of delay"* You are probably substantially underestimating the impact of this. *Every query* your application performs against the database is delayed by this round-trip time, including `USE` and `SET` statements, and the absolute best case RTT from Singapore to Virginia is still > 200ms. A poor querying strategy, for example multiple queries selecting one item each in a `for` loop, goes undetected in a low-latency environment and gets out of hand across distance. What do your application's *internal* benchmarks show? – Michael - sqlbot Dec 04 '18 at 20:48
  • And if you want to discard a network issue, just measure TCP latency (there are several tcp ping tools out there). – Pedro Perez Dec 14 '18 at 19:33

1 Answers1

2

The pages these two instances serve are not the same.

Screenshot

The total page size is 150x bigger (386kB vs 2.8kB), and the number of requests is 8x higher (17 vs 2).

First of all align the two setups so that they serve your pages the same way, than we can look at the actual instance performance.

Also having the DB in a different region doesn't help - typical website may make quite a lot DB queries and if each query adds half a second or so latency it quickly adds up.

And lastly T2 and T3 instances use so called CPU Credits - once they are depleted the performance rapidly drops. It may be that you have done your performance testing right after e.g. installing some software or otherwise using up the credits. In that case the performance would be really poor. Give it some time to accumulate more credits and retry.

Hope that helps :)

MLu
  • 23,798
  • 5
  • 54
  • 81
  • Sorry that I didn't notice the response size difference, but actually the difference is mainly due to the long waiting time of the `t3.micro`, which can be seen in the updated question. Also, no CPU credit was used in any of these instances & I am seeing the delay ever since the `t3.micro` deployment (almost 30 hours till now). – Touhid Dec 04 '18 at 11:55
  • That still does not explain why Site A would be 2.77 KB and Site B would be 386 KB. I don't think that both the sites are the same. – HawkEye Dec 04 '18 at 13:53