0

I've been trying to get details on this but no luck. I've observed that if an ec2 instance has been running for many days (say 30-40 days), it gets degraded. Terminating that instance works.

But,

  1. Why do ec2 instances get degraded? Is it because of the hardware or the software that we are running on it?

  2. Is there anyway to avoid it?

  • Can you be more specific with your definition of 'degraded' please? Certainly the HW doesn't change or degrade as such but you may find the host your VM is created on will get filled up with other people's VMs over time, which could easily explain the issues you're seeing. – Chopper3 Sep 23 '20 at 10:28
  • 1
    @Chopper3 other tenants on the same host do not degrade your instance on EC2. That's an annoyingly persistent myth, nothing more. – Michael - sqlbot Sep 23 '20 at 10:52
  • @Michael-sqlbot - are you sure? how does that work then? I'll be honest I only have expertise with vSphere/ESXi and I know EC2 is Xen (I think), is every Xen VM just pinned to use a set about of CPU/memory/disk resources? – Chopper3 Sep 23 '20 at 11:00
  • @Chopper3 yes, they're pinned, with some possible exceptions on the burstable `t`-families of instances (though the details of this aren't public). Looking for a good resource, I found [this answer on SO](https://stackoverflow.com/a/40444587/1695906) which was written by an AWS insider. – Michael - sqlbot Sep 23 '20 at 16:56
  • @Michael-sqlbot so are you saying that AWS's Xen is never overcommitted in terms of vCPU to pCPU and the same for memory? – Chopper3 Sep 24 '20 at 08:27
  • 2
    When you start an instance, AWS hardware (CPU and RAM) is dedicated, not shared, and not oversold. The only slight variation is T series where they're still not oversold, but you only get a fraction of a CPU core. RAM is still dedicated on a T series. AWS isn't a budget shared hosting provider. – Tim Sep 24 '20 at 19:30
  • @Chopper3 yes, that is what I'm saying. – Michael - sqlbot Sep 24 '20 at 20:59
  • @Michael-sqlbot Have you any documentation to show that, I can see that happening with memory as Xen's not good at that anyway but I genuinely don't believe that to be even close to the case for CPU, it'd be commercial suicide for one thing. – Chopper3 Sep 25 '20 at 10:23
  • @Chopper3 There's documentation cited in the answer I've already linked to: *"Each vCPU is a hyperthread of an Intel Xeon core..."* There's no oversubscription. What would be suicide would be CPU sharing and the inconsistent performance that accompanies it. You get what you pay for with EC2. I'm sorry if you don't believe it. – Michael - sqlbot Sep 26 '20 at 00:48
  • That statement "Each vCPU is a hyperthread of an Intel Xeon core" has nothing to do with overcommitment, they just mean they class a hyperthread as a full core. I have a friend who's an architect at AWS, I'll ask him, he probably won't know himself but I imagine he'll be able to dig out the details and I'll get back to you. – Chopper3 Sep 26 '20 at 14:06
  • Did you get an email stating that your instance was degraded? Or do you "feel" it has degraded? These are completely separate things, as the former means your instance stops working entirely and has to do with the underlying HW your instance happens to be running on. – Gaia Dec 14 '21 at 05:07

1 Answers1

4
  1. If it's a t2.something instance class you may be running out of CPU credits. See On clarifying t2 and t3 working conditions? for more details on that.

    You can monitor your CPU credit balance in the Monitoring tab in the instance details.

  2. Other than that it's probably your software - perhaps it's got a memory leak that causes it to slowly run out of memory and spilling over to swap which makes it slow.

    You'll need to do some investigation on what's going on in the time of "degradation" - does it have high I/O, memory pressure, high swap use, etc.

EC2 alone doesn't degrade, I bet it's your app that's causing the issue and rebooting it clears it.

Hope that helps :)

MLu
  • 23,798
  • 5
  • 54
  • 81