I'm running a t2.micro instance on Amazon Linux AMI 2018.03 (4.14.59-64.43.amzn1.x86_64). It hosts a php website using Apache/2.4.33, and connect to an RDS MySQL database.
From time to time, the server completely "disappears". Trying to display the website, connect to the FTP or even connect in SSH with putty all result in a timeout. And it doesn't come back on its own, I have to manually shut down the server via the AWS console and start it up again, then everything is back to normal. (Interestingly, the "reboot" command does nothing and seem to be ignored by the server. Only shutting it down and starting it again works)
Problem is, I checked every log files I could find and there doesn't seem to be anything at all around the time the server stop responding, so I have no idea how to troubleshoot. Checking Cloudwatch metrics, the CPU and Network usage also seem to be normal while the server is not responding.
This seem to happens when I'm running a particular memory-heavy PHP script a bunch of times (but randomly, I can also run this script without issue) so I suspect it might be related to the RAM filling up. But if the system was closing something to free up memory, wouldn't it show up in the logs?
How would one go about debugging in a situation like this?
Thanks
Here is the only thing in the messages log around the last occurrence :
Sep 6 15:11:34 compta dhclient[2266]: PRC: Renewing lease on eth0.
Sep 6 15:11:34 compta dhclient[2266]: XMT: Renew on eth0, interval 10970ms.
Sep 6 15:11:34 compta dhclient[2266]: RCV: Reply message on eth0 from ****::***:****:****:****.
Sep 6 15:11:34 compta ec2net: [get_meta] Trying to get http://***.***.***.***/latest/meta-data/network/interfaces/macs/**:**:**:**:**:**/local-ipv4s
Sep 6 15:11:34 compta ec2net: [rewrite_aliases] Rewriting aliases of eth0
Sep 6 15:11:34 compta ec2net: [get_meta] Trying to get http://***.***.***.***/latest/meta-data/network/interfaces/macs/**:**:**:**:**:**/subnet-ipv4-cidr-block
Sep 6 15:22:13 compta kernel: imklog 5.8.10, log source = /proc/kmsg started.
Sep 6 15:22:13 compta rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="2356" x-info="http://www.rsyslog.com"] start
Sep 6 15:22:13 compta kernel: [ 0.000000] Linux version 4.14.59-64.43.amzn1.x86_64 (mockbuild@gobi-build-64010) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Thu Aug 2 21:29:33 UTC 2018
Sep 6 15:22:13 compta kernel: [ 0.000000] Command line: root=LABEL=/ console=tty1 console=ttyS0 selinux=0 LANG=en_US.UTF-8 KEYTABLE=us
Sep 6 15:22:13 compta kernel: [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
15:22 is when I restart the server.
Just realised something : The eth0 lease usually renew ~ every minute, but stop once the server stop responding.