I have an Ubuntu 12.04 server, hosted on AWS, that powers a dozen WordPress installations using Nginx and php5-fpm.
The problem just began one day. My server was down. Panic! Whenever I turned it on again, after a few minutes it always died again. Postfix was taking 100% of CPU until the server ran out of breath. I was shocked to find out something is sending emails at an inhumane rate.
This is my postfix configuration (/etc/postfix/main.cf
):
smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
biff = no
# appending .domain is the MUA's job.
append_dot_mydomain = no
# Uncomment the next line to generate "delayed mail" warnings
#delay_warning_time = 4h
readme_directory = no
# TLS parameters
smtpd_tls_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file=/etc/ssl/private/ssl-cert-snakeoil.key
smtpd_use_tls=yes
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
# See /usr/share/doc/postfix/TLS_README.gz in the postfix-doc package for
# information on enabling SSL in the smtp client.
myhostname = ip-172-31-3-94.eu-west-1.compute.internal alias_maps = hash:/etc/aliases alias_database = hash:/etc/aliases myorigin = /etc/mailname mydestination = creatroopers.com., ip-172-31-3-94.eu-west-1.compute.internal, localhost.eu-west-1.compute.internal, localhost relayhost = mynetworks
= 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 mailbox_size_limit = 0 recipient_delimiter = + inet_interfaces = all
Here's a one second worth of logs from /var/log/mail.log
(creatroopers.com is my server):
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 0092B47C77: message-id= <20140317111547.0092B47C77@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 0092B47C77: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12531]: connect to mailin-02.mx.aol.com[64.12.88.164]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: connect to usamail.com[209.15.13.134]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: A9E6947A48: to=<fredaklanro@grammail.com>, relay=none, delay=151, delays=121/0.51/30/0, dsn=4.4.1, status=deferred (connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: F223B47A8D: to=<frbbbjx@usamail.com>, relay=none, delay=149, delays=119/0.55/30/0, dsn=4.4.1, status=deferred (connect to usamail.com[209.15.13.134]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: connect to sakha.net[146.185.211.229]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: D4CDE47A63: to=<fred2@sakha.net>, relay=none, delay=148, delays=118/0.6/30/0, dsn=4.4.1, status=deferred (connect to sakha.net[146.185.211.229]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12543]: connect to smtp-in.sto-hy.se.stejtech.net[79.136.102.100]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 4862B48363: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 4862B48363: message-id=<20140317111547.4862B48363@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12544]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: connect to ebmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: connect to searchmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 4862B48363: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5CAC44829F: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 5CAC44829F: message-id=OrJkZNj-UTgCbNO-oX@creatopix.com
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: 2F1FD47A5F: to=<fredamacclip@searchmail.com>, relay=none, delay=150, delays=119/0.63/30/0, dsn=4.4.1, status=deferred (connect to searchmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: B5E6747AE6: to=<fredadketa@ebmail.com>, relay=none, delay=153, delays=122/0.5/30/0, dsn=4.4.1, status=deferred (connect to ebmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5FDC647D02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 5FDC647D02: message-id=<20140317111547.5FDC647D02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5FDC647D02: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5CAC44829F: from=<www-data@creatroopers.com>, size=2108, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12548]: connect to mx01.gmx.com[74.208.5.27]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12547]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: connect to relay.verizon.net[206.46.232.11]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12561]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: 8F26947A53: to=<frederikahallam9933@verizon.net>, relay=none, delay=149, delays=118/0.79/30/0, dsn=4.4.1, status=deferred (connect to relay.verizon.net[206.46.232.11]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to ASPMX.L.GOOGLE.COM[2a00:1450:400c:c05::1b]:25: Network is unreachable
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8206C4811D: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8206C4811D: message-id=<20140317111547.8206C4811D@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8206C4811D: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8BAFB47E02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 8BAFB47E02: message-id=<20140317111547.8BAFB47E02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8E05B48185: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8E05B48185: message-id=<20140317111547.8E05B48185@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8E05B48185: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8BAFB47E02: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 905C1480E6: uid=33 from=<www-data>
See? It's horrible. If I reinitialize the server and run service postfix stop
, in the log I get Mar 18 09:49:10 ip-172-31-3-94 postfix/master[25358]: terminating on signal 15
and the server becomes healthy again. But somehow every morning at 8:55 postfix wakes up on its own and shuts down the server again.
I see that the user that initiated all of these operations is "www-data" so running ps auxww|grep ^www-data
gives us:
www-data 870 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 871 0.0 0.1 78208 3356 ? S 09:41 0:03 nginx: worker process
www-data 872 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 875 0.0 0.1 78208 3356 ? S 09:41 0:02 nginx: worker process
www-data 8066 0.2 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8343 1.0 0.0 25088 1408 ? D 16:10 0:00 /usr/sbin/postdrop -r
www-data 8380 1.2 0.0 25096 1464 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8616 0.0 0.2 263332 4924 ? S 16:10 0:00 php-fpm: pool www
www-data 8617 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8638 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8640 0.0 0.0 25096 1464 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8649 2.0 0.0 18816 1044 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8654 2.0 0.0 20920 1320 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8664 2.0 0.0 2632 156 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8673 2.0 0.0 25092 1404 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8681 2.0 0.0 18696 520 ? R 16:10 0:00 /usr/sbin/postdrop -r
www-data 8682 0.0 0.0 25096 1468 ? S 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8713 1.0 0.0 18704 520 ? R 16:10 0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data 8720 0.0 0.0 25088 1408 ? S 16:10 0:00 /usr/sbin/postdrop -r
www-data 19310 0.5 1.6 271260 28300 ? S 16:05 0:01 php-fpm: pool www
www-data 22166 0.5 1.7 272196 30404 ? S 16:05 0:01 php-fpm: pool www
www-data 22240 0.6 1.1 270468 18880 ? S 16:05 0:01 php-fpm: pool www
When I run postdrop -d ALL
, after 2 minutes it prints "Killed" on the screen, logs out of the system and forces me to restart the server.
Needless to say, I can't send email through any of my websites because, even in the 10 minutes where postfix is running, the email queue is so full nothing can be added to it.
I need help pinpointing the cause of this endless stream of spam.
Thank you!