-1

I have an Ubuntu 12.04 server, hosted on AWS, that powers a dozen WordPress installations using Nginx and php5-fpm.

The problem just began one day. My server was down. Panic! Whenever I turned it on again, after a few minutes it always died again. Postfix was taking 100% of CPU until the server ran out of breath. I was shocked to find out something is sending emails at an inhumane rate.

This is my postfix configuration (/etc/postfix/main.cf):

smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu)
biff = no

# appending .domain is the MUA's job.
append_dot_mydomain = no

# Uncomment the next line to generate "delayed mail" warnings
#delay_warning_time = 4h

readme_directory = no

# TLS parameters
smtpd_tls_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
smtpd_tls_key_file=/etc/ssl/private/ssl-cert-snakeoil.key
smtpd_use_tls=yes
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache

# See /usr/share/doc/postfix/TLS_README.gz in the postfix-doc package for
# information on enabling SSL in the smtp client.

myhostname = ip-172-31-3-94.eu-west-1.compute.internal alias_maps = hash:/etc/aliases alias_database = hash:/etc/aliases myorigin = /etc/mailname mydestination = creatroopers.com., ip-172-31-3-94.eu-west-1.compute.internal, localhost.eu-west-1.compute.internal, localhost relayhost = mynetworks
= 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 mailbox_size_limit = 0 recipient_delimiter = + inet_interfaces = all

Here's a one second worth of logs from /var/log/mail.log (creatroopers.com is my server):

Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 0092B47C77: message-id=    <20140317111547.0092B47C77@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 0092B47C77: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12531]: connect to mailin-02.mx.aol.com[64.12.88.164]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: connect to usamail.com[209.15.13.134]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: A9E6947A48: to=<fredaklanro@grammail.com>, relay=none, delay=151, delays=121/0.51/30/0, dsn=4.4.1, status=deferred (connect to mx.hover.com.cust.hostedemail.com[216.40.42.4]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12535]: F223B47A8D: to=<frbbbjx@usamail.com>, relay=none, delay=149, delays=119/0.55/30/0, dsn=4.4.1, status=deferred (connect to usamail.com[209.15.13.134]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: connect to sakha.net[146.185.211.229]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12545]: D4CDE47A63: to=<fred2@sakha.net>, relay=none, delay=148, delays=118/0.6/30/0, dsn=4.4.1, status=deferred (connect to sakha.net[146.185.211.229]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12543]: connect to smtp-in.sto-hy.se.stejtech.net[79.136.102.100]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 4862B48363: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 4862B48363: message-id=<20140317111547.4862B48363@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12544]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: connect to ebmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: connect to searchmail.com[72.52.4.90]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 4862B48363: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5CAC44829F: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 5CAC44829F: message-id=OrJkZNj-UTgCbNO-oX@creatopix.com
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12546]: 2F1FD47A5F: to=<fredamacclip@searchmail.com>, relay=none, delay=150, delays=119/0.63/30/0, dsn=4.4.1, status=deferred (connect to searchmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12533]: B5E6747AE6: to=<fredadketa@ebmail.com>, relay=none, delay=153, delays=122/0.5/30/0, dsn=4.4.1, status=deferred (connect to ebmail.com[72.52.4.90]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 5FDC647D02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 5FDC647D02: message-id=<20140317111547.5FDC647D02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5FDC647D02: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 5CAC44829F: from=<www-data@creatroopers.com>, size=2108, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12548]: connect to mx01.gmx.com[74.208.5.27]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12547]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: connect to relay.verizon.net[206.46.232.11]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12561]: connect to mxs.mail.ru[94.100.180.150]:25: Connection timed out
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12549]: 8F26947A53: to=<frederikahallam9933@verizon.net>, relay=none, delay=149, delays=118/0.79/30/0, dsn=4.4.1, status=deferred (connect to relay.verizon.net[206.46.232.11]:25: Connection timed out)
Mar 17 11:15:47 ip-172-31-3-94 postfix/smtp[12534]: connect to ASPMX.L.GOOGLE.COM[2a00:1450:400c:c05::1b]:25: Network is unreachable
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8206C4811D: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8206C4811D: message-id=<20140317111547.8206C4811D@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8206C4811D: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8BAFB47E02: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[15130]: 8BAFB47E02: message-id=<20140317111547.8BAFB47E02@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 8E05B48185: uid=33 from=<www-data>
Mar 17 11:15:47 ip-172-31-3-94 postfix/cleanup[14534]: 8E05B48185: message-id=<20140317111547.8E05B48185@ip-172-31-3-94.eu-west-1.compute.internal>
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8E05B48185: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/qmgr[12501]: 8BAFB47E02: from=<www-data@creatroopers.com>, size=636, nrcpt=1 (queue active)
Mar 17 11:15:47 ip-172-31-3-94 postfix/pickup[12500]: 905C1480E6: uid=33 from=<www-data>

See? It's horrible. If I reinitialize the server and run service postfix stop, in the log I get Mar 18 09:49:10 ip-172-31-3-94 postfix/master[25358]: terminating on signal 15 and the server becomes healthy again. But somehow every morning at 8:55 postfix wakes up on its own and shuts down the server again.

I see that the user that initiated all of these operations is "www-data" so running ps auxww|grep ^www-data gives us:

www-data   870  0.0  0.1  78208  3356 ?        S    09:41   0:02 nginx: worker process
www-data   871  0.0  0.1  78208  3356 ?        S    09:41   0:03 nginx: worker process
www-data   872  0.0  0.1  78208  3356 ?        S    09:41   0:02 nginx: worker process
www-data   875  0.0  0.1  78208  3356 ?        S    09:41   0:02 nginx: worker process
www-data  8066  0.2  0.0  25096  1464 ?        S    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8343  1.0  0.0  25088  1408 ?        D    16:10   0:00 /usr/sbin/postdrop -r
www-data  8380  1.2  0.0  25096  1464 ?        R    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8616  0.0  0.2 263332  4924 ?        S    16:10   0:00 php-fpm: pool www                                                       
www-data  8617  0.0  0.0  25096  1464 ?        S    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8638  0.0  0.0  25096  1464 ?        S    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8640  0.0  0.0  25096  1464 ?        S    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8649  2.0  0.0  18816  1044 ?        R    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8654  2.0  0.0  20920  1320 ?        R    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8664  2.0  0.0   2632   156 ?        R    16:10   0:00 /usr/sbin/postdrop -r
www-data  8673  2.0  0.0  25092  1404 ?        R    16:10   0:00 /usr/sbin/postdrop -r
www-data  8681  2.0  0.0  18696   520 ?        R    16:10   0:00 /usr/sbin/postdrop -r
www-data  8682  0.0  0.0  25096  1468 ?        S    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8713  1.0  0.0  18704   520 ?        R    16:10   0:00 /usr/sbin/sendmail -i -FCronDaemon -oem www-data
www-data  8720  0.0  0.0  25088  1408 ?        S    16:10   0:00 /usr/sbin/postdrop -r
www-data 19310  0.5  1.6 271260 28300 ?        S    16:05   0:01 php-fpm: pool www                                                       
www-data 22166  0.5  1.7 272196 30404 ?        S    16:05   0:01 php-fpm: pool www                                                       
www-data 22240  0.6  1.1 270468 18880 ?        S    16:05   0:01 php-fpm: pool www    

When I run postdrop -d ALL, after 2 minutes it prints "Killed" on the screen, logs out of the system and forces me to restart the server.

Needless to say, I can't send email through any of my websites because, even in the 10 minutes where postfix is running, the email queue is so full nothing can be added to it.

I need help pinpointing the cause of this endless stream of spam.

Thank you!

AnFi
  • 5,883
  • 1
  • 12
  • 26
Ore4444
  • 127
  • 1
  • 5
  • This process `www-data@creatroopers.com`. – NickW Mar 18 '14 at 12:28
  • 1
    Might worth reading http://serverfault.com/questions/419407/fighting-spam-what-can-i-do-as-an-email-administrator-domain-owner-or-user –  Mar 18 '14 at 12:31
  • What else is running as user `www-data` on your server? – MadHatter Mar 18 '14 at 12:40
  • @MadHatter, how do I find that out? – Ore4444 Mar 18 '14 at 12:43
  • `ps auxww|grep ^www-data`? You'll want to cut-and-paste the output into your question; don't try to fit it in a comment. – MadHatter Mar 18 '14 at 12:49
  • 1
    You either have an user account compromised or your postfix is configured to relay email. – Havenard Mar 18 '14 at 15:14
  • I'm not expert in postfix but the presence of the name `www-data` suggests that some of your websites is sending this spam, probably the spammer is using some sort of 'sendmail.php' to relay messages through your server. Check your HTTP server logs after suspecious activities. – Havenard Mar 18 '14 at 15:19
  • I agree with Havenard, at least in terms of what's most likely. Out of curiosity, if you stop the web server does the onslaught of sent mail stop (or very significantly slow)? – MadHatter Mar 18 '14 at 17:22
  • Turn off sending (and receiving) mail, clean out the outbound mail queue. A message with thousands of recipients might have snuck in there. Then start hunting for the origin of the junk. – vonbrand Mar 18 '14 at 17:31

2 Answers2

2

The answer was much simpler than I expected. Somebody hacked into the server and added 5000 identical lines to the www-data user's crontab, executing sendmail.

All I have to do was delete them.

Pikkaachu's answer also helped. Mails stopped being sent, but CPU was still at 100%.

Ore4444
  • 127
  • 1
  • 5
  • 1
    +1 for answering your own question, but -1 for the question itself : Duplicate of generic questions, even if you asked the question really well. –  Mar 19 '14 at 09:33
  • If you're server's been comprimised, the odds are that they also did something so that they can get back in later. I'd suggest blowing that server out and re-installing, as mentioned in the ServerFault question that I linked to in the question comments. – Magellan Mar 19 '14 at 19:08
1

Sounds like you have an open relay.

Put this line in your main.cf

smtpd_client_restrictions = permit_mynetworks, reject

This will stop your postfix been an open relay.

Also check that smtpd_recipient_restrictions is not permit

Hope this helps.

Also, Update all your wordpress installations, get some security plugins to monitor their behaviour, wordpress is easily compromised if not kept up to date.

Pikkaachu
  • 46
  • 4