0

I just found a huge list of emails under my user account in Centos /home/user/mail/new

I opened some of them and noticed they were sent from a particular cron job. From address is root@hostname.com. I need to find out if all these emails were sent by root@hostname.com - It's about 30GBs of email. Is there a way to grep out a list of unique list of From addresses? The format of the email is something like this:

Return-path:

Envelope-to: user@hostname.com

Delivery-date: Thu, 11 Aug 2011 04:34:02 -0400

Received: from user by hostname.com with local (Exim 4.69)

(envelope-from )

id 1QrQiI-0004qM-6V

for user@hostname.com; Thu, 11 Aug 2011 04:34:02 -0400

From: root@hostname.com (Cron Daemon)

To: user@hostname.com

Subject: Cron /opt/gsn/reports/pr.sh

Content-Type: text/plain; charset=UTF-8

Auto-Submitted: auto-generated

Message-Id:

Date: Thu, 11 Aug 2011 04:34:02 -0400

[MESSAGE CONTENT]

gAMBOOKa
  • 979
  • 6
  • 18
  • 33

3 Answers3

3
$ grep -E '^From:' /some/file | uniq
quanta
  • 50,327
  • 19
  • 152
  • 213
dmourati
  • 24,720
  • 2
  • 40
  • 69
1

To get total number of emails, run:

grep From: /home/user/mail/new | wc -l

to get the count of emails from root, run this

grep ^From /home/user/mail/new | grep root\@hostname.com | wc -l

now (Total emails) - (emails from root) = actual number of emails from total emails.

Farhan
  • 4,210
  • 9
  • 47
  • 76
0

Try this:

awk '/^From: / { print $2 }' /home/user/mail/new | sort | uniq -c | sort -rn

It's not one file, each email is about 20K, and the total emails amount to 30G.

awk '/^From: / { print $2 }' /home/user/mail/* | sort | uniq -c | sort -rn
quanta
  • 50,327
  • 19
  • 152
  • 213