1

I'm in the cur folder of a maildir store. I want to cat a message .. pipe it to a command .. and have the body of the message spit out. Simple.

Example: If its a mime message .. and there is a plaintext version .. show me the plaintext .. If its an HTML message with no plaintext .. then render the HTML and give me some semblence of the message text ... If its just an image ... display nothing .. or maybe an [image] placeholder ..

Why do I want this? I'm trying to train spamassassin .. and I want to spit out key headers and an excerpt from the email body so I can quickly skim through all the messages and decide which ones are legit, which are ham, and which are spam ... I am already extracting a list of messages from the maildir that match a given X-Spam score .. and am displaying the headers I want .. I just need to append the body of the message .. but hit a roadblock

Some other questions here suggested using mutt. I installed that and looked at it - but from what I could see - I'd have to point it to the specific maildir .. which is going to complicate the process .. ideally I'd like something that just "interprets" an email message from a file and displays it

Your help is appreciated. Thank You

DHW
  • 53
  • 8

1 Answers1

0

I've managed to come up with the following script .. but its still a bit lacking. Was still refining it when I noticed Andrew suggested munpack from the mpack package

I found the tool reformime to extract the plain/text portion of the mime message. I was using GNU recode too but found that it was stripping stuff that wasn't quoted printable (QP) .. so I elected to use sed, probably quite inefficiently, to remove the QP code .. and substitute common characters that were QP escaped.

Here's the script I came up with .. I can go into a maildir folder now - run the script .. and get a summary of messages. Supplying an argument will match specific scores using regexp.

#!/bin/bash

DEFSC="3[0-9]"
SPAMSCORE=${1-$DEFSC}

echo "Scanning for messages with a Spam Score filter of ${SPAMSCORE}"

# Get a list of messages with desired spam score
grep "^X-Spam-Score: ${SPAMSCORE}\$" * | sed 's/:X-Spam-Score: [0-9-]*//g' > ~/tmpspam

while read MSG; do
    # Extract Message ID for easy reading
    MSGID=$(echo "${MSG}" | grep -oe '^[0-9]*')
    echo "================= ${MSGID} ================="
    # Find the headers that we are looking for
    grep -e "^X-Spam-Status" -e "^Subject:" -e "^From:" ${MSG} | sed -r 's/=\?[^?]*\?[^?]*\?([^?]*)\?=/\1/g;s/=20/ /g;s/=2C/,/g;s/=3A/:/g'
    # Use reformime to find which mime section is text/plain
    MIMESEC=$(cat ${MSG} | ~/reformime -i | grep -B 1 '^content-type: text/plain' | head -n 1 | grep -oe "[0-9\.]*$")
    # Display that Mime Section
    echo '- - - - - - - - - - - - - - - - - - - - - - '
    cat ${MSG} | ~/reformime -e -s ${MIMESEC} | awk '/./{a=a+1;if(a<=10){print $0;}}' | sed -r 's/https?:\/\/[A-Za-z0-9.?%+_@&;=\/-]*/<<url>>/g'
    echo '============================================'
done < ~/tmpspam

# Delete Temp File
rm -f ~/tmpspam

As an example:

skim_msgs.sh '4[0-9]'

OUTPUT: (finds one message)

Scanning for messages with a Spam Score filter of 4[0-9]
================= 1518851309 ================= 
From: John Doe <jdoe@gmail.com> 
Subject: Watch "If Cops Talked Like Pilots" on YouTube 
X-Spam-Status: No, score=4.1
- - - - - - - - - - - - - - - - - - - - - - 
<<url>>
============================================
DHW
  • 53
  • 8