getmail does not keep track of downloaded emails if process is killed

0

I am using getmail to archive my Gmail account's inbox. However, every once in a while the process hangs -- probably due to a bad connection with Gmail's IMAP servers. I find that the only way to restart the process is to simply kill the process with CTRL+C. Sometimes, when killing and restarting the process, getmail is not updating the oldmail file which it uses, when it restarts, to determine which emails were previously downloaded. This makes the process take more time (and potentially hang once again), and bloats the mbox file that stores the backup data.

After poking around, it appears that getmail only updates the oldmail file when it completes -- and so if it is unexpectedly killed, the data on which emails have already been downloaded is lost. Is there a way to force getmail to update the oldmail database in real time, rather than just at the end of the process?

Jason

Posted 2014-04-21T10:06:10.723

Reputation: 255

Answers

1

Regarding getmail

The getmail FAQ reads like this behaviour is "kind of" known:

Use the max_messages_per_session option to limit the number of messages getmail will process in a single session. Some users with flaky servers use this option to reduce the chances of seeing messages more than once if the server dies in mid-session.

Possible workaround

A (GMail specific) possible workaround, that completely gets rid of the need for the oldmail file, is to introduce a kind of "archive bit":

  • Create a custom label, e.g. named archive
  • Create a filter to apply this label to any new message (received or sent)
    • (from:(you@gmail.com) OR to:(you@gmail.com))
  • Using an IMAP retriever, only fetch mails from this label's folder (mailboxes option)
  • Set the delete option in the config (in GMail, this will only remove the label)

toubsen

Posted 2014-04-21T10:06:10.723

Reputation: 31