Process files in a folder that haven't previously been processed

6

2

I have a series of files in a directory that I need to carry an action out on using a script. Once the action is done, then I want to keep a log that the file has been processed, so that the next time the script is run, it does not attempt to carry out the action again.

So lets say I can find all the files that should be processed like this:

for i in `find /logfolder -name '20*.log'` ; do
    process_log $i
    echo $i >> processedlogsfile
done

So I have a file containing the logs I have processed, and my goal would be to modify the for loop such that these processed logs are not processed a second time.

Doing a manual scan each time seems inefficient, particularly as the processedlogfiles gets bigger:

 if grep -iq "$i" processdlogfiles ; then continue; fi

It would be good if these files could be excluded when setting up the for loop.

Note that the OS in question is a linux derivative, a managment appliance, with a limited toolset (no attr command for example) and so no way to install additional utilities (well it is possible but not an option). Most common bash shell commands are available though.

Also, the filenames and locations of the processed files must remain where they are - they can't be altered to reflect their processed status

Paul

Posted 2012-07-11T01:01:26.297

Reputation: 52 173

Answers

1

Add | fgrep -vf processedlogfiles to your find command

johnshen64

Posted 2012-07-11T01:01:26.297

Reputation: 4 399

0

What about splitting things up by folder... New files in one and processed files in the other. Then the "processing" includes moving the files.

Brian Adkins

Posted 2012-07-11T01:01:26.297

Reputation: 1 721

Thanks Brian, the files cannot be moved or renamed unfortunately, I have updated the question – Paul – 2012-07-11T01:43:55.993