How can I find the oldest file in a directory tree

76

13

I'm looking for a shell one-liner to find the oldest file in a directory tree.

Marius Gedminas

Posted 2013-02-15T16:06:04.533

Reputation: 1 770

Answers

74

This works (updated to incorporate Daniel Andersson's suggestion):

find -type f -printf '%T+ %p\n' | sort | head -n 1

Marius Gedminas

Posted 2013-02-15T16:06:04.533

Reputation: 1 770

1I get empty space becasue my first line from this find is empty due to the fact i have filename contains newline. – 林果皞 – 2016-04-19T12:50:44.697

1Can I ask if this uses the created or modification date? – MrMesees – 2016-11-27T13:07:38.733

1Linux doesn't store the file creation date anywhere[*]. This uses the modification date.

[*] this is actually not true; ext4 stores the inode creation date, but it's not exposed via any system calls and you need to use debugfs to see it.) – Marius Gedminas – 2016-11-28T07:14:46.713

I get a useless answer with this one with the head limit set to 1 because a whole bunch of flatpak files have the modification date set to zero, so I'm getting tens of thousands of files with Jan 1, 1970. Then I get a few thousand more for 1980-01-01 from google cloud sdk and docker. Then thousands more from 1985-10-26 from nodejs. Then a bunch from over a decade ago from various git repos and steam files (cloning source/build files takes remote timestamps). Then I get my actual files around line 150k – theferrit32 – 2020-01-03T22:13:23.340

8Less typing: find -type f -printf '%T+ %p\n' | sort | head -1 – Daniel Andersson – 2013-02-15T19:37:05.793

14

This one's a little more portable and because it doesn't rely on the GNU find extension -printf, so it works on BSD / OS X as well:

find . -type f -print0 | xargs -0 ls -ltr | head -n 1

The only downside here is that it's somewhat limited to the size of ARG_MAX (which should be irrelevant for most newer kernels). So, if there are more than getconf ARG_MAX characters returned (262,144 on my system), it doesn't give you the correct result. It's also not POSIX-compliant because -print0 and xargs -0 isn't.

Some more solutions to this problem are outlined here: How can I find the latest (newest, earliest, oldest) file in a directory? – Greg's Wiki

slhck

Posted 2013-02-15T16:06:04.533

Reputation: 182 472

And an even more portable approach is find . -type f -exec ls -ltr {} + | head -n1. No need in xargs and -print0. This is supported in POSIX, unlike -print0.

– Ruslan – 2018-09-09T17:30:18.140

This works too, but it also emits an xargs: ls: terminated by signal 13 error as a side effect. I'm guessing that's SIGPIPE. I've no idea why I don't get a similar error when I pipe sort's output to head in my solution. – Marius Gedminas – 2013-02-15T16:29:03.707

Your version is also easier to type from memory. :-) – Marius Gedminas – 2013-02-15T16:29:33.380

Yes, that's a broken pipe. I don't get this with both GNU and BSD versions of all those commands, but it's the head command that quits once it has read a line and thus "breaks" the pipe, I think. You don't get the error because sort doesn't seem to complain about it, but ls does in the other case. – slhck – 2013-02-15T16:32:16.473

4This breaks if there are so many filenames that xargs needs to invoke ls more than once. In that case, the sorted outputs of those multiple invocations end up concatenated when they should be merged. – Nicole Hamilton – 2013-02-15T17:00:36.303

@Nicole You're right, and I was implying this with my hint to ARG_MAX, because this is the number of files that can be passed, e.g. 262144 on my OS X. (Maybe I should have been more explicit on this?) – slhck – 2013-02-15T17:09:22.163

2I think this is worse than posting a script that assumes filenames never contain spaces. A lot of the time, those will work because the filenames don't have spaces. And when they fail, you get an error. But this is unlikely to work in real cases and failure will go undiscovered. On any directory tree big enough that you can't just ls it and eyeball the oldest file, your solution probably will overrun the command line length limit, causing ls to be invoked multiple times. You'll get the wrong answer but you'll never know. – Nicole Hamilton – 2013-02-15T17:15:18.017

Well, if in the real case there are less than ARG_MAX files, it works. If not, then not, and I explicitly mentioned this as a drawback when posting the answer. I made that point a little clearer though. (Also upvoted your comment.) The main reason I posted this is that printf doesn't exist in non-GNU find and therefore an alternative is needed. – slhck – 2013-02-15T17:22:56.707

If it even failed with an error, I would be okay with this. But it fails silently. That's unacceptable. – Nicole Hamilton – 2013-02-15T17:27:24.010

It's also worth clarifying that this fails if the sum of the lengths of the filenames in characters is greater than ARG_MAX, not if there are more than ARG_MAX files. – Nicole Hamilton – 2013-02-15T18:00:13.000

@Nicole You're right, I missed that. – slhck – 2013-02-15T19:21:05.383

I just found out that I never fully understood how xargs works. ARG_MAX is 2,097,152 on Ubuntu 12.10, by the way. – Dennis – 2013-02-16T04:12:09.920

On reflection, I don't actually want to seem harsh, so I'm undoing my downvote. But I do think it's important that software should never be designed to fail silently. – Nicole Hamilton – 2013-02-17T18:11:32.740

11

The following commands commands are guaranteed to work with any kind of strange file names:

find -type f -printf "%T+ %p\0" | sort -z | grep -zom 1 ".*" | cat

find -type f -printf "%T@ %T+ %p\0" | \
    sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //'

stat -c "%y %n" "$(find -type f -printf "%T@ %p\0" | \
    sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //')"

Using a null byte (\0) instead of a linefeed character (\n) makes sure the output of find will still be understandable in case one of the file names contains a linefeed character.

The -z switch makes both sort and grep interpret only null bytes as end-of-line characters. Since there's no such switch for head, we use grep -m 1 instead (only one occurrence).

The commands are ordered by execution time (measured on my machine).

  • The first command will be the slowest since it has to convert every file's mtime into a human readable format first and then sort those strings. Piping to cat avoids coloring the output.

  • The second command is slightly faster. While it still performs the date conversion, numerically sorting (sort -n) the seconds elapsed since Unix epoch is a little quicker. sed deletes the seconds since Unix epoch.

  • The last command does no conversion at all and should be significantly faster than the first two. The find command itself will not display the mtime of the oldest file, so stat is needed.

Related man pages: findgrepsedsortstat

Dennis

Posted 2013-02-15T16:06:04.533

Reputation: 42 934

5

Although the accepted answer and others here do the job, if you have a very large tree, all of them will sort the whole bunch of files.

Better would be if we could just list them and keep track of the oldest, without the need to sort at all.

Thats why I came up with this alternative solution:

ls -lRU $PWD/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($1,0,1)=="/") { pat=substr($1,0,length($0)-1)"/"; }; if( $6 != "") {if ( $6 < oldd ) { oldd=$6; oldf=pat$8; }; print $6, pat$8; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

I hope it might be of any help, even if the question is a bit old.


Edit 1: this changes allow parsing files and directories with spaces. Its is fast enough to issue it in the root / and find the oldest file ever.

ls -lRU --time-style=long-iso "$PWD"/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($0,0,1)=="/") { pat=substr($0,0,length($0)-1)"/"; $6="" }; if( $6 ~ /^[0-9]+$/) {if ( $6 < oldd ) { oldd=$6; oldf=$8; for(i=9; i<=NF; i++) oldf=oldf $i; oldf=pat oldf; }; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

Command explainded:

  • ls -lRU --time-style=long-iso "$PWD"/* lists all files (*), long format (l), recursively (R), without sorting (U) to be fast, and pipe it to awk
  • Awk then BEGIN by zeroing counter (optional to this question) and setting the oldest date oldd to be today, format YearMonthDay.
  • The main loop first
    • Grabs the 6th field, the date, format Year-Month-Day, and change it to YearMonthDay (if your ls doesnt output this way, you may need to fine tune it).
    • Using recursive, there will be header lines for all directories, in the form of /directory/here:. Grab this line into pat variable. (substituting the last ":" to a "/"). And sets $6 to nothing to avoid using the header line as a valid file line.
    • if field $6 has a valid number, its a date. Compare it with the old date oldd.
    • Is it older? Then save the new values for old date oldd and old filename oldf. BTW, oldf is not only 8th field, but from 8th to the end. That's why a loop to concatenate from 8th to the NF (end).
    • Count advances by one
    • END by printing the result

Running it:

~$time ls -lRU "$PWD"/* | awk etc.

Oldest date: 19691231

File: /home/.../.../backupold/.../EXAMPLES/how-to-program.txt

Total compared: 111438

real 0m1.135s

user 0m0.872s

sys 0m0.760s


EDIT 2: Same concept, better solution using find to look at the access time (use %T with the first printf for modification time or %C for status change instead).

find . -wholename "*" -type f -printf "%AY%Am%Ad %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

EDIT 3: The command bellow uses modification time and also prints incremental progress as it finds older and older files, which is useful when you have some incorrect timestamps (like 1970-01-01):

find . -wholename "*" -type f -printf "%TY%Tm%Td %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; print oldd " " oldf; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'

Dr Beco

Posted 2013-02-15T16:06:04.533

Reputation: 1 277

It still needs tweeking to accept files with spaces. I'll do that soon. – Dr Beco – 2015-06-19T15:50:23.880

I think parsing ls for files with spaces isn't a good idea. Maybe using find. – Dr Beco – 2015-06-19T16:35:50.777

Just run it in the entire tree "/". Time spent: Total compared: 585744

real 2m14.017s user 0m8.181s sys 0m8.473s – Dr Beco – 2015-06-19T18:26:37.937

Using ls is bad for scripting as its output is not meant for machines, output formatting varies across implementations. As you already stated find is good for scripting but it might also be good to add that info before telling about ls solutions. – Sampo Sarrala - codidact.org – 2016-01-06T12:34:24.653

4

Please use ls - the man page tells you how to order the directory.

ls -clt | head -n 2

The -n 2 is so you dont get the "total" in the output. If you only want the name of the file.

ls -t | head -n 1

And if you need the list in the normal order (getting the newest file)

ls -tr | head -n 1

Much easier than using find, much faster, and more robust - dont have to worry about file naming formats. It should work on nearly all systems too.

user1363990

Posted 2013-02-15T16:06:04.533

Reputation: 165

6This works only if the files are in a single directory, while my question was about a directory tree. – Marius Gedminas – 2014-09-02T06:02:36.953

2

find ! -type d -printf "%T@ %p\n" | sort -n | head -n1

Okki

Posted 2013-02-15T16:06:04.533

Reputation: 21

This won't work properly if there are files older than 9 Sep 2001 (1000000000 seconds since Unix epoch). To enable numeric sorting, use sort -n. – Dennis – 2013-02-16T03:11:55.260

This helps find me the file, but it's hard to see how old it is without running a second command :) – Marius Gedminas – 2013-02-16T09:38:00.087

0

It seems that by "oldest" most people have assumed that you meant "oldest modification time." That's probably corrected, according to the most strict interpretation of "oldest", but in case you wanted the one with the oldest access time, I would modify the best answer thus:

find -type f -printf '%A+ %p\n' | sort | head -n 1

Notice the %A+.

PenguinLust

Posted 2013-02-15T16:06:04.533

Reputation: 125

-1

set $(find /search/dirname -type f -printf '%T+ %h/%f\n' | sort | head -n 1) && echo $2
  • find ./search/dirname -type f -printf '%T+ %h/%f\n' prints dates and file names in two columns.
  • sort | head -n1 keeps the line corresponding to the oldest file.
  • echo $2 displays the second column, i.e. the file name.

Dima

Posted 2013-02-15T16:06:04.533

Reputation: 11

1Welcome to Super User! While this may answer the question, it would be a better answer if you could provide some explanation why it does so. – DavidPostill – 2015-06-08T13:10:32.080

1Note, several people also asked for some explanation of your previous (identical) deleted answer. – DavidPostill – 2015-06-08T13:12:13.277

What is difficult to answer? find ./search/dirname -type f -printf '% T +% h /% f \ n' | sort | head -n 1 It shows two columns as the time and path of the file. It is necessary to remove the first column. Using set and echo $ 2 – Dima – 2015-06-08T13:16:55.433

1You should provide explanations instead of just pasting a command line, as requested by several other users. – Ob1lan – 2015-06-08T13:20:51.770

1How is this different then the accepted answer? – Ramhound – 2015-06-08T14:35:44.293

@Ramhound this answer caters for only keeping the file name instead of a string containing both timestamp and filename. 'Twas useful for me anyway... – Geert – 2017-07-11T20:02:44.527