76
13
I'm looking for a shell one-liner to find the oldest file in a directory tree.
76
13
I'm looking for a shell one-liner to find the oldest file in a directory tree.
74
This works (updated to incorporate Daniel Andersson's suggestion):
find -type f -printf '%T+ %p\n' | sort | head -n 1
14
This one's a little more portable and because it doesn't rely on the GNU find
extension -printf
, so it works on BSD / OS X as well:
find . -type f -print0 | xargs -0 ls -ltr | head -n 1
The only downside here is that it's somewhat limited to the size of ARG_MAX
(which should be irrelevant for most newer kernels). So, if there are more than getconf ARG_MAX
characters returned (262,144 on my system), it doesn't give you the correct result. It's also not POSIX-compliant because -print0
and xargs -0
isn't.
Some more solutions to this problem are outlined here: How can I find the latest (newest, earliest, oldest) file in a directory? – Greg's Wiki
And an even more portable approach is find . -type f -exec ls -ltr {} + | head -n1
. No need in xargs
and -print0
. This is supported in POSIX, unlike -print0
.
This works too, but it also emits an xargs: ls: terminated by signal 13
error as a side effect. I'm guessing that's SIGPIPE. I've no idea why I don't get a similar error when I pipe sort's output to head in my solution. – Marius Gedminas – 2013-02-15T16:29:03.707
Your version is also easier to type from memory. :-) – Marius Gedminas – 2013-02-15T16:29:33.380
Yes, that's a broken pipe. I don't get this with both GNU and BSD versions of all those commands, but it's the head
command that quits once it has read a line and thus "breaks" the pipe, I think. You don't get the error because sort
doesn't seem to complain about it, but ls
does in the other case. – slhck – 2013-02-15T16:32:16.473
4This breaks if there are so many filenames that xargs
needs to invoke ls
more than once. In that case, the sorted outputs of those multiple invocations end up concatenated when they should be merged. – Nicole Hamilton – 2013-02-15T17:00:36.303
@Nicole You're right, and I was implying this with my hint to ARG_MAX
, because this is the number of files that can be passed, e.g. 262144 on my OS X. (Maybe I should have been more explicit on this?) – slhck – 2013-02-15T17:09:22.163
2I think this is worse than posting a script that assumes filenames never contain spaces. A lot of the time, those will work because the filenames don't have spaces. And when they fail, you get an error. But this is unlikely to work in real cases and failure will go undiscovered. On any directory tree big enough that you can't just ls
it and eyeball the oldest file, your solution probably will overrun the command line length limit, causing ls
to be invoked multiple times. You'll get the wrong answer but you'll never know. – Nicole Hamilton – 2013-02-15T17:15:18.017
Well, if in the real case there are less than ARG_MAX
files, it works. If not, then not, and I explicitly mentioned this as a drawback when posting the answer. I made that point a little clearer though. (Also upvoted your comment.) The main reason I posted this is that printf
doesn't exist in non-GNU find
and therefore an alternative is needed. – slhck – 2013-02-15T17:22:56.707
If it even failed with an error, I would be okay with this. But it fails silently. That's unacceptable. – Nicole Hamilton – 2013-02-15T17:27:24.010
It's also worth clarifying that this fails if the sum of the lengths of the filenames in characters is greater than ARG_MAX
, not if there are more than ARG_MAX
files. – Nicole Hamilton – 2013-02-15T18:00:13.000
@Nicole You're right, I missed that. – slhck – 2013-02-15T19:21:05.383
I just found out that I never fully understood how xargs works. ARG_MAX is 2,097,152 on Ubuntu 12.10, by the way. – Dennis – 2013-02-16T04:12:09.920
On reflection, I don't actually want to seem harsh, so I'm undoing my downvote. But I do think it's important that software should never be designed to fail silently. – Nicole Hamilton – 2013-02-17T18:11:32.740
11
The following commands commands are guaranteed to work with any kind of strange file names:
find -type f -printf "%T+ %p\0" | sort -z | grep -zom 1 ".*" | cat
find -type f -printf "%T@ %T+ %p\0" | \
sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //'
stat -c "%y %n" "$(find -type f -printf "%T@ %p\0" | \
sort -nz | grep -zom 1 ".*" | sed 's/[^ ]* //')"
Using a null byte (\0
) instead of a linefeed character (\n
) makes sure the output of find will still be understandable in case one of the file names contains a linefeed character.
The -z
switch makes both sort and grep interpret only null bytes as end-of-line characters. Since there's no such switch for head, we use grep -m 1
instead (only one occurrence).
The commands are ordered by execution time (measured on my machine).
The first command will be the slowest since it has to convert every file's mtime into a human readable format first and then sort those strings. Piping to cat avoids coloring the output.
The second command is slightly faster. While it still performs the date conversion, numerically sorting (sort -n
) the seconds elapsed since Unix epoch is a little quicker. sed deletes the seconds since Unix epoch.
The last command does no conversion at all and should be significantly faster than the first two. The find command itself will not display the mtime of the oldest file, so stat is needed.
5
Although the accepted answer and others here do the job, if you have a very large tree, all of them will sort the whole bunch of files.
Better would be if we could just list them and keep track of the oldest, without the need to sort at all.
Thats why I came up with this alternative solution:
ls -lRU $PWD/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($1,0,1)=="/") { pat=substr($1,0,length($0)-1)"/"; }; if( $6 != "") {if ( $6 < oldd ) { oldd=$6; oldf=pat$8; }; print $6, pat$8; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'
I hope it might be of any help, even if the question is a bit old.
Edit 1: this changes allow parsing files and directories with spaces. Its is fast enough to issue it in the root /
and find the oldest file ever.
ls -lRU --time-style=long-iso "$PWD"/* | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { gsub(/-/,"",$6); if (substr($0,0,1)=="/") { pat=substr($0,0,length($0)-1)"/"; $6="" }; if( $6 ~ /^[0-9]+$/) {if ( $6 < oldd ) { oldd=$6; oldf=$8; for(i=9; i<=NF; i++) oldf=oldf $i; oldf=pat oldf; }; count++;}} END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'
Command explainded:
Running it:
~$time ls -lRU "$PWD"/* | awk etc.
Oldest date: 19691231
File: /home/.../.../backupold/.../EXAMPLES/how-to-program.txt
Total compared: 111438
real 0m1.135s
user 0m0.872s
sys 0m0.760s
EDIT 2: Same concept, better solution using find
to look at the access time (use %T
with the first printf
for modification time or %C
for status change instead).
find . -wholename "*" -type f -printf "%AY%Am%Ad %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'
EDIT 3: The command bellow uses modification time and also prints incremental progress as it finds older and older files, which is useful when you have some incorrect timestamps (like 1970-01-01):
find . -wholename "*" -type f -printf "%TY%Tm%Td %h/%f\n" | awk 'BEGIN {cont=0; oldd=strftime("%Y%m%d"); } { if ($1 < oldd) { oldd=$1; oldf=$2; for(i=3; i<=NF; i++) oldf=oldf " " $i; print oldd " " oldf; }; count++; } END { print "Oldest date: ", oldd, "\nFile:", oldf, "\nTotal compared: ", count}'
It still needs tweeking to accept files with spaces. I'll do that soon. – Dr Beco – 2015-06-19T15:50:23.880
I think parsing ls for files with spaces isn't a good idea. Maybe using find. – Dr Beco – 2015-06-19T16:35:50.777
Just run it in the entire tree "/". Time spent: Total compared: 585744
real 2m14.017s user 0m8.181s sys 0m8.473s – Dr Beco – 2015-06-19T18:26:37.937
Using ls
is bad for scripting as its output is not meant for machines, output formatting varies across implementations. As you already stated find
is good for scripting but it might also be good to add that info before telling about ls
solutions. – Sampo Sarrala - codidact.org – 2016-01-06T12:34:24.653
4
Please use ls - the man page tells you how to order the directory.
ls -clt | head -n 2
The -n 2 is so you dont get the "total" in the output. If you only want the name of the file.
ls -t | head -n 1
And if you need the list in the normal order (getting the newest file)
ls -tr | head -n 1
Much easier than using find, much faster, and more robust - dont have to worry about file naming formats. It should work on nearly all systems too.
6This works only if the files are in a single directory, while my question was about a directory tree. – Marius Gedminas – 2014-09-02T06:02:36.953
2
find ! -type d -printf "%T@ %p\n" | sort -n | head -n1
This won't work properly if there are files older than 9 Sep 2001 (1000000000 seconds since Unix epoch). To enable numeric sorting, use sort -n
. – Dennis – 2013-02-16T03:11:55.260
This helps find me the file, but it's hard to see how old it is without running a second command :) – Marius Gedminas – 2013-02-16T09:38:00.087
0
It seems that by "oldest" most people have assumed that you meant "oldest modification time." That's probably corrected, according to the most strict interpretation of "oldest", but in case you wanted the one with the oldest access time, I would modify the best answer thus:
find -type f -printf '%A+ %p\n' | sort | head -n 1
Notice the %A+
.
-1
set $(find /search/dirname -type f -printf '%T+ %h/%f\n' | sort | head -n 1) && echo $2
find ./search/dirname -type f -printf '%T+ %h/%f\n'
prints dates and file names in two columns.sort | head -n1
keeps the line corresponding to the oldest file.echo $2
displays the second column, i.e. the file name.1Welcome to Super User! While this may answer the question, it would be a better answer if you could provide some explanation why it does so. – DavidPostill – 2015-06-08T13:10:32.080
1Note, several people also asked for some explanation of your previous (identical) deleted answer. – DavidPostill – 2015-06-08T13:12:13.277
What is difficult to answer? find ./search/dirname -type f -printf '% T +% h /% f \ n' | sort | head -n 1 It shows two columns as the time and path of the file. It is necessary to remove the first column. Using set and echo $ 2 – Dima – 2015-06-08T13:16:55.433
1You should provide explanations instead of just pasting a command line, as requested by several other users. – Ob1lan – 2015-06-08T13:20:51.770
1How is this different then the accepted answer? – Ramhound – 2015-06-08T14:35:44.293
@Ramhound this answer caters for only keeping the file name instead of a string containing both timestamp and filename. 'Twas useful for me anyway... – Geert – 2017-07-11T20:02:44.527
1I get empty space becasue my first line from this
find
is empty due to the fact i have filename contains newline. – 林果皞 – 2016-04-19T12:50:44.6971Can I ask if this uses the created or modification date? – MrMesees – 2016-11-27T13:07:38.733
1Linux doesn't store the file creation date anywhere[*]. This uses the modification date.
[*] this is actually not true; ext4 stores the inode creation date, but it's not exposed via any system calls and you need to use debugfs to see it.) – Marius Gedminas – 2016-11-28T07:14:46.713
I get a useless answer with this one with the head limit set to 1 because a whole bunch of flatpak files have the modification date set to zero, so I'm getting tens of thousands of files with Jan 1, 1970. Then I get a few thousand more for 1980-01-01 from google cloud sdk and docker. Then thousands more from 1985-10-26 from nodejs. Then a bunch from over a decade ago from various git repos and steam files (cloning source/build files takes remote timestamps). Then I get my actual files around line 150k – theferrit32 – 2020-01-03T22:13:23.340
8Less typing:
find -type f -printf '%T+ %p\n' | sort | head -1
– Daniel Andersson – 2013-02-15T19:37:05.793