I know how to retrieve the last modification date of a single file in a Git repository:
git log -1 --format="%ad" -- path/to/file
Is there a simple and efficient way to do the same for all the files currently present in the repository?
I know how to retrieve the last modification date of a single file in a Git repository:
git log -1 --format="%ad" -- path/to/file
Is there a simple and efficient way to do the same for all the files currently present in the repository?
A simple answer would be to iterate through each file and display its modification time, i.e.:
git ls-tree -r --name-only HEAD | while read filename; do
echo "$(git log -1 --format="%ad" -- $filename) $filename"
done
This will yield output like so:
Fri Dec 23 19:01:01 2011 +0000 Config
Fri Dec 23 19:01:01 2011 +0000 Makefile
Obviously, you can control this since its just a bash script at this point--so feel free to customize to your heart's content!
This approach also works with filenames that contain spaces:
git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {}
Example output:
2015-11-03 10:51:16 -0500 .gitignore
2016-03-30 11:50:05 -0400 .htaccess
2015-02-18 12:20:26 -0500 .travis.yml
2016-04-29 09:19:24 +0800 2016-01-13-Atlanta.md
2016-04-29 09:29:10 +0800 2016-03-03-Elmherst.md
2016-04-29 09:41:20 +0800 2016-03-03-Milford.md
2016-04-29 08:15:19 +0800 2016-03-06-Clayton.md
2016-04-29 01:20:01 +0800 2016-03-14-Richmond.md
2016-04-29 09:49:06 +0800 3/8/2016-Clayton.md
2015-08-26 16:19:56 -0400 404.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-bradycardia-algorithm.htm
2015-12-23 17:03:51 -0500 _algorithms/acls-pulseless-arrest-algorithm-asystole.htm
2016-04-11 15:00:42 -0400 _algorithms/acls-pulseless-arrest-algorithm-pea.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-secondary-survey.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-suspected-stroke-algorithm.htm
2016-03-31 11:54:19 -0400 _algorithms/acls-tachycardia-algorithm-stable.htm
...
The output can be sorted by modification timestamp by adding | sort
to the end:
git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {} | sort
Here's another way:
git ls-tree -r --name-only HEAD -z | TZ=UTC xargs -0n1 -I_ git --no-pager log -1 --date=iso-local --format="%ad _" -- _
Changes to previously given answers:
ls-tree
instead of ls-files
and as such can be used with bare repositories. | sort
to the command.Note that this doesn't correctly handle filenames with the %
character. See below for a more elaborate command to correctly handle all characters in filenames.
Note that this command is still really slow because Git doesn't really store the information we're looking after. Technically this goes through all the files, filters all changes to any given file from the whole project history, takes the latest commit and prints its author timestamp. As a result, the displayed times match the last commit that changed each file. If the file had a different timestamp on disk at the time the original commit was made, it was not ever stored anywhere in the Git repository and as such it cannot ever be restored without an external data source.
The timestamps that this script emits are just an emulated version matching the commit time, not the real timestamp that the file had because Git doesn't consider file timestamps as data. This is because this part of Git was designed by Linus Torvalds and he strongly believes that the file timestamp on disk should match the time it was modified on disk, not the timestamp that the file had on the disk of somebody else when it was historically modified. Git only stores one timestamp for the commit that was made and another timestamp for the moment that commit was included in the DAG. These may differ in case commit author and the person that applied the commit to version history are two different people as often happens in Linux kernel development. (Also consider the fact that you can commit only selected lines from each file using the index / staging area. There doesn't exist even a concept of "file timestamp" in theory for that case because the committed version doesn't match any file on disk.)
If you want to set filesystem modification times to the last author commit time of each file, you can do something like this to deal with special characters in filenames (add | bash
to automatically execute all emitted commands):
git ls-tree -r --name-only HEAD -z | TZ=UTC xargs -0n1 git --no-pager log -1 --date=iso-local --name-only -z --format="format:%ad" | perl -npe "INIT {\$/ = \"\\0\"} s@^(.*? .*?) .*?\n(.*)\$@\$date=\$1; \$name=\$2; \$name =~ s/'/'\"'\"'/sg; \"TZ=UTC touch -m --date '\$date' '\$name';\n\"@se"
Even though this is much more complex than the command above, the performance of this command should be about equal to the first one because the performance is limited by searching for last modification time of each file instead of actually setting the modification time. Note that this converts times to UTC, uses null-separated files and resets correct timestamp for each file on the filesystem using UTC timezone while setting the time.
If the order of output is not strictly important, you can improve performance of this command by adding -P $(nproc)
to xargs
flags to scale Git to all CPUs making the command look like ...TZ=UTC xargs -0n1 -P $(nproc) git...
.
If you prefer committer time instead of author date, use %cd
instead of %ad
in the above command line.
This is a small tweak of Andrew M.'s answer. (I was unable to comment on his answer.)
Wrap the first $filename in double quotes, in order to support filenames with embedded spaces.
git ls-tree -r --name-only HEAD | while read filename; do
echo "$(git log -1 --format="%ad" -- "$filename") $filename"
done
Sample output:
Tue Jun 21 11:38:43 2016 -0600 subdir/this is a filename with spaces.txt
I appreciate that Andrew's solution (based on ls-tree) works with bare repositories! (This isn't true of solutions using ls-files.)
If you're trying to set the file modification times on a big repository, look at Git Tools. It’s already a package.
sudo apt install git-restore-mtime
cd repo
git restore-mtime
It uses git whatschanged
rather than git log
, which is much quicker on big repositories.
For those of us using Windows and PowerShell, Andrew M's answer, with the computer-readable tweak:
git ls-tree -r --name-only HEAD | ForEach-Object { "$(git log -1 --format="%ai" -- "$_")`t$_" }
Example output:
2019-05-07 12:00:37 -0500 .editorconfig
2016-07-13 14:03:49 -0500 .gitattributes
2019-05-07 12:00:37 -0500 .gitignore
2018-02-03 22:01:17 -0600 .mailmap
Here is the Fish shell version of Andrew M's answer, for those that use Fish.
git ls-tree -r --name-only HEAD | while read -l filename
printf '%s %s\n' (git log -1 --format="%ai" -- $filename) $filename
end
I store this as a Fish function for easy access.