Rewrite git history to replace all CRLF to LF?

32

10

I'm going to transfer a private Git repository from win32 box to Ubuntu. Though I can do a final dos2unix commit, but I'd like to rewrite the whole history, so some Git GUI will display log/diff correctly. E.g., gitg will insert empty lines for each CR/LF.

Xiè Jìléi

Posted 2011-06-07T09:08:30.177

Reputation: 14 766

Answers

25

You can use git filter-branch for that, with the --tree-filter option, and specifying --all for the branch.

Here's an example (started in an empty directory with a Unix-type text file:

Preparation:

$ hexdump -C testfile 
00000000  61 0d 0a 62 0d 0a 63 0d  0a                       |a..b..c..|
00000009

$ git init
Initialized empty Git repository in /home/seigneur/tmp/a/.git/

$ git add testfile && git commit -m "dos file checked in"
[master (root-commit) df4970f] dos file checked in
 1 files changed, 3 insertions(+), 0 deletions(-)
 create mode 100644 testfile

The command:

$ git filter-branch --tree-filter 'git ls-files -z | xargs -0 dos2unix' -- --all

Output:

Rewrite df4970f63e3196216d5986463f239e51eebb4014 (1/1)dos2unix: converting file testfile to Unix format ...

Ref 'refs/heads/master' was rewritten

$ hexdump -C testfile 
00000000  61 0a 62 0a 63 0a                                 |a.b.c.|
00000006

I strongly recommend doing a full backup beforehand. Running that from your Linux machine (unless you've got a good shell set up in your windows environment) is probably easier.

Edit: had the conversion reversed the first time around.

Mat

Posted 2011-06-07T09:08:30.177

Reputation: 6 193

Another alternative to the dos2unix command is to rely on the git itself: git filter-branch --prune-empty --tree-filter 'git add --renormalize .' -- --all – Vilmantas Baranauskas – 2018-12-10T06:45:46.237

1Thank you, this post helped me a lot. I had a few files with spaces in their name, a little change to the original command fixed it: git filter-branch --tree-filter 'git ls-files -z | xargs -0 dos2unix' -- --all . Flags -z and -0 tell git ls-files and xargs to print and interpret null as end of line. – Ivan – 2013-11-08T14:04:49.940

6

Mat's answer has nailed the issue right on the head. Unfortunately on Ubuntu Linux, starting with version 10.04 (Lucid Lynx), the dos2unix/unix2dos commands are no longer available, and have been replaced by fromdos/todos. Furthermore, both of the sets of the conversion commands have various degree of ignorance to the existence of binary files, thus if your repository contains images, fonts, etc. they are going to be corrupted by this process.

I was able to find a workaround for the binary file corruption issue that uses Linux 'file' command to correctly identify and process only text files as shown below. The command below uses --tag-name-filter option to preserve the existing tags by moving them to the newly amended commits. Also it uses --force flag to ensure that the command will work in the event you have run tree-filter on your repository before.

git filter-branch --force --tree-filter 'git ls-files | xargs file | sed -n -e "/.*: .*text.*/s/\(.*\): .*/\1/p" | xargs fromdos' --tag-name-filter cat -- --all

mgorovoy

Posted 2011-06-07T09:08:30.177

Reputation: 61

3

And without any additional tools (like 'fromdos', 'dos2unix', etc.):

git filter-branch --force --tree-filter 'git ls-files | xargs file | sed -n -e "/.*: .*text.*/s/\(.*\): .*/\1/p" | xargs -0 sed -i"" -e "s/"$(printf "\015")"$//"' --tag-name-filter cat -- --all

Crossplatform (OS X, FreeBSD, Linux) useful analog 'fromdos', 'dos2unix':

sed -i'' -e 's/'"$(printf '\015')"'$//'

Perhaps useful 'unix2dos':

sed -i '' -e 's|$|'"`printf '\015'`"'|' file.name

If are you absolytely shure what are you doing, you can use this simple inline command for delete "/r" from all files in current directory ".":

find . -type f -exec sed -i'' -e 's/'"$(printf '\015')"'$//' {} \;

METAJIJI

Posted 2011-06-07T09:08:30.177

Reputation: 31

1Rather change \r\n to \n instead of removing \r only – xdevs23 – 2016-09-07T08:35:33.427

I think that corresponding sed invocation can be replaced with shorter one: sed -n -e "s/\(.*\): .*text.*/\1/p" – dma_k – 2018-11-15T14:38:57.890