Why is my patch coming out in binary format?

0

I'm diffing two directories, 220 and sue, as follows:

diff -r 220 sue > diff.txt

The directory looks as follows afterwards:

$ ls -al
total 20
drwxrwx---+ 1 Administrators Domain Users     0 Jun 24 10:44 .
drwxrwx---+ 1 SYSTEM         SYSTEM           0 Jun 24 09:52 ..
drwxrwx---+ 1 Administrators Domain Users     0 Jun 24 09:54 220
-rw-rwxr--+ 1 jempty         Domain Users 15463 Jun 24 10:44 diff.txt
drwxrwx---+ 1 Administrators Domain Users     0 Jun 24 09:55 sue

Confirming diff.txt is text as follows:

$ file diff.txt
diff.txt: HTML document, ASCII text, with very long lines, with CRLF, LF line terminators

The above is primarily to demonstrate that I can use diff and see that there not a tremendous amount of differences.

Then creating a patch file as suggested by https://docs.moodle.org/dev/How_to_create_a_patch:

$ diff -Naur 220 sue > patch.txt

Results in the directory looking as follows:

$ ls -al
total 133836
drwxrwx---+ 1 Administrators Domain Users         0 Jun 24 10:57 .
drwxrwx---+ 1 SYSTEM         SYSTEM               0 Jun 24 09:52 ..
drwxrwx---+ 1 Administrators Domain Users         0 Jun 24 09:54 220
-rw-rwxr--+ 1 jempty         Domain Users     15463 Jun 24 10:44 diff.txt
-rw-rwxr--+ 1 jempty         Domain Users 137024100 Jun 24 10:57 patch.txt
drwxrwx---+ 1 Administrators Domain Users         0 Jun 24 09:55 sue

As you can see the patch.txt file is enormous and as it turns out it's binary:

$ file patch.txt
patch.txt: data

Should I be using the patch command instead of diff

Dexygen

Posted 2015-06-24T16:03:59.783

Reputation: 304

Answers

1

You have binary files (programs, DLLs, data files, etc.) in either 220 or sue, or both.

The first command (diff -r) recognizes that some files are binary, and when this is the case, diff will simply print a message that they differ. For example, if both 220 and sue have a binary file foo.dat in them, you would expect the output to be something like this:

Binary files 220/foo.dat and sue/foo.dat differ

The second command has the -a flag, which tells diff to unconditionally treat all files as plaintext, so it will compare and print the raw, binary content of the differences between 220/foo.dat and sue/foo.dat. Since diff compares line-by-line, and binary files typically have few line breaks, the lines compared and shown in the output will be fairly large, even for relatively small files.

To reduce the size of diff's output, don't use the -a flag:

$ diff -Nur 220 sue > patch.txt

If you don't care about the differences between binary files, you can filter the output to exclude them:

$ diff -Nur 220 sue | grep -v '^Binary files.*differ' > patch.txt

To answer your last question, patch is the opposite of diff, so you wouldn't use patch here. You use diff to find the differences between files, and you use patch to apply the differences from diff's output to those files. The terms are used interchangeably when referring to the output of diff.

James Sneeringer

Posted 2015-06-24T16:03:59.783

Reputation: 448