Checking a file contains only null bytes

12

Your goal is to write a program or function that takes in as input a string representing the path to a file, and outputs a truthy value if that file is not empty and contains no non-null bytes -- i.e., all bits are 0 -- and a falsey value otherwise.

I realize it is a very simple problem and I guess I could hack something, but I suspect there must be some short and elegant way of doing it, and that gave me the idea to make a challenge out of it.

This is , so shortest code in bytes wins. (My own preference would go to the fastest solution, but that is too implementation dependent...)

Related questions: Pad a file with zeros

Motivation : This is only to tell where the problem comes from, in case you are interested. You do not need to read it.

ISO images of CDs and DVDs, copied with "dd" or other means, often terminate with a sequence of useless blocks containing only null bytes. Standard techniques to remove these blocks are known and simple (see https://unix.stackexchange.com/questions/74827/ ) but they may sometimes remove non null useful data because the medium can lie about its own size. So I want to check that the removed blocks contain only null bytes. Removing these blocks is important for defining a normalised version of ISO-images.

babou

Posted 2018-08-19T14:19:02.480

Reputation: 269

Answers

5

Pyth, 6 5 bytes

!sCM'

Try it online!

Takes a filename from STDIN, opens and reads the file, converts it to a list of ints (think Python ord) sums the list (will return 0 iff file is all null bytes), and nots the result, printing it.


Hey,

This looks a lot like a general programming question. These belong on Stack Overflow. However, from the comments under the main post, I can see that this was not your intent. That said, I feel that the discussion has been unnecessarily hostile on both sides, so I've decided to pick up the slack and give you the proper PPCG welcome!

Generally, we ask that any challenges are first posted to our sandbox for proper feedback. You can take a look at the current submissions in there to see what format we prefer for challenges. Please give it a try next time!

Just in case we've all misunderstood you and you are looking for a general solution, here's a solution in Python 3:

def main(string):
    with open(string) as file:
        return not any(map(ord,file.read()))

hakr14

Posted 2018-08-19T14:19:02.480

Reputation: 1 295

1This will not work with a grayscale image consisting of only black pixels (zeroes), due to how powerful ' is. – user202729 – 2018-08-22T07:47:58.920

Besides: OP requires taking file name as input using command-line argument, and return as status code. – user202729 – 2018-08-22T07:51:02.773

2

GNU sed -zn, 5 bytes

The input file is passed to sed as a command-line parameter. Output as a standard shell return code - i.e. 0 is TRUE, 1 is FALSE.

/./q1

Normally sed works on newline-delimited input records (AKA "lines"). -z changes this to nul-delimited input records. If any input records match the . regex, then quit with exit code 1.

Try it online!

Digital Trauma

Posted 2018-08-19T14:19:02.480

Reputation: 64 644

2

DOS, 37 bytes


100:BE 80 00 MOV SI, 0080
103:AD       LODSW ;get command-line length
104:98       CBW ;only a byte
105:93       XCHG BX,AX
106:88 40 FF MOV [BX+SI-01], AL ;zero end of name
109:B4 3D    MOV AH, 3D
10B:89 F2    MOV DX, SI
10D:CD 21    INT 21 ;open file
10F:93       XCHG BX, AX ;handle into BX
110:AF       SCASW ;DI=0
111:B4 3F    MOV AH, 3F
113:B1 01    MOV CH, 01
115:CD 21    INT 21 ;read 1 byte
117:91       XCHG CX, AX
118:E3 06    JCXZ 0120 ;quit on EOF
11A:97       XCHG DI, AX ;set true for later
11B:38 2C    CMP [SI], CH
11D:74 F2    JZ 0111 ;loop while zero
11F:4F       DEC DI ;set false
120:97       XCHG DI, AX
121:B4 4C    MOV AH, 4C ;return
123:CD 21    INT 21

It opens the file named on the command-line, returns 0 if empty or contains non-zero, otherwise returns 1.

peter ferrie

Posted 2018-08-19T14:19:02.480

Reputation: 804

1

Attache, 24 bytes

Zero@Max&0@Ords@FileRead

Try it online!

Explanation

This is a composition of 4 functions, executed one after the other:

  • FileRead - takes a file name as input, returns the contents of that file
  • Ords - returns the ASCII code points of each character in a list
  • Max&0 - this is equivalent to, for argument x, Max[x, 0]; this in turn computes the maximum of all entries in x and 0 (yielding 0 for the empty list)
  • Zero - this is a predicate that checks if this number is in fact 0, and returns that boolean.

Conor O'Brien

Posted 2018-08-19T14:19:02.480

Reputation: 36 228

OP requires full program, call from command line, take file name as input using argument, and return as status code. (@_@) – user202729 – 2018-08-22T07:50:13.293

Doesn't this give a false positive for an empty file? – ngenisis – 2018-08-27T20:05:41.790

1@ngenisis the original problem stated the following: "That means the empty file is considered OK" -- check the revision history, it seems that a certain user edited that point out of the question. – Conor O'Brien – 2018-08-27T20:30:49.213

1

C (32bit platform), 65 bytes

main(x,v)int*v;{for(v=fopen(v[1],"r");!(x=fgetc(v)););return++x;}

Assumes sizes of pointers are all the same, which is almost always true. Returns with a 0 exit code on success (file contains only NUL characters), some other value otherwise.

Behavior is undefined if command line argument isn't a path to a readable file.

Felix Palmen

Posted 2018-08-19T14:19:02.480

Reputation: 3 866

I think you need to write int**v? I can't find a compiler where this doesn't segfault without doing that. Also, you can save a bit by intentionally erroring, but I don't know if this is the best approach.

– FryAmTheEggman – 2018-08-22T17:09:24.963

Huh? I tried this with gcc on mingw32, works flawlessly. I probably should add the constraint sizeof(void*) == sizeof(int) (or more generally "32-bit platform") then ... on an amd64 platform, try compiling with -m32 ;) – Felix Palmen – 2018-08-22T20:06:40.450

@FryAmTheEggman also works on TIO when compiled as 32bit code (-m32): Try it online!

– Felix Palmen – 2018-08-23T06:51:06.587

Ah, of course. NIce work, then! Feel free to use my suggestion to save the couple bytes :) – FryAmTheEggman – 2018-08-23T12:56:11.003

0

Bash + GNU utilities, 26 bytes

od -An $1|grep -qv [^0\ *]

Input filename is given as a command-line parameter. Output as a standard shell return code - i.e. 0 is TRUE, 1 is FALSE.

Try it online!

Digital Trauma

Posted 2018-08-19T14:19:02.480

Reputation: 64 644

0

Wolfram Language (Mathematica), 30 bytes

BinaryReadList@#~MatchQ~{0..}&

Try it online!

Explanation

                             & (* Function which returns whether *)
BinaryReadList                 (* the list of bytes *)
              @                (* of *)
               #               (* the input *)
                ~MatchQ~       (* matches *)
                        {0..}  (* a list of a one or more zeros *)

Alternate solution, 22 bytes

If empty files are supposed to pass, then this can be shortened to:

Tr@BinaryReadList@#<1&

Try it online!

ngenisis

Posted 2018-08-19T14:19:02.480

Reputation: 4 600

0

Java, 149 bytes

boolean b(String f)throws Exception{java.io.InputStream s=new java.io.FileInputStream(f);int i=Math.abs(s.read());while(i==0)i+=s.read();return i<0;}

SuperJedi224

Posted 2018-08-19T14:19:02.480

Reputation: 11 342

0

Perl 5, 20 bytes

$\=0;exit<>=~/^\0+$/

Takes a file name in command line args and returns the response in the exit code of the program

faubi

Posted 2018-08-19T14:19:02.480

Reputation: 2 599

0

Haskell, 49 bytes

import Data.ByteString
f=(all(<1)<$>).getContents

Obviously if the import is not included, then it is 26 bytes.

Izaak Weiss

Posted 2018-08-19T14:19:02.480

Reputation: 101

I guess you meant readFile instead of getContets. I think you can read the file as a normal String, compare to =='\0' (or better <'\1') and get rid of the import. As you can use an anonymous function, you can drop the f x= and go pointfree: (all(<'\1')<$>).readFile. – nimi – 2018-08-28T22:21:01.870

If it's a binary file, you cannot use readFile, which will throw an exception upon encountering an invalid Unicode sequence. Good point regarding the pointfree. – Izaak Weiss – 2018-08-29T15:07:41.560

0

Python 3, 59 bytes

f=lambda s:any(open(s,'rb').read())+not len(open(s).read())

Returns 0 for success (all bytes zero).

Returns 1 for failure (at least one nonzero byte, or zero length file).

pizzapants184

Posted 2018-08-19T14:19:02.480

Reputation: 3 174

If the file is empty, you must return Failure. – Adám – 2018-08-28T22:25:00.013

0

APL (Dyalog Unicode), 14 bytes

Full program. Prompts for filename from stdin.

0=⌈/11 ¯1⎕MAP⍞

Try it online!

 prompt for filename

11 ¯1⎕MAP map that file to a packed bit array

⌈/ maximum (reduction); smallest float if empty, otherwise 0 or 1

0= is zero equal to that?

Adám

Posted 2018-08-19T14:19:02.480

Reputation: 37 779

0

JavaScript (ES8), 52 bytes

Takes a url as an argument and returns a promise that resolves to true if the file isn't empty and contains no null bytes.

async p=>!/\0|^$/.test(await(await fetch(p)).text())

kamoroso94

Posted 2018-08-19T14:19:02.480

Reputation: 739

0

Zsh, 35 bytes

! for c (${(s::)"$(<$1)"})((i|=#c))

Try it online! Outputs via exit code.

Read in, split on characters, and bitwise-or each codepoint together.

If the file is empty, the body of the loop is never run and so the loop returns truthy. If the truthy-falsy values can be swapped, the leading ! can be removed for a 2-byte save.

GammaFunction

Posted 2018-08-19T14:19:02.480

Reputation: 2 838