How do I create a 1GB random file in Linux?

91

39

I am using the bash shell and would like to pipe the out of the command openssl rand -base64 1000 to the command dd such as dd if={output of openssl} of="sample.txt bs=1G count=1. I think I can use variables but I am however unsure how best to do so. The reason I would like to create the file is because I would like a 1GB file with random text.

PeanutsMonkey

Posted 2012-09-06T18:03:06.640

Reputation: 7 780

Answers

139

if= is not required, you can pipe something into dd instead:

something... | dd of=sample.txt bs=1G count=1

It wouldn't be useful here since openssl rand requires specifying the number of bytes anyway. So you don't actually need ddthis would work:

openssl rand -out sample.txt -base64 $(( 2**30 * 3/4 ))

1 gigabyte is usually 230 bytes (though you can use 10**9 for 109 bytes instead). The * 3/4 part accounts for Base64 overhead, making the encoded output 1 GB.

Alternatively, you could use /dev/urandom, but it would be a little slower than OpenSSL:

dd if=/dev/urandom of=sample.txt bs=1G count=1

Personally, I would use bs=64M count=16 or similar:

dd if=/dev/urandom of=sample.txt bs=64M count=16

user1686

Posted 2012-09-06T18:03:06.640

Reputation: 283 655

3Note if it says dd: warning: partial read (33554431 bytes); suggest iflag=fullblock it will create a truncated file so add the iflag=fullblock flag, then it works. – rogerdpack – 2018-09-27T20:21:42.443

Thanks. A few questions, does using the command openssl rand -base64 $(( 2**30 * 3/4 )) > sample.txt give you a true text file? Secondly I don't quite follow the use of bs=64M count=16. Can you elaborate further? – PeanutsMonkey – 2012-09-06T19:06:05.513

2

I posted a question regarding compressing large files at http://superuser.com/questions/467697/why-does-a-zip-file-appear-larger-than-the-source-file-especially-when-it-is-tex and was advised that using /dev/urandom generates a binary file and not a true text file.

– PeanutsMonkey – 2012-09-06T19:10:03.823

@PeanutsMonkey: What do you mean by a "true text file"? A file that only contains printable characters, I'm guessing? Then yes, the -base64 option tells OpenSSL to output a "text" file. – user1686 – 2012-09-06T19:23:21.150

@PeanutsMonkey: But beware that random data does not compress well, regardless of whether it is "binary" or "true text". – user1686 – 2012-09-06T19:23:52.223

2@PeanutsMonkey: Right; you would need something like dd if=/dev/urandom bs=750M count=1 | uuencode my_sample > sample.txt. – Scott – 2012-09-06T19:33:41.873

@Scott - Can you elaborate what that does exactly as well as why you are using a byte size of 750M and a count of 1? – PeanutsMonkey – 2012-09-06T19:52:58.103

@grawity - Well people keep bouncing the term "true text file" and based on my previous post it was suggested that /dev/urandom generates binary files. My understanding is that a text file is one with printable characters although am unsure whether ASCII characters would count. I thought -base64 is used to convert binary data to text? – PeanutsMonkey – 2012-09-06T19:56:09.253

@grawity - If random data does not compress well, how can I create a file that mimics real world scenarios? – PeanutsMonkey – 2012-09-06T19:56:46.023

3

@PeanutsMonkey: There's no single "real world scenario", some scenarios might be dealing with gigabytes of text, others – with gigabytes of JPEGs, or gigabytes of compiled software... If you want a lot of text, download a Wikipedia dump for example.

– user1686 – 2012-09-06T20:06:27.393

2@PeanutsMonkey: The dd reads 750,000,000 bytes from /dev/urandom and pipes them into uuencode. uuencode encodes its input into a form of base64 encoding (not necessarily consistent with other programs). In other words, this converts binary data to text. I used 750M because I trusted grawity's statement that base64 encoding expands data by 33⅓%, so you need to ask for ¾ as much binary data as you want in your text file. – Scott – 2012-09-06T20:07:33.557

@Scott: Pure Base64 always encodes 3 bytes to 4 (33.(3)%). OpenSSL's encoder splits output into 64-character lines (so about 35.4% overhead; I forgot to account for this – would be *48/65). UUencode uses even shorter lines and adds length prefixes, header & footer, resulting in ~40% overhead. – user1686 – 2012-09-06T20:15:09.953

@Scott - That makes sense although am curious to understand why limit the count to 1? – PeanutsMonkey – 2012-09-06T20:41:56.887

@grawity - I am astounded by the depth of knowledge. Where are you learning all of this stuff? – PeanutsMonkey – 2012-09-06T20:42:24.870

1@leighmcc: FYI: using > redirection does not make the writes pass through bash – it is equivalent to having the program open the file directly. – user1686 – 2013-05-10T14:03:14.433

26

Create a 1GB.bin random content file:

dd if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock

anneb

Posted 2012-09-06T18:03:06.640

Reputation: 724

4For me, iflag=fullblock was the necessary addition compare to other answers. – dojuba – 2018-09-18T14:55:11.643

2

If you want EXACTLY 1GB, then you can use the following:

openssl rand -out $testfile -base64 792917038; truncate -s-1 $testfile

The openssl command makes a file exactly 1 byte too big. The truncate command trims that byte off.

Joel Jacobs

Posted 2012-09-06T18:03:06.640

Reputation: 21

That extra byte is probably because of the -base64. Removing it will result in a file with the correct size. – Daniel – 2019-10-10T11:34:04.930

-1

Try this script.

#!/bin/bash
openssl rand -base64 1000 | dd of=sample.txt bs=1G count=1

This script might work as long as you don't mind using /dev/random.

#!/bin/bash
dd if=/dev/random of="sample.txt bs=1G count=1"

Jonathan Reno

Posted 2012-09-06T18:03:06.640

Reputation: 318

8I wouldn't recommend wasting /dev/random on this unless there's a very good reason to do so. /dev/urandom is much cheaper. – Ansgar Wiechers – 2012-09-06T18:22:23.267

1Also, $var=(command) isn't valid syntax in this context. – user1686 – 2012-09-06T18:58:47.580

@grawity - When you say it isn't valid, what do you mean? – PeanutsMonkey – 2012-09-06T19:08:43.980

I mean exactly that – it's incorrect. – user1686 – 2012-09-06T19:22:17.797

3@grawity, @PeanutsMonkey: He made a typo; he meant random=$(openssl rand -base64 1000).  Although I would question whether bash would let you assign a gigabyte-long value to a variable.  And even if you do say random=$(openssl rand -base64 1000), the subsequent if=$random doesn't make sense. – Scott – 2012-09-06T19:28:21.520

Right. random=<(openssl ...) would almost work (if not for bash's poorly-thought-out implementation of the feature). And dd if=<(openssl ...) would definitely work, but then it's just exact same thing as openssl ... | dd – user1686 – 2012-09-06T19:37:55.743