1
Issue details
I am trying to invoke wget from java to download a file but I keep hitting a weird issue where the file size will be capped.
For example, when issuing "wget https://speed.hetzner.de/1GB.bin", I correctly have 1GB.bin with a file size of 1,048,576,000 (exactly 1 GB). But when invoking the same command from java I consistently end up with approximately 40 MB file.
Debugging
Assuming you have JDK installed, here is an MCVE that reproduces this behavior:
echo 'class RunCommand {
public static void main(String[] args) throws Exception {
String s = "";
for (int i=0; i < args.length; i++)
s += (i > 0 ? " " : "") + args[i];
System.out.println(Runtime.getRuntime().exec(s).waitFor());
}
}' > RunCommand.java
javac RunCommand.java
java RunCommand wget https://speed.hetzner.de/1GB.bin
I have tried this on a clean AWS CentOS 7.6 machine with all of:
- OpenJDK 7
- OpenJDK 8
- Oracle JDK 8
I always end up with the same result: java hangs and the file size is around 40 MB.
I have also tried increasing heap size with -Xms1024m -Xmx1024m
to no avail, concluding that heap size is not the problem.
Now, running the exact same thing again with curl instead:
java RunCommand curl https://speed.hetzner.de/1GB.bin -o 1GB.bin
This surprisingly works and I successfully end up with a 1GB file!
Questions
So there are many questions here:
- Why is java hanging after 40 MB?
- Why always exactly 40 MB? (grepping 40 in
-XX:+PrintFlagsFinal
gives no clue) - What difference is there between the wget and curl commands that could lead to one failing and the other succeeding?
wget -q does the job.
So it seems that wget output (progress bar) is causing this issue. The question still remains, however, about what exactly in java is trying to read stdin and where the limit on this is coming from. It seems to be almost exactly 40m across different Linux distributions and different versions of Java which is quite intriguing to say the least. – Jbezos – 2019-02-13T15:26:01.843
Another question I guess is why this problem is not happening with curl since curl also has a progress output... – Jbezos – 2019-02-13T15:33:49.417
You can test the output to a file using
wget https://speed.hetzner.de/1GB.bin 2>&1 | tee wget_output.log
. The file grows larger than 1.5MB while doing the same thing with curl, the file remains under 30k. – Jbezos – 2019-02-22T18:18:04.817I couldn't verify this but around 40M with wget, the file was about 64k. – Jbezos – 2019-02-22T18:19:38.663