1
What would you expect to happen after this:
for /l %i in (1,1,100) do @more some.bbl | grep a | md5sum
Most probably, not this:
ec3ecb76408d4225ff23a25d0596e00f *-
13cfd899b90b9cd7aedb406a785e8eac *-
737e8898a65657f1a2ce8012ff1ffe82 *-
d4095243e56a7da3b31a352423a5417a *-
319db7810e677414ca1609238bdeba6f *-
31e626a8ce0732fda1fa7499c8b13dfa *-
006fe390f923d50348d65d0bbefa64d8 *-
77708f62cb2d61a45788a656d0979aee *-
cda10a9ab71c2bce4df069c479241349 *-
b01b71dc7dca11808ca989c4985513ca *-
c22a6f8b1cac9a93c4fe10b07a9f483a *-
0b04f4b24f3f183270eb7414f4f86e3d *-
5a2f8b8ad482ae8f70b7ce3384a7c9e2 *-
beccdbe737b48c02b48c4524cd89eede *-
a16fec5238cfe8dfff6b403ff943a8ca *-
ec0cd2edc0009abd14119915a8b563f4 *-
1e78f0012ca09aeade169f815415da40 *-
...
I was worried, too, so I ran a couple of sanity checks:
for /l %i in (1,1,100) do @more some.bbl | md5sum
yields 100 times
ace4f37f3a1433e29696a535c0b79f2c *-
Same for
for /l %i in (1,1,100) do @grep a some.bbl | md5sum
and
d8753d755025a1119cd2910c6f5cb0de *-
So more
, grep
and md5sum
work fine by themselves. Also, the pipe before md5sum
is not a problem, since
for /l %i in (1,1,100) do @more some.bbl | grep a > out%i
md5sum out*
confirms the issue. fc
ing the outputs, I find no difference. diff
ing them revels invisible differences, confirmed by a hex editor to be differences in line endings in seemingly random places (and different from file to file).
The issue is still seen, but less often so, in this example:
for /l %i in (1,1,100) do @more some.bbl | grep "[a-z]" | md5sum
yielding
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
ef23817185d41987c11cb1fc4371bb76 *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
e398e63b60cee3e271967f01350068f1 *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
b135bcfe0bcfb7f1c43fe1905164c31e *-
...
Now I am running out of ideas what the reason could be. I would not care about this much if I did not lose any valid lines in cases like this:
for /l %i in (1,1,100) do @more "some.bbl" | grep "\}$" | wc -l
This gives
249
249
249
248
255
253
252
248
251
...
To reproduce similar issues, you can use this file
for /l %i in (1,1,200) do @echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX{Something1999a}>> some.bbl
Some more information
C:\>ver
Microsoft Windows [Version 6.1.7601]
C:\>more /h
Displays output one screen at a time.
MORE [/E [/C] [/P] [/S] [/Tn] [+n]] < [drive:][path]filename
...
C:\>grep --ver
GNU grep 2.6.3
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
C:\>md5sum --ver
md5sum (GNU coreutils) 8.15
Packaged by Cygwin (8.15-1)
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Ulrich Drepper, Scott Miller, and David Madore.
Why is this happening?
Update: The problem also goes away by replacing more
by this cat
:
C:\>cat --ver
cat (GNU coreutils) 8.15
Packaged by Cygwin (8.15-1)
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Torbjörn Granlund and Richard M. Stallman.
Why are you using more in this way and what exactly are you trying to achieve? – qasdfdsaq – 2015-06-12T14:39:57.703
I don't recall why I initially used
more
. I did notice, though, that in some casesmore | grep
is faster thangrep
. Anyway, isn't it completely irrelevant why I am usingmore
(and notcat
, orgrep
, or whatever) with respect to my question, WHY the combination of command-line tools behaves in a non-deterministic way? – bers – 2015-06-12T16:28:31.497I think it's relevant because more is not intended to be used in this way. My guess is because you are using more in a way it is not designed to function it is forking or returning at the wrong time, or mis-detecting your terminal parameters because there is no terminal. I wouldn't be surprised at a program behaving non-deterministically when it's documented behaviour is "undefined" – qasdfdsaq – 2015-06-12T16:32:17.107
Oh. So you think the fact that
more
expects terminal input at the end of a screen (when output is not redirected) might interfere with the output? And the apparent randomness basically results from the position on a virtual terminal window? This might actually make sense. – bers – 2015-06-12T16:38:16.073