11
I have a html-page url and I want to grep it. How can I do it by wget someArgs | grep keyword
?
My first idea was wget -q -O - url | grep keyword
, but wget's output bypass grep and arise on the terminal in its original form.
11
I have a html-page url and I want to grep it. How can I do it by wget someArgs | grep keyword
?
My first idea was wget -q -O - url | grep keyword
, but wget's output bypass grep and arise on the terminal in its original form.
11
The easiest way is to use curl
with the option -s
for silent:
curl -s http://somepage.com | grep whatever
1Question asks for wget. Not curl. This won't work with multiple redirects and -L option. – Ligemer – 2016-10-07T23:07:24.800
@slhck: Both commands do exactly the same for me. – Dennis – 2012-06-01T21:39:36.603
@Dennis Try curl
ing http://superuser.com/questions/431581
. For whatever reason I tested it with this particular URL and got no output. Dunno what I'm missing. – slhck – 2012-06-01T21:45:44.173
@slhck: Curl doesn't follow redirects by default. It does with the -L
switch. – Dennis – 2013-03-31T20:15:45.353
@Dennis Didn't know what you were talking about without seeing the deleted comments – but yeah, that makes sense. Thanks for clearing it up. – slhck – 2013-03-31T20:27:25.127
11
Keeping this around for the sake of completeness.
Your example should actually work. The syntax is correct, and here's a screencast I just took demonstrating it, with a good old GNU wget
1.13.4.
wget -q some-url -O - | grep something
So assume your pattern is wrong and grep
will just output everything it got.
It could also be a typo in the URL. With -q
, there is no error message. – Dennis – 2012-06-02T00:44:55.670
3
If you are looking to grep or pipe headers, they are standard directed to stderr so you need to redirect them. Eg:
wget -O - http://example.com/page.php > /dev/null 2>&1 | grep HTTP
2This is the correct way of doing it, thanks! – Udayraj Deshmukh – 2018-03-23T06:38:58.183
3
This bug was in v1.12.1 fixed in another version. Currently I use v1.15 and it works as expected.
0
The wget
writes its output to stderr
not to stdout
, so one needs to redirect the stderr
to stdout
:
wget -q -O - url 2&>1 | grep keyword
grep selects lines delimited by (e.g.) carriage return and linefeed characters, an HTML response doesn't have lines it has text with markup like <br> or <p> so the whole web-page could look like one line to grep – RedGrittyBrick – 2012-06-01T19:44:45.927
1@RedGrittyBrick The OP's command works flawlessly for me. – slhck – 2012-06-01T19:47:30.527