Get the number of images on a webpage

-5

Recreational languages win at code golf too much. This challenge is simple, when using higher level functions.

The task is as follows:

Given any valid url, which is guaranteed to result in a valid HTML document (for example https://en.wikipedia.org/wiki/Ruby_on_Rails, which has 9 images), count the number of images displayed from the resultant webpage.

Only images visible in the HTML code are counted, marked by the HTML <img> tag.

Websites that can continually load images (Google Images) are not valid tests.

0liveradam8

Posted 2018-04-09T18:45:15.597

Reputation: 93

Question was closed 2018-04-10T17:49:04.283

Comments are not for extended discussion; this conversation has been moved to chat.

– Mego – 2018-04-12T12:47:24.497

2

This is [tag:code-golf], right? The shortest code wins, right? Then why does a 61 byte answer beat an 10 byte answer? If my answer is invalid in some way, just comment on my post.

– NoOneIsHere – 2018-04-13T22:55:26.343

Answers

2

JavaScript, 61 bytes

As per this consensus, the code needs to run under the same domain as the page being requested, to avoid CORS issues. Returns a Promise conatining the count.

u=>fetch(u).then(r=>r.text()).then(t=>t.split`<img`.length-1)

Thanks to tsh for pointing out some code I forgot to update before posting, saving 3 bytes.


Try it

(f=
u=>fetch(`https://crossorigin.me/`+u).then(r=>r.text()).then(t=>t.split`<img`.length-1)
)(i.value=`https://codegolf.stackexchange.com/questions/161669/`).then(x=>o.innerText=x);b.addEventListener(`click`,_=>f(i.value).then(x=>o.innerText=x),0)
<input id=i><button id=b>Count</button><pre id=o></pre>

Shaggy

Posted 2018-04-09T18:45:15.597

Reputation: 24 623

Why not .match? or .split"<img"? – tsh – 2018-04-10T11:05:53.757

.match works out longer in order to handle the case of their being no images. You're right on the second count, though; forgot to change it before posting. Thanks. – Shaggy – 2018-04-10T11:07:48.407

*there! Dunno how I did that! – Shaggy – 2018-04-10T22:12:23.340

2

CJam, 10 bytes

lg"<img"e=

Does not work online.

Description:

l          # read line
 g         # open url and get html
  "<img"   # this string
        e= # count arg2 in arg1

NoOneIsHere

Posted 2018-04-09T18:45:15.597

Reputation: 1 916

1@downvoters what did I do wrong? – NoOneIsHere – 2018-04-10T04:13:14.327

1No idea. / Anyone who hates golfing languages is likely to know none of them. (otherwise they would know how hard it is to golf in those languages) Bash is not even a golfing language... – user202729 – 2018-04-10T04:53:54.457

1Can you add a TIO and/or an explanation? I don't know CJam and can't figure out how this is reading in the HTML. – Shaggy – 2018-04-10T09:39:49.390

2

Python 3 + Selenium + Firefox, 124 bytes

from selenium.webdriver import*
d=Firefox()
d.get(input())
c=len(d.find_elements_by_xpath('//img'))
d.quit()
print(c)

Notice that https://en.wikipedia.org/wiki/Ruby_on_Rails report 8 instead of 9. This is due to one of the <img> tag is contained by <noscript>, which is regarded as comment by a browser who enabled script.

Changing webdriver.Firefox to webdriver.Ie may save 5 bytes... But I just dislike IE.

How to run:

  • First install Python 3, and Firefox
  • install Selenium by pip install selenium
  • Download gecko driver and put it to your path
  • Run this script
  • A Firefox will be started, ignore it and input the url to stdin
  • The result will be shown on stdout

tsh

Posted 2018-04-09T18:45:15.597

Reputation: 13 072

A manual count confirms that there are 9 img tags on that page. – Shaggy – 2018-04-10T09:38:14.993

@Shaggy One of them is in <noscript> tag which will not be rendered if browser support javascript; 3 of them are invisible to user. So there are 5 image displayed. – tsh – 2018-04-10T09:55:07.020

"*Only images visible in the HTML code are counted, marked by the HTML <img> tag.*" – Shaggy – 2018-04-10T09:56:41.290

Yes, and also "count the number of images displayed from the resultant webpage". I read this as "only displayed <img> tags are count. – tsh – 2018-04-10T09:59:44.410

You're right, there's a contradiction there. I suggest asking for clarification. – Shaggy – 2018-04-10T10:01:13.393

@tsh @0liveradam8 So we just need to count <img> tags? – NoOneIsHere 19 hours ago @NoOneIsHere Yes! But you have to download the HTML code from a given URL first – NoOneIsHere – 2018-04-10T15:49:12.707

For some reason my laptop will not let me install silenium so can't test but can't you just do import silenium as S and then S.webdriver.Firefox() to save 7? – ElPedro – 2018-04-10T20:35:43.440

@Shaggy @NoOneIsHere Just updated to fit OP's comment (ignored is_displayed test). – tsh – 2018-04-11T01:53:01.410

@ElPedro it seems that from selenium.webdriver import* works, updated. – tsh – 2018-04-11T01:53:35.337

1

Python 2 + BeautifulSoup, 103 bytes

import bs4 as B
import urllib as U
print len(B.BeautifulSoup(U.urlopen(input()).read()).findAll('img'))

Can't do a TIO as BeautifulSoup is not available there. This reports 9 for https://en.wikipedia.org/wiki/Ruby_on_Rails.

Note: A warning is displayed before the number as a parser is not specified and lxmlis taken as the default. Adding the lxml parser explicitly suppresses the warning but costs another 7 bytes.

import bs4 as B
import urllib as U
print len(B.BeautifulSoup(U.urlopen(input()).read(),'lxml').findAll('img'))

ElPedro

Posted 2018-04-09T18:45:15.597

Reputation: 5 301

-1

Bash, grep, and wc, 45 28 bytes

curl $1|grep -o "<img"|wc -l

-17 bytes thanks to tsh

Explanation:

curl $1                      # download html from first parametre
       |grep -o "<img"       # find all <img strings
                      |wc -l # count lines

NoOneIsHere

Posted 2018-04-09T18:45:15.597

Reputation: 1 916

3What if there are multiple <img> tags on a line? – nimi – 2018-04-09T23:06:33.277

@downvoters what did I do wrong? – NoOneIsHere – 2018-04-10T04:13:22.930

grep -o '<img' should works if you just want <img substring. – tsh – 2018-04-10T07:00:51.983

@nimi - if that's the case then the person who made the web site should be a code golfer and not a web developer :-) – ElPedro – 2018-04-10T20:25:11.990

1@ElPedro No need of code golfer, just a minify tool should be enough. – tsh – 2018-04-11T04:51:09.727