A Python solution
You can use Scrapy which will make most of the work for you. You then would just need to use Counter()
to get the top words (if looking for frequency counts).
You could also use a more low-level approach with Beautiful Soup to get the top 5 words:
# coding=utf-8
import requests
import collections
from bs4 import BeautifulSoup
thesite = requests.get("http://www.lemonde.fr").text
soup = BeautifulSoup(thesite, 'html.parser')
thewords = soup.get_text().split()
print(collections.Counter(thewords).most_common(5))
Since the output is
[('de', 223), ('la', 154), (':', 123), ('{', 115), ('à', 84)]
you could look at setting a minimal length of a "word" (3 perhaps?)
UPDATE: the code for a sorted list of most common words with 3 or more letters
# coding=utf-8
import requests
import collections
from bs4 import BeautifulSoup
import operator
thesite = requests.get("http://www.lemonde.fr").text
soup = BeautifulSoup(thesite, 'html.parser')
thewords = soup.get_text().split()
# keep only words over 3 chars
thewords = {w: f for w, f in collections.Counter(thewords).items() if len(w) > 3}
topwords = sorted(thewords.items(), key=operator.itemgetter(1), reverse=True)
print(topwords)