Display and analyze the graph of a website to check its connectivity and compute properties

0

I would like to check several properties of the graph behind a website I own. The properties I'm interested in are: diameter, shortest path from index page, min/max/average degree of vertices, and other properties that do not require solving a NP-complete problem. I'm also interested in tags that are not links, in order to check the programming style.

This is a Wordpress site, but I would like something generic. I've tried Gephi, but the UI is not very good and I didn't find plugins to import or generate that kind of data. It should run under Linux or Mac OS X; I'm not looking for an online application.

alecail

Posted 2012-04-27T11:39:38.247

Reputation: 453

In what format is the data available? Do you have to interpret the graph, or do you have access to its raw data? Can you post a sample? – dav – 2012-04-27T17:29:31.550

The graph must be build from the HTML pages. What do you mean by interpret the graph ? Basically I want to iterate through the links of the first page of a website, add one vertex for each page and one edge for each link, until no new pages found, and repeat for each vertex. Then on this graph, I will compute the properties I mentionned above. But I don't want to reinvent the wheel. – alecail – 2012-04-27T21:30:22.357

Sorry, I misunderstood the question intially, so my follow-up didn't make any sense. It sounds like this is beyond what I could help with. Good luck. – dav – 2012-05-01T14:09:23.290

I have found a sample program that does a part of the job: Qt4 example 'Dom Traversal'. With a bit of work I should be able to implement what I have in mind. – alecail – 2012-05-01T14:13:51.270

Answers

0

Try NodeXL with the VOSON Web crawl Spigot (free account required).

edallme

Posted 2012-04-27T11:39:38.247

Reputation: 216