Linked data
In[1] computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database.
Tim Berners-Lee, director of the World Wide Web Consortium (W3C), coined the term in a 2006 design note about the Semantic Web project.[2]
Linked data may also be open data, in which case it is usually described as linked open data (LOD).[3]
Principles
Tim Berners-Lee outlined four principles of linked data in his "Linked Data" note of 2006,[2] paraphrased along the following lines:
- Use URIs to name (identify) things.
- Use HTTP URIs so that these things can be looked up (interpreted, "dereferenced").
- Provide useful information about what a name identifies when it's looked up, using open standards such as RDF, SPARQL, etc.
- Refer to other things using their HTTP URI-based names when publishing data on the Web.
Tim Berners-Lee gave a presentation on linked data at the TED 2009 conference.[4] In it, he restated the linked data principles as three "extremely simple" rules:
- All kinds of conceptual things, they have names now that start with HTTP.
- If I take one of these HTTP names and I look it up...I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
- When I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.
Components
- URIs
- HTTP
- Structured data using controlled vocabulary terms and dataset definitions expressed in Resource Description Framework serialization formats such as RDFa, RDF/XML, N3, Turtle, or JSON-LD
- Linked Data Platform
Linked open data
Linked open data is linked data that is open data.[5][6][7] Tim Berners-Lee gives the clearest definition of linked open data in differentiation with linked data.
Linked Open Data (LOD) is Linked Data which is released under an open license, which does not impede its reuse for free.
Large linked open data sets include DBpedia and Wikidata.
History
The term "linked open data" has been in use since at least February 2007, when the "Linking Open Data" mailing list[1] was created.[9] The mailing list was initially hosted by the SIMILE project[10] at the Massachusetts Institute of Technology.
Linking Open Data community project
The goal of the W3C Semantic Web Education and Outreach group's Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links.[12][13] By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. A detailed statistical breakdown was published in 2014.[14]
European Union projects
There are a number of European Union projects involving linked data. These include the linked open data around the clock (LATC) project,[15] the PlanetData project,[16] the DaPaaS (Data-and-Platform-as-a-Service) project,[17] and the Linked Open Data 2 (LOD2) project.[18][19][20] Data linking is one of the main goals of the EU Open Data Portal, which makes available thousands of datasets for anyone to reuse and link.
Ontologies
Ontologies are formal descriptions of data structures. Some of the better known ontologies are:
- FOAF – an ontology describing persons, their properties and relationships
- UMBEL – a lightweight reference structure of 20,000 subject concept classes and their relationships derived from OpenCyc, which can act as binding classes to external data; also has links to 1.5 million named entities from DBpedia and YAGO
Datasets
- DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts described by 1 billion triples, including abstracts in 11 different languages
- GeoNames – provides RDF descriptions of more than 7,500,000 geographical features worldwide.
- Wikidata – a collaboratively-created linked dataset that acts as central storage for the structured data of its Wikimedia Foundation sister projects
- Global Research Identifier Database (GRID) – an international database of 89,506 institutions engaged in academic research, with 14,401 relationships, models two types of relationships: a parent-child relationship that defines a subordinate association, and a related relationship that describes other associations[21][22]
Dataset instance and class relationships
Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud (as shown by the figures to the right) are available.[23][24]
See also
- American Art Collaborative - consortium of US art museums committed to establishing a critical mass of linked open data on American art
- Authority control – about controlled headings in library catalogs
- Citation analysis – for citations between scholarly articles
- Hyperdata
- Network model – an older type of database management system
- Schema.org
- VoID – Vocabulary of Interlinked Datasets
- Web Ontology Language
References
- "public-lod@w3.org Mail Archives".
- Tim Berners-Lee (2006-07-27). "Linked Data". Design Issues. W3C. Retrieved 2010-12-18.
- "What are Linked Data and Linked Open Data?". Ontotext. Retrieved 2019-05-08.
- "Tim Berners-Lee on the next Web".
- "Frequently Asked Questions (FAQs) - Linked Data - Connect Distributed Data across the Web".
- "COAR » 7 things you should know about…Linked Data". Archived from the original on 2015-11-18. Retrieved 2015-12-29.
- "Linked Data Basics for Techies".
- "5 Star Open Data".
- "SweoIG/TaskForces/CommunityProjects/LinkingOpenData/NewsArchive".
- "SIMILE Project - Mailing Lists".
- Linking open data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
- "SweoIG/TaskForces/CommunityProjects/LinkingOpenData - W3C Wiki". esw.w3.org. Retrieved 22 March 2018.
- Fensel, Dieter; Facca, Federico Michele; Simperl, Elena; Ioan, Toma (2011). Semantic Web Services. Springer. p. 99. ISBN 978-3642191923.
- Max. "State of the LOD Cloud". linkeddatacatalog.dws.informatik.uni-mannheim.de. Retrieved 22 March 2018.
- "Linked open data around the clock (LATC)". latc-project.eu. Archived from the original on 19 September 2018. Retrieved 22 March 2018.
- "Welcome to PlanetData! - PlanetData". planet-data.eu. Retrieved 22 March 2018.
- "DaPaaS". project.dapaas.eu. Retrieved 22 March 2018.
- Linking Open Data 2 (LOD2)
- "CORDIS FP7 ICT Projects – LOD2". European Commission. 2010-04-20.
- "LOD2 Project Fact Sheet – Project Summary" (PDF). 2010-09-01. Archived from the original (PDF) on 2011-07-20. Retrieved 2010-12-18.
- "GRID Statistics". grid.ac/stats. Retrieved 2018-10-26.
- "GRID Policies". grid.ac. Retrieved 2018-10-26.
- "Instance relationships amongst datasets". fu-berlin.de. Retrieved 22 March 2018.
- "Class relationships amongst datasets". Archived from the original on 28 August 2011. Retrieved 22 March 2018.
Further reading
- Ahmet Soylu, Felix Mödritscher, and Patrick De Causmaecker. 2012. “Ubiquitous Web Navigation through Harvesting Embedded Semantic Data: A Mobile Scenario.” Integrated Computer-Aided Engineering 19 (1): 93–109.
- Linked Data: Evolving the Web into a Global Data Space (2011) by Tom Heath and Christian Bizer, Synthesis Lectures on the Semantic Web: Theory and Technology, Morgan & Claypool
- How to Publish Linked Data on the Web, by Chris Bizer, Richard Cyganiak and Tom Heath, Linked Data Tutorial at Freie Universität Berlin, Germany, 27 July 2007.
- The Web Turns 20: Linked Data Gives People Power, part 1 of 4, by Mark Fischetti, Scientific American 2010 October 23
- Linked Data Is Merely More Data – Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, and Amit P. Sheth. In: Dan Brickley, Vinay K. Chaudhri, Harry Halpin, and Deborah McGuinness: Linked Data Meets Artificial Intelligence. Technical Report SS-10-07, AAAI Press, Menlo Park, California, 2010, pp. 82–86.
- Moving beyond sameAs with PLATO: Partonomy detection for Linked Data – Prateek Jain, Pascal Hitzler, Kunal Verma, Peter Z. Yeh, Amit Sheth. In: Proceedings of the 23rd ACM Hypertext and Social Media conference (HT 2012), Milwaukee, WI, USA, June 25–28, 2012.
- Freitas, André, Edward Curry, João Gabriel Oliveira, and Sean O’Riain. 2012. “Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends.” IEEE Internet Computing 16 (1): 24–33.
- Interlinking Open Data on the Web – Chris Bizer, Tom Heath, Danny Ayers, Yves Raimond. In Proceedings Poster Track, ESWC2007, Innsbruck, Austria
- Ontology Alignment for Linked Open Data – Prateek Jain, Pascal Hitzler, Amit Sheth, Kunal Verma, Peter Z. Yeh. In proceedings of the 9th International Semantic Web Conference, ISWC 2010, Shanghai, China
- Linked open drug data for pharmaceutical research and development - J Cheminform. 2011; 3: 19. Samwald, Jentzsch, Bouton, Kallesøe, Willighagen, Hajagos, Marshall, Prud'hommeaux, Hassenzadeh, Pichler, and Stephens (May 2011)
- Interview with Sören Auer, head of the LOD2 project about the continuation of LOD2 in 2011, June 2011
- Linked Open Data: The Essentials - Florian Bauer and Martin Kaltenböck (January 2012)
- The Flap of a Butterfly Wing - semanticweb.com Richard Wallis (February 2012)