Linked Data Pattern
Also known as: Linked Data
The Linked Data pattern provides a set of best practices for publishing and connecting structured data on the Web, enabling the creation of a global data graph.
1. Overview
The Linked Data pattern is a set of design principles for publishing and interlinking structured data on the Web. It was first described by Tim Berners-Lee in a 2006 design note [2]. The primary goal of Linked Data is to enable the creation of a web of data, where data from different sources can be connected and queried as a single information space. This is in contrast to the traditional web of documents, where information is primarily presented for human consumption.
2. Core Principles
The Linked Data pattern is based on four core principles [2]:
- Use URIs as names for things: Every entity or concept should be identified by a Uniform Resource Identifier (URI).
- Use HTTP URIs so that people can look up those names: These URIs should be accessible over the web using the HTTP protocol.
- When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL): When a URI is dereferenced, it should return data in a standard format, such as the Resource Description Framework (RDF).
- Include links to other URIs, so that they can discover more things: The data should contain links to other related URIs, allowing applications to navigate the web of data.
3. Key Practices
The web is a vast repository of information, but most of it is locked away in unstructured documents, making it difficult for machines to process and understand. Data is often published in isolated silos, with no easy way to connect and integrate information from different sources. This limits the potential for data reuse and the creation of new and innovative applications.
4. Implementation
The Linked Data pattern provides a solution to this problem by establishing a common set of rules for publishing and interlinking data on the web. By using standard web technologies such as URIs and HTTP, and data models like RDF, Linked Data allows data from different sources to be connected and queried in a uniform way. This creates a global data graph that can be navigated and explored by both humans and machines.
5. 7 Pillars Assessment
| Pillar | Score (1-5) | Rationale |
|---|---|---|
| Purpose | 3 | Serves a clear technical purpose in system design |
| Governance | 3 | Can be governed through standard engineering practices |
| Culture | 3 | Supports engineering culture of reliability and quality |
| Incentives | 3 | Aligns incentives toward system stability |
| Knowledge | 4 | Well-documented pattern with extensive community knowledge |
| Technology | 4 | Directly applicable to modern technology stacks |
| Resilience | 4 | Contributes to overall system resilience |
| Overall | 3.4 | A valuable technical pattern that supports commons infrastructure |
| Pros | Cons |
|---|---|
| Increased data interoperability: Linked Data makes it easier to integrate data from different sources. | Complexity: Implementing Linked Data can be complex and requires specialized knowledge. |
| Improved data discoverability: Linked Data makes it easier to find and reuse data. | Data quality: The quality of Linked Data can be variable, and there is no central authority to ensure data accuracy. |
| Enhanced data integration: Linked Data enables the creation of new applications that combine data from multiple sources. | Scalability: Querying large amounts of Linked Data can be challenging and may require specialized infrastructure. |
6. When to Use
- DBpedia: A community effort to extract structured information from Wikipedia and make it available as Linked Data.
- GeoNames: A geographical database that provides information about millions of places around the world as Linked Data.
- Bio2RDF: A project that converts and integrates biological data from multiple sources into Linked Data.
7. Anti-Patterns & Gotchas
In the cognitive era, Linked Data can play a crucial role in providing the structured data needed to train and power AI and machine learning models. By creating a web of interconnected data, Linked Data can help to break down data silos and provide a richer source of information for cognitive applications. For example, a chatbot could use Linked Data to answer complex questions that require integrating information from multiple sources.
8. References
| Commons Principle | Assessment |
|---|---|
| Shared Resource | The Linked Data pattern promotes the creation of a shared resource of interconnected data that can be used by anyone. |
| Democratic Governance | The governance of Linked Data is decentralized, with no single entity controlling the entire data graph. |
| Equitable Access | Linked Data is based on open standards and can be accessed by anyone with an internet connection. |
| Sustainability | The sustainability of Linked Data depends on the willingness of data publishers to maintain their data and keep it up-to-date. |
| Community Benefit | The Linked Data pattern has the potential to create significant community benefits by enabling the creation of new and innovative applications that can solve real-world problems. |
8. References
[1] Dodds, L., & Davis, I. (2022). Linked Data Patterns. https://patterns.dataincubator.org/book/linked-data-patterns.pdf
[2] Berners-Lee, T. (2006). Linked Data - Design Issues. W3C. https://www.w3.org/DesignIssues/LinkedData.html
[3] Wikipedia. (2023). Linked data. https://en.wikipedia.org/wiki/Linked_data