Muutke küpsiste eelistusi

Validating RDF Data [Pehme köide]

RDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange.

The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models.

At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint languages capture and enforce data structures. Filling a need long recognized by Semantic Web users, shapes languages provide models and vocabularies for expressing such structural constraints.

This book describes two technologies for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL), the rationales for their designs, a comparison of the two, and some example applications.
Preface xv
Foreword xvii
Phil Archer
Foreword xix
Tom Baker
Foreword xxi
Dan Brickley
Libby Miller
Acknowledgments xxiii
1 Introduction 1(8)
1.1 RDF and the Web of Data
1(1)
1.2 RDF: The Good Parts
1(2)
1.3 Challenges for RDF Adoption
3(2)
1.4 Structure of the Book
5(1)
1.5 Conventions and Notation
6(3)
2 The RDF Ecosystem 9(18)
2.1 RDF History
9(1)
2.2 RDF Data Model
10(7)
2.3 Shared Entities and Vocabularies
17(1)
2.4 Technologies Related with RDF
18(7)
2.4.1 SPARQL
18(2)
2.4.2 Inference Systems: RDF Schema and OWL
20(3)
2.4.3 Linked Data, JSON-LD, Microdata, and RDFa
23(2)
2.5 Summary
25(1)
2.6 Suggested Reading
25(2)
3 Data Quality 27(28)
3.1 Non-RDF Schema Languages
28(12)
3.1.1 UML
28(1)
3.1.2 SQL and Relational Databases
29(2)
3.1.3 XML
31(6)
3.1.4 JSON
37(2)
3.1.5 CSV
39(1)
3.2 Understanding the RDF Validation Problem
40(5)
3.3 Previous RDF Validation Approaches
45(3)
3.3.1 Query-based Validation
45(2)
3.3.2 Inference-based Approaches
47(1)
3.3.3 Structural Languages
48(1)
3.4 Validation Requirements
48(4)
3.4.1 General Requirements
49(1)
3.4.2 Graph-based Requirements
49(1)
3.4.3 RDF Data Model Requirements
50(1)
3.4.4 Data-modeling-based Requirements
50(1)
3.4.5 Expressiveness of Schema Language
51(1)
3.4.6 Validation Invocation Requirements
52(1)
3.4.7 Usability Requirements
52(1)
3.5 Summary
52(1)
3.6 Suggested Reading
53(2)
4 Shape Expressions 55(64)
4.1 Use of ShEx
55(1)
4.2 First Example
56(2)
4.3 ShEx implementations
58(1)
4.4 The Shape Expressions Language
59(6)
4.4.1 Shape Expressions Compact Syntax
59(1)
4.4.2 Invoking Validation
60(3)
4.4.3 Structure of Shape Expressions
63(1)
4.4.4 Start Shape Expression
64(1)
4.5 Node Constraints
65(13)
4.5.1 Node kinds
67(1)
4.5.2 Datatypes
68(2)
4.5.3 Facets on Literals
70(3)
4.5.4 Value Sets
73(5)
4.6 Shapes
78(12)
4.6.1 Triple Constraints
79(1)
4.6.2 Groupings
80(1)
4.6.3 Cardinalities
80(2)
4.6.4 Choices
82(2)
4.6.5 Nested Shapes
84(1)
4.6.6 Inverse Triple Constraints
85(1)
4.6.7 Repeated Properties
86(1)
4.6.8 Permitting other Triples
87(3)
4.7 References
90(5)
4.7.1 Shape References
90(1)
4.7.2 Recursion and Cyclic References
91(1)
4.7.3 External Shapes
92(1)
4.7.4 Labeled Triple Expression
93(1)
4.7.5 Annotations
94(1)
4.8 Logical Operators
95(10)
4.8.1 Conjunction
95(3)
4.8.2 Disjunction
98(3)
4.8.3 Negation
101(4)
4.9 Shape Maps
105(14)
4.9.1 Fixed Shape Maps
105(1)
4.9.2 Query Shape Maps
106(2)
4.9.3 Result Shape Maps
108(1)
4.9.4 JSON Representation
109(1)
4.9.5 Chaining Validation Workflows
110(1)
4.10 Semantic Actions
110(1)
4.11 ShEx and Inference
111(2)
4.12 Importing schemas
113(1)
4.13 RDF and JSON-LD Syntax
114(2)
4.14 Summary
116(1)
4.15 Suggested Reading
116(3)
5 SHACL 119(76)
5.1 Simple Example
119(3)
5.2 SHACL Implementations
122(2)
5.3 Basic Definitions: Shapes Graphs, Node, and Property Shapes
124(1)
5.4 Importing other Shapes Graphs
125(1)
5.5 Validation Report
126(3)
5.6 Shapes
129(8)
5.6.1 Node shapes
129(1)
5.6.2 Property Shapes
130(1)
5.6.3 Constraint Components
131(2)
5.6.4 Human Friendly Messages
133(1)
5.6.5 Declaring Shape Seventies
134(1)
5.6.6 Deactivating Shapes
135(2)
5.7 Target Declarations
137(4)
5.7.1 Target Node
137(1)
5.7.2 Target Class
138(1)
5.7.3 Implicit Class Target
139(1)
5.7.4 Target Subjects Of
140(1)
5.7.5 Target Objects Of
141(1)
5.8 Cardinality
141(1)
5.9 Constraints on Values
142(6)
5.9.1 Datatypes
142(3)
5.9.2 Class of Values
145(1)
5.9.3 Node Kinds
146(1)
5.9.4 Sets of Values
147(1)
5.9.5 Specific Value
148(1)
5.10 Datatype Facets
148(6)
5.10.1 Value Ranges
149(1)
5.10.2 String-based Constraints
149(2)
5.10.3 Language-based Constraints
151(3)
5.11 Logical Constraints: and, or, not, xone
154(10)
5.11.1 AND
154(3)
5.11.2 OR
157(2)
5.11.3 Exactly One
159(3)
5.11.4 Not
162(1)
5.11.5 Combining Logical Operators
162(2)
5.12 Shape-based Constraints
164(13)
5.12.1 Shape References and Recursion
166(8)
5.12.2 Qualified Value Shapes
174(3)
5.13 Closed Shapes
177(3)
5.14 Property Pair Constraints
180(2)
5.15 Non-validating SHACL Properties
182(2)
5.16 SHACL-SPARQL
184(4)
5.16.1 SPARQL Constraints
184(1)
5.16.2 SPARQL-based Constraint Components
185(3)
5.17 SHACL and Inference Systems
188(2)
5.18 SHACL Compact Syntax
190(1)
5.19 SHACL Rules and Advanced Features
190(3)
5.20 SHACL Javascript
193(1)
5.21 Summary
194(1)
5.22 Suggested Reading
194(1)
6 Applications 195(38)
6.1 Describing a Linked Data Portal
195(9)
6.1.1 WebIndex in ShEx
197(3)
6.1.2 Weblndex in SHACL
200(4)
6.2 Describing Clinical Records-FHIR
204(8)
6.2.1 FHIR as Linked Data
206(1)
6.2.2 Consistency constraints
206(3)
6.2.3 FHIR/RDF Development
209(1)
6.2.4 Generic Properties
210(2)
6.3 Springer Nature SciGraph
212(1)
6.4 DBpedia Validation Use Cases
213(6)
6.4.1 Ontology-based Validation
213(1)
6.4.2 RDF Mappings Validation
214(1)
6.4.3 Validating Link Contributions with SHACL
215(1)
6.4.4 Ontology Validation with SHACL
216(3)
6.5 ShEx for ShEx
219(6)
6.6 SHACL in SHACL
225(5)
6.7 Summary
230(1)
6.8 Suggested Reading
231(2)
7 Comparing ShEx and SHACL 233(34)
7.1 Common Features
233(4)
7.2 Syntactic Differences
237(2)
7.3 Foundation: Schema vs. Constraints
239(1)
7.4 Invoking Validation
240(2)
7.5 Modularization and Reusability
242(2)
7.6 Shapes, Classes, and Inference
244(2)
7.7 Violation Reporting and Seventies
246(1)
7.8 Default Cardinalities
246(1)
7.9 Property Paths
247(1)
7.10 Recursion
248(2)
7.11 Property Pair Constraints and Uniqueness
250(1)
7.12 Repeated Properties
251(3)
7.13 Exactly One and Alternatives
254(3)
7.14 Treatment of Closed Shapes
257(2)
7.15 Stems and Stem Ranges
259(1)
7.16 Annotations
260(1)
7.17 Semantics and Complexity
261(1)
7.18 Extension Mechanisms
262(1)
7.19 Conclusions and Outlook
263(3)
7.20 Summary
266(1)
7.21 Suggested Reading
266(1)
A WebIndex in ShEx 267(2)
B WebIndex in SHACL 269(6)
C ShEx in ShEx 275(4)
D SHACL in SHACL 279(6)
Bibliography 285(10)
Authors' Biographies 295