Introduction to SHACL

Shapes Constraint Language

Kathrin Dentler (kathrin.dentler@triply.cc),
Thomas de Groot (thomas.de.groot@triply.cc),
Wouter Beek (wouter@triply.cc)

Introduction to SHACL: Basics

Kathrin Dentler (kathrin.dentler@triply.cc)

Overview

  • SHACL Core and SHACL-SPARQL
  • SHACL Advanced Features
  • ShEx and DASH
  • Knowledge Model versus Information Model

SHACL

  • W3C Recommendation 20 July 2017
  • Validate data against a set of conditions - the shapes
  • Data-graph against shapes-graph
  • Shapes can also be viewed as a description of data
  • SHACL-Core and additionally SHACL-SPARQL

Use Cases

  • Validation
  • User Interface Building: display, edit and validate
  • Data Generation
  • Code Generation
  • Data Integration

Shapes and Constraints

Validation and Graphs

  • A processor validates the data and produces a validation report
  • SHACL defines an RDF Validation Report Vocabulary
  • Let's go play at the https://shacl.org/playground/

SHACL-SPARQL

Introduction to SHACL: Advanced

Kathrin Dentler (kathrin.dentler@triply.cc)

SHACL Advanced Features

ShEx

DASH

  • http://datashapes.org/forms.html
  • Unofficial Draft 15 September 2020
  • Holger Knublauch (TopQuadrant, Inc.)
  • Shapes to drive user interfaces: display and edit
  • Layouts mirror the definitions in shapes
  • Extensions: Editors, e.g. dash:DatePickerEditor and
    Viewers, e.g. dash:ImageViewer

Information model & knowledge model

Information model
Specific, SHACL
Knowledge model
Generic, RDFS+OWL

Knowledge model


                def:surface
                  a owl:DatatypeProperty;
                  dct:source law:someArticle;
                  rdfs:domain def:PlaceOfResidence;
                  rdfs:range xsd:positiveInteger;
                  rdfs:seeAlso "https://link.to.online.definition"^^xsd:anyURI;
                  skos:definition "The surface of a place of residency that is qualified as living space.  Recorded in square meters (m²)."@en;
                  skos:prefLabel "has surface"@en.
              
  • What is a surface, as defined in law?
  • Must be fully understandable by data users.

Information model


                shape:oppervlakte
                  sh:datatype xsd:positiveInteger;
                  sh:description "A natural number between 1 and 999,999."@en;
                  sh:maxCount 1;
                  sh:minCount 1;
                  sh:path bag:oppervlakte;
                  sh:pattern "[0-9]{1,6}".
              
  • How is a surface represented in this internal system?
  • Must be understandable by users of the internal system.

How Can I Use SHACL Validation?

Thomas de Groot (thomas.de.groot@triply.cc)

SHACL Validation is a feature in RATT!

Setup


                git clone git@github.com:RWS-NL/dis-ld.git
                cd dis-ld
                cp .ratt .ratt-private
                yarn
                yarn build
            
Go to: me/tokens to create a write token.
Edit your .ratt-private and paste in your token.

SHACL Validation is a feature in RATT!

SHACL validation is simply adding a `Middleware`.

                app.use(
                  mw.validateShacl("shapes.trig", {
                    reportLocation: `SHACL-report-kernGIS.ttl.gz`
                  })
                );
            
And executing the ETL now applies SHACL validation.

              yarn exec ratt ./lib/meridian/etl.js
            

SHACL Best Practices

Wouter Beek (wouter.beek@triply.cc)

Streaming ETL + validation (1/2)


                shape:status
                  sh:class def:Status;
                  sh:path def:status.

                id:someRoad def:status status:used.
              
We transform one record, for example one road, at a time. But this does not validate 😿.

Streaming ETL + validation (2/2)


                shape:status
                  sh:class def:Status;
                  sh:path def:status.

                id:someRoad def:status status:used.
                status:used a def:Status. # ← needed to pass streaming validation
              
We can still stream & validate, but this requires an extra triple. Now we pass validation for self-contained records 😺.

Regex (1/3)


                shape:City
                  sh:property
                    [ sh:datatype xsd:string;
                      sh:path shape:name ];
                  sh:targetClass sdo:City.
              
Too generic 😿: do we really want to allow the empty string?

Regex (2/3)


                shape:City
                  sh:property
                    [ sh:datatype xsd:string;
                      sh:path shape:name;
                      sh:pattern "[A-Za-z]{1,80}" ];
                  sh:targetClass sdo:City.
              
Too specific 😿: what about “The Hague” and “Köln”?

Regex (3/3)


                shape:City
                  sh:property
                    [ sh:datatype xsd:string;
                      sh:path shape:name;
                      sh:pattern "\\p{S}{1,80}" ];
                  sh:targetClass sdo:City.
              

Just right 😺! A reasonable number of display characters.

See XML Schema Datatypes for (much) more information.

Human-readable labels (1/2)


                shape:label
                  sh:languageIn ( "en" "nl" );
                  sh:maxCount 2;
                  sh:minCount 2;
                  sh:path skos:prefLabel.

                city:theHague
                  skos:prefLabel
                    "Den Haag"@nl,
                    "'s-Gravenhage"@nl.
              
Still validates 😿. We wanted one English and one Dutch label.

Human-readable labels (2/2)


                shape:label
                  sh:languageIn ( "en" "nl" );
                  sh:maxCount 2;
                  sh:minCount 2;
                  sh:path skos:prefLabel;
                  sh:uniqueLang true. # ← this expresses what we want

                  city:theHague
                    skos:altLabel "'s-Gravenhage"@nl;
                    skos:prefLabel
                      "The Hague"@en,
                      "Den Haag"@nl.
              
😺

Value lists (1/2)


                def:Status
                  owl:oneOf
                    ( status:used
                       …
                      statis:unused ).

                shape:status
                  sh:in
                    ( status:used
                      …
                      statis:unused );
                  sh:path def:status.
              
We need to maintain the same list twice 😿.

Value lists (2/2)


                def:Status
                  owl:oneOf
                    ( status:used
                       …
                      statis:unused ).

                shape:status
                  sh:class def:Status;
                  sh:path def:status.
              
😺

                construct {
                  ?nodeShape sh:in ?valueList.
                } where {
                  ?nodeShape sh:class ?class.
                  ?class owl:oneOf ?valueList.
                }
              
Enrichtment step (part of ETL).

Hierarchic reuse (1/2)


                def:Building rdfs:subClassOf shape:Feature.
                def:Road rdfs:subClassOf shape:Feature.

                shape:Building
                  sh:property
                    shape:geometry, # reuse potential
                    shape:address;
                  sh:targetClass def:Building.

                shape:Road
                  sh:property
                    shape:geometry, # reuse potential
                    shape:surfaceType;
                  sh:targetClass def:Road.
              
Duplication of an $M$ properties for $N$ subclasses, equals $M \times (N-1)$ duplications 😿.

Hierarchic reuse (2/2)


                def:Building rdfs:subClassOf shape:Feature.
                def:Road rdfs:subClassOf shape:Feature.

                shape:Feature
                  sh:property shape:geometry; # reuse
                  sh:tagetClass geo:Feature.

                shape:Building
                  sh:property shape:address;
                  sh:targetClass def:Building.

                shape:Road
                  sh:property shape:surfaceType;
                  sh:targetClass def:Road.
              
😺

Introduction to SHACL

Shapes Constraint Language


Kathrin Dentler (kathrin.dentler@triply.cc),
Thomas de Groot (thomas.de.groot@triply.cc),
Wouter Beek (wouter@triply.cc)

Triply B.V.