Querying Knowledge Graphs

Wouter Beek (wouter@triply.cc)

Thomas de Groot (thomas.de.groot@triply.cc)


4 SPARQL forms

Graph → Y/N
Graph → Graph
IRI → Graph
Graph → Table


Graph → Y/N

ask query


Graph → Table

select query

  • RDF data is stored in a graph.
  • A select query creates a tabular view by matching a pattern against the graph data.

Components of a select query

Specifies the columns of the table.
Specifies how the cells of the table are filled.
Specifies additional operations over the table.


Triple Patterns

1-1: Our first select query

Projection (columns)
select ?s ?p ?o
Pattern (graph match)
{ ?s ?p ?o }
limit 25

1-2: Table of creative Works + image links

select ?item ?image {
  ?item <http://xmlns.com/foaf/0.1/depiction> ?image
limit 25
Match specific arcs in the graph.
?item and ?image
Use descriptive names for variables.

1-3: Abbreviated IRI notation

Abbreviated query notation
Allows foaf:depiction to be written i.o. <http://xmlns.com/foaf/0.1/depiction>.
Abbreviated result notation
Display id:568328 instead of <https://iisg.amsterdam/id/item/568328>.

1-4: Invert the projection

The first column contains the image links.
The second column contains the CreativeWork.

1-5: Change the projection

Only return the column for image links.
A hidden variable: one whose bindings are not returned.

1-6: The generic projection

select *
Return columns for all variables. Columns appear in unspecified order.

1-7: Introduce a variable

Add a column with values that are not matched in the graph.

1-8: HTML template

Use ?VARIABLE in a template string.
Triply-specific feature
Still a standards-compliant language-sparql 1.1 query, but will not perform HTML templating in other language-sparql editors.

1-9: Limit the number of rows

limit 250
Return at most 250 rows.

1-10: Skip a number of rows

offset 250
Skip the first 250 rows, return the 251st through the 275th row.


Construct Purpose Examples
Prefix Abbreviate syntax prefix ex: <https://example.com/>
Projection Select columns select ?x ?y
select *
Pattern Match cell values { ?s ?p ?o }
{ ?s ex:p ?o }
Binding Introduce new variables bind('Hi!' as ?widget)
Template Return HTML widgets bind('<img src="{{image}}">' as ?widget)
Limit Set a maximum number of rows. limit 10
Offset Skip a number of rows. offset 10


Graph Patterns

2-1: Graph Pattern: One Triple Pattern

  • Graph patterns contain zero or more Triple Patterns.
  • We leave out prefix declarations from now on.
  • We leave out widget bindings (bind) from now on.

2-2: Graph Pattern: Two Triple Patterns

A shared variable connects two or more triple patterns.
. (dot)
Marks the end of a Triple Pattern. (The dot behind the last Triple Pattern is optional.)

2-3: Graph Pattern: Four Triple Patterns

2-4: Graph Pattern: Abbr. notation

; (semi-colon)
Repeat the previous subject term.
, (comma)
Repeat the previous subject and predicate terms.

2-5: Multiple values

filter( … )
A non-graph restriction that is added to the pattern.
X != Y, X < Y, …
X and Y must not be the same, X must be smaller than Y, etc.

2-6: Filter by language

Returns the language of a language-tagged string.
filter( A && B )
Apply filter A ánd filter B.

2-7: Graph pattern: Intermediate node

Some nodes appear both in the subject and in the object position.

2-8: Property Paths

Sequence: first follow P, then follow Q.
Choice: follow P or follow Q.
Follow P one or more times.

2-9: Making the query more specific

Instantiating a variable makes the query more specific.

2-10: Sort rows

order by ?x
Sorts rows from oldest to newest work.
order by ?x ?y ?z
It is possible to sort by multiple criteria.

2-11: Inversely sort rows

order by desc(?x)
Inversely sort rows (descending).

construct queries

Graph → Graph

describe queries

IRI → Graph


Geospatial data model (GeoSPARQL)

A feature can have 2D ánd 3D shapes; it can have serializations in GML ánd in WKT.

GeoSPARQL: Geometry

geo:hasGeometry and geo:asWKT
geosparql, standardized by the Open Geospatial Consortium (OGC).
Popup for the shape bound to ?shape.

GeoSPARQL: Geometry

[ P O ]
Anonymous node notation (square brackets, […]) can be used to abbreviate unused subject terms.

Find a Dutch building

bag namespace
Vocabulary of the Dutch Base Registry for Buildings (BAG) by the Dutch Cadastre (Kadaster).

Exercise: Find a house or street in the Netherlands.

PDOK Endpoint

Find a Dutch building: bracketed

S Q [ P O ]
Anonymous nodes are regular nodes that can be used in the object position as well.


  "hoofdadres": {
    "bijbehorendeOpenbareRuimte": {
      "bijbehorendeWoonplaats": { "label": "Amsterdam" },
      "naamOpenbareRuimte": "De Boelelaan"
    "huisnummer": 1105,
    "postcode": "1081HV"
  bag:hoofdadres [
    bag:bijbehorendeOpenbareRuimte [
      bag:bijbehorendeWoonplaats/rdfs:label "Amsterdam"@nl;
      bag:naamOpenbareRuimte "De Boelelaan"
    bag:huisnummer 1105;
    bag:postcode "1081HV"

GeoSPARQL: 3D geometries

select * {
  service  {
  values (?shapeColor ?shapeHeight) { ('green' 60) }
  bind(?shapeLabel as ?shapeName)
Color of the shape bound to ?shape.
Height of the shape bound to ?shape.
Name of the shape bound to ?shape.
values (?var … ?var) { (?term … ?term) … (?term … ?term) }
Specify multiple bindings.

GeoSPARQL + modifiers: order by

Show the 25 oldest buildings in Apeldoorn.


What is aggregation?

One or more functions that are applied to groups of values.

The groups are generated for each unique combination of values for a specified set of variables.

An example of groups

viaf:89204476"Eduard Douwes Dekker"@en
viaf:89204476"Eduard Douwes Dekker"@ja-jp
viaf:89204476"Eduard Douwes Dekker"@hr-hr
viaf:89204476"Eduard Douwes Dekker"@nl-nl

The set of variables is {?id}.

The groups are the sets of names per PersonID.

3-1: Count function

Applies the count function to each group of names.
group by ?id
Explicit grouping criterion.


Applies a filter expression.
Different from filter
having can use aggregation functions

3-2: Implicit grouping

Implicit grouping
When there is at least one aggregation function (e.g., count) and there is no group by clause.
group by ?id
The implicitly grouped-by variables are the ones that are (1) visible and (2) no argument to an aggregation function.

3-3: Implicit grouping gone wrong

?id is no longer visible.

3-4: Group concatenation

Concatenate all arguments into one new string.
(group_concat(STRING;separator=STRING) as VAR)
Concatenate all bindings, interspersed with separators, into one new string.
Often used together with sub-select

3-5: Multiple aggregation functions

(sum(?employees) as ?employees)
Summate the number of employees per company ?id, regardless of ?name.
(group_concat(distinct ?name;separator=';') as ?name)
Concatenate the names for each company ?id.
Implicitly grouped by ?id
Visible and does not occur in an aggregation function.


Transitive predicates

Hierarchy: Org Chart

Uses the Historical International Standard Classification of Occupations (HISCO) dataset.

Hierarchy: TreeMap

Uses the Historical International Standard Classification of Occupations (HISCO) dataset.


DataCube: Observation

observation:0007ddade4 a qb:Observation;
  qb:dataSet dataset:countries;
  dimension:location country:Netherlands;
  dimension:year "2002"^^xsd:gYear;
  measure:lifeExpectancy 7.9696e1.

DataCube: Dataset

dataset:countries a qb:DataSet;
  qb:structure dsd:countries;
  sdmx-attribute:unitMeasure dbr:Year.

DataCube: Data Structure Definition

dsd:countries a qb:DataStructureDefinition;
    [ qb:dimension dimension:location ],
    [ qb:dimension dimension:year ],
    [ qb:measure measure:lifeExpectancy ],
    [ qb:attribute sdmx-attribute:unitMeasure;
      qb:componentAttachment qb:Dataset ].

dimension:location a qb:DimensionProperty;
  qb:concept sdmx-concept:refArea;
  rdfs:range vocab:Country;
  rdfs:subPropertyOf sdmx-dimension:refArea.

dimension:year a qb:DimensionProperty;
  qb:concept sdmx-concept:refPeriod;
  rdfs:range xsd:gYear;
  rdfs:subPropertyOf sdmx-dimension:refPeriod.

measure:lifeExpectancy a qb:MeasureProperty;
  rdfs:range xsd:double;
  rdfs:subProperty sdmx-measure:obsValue.

Plot a measure for one dimension

Fixed dimension
dimension:year "2007"^^xsd:gYear
Plotted dimension
dimension:location/rdfs:label ?country
Plotted measure
measure:lifeExpectancy ?value

Plot a measure for one dimension

  • Column 1: plotted dimension
  • Column 2: plotted measure
  • Column 3: coordinate label
  • Column 4: tooltip


  • Caption, axes, legend
  • Linear/polynomial trend
  • Error bars
  • Log scale

Thank you for your attention!

Wouter Beek (wouter@triply.cc),
Thomas de Groot (thomas.de.groot@triply.cc)