Testing for a bad namespace with SHACL¶
Contents
Overview¶
This is a brute force approach to using SHACL to report invalid use of a namespace. It is only effective where there are limited combinations of the bad namespace and matching classes [1] for testing.
Using the SHACL shapes:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix d1: <http://ns.dataone.org/schema/SO/nsvalidation#> .
d1:DatasetBad1Shape
a sh:NodeShape ;
sh:targetClass <https://schema.orgDataset/> ;
sh:message "Expecting SO namespace of <https://schema.org/> not <https://schema.org>" ;
sh:not [
sh:path rdf:type ;
sh:minCount 1;
].
d1:DatasetBad2Shape
a sh:NodeShape ;
sh:targetClass <http://schema.org/Dataset> ;
sh:message "Expecting SO namespace of <https://schema.org/> not <http://schema.org/>" ;
sh:not [
sh:path rdf:type ;
sh:minCount 1;
].
d1:DatasetBad3Shape
a sh:NodeShape ;
sh:targetClass <http://schema.orgDataset/> ;
sh:message "Expecting SO namespace of <https://schema.org/> not <http://schema.org>" ;
sh:not [
sh:path rdf:type ;
sh:minCount 1;
].
and a graph with three SO:Dataset
sub-graphs that use invalid namespaces:
[
{
"@context": {
"@vocab": "https://schema.org"
},
"@id":"demo_0",
"@type":"Dataset",
"name": "https, no trailing slash"
},
{
"@context": {
"@vocab": "http://schema.org"
},
"@id":"demo_1",
"@type":"Dataset",
"name": "http, no trailing slash"
},
{
"@context": {
"@vocab": "http://schema.org/"
},
"@id":"demo_2",
"@type":"Dataset",
"name": "http only"
}
]
The SHACL tests are applied and results printed:
import rdflib
import pyshacl
shape_graph = rdflib.Graph()
shape_graph.parse("examples/shapes/test_namespace.ttl", format="turtle")
data_graphs = rdflib.ConjunctiveGraph()
data_graphs.parse("examples/data/ds_bad_namespace.json", format="json-ld", publicID="https://example.net/")
conforms, results_graph, results_text = pyshacl.validate(
data_graphs,
shacl_graph=shape_graph,
inference="rdfs",
meta_shacl=True,
abort_on_error=False,
debug=False
)
print(results_text)
Validation Report
Conforms: False
Results (3):
Constraint Violation in NotConstraintComponent (http://www.w3.org/ns/shacl#NotConstraintComponent):
Severity: sh:Violation
Source Shape: d1:DatasetBad2Shape
Focus Node: <https://example.net/demo_2>
Value Node: <https://example.net/demo_2>
Message: Expecting SO namespace of <https://schema.org/> not <http://schema.org/>
Constraint Violation in NotConstraintComponent (http://www.w3.org/ns/shacl#NotConstraintComponent):
Severity: sh:Violation
Source Shape: d1:DatasetBad3Shape
Focus Node: <https://example.net/demo_1>
Value Node: <https://example.net/demo_1>
Message: Expecting SO namespace of <https://schema.org/> not <http://schema.org>
Constraint Violation in NotConstraintComponent (http://www.w3.org/ns/shacl#NotConstraintComponent):
Severity: sh:Violation
Source Shape: d1:DatasetBad1Shape
Focus Node: <https://example.net/demo_0>
Value Node: <https://example.net/demo_0>
Message: Expecting SO namespace of <https://schema.org/> not <https://schema.org>
For comparison, a valid SO:Dataset
:
{
"@context": {
"@vocab": "https://schema.org/"
},
"@graph": [
{
"@type": "Dataset",
"@id": "./",
"identifier": "dataset-01",
"name": "Dataset with metadata about",
"description": "Dataset snippet with metadata and data components indicated by hasPart and the descriptive metadata through an about association.",
"license": "https://creativecommons.org/publicdomain/zero/1.0/",
"hasPart": [
{
"@id": "./metadata.xml"
},
{
"@id": "./data_part_a.csv"
}
]
},
{
"@id": "./metadata.xml",
"@type": "MediaObject",
"contentUrl": "https://example.org/my/data/1/metadata.xml",
"dateModified": "2019-10-10T12:43:11+00:00.000",
"description": "A metadata document describing the Dataset and the data component",
"encodingFormat":"http://www.isotc211.org/2005/gmd",
"about": [
{
"@id": "./"
},
{
"@id": "./data_part_a.csv"
}
]
},
{
"@id": "./data_part_a.csv",
"@type": "MediaObject",
"contentUrl": "https://example.org/my/data/1/data_part_a.csv"
}
]
}
Does not match any of the bad namespace tests and so conforms.
data_graphs.parse("examples/data/ds_m_about.json", format="json-ld", publicID="https://example.net/")
conforms, results_graph, results_text = pyshacl.validate(
data_graphs,
shacl_graph=shape_graph,
inference="rdfs",
meta_shacl=True,
abort_on_error=False,
debug=False
)
print(results_text)
Validation Report
Conforms: True
Footnotes¶
[1] | The limitation of this approach stems from the need to identify a target node that the SHACL
constraints are applied against. Adding checks for additional SO: types with this pattern requires
a separate sh:targetClass rule for each combination of namespace and type. In this case, three entries for each
type being tested would be required. |
Running code on this page¶
All examples on this page can be run live in Binder. To do so:
- Click on the “Activate Binder” button
- Wait for Binder to be active. This can take a while, you can watch progress in your
browser’s javascript console. When a line like
Kernel: connected (89dfd3c8...
appears, Binder should be ready to go. - Run the following before any other script on the page. This sets the right path context for loading examples etc.
import os
try:
os.chdir("docsource/source")
except:
pass
print("Page is ready. You can now run other code blocks on this page.")