Monday, September 7, 2009

Creating and Managing Metadata about RDF statements

Metadata about RDF statements can be quite useful and even required for a number of purposes. For example, let’s consider questions one may have about a statement “Washington, DC is a capital of the United States”:

  • What is the provenance of this statement – who said this, when did they say this?
  • What is the temporal scope of this statement – when did DC became a capital of the US, is it still the capital?
  • What is the access control for this statement – who can see it, who can change it?


Interest in this topic is evidenced by Paul Hermans' blog summarizing recent discussions on approaches to implementing such metadata http://www.proxml.be/users/paul/weblog/9d47d/A_must_read__Temporal_Scope_for_RDF_Triples.html.

We often get asked what is TopQuadrant’s recommended approach to supporting statements about statement – for versioning, for governance, etc? TopBraid Suite is fully flexible in this respect and can be used to implement any number of approaches. However, in our work, we found an approach based on RDF reification to be particularly useful.

TopBraid Composer provides a Change History view where one can see every added and deleted triples. The view is based on a small ontology called change.owl and available as part of TopBraid library, It contains a class Change and a handful of properties – added, deleted, graph and timestamp. The class Change is described as follows: A change to an RDF Graph, encapsulating lists of added and/or deleted rdf:Statements. Additional metadata such as a timeStamp (or author or whatever) can be added”,

We often extend this model for particular applications to add the metadata required by the app, for example, author or scope. TopBraid Composer inserts change statements automatically. Every time there is a triple is added or deleted, there is a new change statement. In web applications deployed under TopBraid Live, we use a SPARQLMotion scripts that start with sml:TrackChanges module.

Sml:TrackChanges is used to implement services that shall be executed as a side effect of a change to an RDF model. In TopBraid, any script containing an instance of this module will be executed as part of each change. The output of this module is using the http://topbraid.org/change ontology, with triples describing the changes that have happened.

In other words, TopBraid listens for the changes and, when a change happens, it will trigger execution of a script(s) containing TrackChanges module. One can provide a filter to specify what type of changes a script should react to. And, as with any SPARQLMotion script, what happens when the script is triggered is up to the script designer. For example, in the Enterprise Vocabulary Management solution we use this approach to stamp every change with the author id and a timestamp and also to trigger the governance processes – send e-mails about the changes to the appropriate parties, promote approved changes, etc.

This is a blog by TopQuadrant, developers of the TopBraid Suite, created to support the pursuit of our ongoing mission - to explode strange semantic myths, to seek out new models that support a new generation of dynamic business applications, to boldly integrate data that no one has integrated before.