VOYAGES OF THE SEMANTIC ENTERPRISE: SPIN

Since SPIN (SPARQL Inferencing Notation) aka SPARQL Rules became W3C member submission,we find ourselves responding to the growing interest to it.

With this, a question some may ask is how SPIN is different from or similar to RIF - W3C's standard for rules interchange.

While I have heard this asked a couple of times, I was pleasantly surprised that it was is not a very common question. Pleasantly, because a certain level of confusion is to be expected about new things and, both, SPIN and RIF are relatively new. If so few people ask this question, then SPIN specification did a good job explaining and positioning it and people easily grasp the unique and important needs it serves. Still, I thought it was worth while to write up my thoughts on comparing SPIN with RIF.

The goal of RIF was to create an interchange format for use between rules engines. As such, unlike SPIN, RIF is not an idea that is specifically or particularly aligned with RDF. This is why RIF was created as XML (although there is now work on RDF serialization). I am not pointing this out as a shortcoming of RIF, but rather to put in perspective the origin and the reason for RIF. In its goals, RIF is similar to OMG's XMI which also uses XML and was created to be an interchange format between different tools.

Given this similarity, XMI’s failure in being a reliable interchange format becomes relevant when considering RIF's future. Will RIF succeed in reaching its goal? One can easily argue that with the variety of available rules languages and engines, RIF’s job is harder than what XMI needed to do to succeed.

As noted here, different rules languages exist because there are different algorithms and formalisms for rules. Furthermore, different rule products have different sets of capabilities. RIF dialects are intended to be the least common denominators for a given type of a rule engine. This means that in order to effectively use the same set of RIF rules in the ‘rules engine A’ and in the ‘rules engine B’, the following needs to happen:

1. RIF dialect used to express the rules, needs to be supported by both rules engines.

Checking the implementation page, one will see that currently the overlap between any two engines is not that great. Some support BLD, some support PRD + Core, others support BLD partial or PRD minus something, etc.

2. RIF dialect used to express the rules, must be enough for the task at hand.

As mentioned above, RIF by design is somewhat of a least common denominator. This means that a user could always do more with a given rules engine than they can express in a dialect of RIF.

For example (as noted here), SPARQL is more expressive than what is possible with RIF. This is not unique to SPARQL, it is true for pretty much any rules technology.

3. The interchange must work

Given well known XMI issues, I am quite keen to see RIF test cases as well as test case results from the implementers

Attitude of the major rule engine vendors towards RIF is currently, at best, lukewarm. For example, on the Oracle forum, support engineers recommend against attempting to interchange rules by saying:

“In a hybrid environment I'd recommend that rules authored in ILOG be executed in the ILOG engine, and that rules authored in OPA be executed in the OPA engine, rather than attempt to interchange rules between the two products. As long as there is a clear scope boundary between what the rule sets are used for, then there wouldn't be any duplication or interchange of rules.”

Having considered the design goals and challenges of RIF, it is easy to see that the design goals of SPIN are quite different. SPIN is not about capturing rules that can then be translated for execution by different types of rule engines. Rather it is about capturing rules that can be executed directly over RDF data and about having rules that are intimately connected to the Semantic Web models.

With these goals in mind, we identified the following three things as important principles in SPIN's design:

1. Rules can be expressed in a familiar language. People working with RDF must know SPARQL. Using SPARQL for rules means that they don’t need to use another language

2. Rules can be executed by any RDF database. Since they are in SPARQL, rules are portable – not across rules engines, but across RDF stores

3. Evolution of the models does not unnecessarily break the rules. For example, let’s say we change the URI of a resource used in a rule. If a rule uses some other format (XML) and is not connected to the underlying RDF in a way other than a blob, it becomes hard to maintain these two different sets of information

Finally, SPIN takes an object-oriented approach to rules. It is about programming and about associating behavior with classes while RIF takes a model-theoretic view on how the rules may relate to ontologies. This is a key difference as noted in W3C comments on SPIN submission.

In short, SPIN and RIF address different needs and have different design goals. They can be considered complimentary.

What about using SPIN and RIF together? Given the key role SPARQL plays in the architecture of Semantic Web solutions, I am certain that should RIF get traction in its adoption, someone will create a RIF profile for SPARQL and write a RIF to SPARQL translation.

I recently wrote in my personal blog about how, since joining TopQuadrant, I've grown to appreciate how well SPARQL can serve as a rules language. SPARQL Inferencing Notation, or SPIN, lets you associate rules and constraints expressed in SPARQL with classes of triples. While you don't need to use TopQuadrant products to take advantage of SPIN, they sure make it much easier, especially if you want to use those rules as part of an application. I just finished writing a tutorial (pdf) on how to implement SPIN rules and constraints with your models using TopBraid Composer, and except for one optional detail of the tutorial, it all works with the free edition, so it's available for anyone with a Mac or Windows machine to try.

Using a small collection of data about service contracts and materials purchases, the tutorial walks you through the creation of:

your own functions, written in SPARQL and returning values of whatever type you like
inferencing: the generation of new triples based on other data (in the case of the tutorial, the generation of ISO 8601 yyyy-mm-dd format invoiceDate values from "mm/dd/yy" date values stored in the original data)
constructors: the automatic generation of a postingDate value when a new MaterialsPurchase or ServiceContract instance is created
constraints: setting up the system to alert the user to unpaid materials purchases that are more than 90 days old or unpaid service contracts that are more than 60 days old. Instead of using a lot of redundant code to achieve these two different but similar goals, the tutorial shows how to define a reusable template with a SPARQL query and pass parameters to it (in this case, the numbers 60 or 90) when using the template.

I hope the tutorial demonstrates the potential connections between SPIN technology and real-world business issues to its readers, as well as the ease of implementing it all with TopBraid Composer.

Sunday, June 26, 2011

Comparing SPIN with RIF

Wednesday, October 14, 2009

SPIN Tutorial Available

More Blogs and Forums from TopQuadrant

Categories

Blog Archive