Sunday, June 26, 2011

Comparing SPIN with RIF

Since SPIN (SPARQL Inferencing Notation) aka SPARQL Rules became W3C member submission,we find ourselves responding to the growing interest to it.

With this, a question some may ask is how SPIN is different from or similar to RIF - W3C's standard for rules interchange.

While I have heard this asked a couple of times, I was pleasantly surprised that it was is not a very common question. Pleasantly, because a certain level of confusion is to be expected about new things and, both, SPIN and RIF are relatively new. If so few people ask this question, then SPIN specification did a good job explaining and positioning it and people easily grasp the unique and important needs it serves. Still, I thought it was worth while to write up my thoughts on comparing SPIN with RIF.

The goal of RIF was to create an interchange format for use between rules engines. As such, unlike SPIN, RIF is not an idea that is specifically or particularly aligned with RDF. This is why RIF was created as XML (although there is now work on RDF serialization). I am not pointing this out as a shortcoming of RIF, but rather to put in perspective the origin and the reason for RIF. In its goals, RIF is similar to OMG's XMI which also uses XML and was created to be an interchange format between different tools.

Given this similarity, XMI’s failure in being a reliable interchange format becomes relevant when considering RIF's future. Will RIF succeed in reaching its goal? One can easily argue that with the variety of available rules languages and engines, RIF’s job is harder than what XMI needed to do to succeed.

As noted here, different rules languages exist because there are different algorithms and formalisms for rules. Furthermore, different rule products have different sets of capabilities. RIF dialects are intended to be the least common denominators for a given type of a rule engine. This means that in order to effectively use the same set of RIF rules in the ‘rules engine A’ and in the ‘rules engine B’, the following needs to happen:

1. RIF dialect used to express the rules, needs to be supported by both rules engines.

Checking the implementation page, one will see that currently the overlap between any two engines is not that great. Some support BLD, some support PRD + Core, others support BLD partial or PRD minus something, etc.


2. RIF dialect used to express the rules, must be enough for the task at hand.


As mentioned above, RIF by design is somewhat of a least common denominator. This means that a user could always do more with a given rules engine than they can express in a dialect of RIF.

For example (as noted here), SPARQL is more expressive than what is possible with RIF. This is not unique to SPARQL, it is true for pretty much any rules technology.

3. The interchange must work

Given well known XMI issues, I am quite keen to see RIF test cases as well as test case results from the implementers

Attitude of the major rule engine vendors towards RIF is currently, at best, lukewarm. For example, on the Oracle forum, support engineers recommend against attempting to interchange rules by saying:

“In a hybrid environment I'd recommend that rules authored in ILOG be executed in the ILOG engine, and that rules authored in OPA be executed in the OPA engine, rather than attempt to interchange rules between the two products. As long as there is a clear scope boundary between what the rule sets are used for, then there wouldn't be any duplication or interchange of rules.”

Having considered the design goals and challenges of RIF, it is easy to see that the design goals of SPIN are quite different. SPIN is not about capturing rules that can then be translated for execution by different types of rule engines. Rather it is about capturing rules that can be executed directly over RDF data and about having rules that are intimately connected to the Semantic Web models.

With these goals in mind, we identified the following three things as important principles in SPIN's design:

1. Rules can be expressed in a familiar language. People working with RDF must know SPARQL. Using SPARQL for rules means that they don’t need to use another language

2. Rules can be executed by any RDF database. Since they are in SPARQL, rules are portable – not across rules engines, but across RDF stores

3. Evolution of the models does not unnecessarily break the rules. For example, let’s say we change the URI of a resource used in a rule. If a rule uses some other format (XML) and is not connected to the underlying RDF in a way other than a blob, it becomes hard to maintain these two different sets of information

Finally, SPIN takes an object-oriented approach to rules. It is about programming and about associating behavior with classes while RIF takes a model-theoretic view on how the rules may relate to ontologies. This is a key difference as noted in W3C comments on SPIN submission.

In short, SPIN and RIF address different needs and have different design goals. They can be considered complimentary.

What about using SPIN and RIF together? Given the key role SPARQL plays in the architecture of Semantic Web solutions, I am certain that should RIF get traction in its adoption, someone will create a RIF profile for SPARQL and write a RIF to SPARQL translation.

This is a blog by TopQuadrant, developers of the TopBraid Suite, created to support the pursuit of our ongoing mission - to explode strange semantic myths, to seek out new models that support a new generation of dynamic business applications, to boldly integrate data that no one has integrated before.