The SKOS-XL extension to the W3C’s SKOS standard for vocabulary management adds flexibility in how
you track concept names, but it adds complexity and potential confusion that
are rarely, if ever, worth it.
What is the appeal of SKOS-XL? Information modelers wanting to
separate concepts (so-called conceptual ideas) from terms (the names people use
for concepts) often base their thinking on the model of Semiotic Triangle http://en.wikipedia.org/wiki/Triangle_of_reference
or Peirce’s Triangle. Sometimes also called a triangle of meaning, this philosophy
distinguishes a concept that exists in a human mind—a thought—from how it is
referred to and from a symbol that evokes it.
A referent is understood as a word. A symbol is typically
explained as a pictorial depiction. A key aspect of this theory is its focus on human
cognition. It postulates that there can be no name or identity intrinsic to a
concept as it only exists as a thought in a human mind.
The challenge of applying this thinking to information modeling is
that, ultimately, in information modeling we must commit everything to paper,
electronic or otherwise. Thus, every concept must have an identity and a name.
As a result, a separate model for concepts and terms where terms themselves
have identity, names, relationships and are tracked separately from concepts is
typically an over-complication that does not deliver practical value. For one
thing, even explaining to a business audience a difference between a concept
and a term is not simple. Colloquially, these words are often used interchangeably.
Once explained, distinguishing and keeping track of these on an ongoing basis,
when both “concepts” and “terms” are more often than not named using the same
words, can be mind boggling.
SKOS takes a simpler and what we believe to be a more practical approach
to information modeling. It provides a way to describe concepts by giving each
one:
- A globally unique identity
- A preferred label that is unique for a human language (such as English or German) within a scope of a particular “concept scheme”. It is called skos:prefLabel.
- Any number of alternative labels called skos:altLabel. Concept's alternative labels in a given language should not be the same as its preferred label in this language.
- Whatever other properties (attributes and relationships) are deemed necessary:
- SKOS supplies some standard relationships such as skos:broader, skos:related and skos:exactMatch and a number of annotations that are thought to be universally useful such as skos:definition and skos:editorialNote.
- Users of SKOS are free to add properties specific to their domain. For example, when using SKOS to describe different companies, a user may want to add a stock ticker field.
If needed, metadata about labels can be captured without giving
them identity of their own. TopBraid EVN is
a good example of a tool that offers this capability. Besides the language
part, such metadata is typically not just about the label itself, but about its
relationship to the concept—for example, who said that this is a preferred
label for this concept and when. All the relationships are between concepts,
not between the labels.
The W3C has published an optional extension to SKOS called SKOS-XL
(SKOS eXtension for Labels) that accommodates those who want to give separate
identity to concepts and terms. It does not use the word “term”—presumably, because
informally terms are often understood as concepts and vice versa. Instead it
introduces a class Label explained as a “lexical entity”. While the extension
is small with only one new class and five new properties, its implications are far-reaching.
As a result, providing tool support for SKOS-XL is considerably more complex
than for SKOS proper.
Concepts are connected to Labels by relationships that indicate
preferred (skosxl:prefLabel) and
alternative Labels (skosxl:altLabel) for a
Concept. There is no cardinality restrictions on these relationships–that is, a
Concept can be linked to multiple Labels using skosxl:prefLabel
link. Labels
can be linked to each other using skosxl:labelRelation
relationship. These links are separate from the relationships between Concepts.Direct use of SKOS properties that associate label strings with Concepts can be tricky when using SKOS-XL. According to SKOS-XL label strings for Concepts are derived using rules such as:
The property chain (skosxl:prefLabel, skosxl:literalForm) is a sub-property of skos:prefLabel.
This means that if there is a Label ex:Label1 with literal form
“love” and a Concept ex:Concept1 where ex:Concept1 connects to ex:Label1 using a
skosxl:prefLabel relationship, we can conclude that ex:Concept1 has a skos:prefLabel value of “love”. Since simultaneously keeping the integrity of
directly entered and inferred values is problematic, any tool supporting
SKOS-XL must protect the user from directly entering label strings for Concepts.
This makes it difficult to use the same tool to edit for SKOS and SKOS-XL
vocabularies, especially if users want to intermix different vocabulary
formats.
Furthermore, a user will see the same text label for different
entities. This will be not only because different Labels can have the same
literal forms, but also because the Concept resources “inherit” string labels
from the associated Label resources. This can easily lead to confusing results.
There are also various integrity clashes between SKOS and SKOS-XL.
For example:
1.
Two different preferred labels in the same
language
ex:Concept1skosxl:prefLabelex:Label1; skosxl:prefLabelex:Label2.
ex:Label1 skosxl:literalForm "love"@en .
ex:Label2 skosxl:literalForm "adoration"@en .
ex:Concept1skosxl:prefLabelex:Label1; skosxl:prefLabelex:Label2.
ex:Label1 skosxl:literalForm "love"@en .
ex:Label2 skosxl:literalForm "adoration"@en .
This is not “wrong” according to SKOS-XL because a Concept can be
connected to multiple Labels using the skosxl:prefLabel relationship. But, it
means that ex:Concept1has skos:prefLabel values of both "love"@en and
"adoration"@en. This is a violation of SKOS constraint S14, which
prohibits a concept from having more than one preferred label string in a given
language.
2.
Clash between preferred and alternative labels
ex:Concept1skosxl:prefLabel ex:Label1; skosxl:altLabel ex:Label2
ex:Label1 skosxl:literalForm "love"@en .
ex:Label2 skosxl:literalForm "love"@en .
ex:Concept1skosxl:prefLabel ex:Label1; skosxl:altLabel ex:Label2
ex:Label1 skosxl:literalForm "love"@en .
ex:Label2 skosxl:literalForm "love"@en .
Again, this is not “wrong” according to SKOS-XL because different
Labels can have the same literal form, but it’s a problem for SKOS because it
implies identical English language preferred and alternative label strings for
ex:Concept1.
Without a doubt, these issues play a role in the fact that while the
use and tool support for SKOS is growing, there are few if any tools for
SKOS-XL or published SKOS-XL vocabularies. An even more important factor is the
lack of compelling business value that would justify SKOS-XL complexity. Having
talked to a wide range of users working on business vocabularies, we have yet
to hear a use case that cannot be supported by SKOS alone.
Does SKOS-XL look like the only viable approach to your vocabulary
management needs? Let’s discuss it—maybe we can help you find a simpler
solution.