Yesterday a question about how ontologies may be different from logical data models was asked by a newcomer on TopBraid Users Forum. As to be expected on the TopBraid Forum, by ontologies he meant specifically ontology models expressed in RDFS/OWL. Because we frequently hear this or similar questions in our trainings, workshops and in conversations with customers, I decided to respond in a blog post instead of writing an e-mail.
Data modeling was invented more than thirty years ago to help with the design of databases, specifically, relational databases. As quoted below, ANSI definition from 1975 differentiated between three data models – conceptual, logical and physical. Data modeling quickly became recognized as a tool for analyzing the semantics of an organization with the respect to the structure and flow of the information used in carrying out organization’s activities. Wikipedia offers the following definition of Data Modeling:
Data modeling is a method used to define and analyze data requirements needed to support the business processes of an organization. The data requirements are recorded as a conceptual data model with associated data definitions. Actual implementation of the conceptual model is called a logical data model.These definitions describe a clear progression from conceptual to logical to physical data models. SInce their origin is in the 70s, they reflect certain technology assumptions than no longer hold true.
<…>
In 1975 ANSI described three kinds of data-model instance:
According to ANSI, this approach allows the three perspectives to be relatively independent of each other. Storage technology can change without affecting either the logical or the conceptual model. The table/column structure can change without (necessarily) affecting the conceptual model.
- Conceptual schema: describes the semantics of a domain (the scope of the model). For example, it may be a model of the interest area of an organization or of an industry. This consists of entity classes, representing kinds of things of significance in the domain, and relationships assertions about associations between pairs of entity classes. A conceptual schema specifies the kinds of facts or propositions that can be expressed using the model. In that sense, it defines the allowed expressions in an artificial "language" with a scope that is limited by the scope of the model.
- Logical schema: describes the structure of some domain of information. This consists of descriptions of (for example) tables, columns, object-oriented classes, and XML tags.
- Physical schema: describes the physical means used to store data. This is concerned with partitions, CPUs, tablespaces, and the like.
When information modeling is done to create a relational database, conceptual model must be different from a logical model because there is no place in a relational database structure to capture, for example, business rules, create subsumtion relationships and describe other key aspects of a conceptual model. This semantic information collected and documented as part of the initial modeling is left behind when modelers and designers move on to define a logical data model. The "left behind" parts are used by software developers as they encode business semantics directly into custom programs.
Logical data model is a subset of a conceptual model that can be expressed using a particular technology. However, there are always some performance considerations that require additional changes to the logical data model before it can be implemented in a relational database. Hence, some of the aspects of a logical model are left behind as it gets translated into a physical data model.
Since an ontology is a model of a domain describing objects that inhabit it, all three types of data models can be thought of as ontologies. They range from the most expressive one that describes business concepts and processes (the conceptual model) to less expressive and progressively moving from describing business semantics to describing physical structures of the data as it is stored in the databases (the logical and physical data model). Physical model can be thought of as an ontology of a particular database. Wikipedia goes on to note
Early phases of many software-development projects emphasize the design of a conceptual data model. Such a design can be detailed into a logical data model. In later stages, this model may be translated into physical data model. However, it is also possible to implement a conceptual model directly.Semantic Web standards (governed by the W3C, the World Wide Web Consortium) make it possible to implement conceptual models directly. This is possible due to the layered architecture of the Semantic Web technology stack consisting of:
- RDF – a canonical data model that is like relational data model in its ability to connect related objects and unlike relational data model in that the data objects (or resources in RDF-speak) are highly granular.
- RDFS (RDF Schema) and OWL (Web Ontology Language) – RDF-based languages for expressing business semantics.
A growing number of standards bodies and communities of interest are publishing RDF/OWL data models for their particular domains. For example:
- SKOS – provides a way to represent taxonomies and thesauri
- ISO 15926 – offers a data model for sharing life-cycle data for process plants including oil and gas production facilities
- Ontology for Media Resources - defines a core set of metadata properties for multimedia resources
- SIOC - defines information about online communities
- QUDT - provides models describing measurable quantities, units for measuring different kinds of quantities and the data types used to store and manipulate these objects in software
- Provenance Vocabulary - defines provenance-related metadata
I will end by pointing to a few relevant related blogs and web pages we have published before:
- How to extend an ontology http://topquadrantblog.blogspot.com/2011/03/how-to-extend-ontology.html
- Ontology Mapping with SPINMap http://topquadrantblog.blogspot.com/search/label/SPINMap
- Training on RDF, OWL and ontology modeling http://www.topquadrant.com/training/training_overview.html
- Transforming XML Schemas and XML into RDF/OWL http://topquadrantblog.blogspot.com/2011/09/living-in-xml-and-owl-world.html
- Converting UML models to OWL http://topquadrantblog.blogspot.com/2011/02/converting-uml-models-to-owl-part-1.html
4 comments:
Thanks!!!
The information in this blog is extremely useful. One thing, the links for the related blogs errors out. It would be great if these links are updated
Thanks!!!
The information in this blog is extremely useful. One thing, the links for related blog at the end do not work. would appreciate if these are updated
Clearly 'modelling power' and 'direct implementation' are two key factors but I miss the OWA/CWA discussion here which might be even more key? (false versus unknown).
Furthermore these two first factors are not 'exact'. With good old EXPRESS (STEP technology) we could also model specialisation/taxonomies and define complex here called where-rules. And as long as you have a clean/complete one-to-one mapping to the implementing system also EXPRESS could abstract from the underlying implementation mechanism (just like you can have a triple store or RDBMS backend for RDF/OWL). All in all my first real factor for differentiating between ontologies and data models would we OWA/CWA....my two cents, Michel Böhms
Open world assumption adds flexibility and enables knowledge discover/classification. However, for many practical enterprise applications, it is important to be able to close the world. SPIN (SPARQL Rules) makes this possible. So, do some other rules-based approaches. Typical of RDF, this can be done with a lot of flexibility - closing the world for some operations and leaving it open for others.
Keep in mind that this blog entry was not about ontologies in general, but specifically about RDF/Linked Data/Semantic Web ontologies. The second sentence tries to make this point very specific by saying that "As to be expected on the TopBraid Forum, by ontologies he meant specifically ontology models expressed in RDFS/OWL." Thus, for the blog authors the most fundamental differentiation is in RDF itself - its support for globally unique identifiers, data and schema distribution and merging, etc.
Post a Comment