Enterprise Ontologies – The Holy Grail or the Emperor’s New Clothes?
December 17, 2010
Enterprise ontology was a topic of academic research in the 90s, e.g. in Canada, Holland, and Scotland. This early work was informed by industrial practice in enterprise modeling, and emphasized industrial issues like product design, requirements, organization, manufacturing, transportation, quality, inventory etc. The added value of an ontological approach compared to conventional enterprise modeling was however never clear to industrial practitioners, and ontology remained an academic exercise, which was not picked up by leading tool vendors.
Recently, the fad of the “semantic web” has brought forward an even more theoretical approach to enterprise ontology. Its proponents seem unaware of the earlier work. As before, interoperability is the core concern that ontologies are addressing, e.g. in the IDEAS framework. However, it seems that the focus has moved from interoperability of enterprises to exchange of enterprise models. This post questions if such an approach is viable. In the absence of any evidence that demonstrates that ontologies work in practice, I apologize for the theoretical nature of this post.
The Holy Grail
I recently attended a meeting where a seemingly well-designed and useful solution to model exchange was rejected because it was not the “holy grail”, the full and complete solution, which an ontology would provide, sometime in the future. The holy grail does not exist. Interoperability is not a problem that can be solved perfectly. The root of the problem is not technical, but organizational and social. People use different languages because their tasks and responsibilities differ, as well as their background and training. There are sound reasons for interoperability problems, having to do with e.g. specialization of skills and the core competences of different organizations. What is needed then, is a practical language that allows people to communicate although they see things differently, and interpret the terms of the language differently. Perfect interoperability is unattainable. Interoperability is an ongoing process that aims to continuously increase shared understanding, not a finished product or standard.
Through a formal and mathematical foundation, ontologies offer an illusion that a precise language can be defined, which removes all the misunderstandings that prevent interoperability. The formal foundation guarantees that there is only one correct meaning of the data, independent of context or interpretation. The ontological approach thus describes a world where there are no interoperability problems. Problem solved, right? Not really. An ontology is just a language, the real world interoperability problems are still there. It is just that the ontology does not allow inconsistency, uncertainty, ambiguity, conflicting interpretations, conflicts of interest etc. All these interoperability problems still exist, they just have to be kept on the outside of the ontology. In practice, they will appear on the boundary of the ontology instead, when different communities are trying to map their local data to the common ontology. The key question to whether an ontological approach is useful, lies in the support it offers for this mapping, not in the internal soundness, precision, or consistency of the ontology itself.
Any given ontology should of course be evaluated according to the same criteria as any other language or common datamodel. For enterprise architectures and other data that are designed for human interpretation more than computer processing, the increased mathematical precision that an ontology requires, may be counterproductive. Because people bring their own background and knowledge to the interpretation of a piece of data, they are often able to deal with uncertainty, ambiguity, conflicts, and inconsistencies. Imprecision is a benefit when it helps people to communicate their uncertainty. A formal language may create artificial precision, by prohibiting the many nuances of meaning inherent in communication between people. In the real world, all data will be interpreted in context.
The Emperor’s New Clothes, Recycled
Artificial intelligence (AI) is the academic tradition that spawned semantic web and ontologies. The field combines tried and tested software engineering techniques with a strong belief in formal logic. It seems to have survived through “reincarnation”, by renaming and rebranding itself once every few years. The field first builds a strong hype that leads to a lot of government research funding, although the interest from industry is lukewarm. A few years pass, and the solutions promised do not appear. If successful products exist, they are based on conventional software engineering, and do not really utilize formal logic. The buzzword becomes a joke. Someone comes up with a new name, and the cycle starts again. So far we’ve seen artificial intelligence, expert systems, intelligent agents, and numerous knowledge-based things come and go. The semantic web may have a year or two left. After the misuse of terms like “intelligence”, “knowledge”, and “expert”, it will be interesting to see what comes next.
Semantics is the latest term that is being misused by the AI community. Like their previous “brands”, the term has been reduced to mean something to do with formal logic. Compared to what happened to “expert”, “knowledge”, and “intelligence”, this latest abduction of a term looks relatively innocent. An in depth look should still shed some light on the issue.
Let us first take a historical perspective. Berners-Lee is commonly credited with having coined the term “semantic web”, referring to a web of machine-understandable information. He also points out that “Semantic Web is not Artificial Intelligence”. RDF is different from conventional AI: “Where the other models are related to previous unmet promises of computer science, now passed into folk law as unsolvable problems, they suggest a fear that the goal of a Semantic Web is inappropriate.” Unlike conventional models, the semantic web should not be based on the “closed world assumption”, it should be open, distributed, and decentralized. While Berners-Lee is eager to point out the distance between AI and the semantic web, he also emphasizes the closeness between the semantic web and conventional software engineering practices like entity-relationship modeling and relational databases.
These ideas led us to consider RDF as a suitable foundation for interoperability a few years ago. Ontologies, on the other hand, are mostly centralized and closed models, following a conventional AI approach.
We can also take a philosophical approach to understanding the semantic web. Let us start with the semiotic triangle(which some people even call the semantic triangle). Semantics, the study of meaning, deals with all sides of the triangle, what the actor interprets the representation to refer to in the real world, and what knowledge it conveys about those things or phenomena. The relationship between the representation and what it stands for is derived, always subject to interpretation.
What formal semantics do, is to replace the human actor on the top of the triangle with a mathematical “engine”. This is the standard move that AI makes in abducting terms like semantics, knowledge, and intelligence. This automatic actor is more godlike than human, knowing all there is to know, seeing all there is to see, never making mistakes. This approach, more religion than science, is something ontologies and mainstream semantic web research share with previous incarnations of AI. The relationship to the real world referents can of course not violate the core principles of science, so the techniques are non-referential, saying nothing about the real world. We seem to have moved quite far away from real world interoperability issues here.
The philosophical discipline of ontology is the “study of the nature of being, existence or reality as such, as well as the basic categories of being and their relations” (Wikipedia). This perspective inspired information systems research, and the conceptual framework of Mario Bunge was used as a starting point for early work on ontology. But according to this definition, ontology and formal semantics are opposite approaches, starting in different corners of the triangle, having no common things to study.
At least according to Wikipedia, a new definition of ontology dominates computer science: “a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain”. From a software engineering perspective an ontology is just like any other model, except it is formal.
Ontology, then, is yet another well established term, abducted and reinterpreted by the AI community in such a way that it ends up meaning almost the opposite of what it used to. Their definition confuses the three corners of the semiotic triangle, referring to concepts as units of representation, in a domain of (real world?) entities it can reason about. This confusion is common also in typical ontologies: Both OWL and IDEAS call their most generic term “Thing”. Other modeling frameworks, even the more technical ones, recognize that they are defining a language, and use e.g. “Element” as the root construct (see UML).
According to Bunge, the meaning of a concept is determined by its reference set, the set of material and immaterial objects it refers to, and its intension. The intension specifies which properties or features the concept implies that the referents possess. Concepts can be categorized by their reference into individual (definite or indefinite), class, relation (comparative or not), and quantitative concepts. Here the concepts are ordered according to growing logical strength, their ability to express facts precisely. Many ontologies in IT follow these principles, basing their formalization on set theory, where individual referents are the basic members of the sets. OWL lets you define a class by enumerating all its members (the reference set), and also by property restriction (intension).
Specialization defines a subset of the references, often associated with adding properties to the intension. The reasoning power of ontologies seem to primarily utilize specialization and class membership, deriving consequences from the fact that statements about a class applies to all its members as well as all members of its subclasses. This semantic rule is of course also implemented in most non-ontological modeling and programming languages, as inheritance. This can thus not be regarded as a benefit of an ontological approach, it is standard computer science. In order to find the holy grail of ontology, we need to look elsewhere.
Formal and Natural Languages
Looking at an ontology as just another model or language, there is a number of areas where it may be better suited than the alternatives, if it is
- Better at specifying automated behavior than conventional programming,
- Better for human communication than natural language,
- Easier to understand for humans than business process and enterprise models,
- A better and more neutral common data model for translation between enterprise modeling languages,
- A better serialization format for model exchange than e.g. XMI, or
- Better at specifying mappings than e.g. XSLT.
For the first three, the answer is clearly “no”, although AI earlier has claimed the opposite, e.g. when it comes to programming languages.
As a common data model, formal mathematics may seem to offer an independent and neutral framework. If we look at the ontologies that have actually been promoted, however, the background and bias of their designers are evident. The difference between the early enterprise ontologies mentioned in the introduction, and more recent proposals like IDEAS, illustrate this. DEMO represents useful concepts for business analysis, like the difference between producing and coordinating activities. IDEAS, on the other hand, follows object-oriented programming, focusing on material nouns (things, objects), and regarding the difference between individuals and types as fundamental. It looks more like a “UML ontology” than an enterprise architecture ontology. Enterprise architects realize that what one person sees as an individual, another may see as a type, and EA tools do not distinguish so strictly between the two. For a software engineer, a single codebase represents an individual “software”. For the test manager, this software running on different platforms represents different individuals, so he interprets the software engineering instance as a type. This is one of many examples where the precision required by an ontology is alien to the less precise languages it is supposed to translate between, adding complexity, noise and changing the semantics on the way. It is simply not a good fit.
Poor syntactical quality has been a challenge for ontology-based serialization formats like XML. For several years, OWL and RDF did not have well-defined XML schemas, making it difficult to use them in e.g. a service oriented implementation. This is not my area of expertise, but it seems that an ontological exchange format will have to be significantly better than the already established XMI if it is to be taken up by tool vendors. Similarly, for mapping, promoters of ontology often use trivial examples, like concatenating or splitting text values, to illustrate their techniques. An ontological approach seems like overkill for such simple problems, so they need to make a better argument about added value. In general, though, it seems that formal logic is a language that appeals to a very small group of people. On a scale from user-friendly to technical, formal logic is seen as far more technical than any programming language. This is a critical barrier.
Rather than going for the most cryptic languages, we should recognize interoperability as a problem that originates from social organization, and look into how people are able to communicate with natural language, even though they never understand each other perfectly. Ontologists could learn a lot about semantics in the real world by looking into the sociology of knowledge, e.g. the discussion of meaning in Communities of Practice. Learning Meaning and Identity, by Etienne Wenger.
Towards Interoperable Enterprise Models
In order to solve the problem of sharing and exchanging enterprise architecture models across tools, a pragmatic perspective is needed, not a theoretical one. In order to maximize the transfer of semantic information between notations and tools, a common language should be as neutral as possible, and not linked to one of the least suitable notations, like UML or MOF. At the same time, point-to-point mappings should be utilized in addition to the common hub. By allowing a federation of common and local mappings, you are able to simplify the common model by not including everything that is common to just two or a few tools.
A typical weakness of ontological approaches, and many other attempts at interoperability, is to focus too much attention on the common language. We see a need to ensure interoperability on several levels:
- On the model data level, the identify of each element must preserved in a roundtrip from one tool to another and back. This is often difficult because the tools have different ways of handling identity, even when they use the same framework, e.g. GUIDs in XMI. Many tools store external identifiers to enable roundtrip, but few can handle several external identification schemes. A common information identification framework is thus the first foundation of interoperability.
- On the real world referent level, identify is also a problem. There may be many ways of defining what in the real world an information element refers to, the “semantic identifiers”, if you like. This is important for removing duplicates when two separate models are to be merged, and they refer to some of the same real world entities.
- On the language level (metamodel), there is of course a need for mapping between element types. However, especially if the tools that you are trying to integrate are capable of extending their data models, like many EA tools are, the language level is just one part of the puzzle.
- On the metametamodel level we find the underlying modeling frameworks of the tools. They can differ in many ways. Even UML tools, in theory implementing the same metamodeling stack, have their own local extensions and nuances. Typical issues on this level deals with the separation of types and individuals, the representation of relationships as pointers or separate elements, whether relationships can have properties, parts, and relationships to other elements, how roles are encoded, etc.
One of the major weaknesses of ontologies is that they try to define a too rigorous common modeling framework, spanning multiple levels. Whereas the concept of a relationship may be slightly differently represented in two different modeling tools, and thus have slightly different semantics, you would still like to translate relationships in a model from one tool into relationships also in the other. Interoperability is achieved in spite of semantic differences. If there are no semantic differences there are only syntactical interoperability problems, and for those you need no ontology. The high level of formalism in an ontology creates overly complex common datamodels, on multiple levels. This creates a need for duplicating many of the same elements on the model, metamodel and metametamodel levels, so that you not only have X, you also have ClassOfX or XType, and ClassOfClassOfX or XTypeType.
There are also lessons to be learned from the way people use natural language to communicate. Again AI seems to be more interested in coming up with formal models of natural language, than on understanding how natural languages actually works in a social setting. They talk the same talk all the time, but are not very interested in listening.
One of the most interesting perspectives that IT has learned from natural language, is the concept of semantic holism, I first saw it in a paper on interoperability by Kangassalo. It points out that signs and terms are interpreted in context, not by themselves. Rather than looking for the one-to-one correspondence between a single sign and its referent, we need to look at how a complete model or a complete text as a whole refers to the real world. Take for instance the term “a car” in these two sentences: “I saw a car”, “A car has an engine”. In the first, it refers to an individual, in the second to a type. You need to see the whole sentence in order to understand which it is. This is one of our core modeling principles, which is especially important for simplifying the datamodels.
Storytelling is often regarded as the most effective means of communicating knowledge. A work of fiction does by definition not have any referents in the real world, yet the full text is able to create meaning, if the reader sees parallels with his own experience and knowledge, in a holistic way.
In modeling, holism leads us to focus on patterns and contexts. Anything that you are able to deduce from the context of an element, you need not represent in that element. If you choose to represent it also in the element, you end up having to define consistency rules between the two representations. A typical example is a composition structure, where some languages define different types for atomic and composite elements, e.g. Action and Process as subtypes of Activity. Then you need rules stating that an Action cannot have sub-activities, and that a Process must have sub-activities. This is unnecessarily complex and makes evolution difficult. Instead, you should define Process and Action as derived types, stating that an activity that has sub-activities is a Process, and an atomic activity is an Action. Similarly, rather than having to define ClassOfX and ClassOfClassOfX in addition to X in the ontology, it should be sufficient to define X, and represent ClassOfX as X and Class, in a multidimensional manner. This is similar to how the same words may refer to both classes and individuals, depending on the context. It should be said that OWL property restrictions provide this mechanism, so it is possible to design some good features into an ontology framework as well.
More often, however, the focus on formalization leads to blindness. Most people promoting ontologies seem to lack the practical experience needed to understand the limitations of a formal approach. Similar to formal computer science, many ontologists focus on the simple issues, those that are most easy to formalize, rather than the complex and hard problems. They look for the problems that have a perfect solution, not realizing that those are non-problems in practice. For instance, they tend to start with material things, rather than focusing on more difficult concepts like quality, goal, activity, adjective etc.
I also have a stereotypical perspective that influences my interpretation and problem framing. That perspective is to see computing as fundamentally about interaction between autonomous components, not primarily as automation inside a single machine. When some of the components are human users, organizational, social and psychological factors become important. In this perspective, interoperability becomes a problem that is solved through an unfolding, partially automated process, where the actors involved are supported in interacting about their different semantic interpretations. This seems a more appropriate understanding of interoperability than “lack of precision requiring formal logic”, at least that is my perspective.
PS: In an extensional ontology like IDEAS, where “physical existence is the criterion for identity”, the two concepts “holy grail” and “emperor’s new clothes”, may appear as identical (their real world extension is the empty set in both cases). The intensions however, are quite different.