Ontology and Taxonomy: Strange Bedfellows

Explore the relationship between Taxonomy and Ontology with this presentation by Michael Uschold  from a keynote talk at the International Conference on Semantic Computing. 

Click Here to View The PDF

The Menu (Taxonomy) vs. the Meal (Ontology)

Taxonomy and Thesauri:

  • Focus is on words, not concepts (the menu).
  • Relationships are between terms: synonym, hyponym, broader/narrower term.
  • Each term should refer to just one concept.

Ontology:

  • Focus is on concepts (the meal).
  • Relationships are between concepts.
  • Formal definitions.
  • Automated inference.

How do we bring it all together?

  • Understand the value where each approach adds the most value.
  • Find the touch points and link them all up.
  • Can Everyone and every tool live in harmony?
  • It is not impossible, we are pushing hard and it gets easier!

 

 

 

 

Ontologies and Taxonomies

Ontologies and TaxonomiesAre you struggling with how to make best use of your company’s knowledge assets that have grown overly complex?  Have you wondered how to blend the more informal taxonomic knowledge with the more formal ontological knowledge?  This has been a real head-scratcher for us for quite a while now.  We described some breakthroughs we have made in the past couple of years on this front in a keynote talk at the International Conference on Semantic Computing in Newport Beach.

Ontologies and Taxonomies: Strange Bedfellows

Abstract:

In large companies, key knowledge assets are often unnecessarily complex, making them hard to understand, evolve and reuse. Ambiguity is at the root of the problem; it is often reflected in poorly structured information. We describe an approach using taxonomies and ontologies to root out ambiguity and create a set of building blocks that acts as a solid foundation for creating more useful structure.

We describe the challenges of working with both taxonomies and ontologies, and how we married them to provide a foundation that supports integration across a wide range of enterprise assets including spreadsheets, applications and databases.

Click here to view the presentation.

Written and presented by Michael Uschold

Read Next:

Part 6: Definitions are even more important than terms are

In recent posts, we stated that while terms are less important than concepts, and they mean nothing from a formal semantics perspective, they are very important for socializing the ontology. The same is true for text definitions, but even more so. Just like terms, the text definitions and any other comments have zero impact on the inferences that will be sanctioned by the ontology axioms. However, from the perspective of communicating meaning (i.e. semantics) to a human being, they play a very important role. Many of the people that want to understand the enterprise ontology to will mainly be looking at the terms and the text definitions, and never see the axioms. Text definitions help the human get a better idea of the intended semantics for a term, even for those that choose to view the axioms as well. For those interested in the axioms, the text helps clarify the meaning and makes it possible to spot errors in the axioms. For example, the text may imply something that conflicts with or is very different from with what the axioms say. The text definitions also say things that are too difficult or are unnecessary to say formally with axioms. Other comments that are not definitions, but that should be included in the ontology include: examples and counter examples, things that are true about a concept, but that are not part of defining it. Collectively all this informal text that is hidden from the inference engine contributes greatly to human understanding of the ontology, which is on the critical path to putting the ontology to use.

Read Next:

Part 2: Don’t let terms get in the way!

It frequently happens that a group of experts use a term so differently that they just cannot agree on a single meaning or definition.  This problem arises in spades in the area of ‘risk’. For example, in traditional operational risk management (ORM), when you measure risk, you multiply the probability of a loss times the amount of the loss.  In the modern view of ORM, risk is a measure of loss at a level of uncertainty. The modern definition of risk requires both exposure and uncertainty[1].  So you get two different numbers if you measure risk from these different perspectives. One can go round and round with a group of experts trying to agree on a definition of ‘risk’ and generate a lot of heat with little illumination.   But, when we change our perspective from the term, and instead start looking for underlying concepts that everyone agrees on we don’t have to look very far.  When we found them, we expressed them in simple non-technical terms to minimize ambiguity.   Here they are:

  1. Something bad might happen
  2. There is a likelihood of the bad thing happening
  3. There are undesirable impacts whose nature and severity varies (e.g. financial, reputational)
  4. There is a need to take steps to reduce the likelihood of the bad thing happening, or to reduce the impact if it does happen.

After many discussions and no agreement on a definition the term, ‘risk’, we wrote down these four things and asked the experts: “when you are talking about risk, are you always talking about some combination of these four things”?  “Yes” was unanimous. The experts differ on how to combine them and what to call them. For example, the modern view and the traditional view of risk each combine these underlying concepts in different ways to define what they mean by ‘risk’.  In the modern view, if the probably of loss is 100%, there is no risk because there is no uncertainty.   The concept that is called ‘risk’ in the traditional view, is called ‘expected loss’ in the modern view, but it is the same underlying concept. Compared to wading through the muck and the mire of trying to agree on terms, focusing on the underlying concepts using simple non-jargon terms is like a hot knife going through cold butter. Terms get in the way of a happy marriage too!  How many times have you disagreed with your partner on the meaning of a word?  It’s more than just semantics, it’s often emotional too.   I believe we are all divided by a common language, in that no two people use words to mean exactly the same thing, even everyday words like “support” or “meeting”.    I have learned that it is easier to learn and use the language of my spouse than it is to convince her that the term I use is the right one (despite the seductive appeal of the latter).


[1] “A New Approach for Managing Operational Risk Addressing the Issues Underlying the 2008 Global Financial Crisis”  Sponsored by: Joint Risk Management Section, Society of Actuaries,  Canadian Institute of Actuaries, and Casualty Actuarial Society

For further reading, refer to Michael Uschold’s additional posts in this series.

Read Next:

Part 5: Good terms are important for socializing the ontology

In my previous post, I explained why, when building an enterprise ontology, it is a good idea to focus on concepts first and to decide on terms later. Today, we will discuss what to do when ‘later’ arrives. But first, if terms don’t matter from a logic perspective, why do we care about them? In short, they are essential for learning and understanding the ontology.

This is true even if there is a single developer, who should be able to immediately know the meaning of a term, at a glance, not having to rely on memory. It is more important if there are multiple developers. However, the most important reason to have good terms is because without them, it is nearly impossible for anyone else to become familiar with the ontology, which in turn severely limits its potential for being used. How do we choose good terms for an enterprise ontology? An ideal term is one that is strongly suggestive of its meaning and would be readily understood by anyone in the enterprise who needs to know what is going on and is unfamiliar with terms tied to specific applications.

A term being strongly suggestive of its meaning just makes things easier for everyone. It requires that not only that the term is (or could be with minimal disruption) commonly use across the enterprise to express the concept it is naming, but also that the same term is not used for a variety of other things too. Such ambiguity is the enemy. Because an enterprise ontology is designed to represent the real world in the given enterprise independent from its applications, it is important to the terms are independent from any particular application.

This is easier said than done, as the terminology of a widely used application in an enterprise often becomes the terminology of the enterprise in general. Individuals in the enterprise forget that various terms are tied to a particular application and vendor just like we forget that ‘Kleenex’ is tied to a particular brand and manufacturer. Also, because the enterprise ontology is intended for use across the whole enterprise, it is not a good idea to use jargon terms that are only understood by specialists in a given area, and will likely be confusing to others. Future applications that are based on the enterprise ontology can introduce local terms that are understood by the narrower group of people.

To reap the most rewards from the enterprise ontology in the long term, it is important to explicitly link the terms in the application to the concepts in the enterprise ontology. This way, the terms in the application effectively become synonyms for the terms in the ontology reflecting the mapped concepts.

Read Next:

Part 4: Identify the underlying concepts

aIn the previous posts in the series, we discussed how it is important to focus on the concepts first and then the terms. Today we discuss identifying what the central concepts are in the enterprise. Every enterprise typically has a small handful of core concepts that all the other concepts hinge on. For retail manufacturing, it is all about products and specifications (of products) which leads to manufacturing. For health care, everything hinges on patients and providing care (to patients) which is the driver for diagnosis and treatment procedures. But how do we identify those core concepts? Unfortunately, there is no magic answer.

The trick is to get into beginners mind and start asking basic questions. Sometimes it takes a while before it is clear what the core concepts are. One good sign that you have them is that everything seems to click nicely into place. It is the distilled essence of a complex web of ideas. Once identified, this small handful of concepts becomes the glue for holding the enterprise ontology together as well as the basis for the story of explaining and socializing it to stakeholders when it is ready.

Read Next:

Part 3: Concepts first, then terms

In my previous blog, I described how for very broad and general terms, it can be nearly impossible to get a roomful of experts to agree on a definition of the term. However, it can be relatively easy to identify a small set of core concepts that everyone agrees are central to what they are talking about when they use that particular term.

In this blog, we explore the role of concepts vs. terms in the ontology engineering process more broadly; that is focusing on all terms, not just the more challenging ones. First of all, is it important to understand the role of terms when building an ontology in a formal logic formalism such as OWL. Basically, they don’t matter. Well, that’s not quite true. What is true is that from the perspective of the logic, the formal semantics and the behavior of any inference engine that uses the ontology, they don’t matter.

You could change any term in an ontology, or all of them, and logically, the ontology is still exactly the same. A rose by any other name, smells just as sweet. You can call it ‘soccer’ or ‘football’, but it is still the same game. So, especially in the early stages of building the ontology, it is important to focus first on getting the concepts nailed, and to defer any difficult discussions about terms. Of course, you have give the concept a name, so you can refer to it in the ontology, or to anyone else interested in the ontology. If there is a handy name that people are generally happy with, then just use that. When no good terms come to mind, then use a long descriptive term that is suggestive of the meaning, and get back to it later.

The Importance of Distinguishing between Terms and Concepts

Michael Uschold, Semantic Arts Senior Ontologist, will share a six part series on the importance of distinguishing terms from concepts when building enterprise ontologies for corporate clients.  This first article summarizes five key points; each is elaborated in greater detail in the subsequent posts. (Note: The titles of each post will become linkable below as they are published.) The importance of distinguishing Terms and Concepts: Overview

  1. Don’t Let Terms Get In The Way.Some terms are so broad and used in so many different ways (e.g. ‘risk’, ‘process’, ‘control’), that it may be impossible to get even a small group of people to agree on a single meaning and definition.  Oddly, it can be relatively easy to get those same people to agree on a few core things that they all are talking about when they are using the term.  The concepts are the real things you are trying to model, the terms are ‘merely’ what you call those things.
  2. Focus on Concepts First, Then Terms. We noted that getting a clear understanding of what the underlying concepts are is easier than getting agreement on terms. It is also more important, especially in the early stages of discovery and ontology development.  When modeling in a logic-based ontology language like OWL, the terms have zero impact on the semantics, the logic and the inference behavior of the ontology. From this perspective terms don’t matter at all.  This means we can safely defer the challenging discussions about terminology until after the concepts are firmly in place.
  3. Identify The Underlying Concepts. In a given enterprise, there is usually a small handful of key concepts that nearly everything else hinges on.  In manufacturing, it is about products and specifications. In healthcare, it is about patients and care provision.  Identifying these concepts first lays the foundation for building the rest of the ontology.
  4. Good Terms Are Important For Socializing The OntologyTerms do matter tremendously for overcoming the challenge of getting other people to learn and understand the ontology.  Confusing and inconsistent terminology just exacerbate the problem, which in turn hinders usability.  Having good terms matters to the ontology developers so that at a glance, they easily know what a term is intended to mean. It matters even more when socializing the ontology so that others can easily learn and understand it.
  5. Text Definitions Are Important For Socializing The OntologyFrom a logic perspective, text definitions and other comments are meaningless, just as terms are. But in ontology development, they play an even more important role up front than terms do.  This is because text definitions are the way we communicate to the stakeholders that we have understood the concepts they have in mind.

Stay tuned- we will discuss each item in further detail in the coming weeks. If you have questions for Michael, email him at [email protected] .

Read Next:

Dublin Core and Owl

Why isn’t there an OWL version of Dublin Core?

We’ve known about the Dublin Core (http://www.dublincore.org/) pretty much forever. We know it has a following in Library Science and content management systems, and Adobe uses their tags as the basis for the XMP (www.adobe.com/products/xmp/). And we knew that at least one of the original architects for the Dublin Core, Eric Miller (www.w3.org/People/EM/) is now deeply invested in the Semantic Web. So, we knew it was just a matter of time until we came to a client who was implementing a content management system, using the Dublin Core tags and who wanted to integrate that with their Enterprise Ontology. We assumed there was a Dublin Core OWL implementation just for this purpose. If there is, it’s pretty well hidden. (One of my motivations in this writing is to see if this brings it out of the woodwork). The obvious one (the one that comes up first in a Google search) is from Stanford (protege.stanford.edu/plugins/owl/dc/protege-dc.owl).

On closer inspection, the only OWL property used in this ontology is the owl:annotationProperty (comment). The rest of it is really just naming the tags and providing the human readable definitions. But this really isn’t helpful for integration. It turns out there are several other problems with the Dublin Core for this type of usage. For instance, the preferred usage of the “creator” tag is a LastName, FirstName literal. LastName, FirstName is pretty ambiguous.

There are a lot of “Smith, John”s in most corporate databases, and in many cases we know much more precisely (to the urn: level) which John Smith we’re dealing with when we capture the document. We have built an OWL version of the Dublin Core suitable for integration with Enterprise Ontologies. We ended up, I’m sure, re-inventing the wheel. I’m on the road again starting tomorrow, but within a week or two we expect to have it vetted and out in a suitable public place. In the next installment I’ll go over some of the design tradeoffs we made along the way. By the way, what suitable public places are people going to for their ontologies these days?

Building Ontologies Visually Using OWL

Faced with the challenges of UML and other modeling notations, we developed our own Visio-based ontology authoring tool. We’ve been building large enterprise ontologies for our clients using the W3C web ontology language OWL. If you’re not familiar with OWL, think of it as a data modeling language on steroids. It also has the fascinating property of being machine interpretable.

You can read the model with the aid of an inference engine, which not only tells you if all the assertions you have made are consistent, but also can infer additional parts of the model which logically follow from those assertions. So far the available tools for developing OWL ontologies, like Top Braid Composer and Protégé, look and feel like programming development environments. Some visualization tools are available but there is no graphical authoring tool, like Erwin or ER Studio, that data modelers have become used to.

The larger your ontology gets, the harder it is to understand and navigate using the current tools, and enterprise models push the boundaries of size and complexity for OWL ontologies. Another problem is that, because of OWL’s expressiveness, the UML diagramming conventions can result in a very complex-looking diagram even for simple ontologies.

This makes it hard to review models with subject matter experts (SMEs), who typically have about 15 minutes’ worth of tolerance for training in complex modeling notations. Faced with these problems, we developed our own Visio-based ontology authoring tool. It uses a more compact, more intuitive diagramming convention that is easier for SMEs to grasp.

From the diagram you can generate compliant OWL in XML format that can be read by any of the standard editors or processed by an OWL inference engine. The tool, which we call e6tOWL, is built as an add-in to Visio 2007 which provides the diagramming platform. In addition to the Visio template containing the diagramming shapes and the OWL generation function, the tool provides a lot of layout and management functionality to help deal with the challenges of maintaining large and complex diagrams.

So far the results have been good. We find it faster, easier and more fun to create ontologies graphically, and SMEs seem able to understand the diagramming conventions and provide meaningful feedback on the models. We are still using Composer and Protégé for running inference and debugging, but all our authoring is now done in Visio. We currently maintain the tool for our own use and provide it to our clients for free to help with the ongoing development and maintenance of ontologies.

Follow us on LinkedIn for more!

Skip to content