Part 2: Don’t let terms get in the way!

It frequently happens that a group of experts use a term so differently that they just cannot agree on a single meaning or definition.  This problem arises in spades in the area of ‘risk’. For example, in traditional operational risk management (ORM), when you measure risk, you multiply the probability of a loss times the amount of the loss.  In the modern view of ORM, risk is a measure of loss at a level of uncertainty. The modern definition of risk requires both exposure and uncertainty[1].  So you get two different numbers if you measure risk from these different perspectives. One can go round and round with a group of experts trying to agree on a definition of ‘risk’ and generate a lot of heat with little illumination.   But, when we change our perspective from the term, and instead start looking for underlying concepts that everyone agrees on we don’t have to look very far.  When we found them, we expressed them in simple non-technical terms to minimize ambiguity.   Here they are:

  1. Something bad might happen
  2. There is a likelihood of the bad thing happening
  3. There are undesirable impacts whose nature and severity varies (e.g. financial, reputational)
  4. There is a need to take steps to reduce the likelihood of the bad thing happening, or to reduce the impact if it does happen.

After many discussions and no agreement on a definition the term, ‘risk’, we wrote down these four things and asked the experts: “when you are talking about risk, are you always talking about some combination of these four things”?  “Yes” was unanimous. The experts differ on how to combine them and what to call them. For example, the modern view and the traditional view of risk each combine these underlying concepts in different ways to define what they mean by ‘risk’.  In the modern view, if the probably of loss is 100%, there is no risk because there is no uncertainty.   The concept that is called ‘risk’ in the traditional view, is called ‘expected loss’ in the modern view, but it is the same underlying concept. Compared to wading through the muck and the mire of trying to agree on terms, focusing on the underlying concepts using simple non-jargon terms is like a hot knife going through cold butter. Terms get in the way of a happy marriage too!  How many times have you disagreed with your partner on the meaning of a word?  It’s more than just semantics, it’s often emotional too.   I believe we are all divided by a common language, in that no two people use words to mean exactly the same thing, even everyday words like “support” or “meeting”.    I have learned that it is easier to learn and use the language of my spouse than it is to convince her that the term I use is the right one (despite the seductive appeal of the latter).


[1] “A New Approach for Managing Operational Risk Addressing the Issues Underlying the 2008 Global Financial Crisis”  Sponsored by: Joint Risk Management Section, Society of Actuaries,  Canadian Institute of Actuaries, and Casualty Actuarial Society

For further reading, refer to Michael Uschold’s additional posts in this series.

Read Next:

Part 5: Good terms are important for socializing the ontology

In my previous post, I explained why, when building an enterprise ontology, it is a good idea to focus on concepts first and to decide on terms later. Today, we will discuss what to do when ‘later’ arrives. But first, if terms don’t matter from a logic perspective, why do we care about them? In short, they are essential for learning and understanding the ontology.

This is true even if there is a single developer, who should be able to immediately know the meaning of a term, at a glance, not having to rely on memory. It is more important if there are multiple developers. However, the most important reason to have good terms is because without them, it is nearly impossible for anyone else to become familiar with the ontology, which in turn severely limits its potential for being used. How do we choose good terms for an enterprise ontology? An ideal term is one that is strongly suggestive of its meaning and would be readily understood by anyone in the enterprise who needs to know what is going on and is unfamiliar with terms tied to specific applications.

A term being strongly suggestive of its meaning just makes things easier for everyone. It requires that not only that the term is (or could be with minimal disruption) commonly use across the enterprise to express the concept it is naming, but also that the same term is not used for a variety of other things too. Such ambiguity is the enemy. Because an enterprise ontology is designed to represent the real world in the given enterprise independent from its applications, it is important to the terms are independent from any particular application.

This is easier said than done, as the terminology of a widely used application in an enterprise often becomes the terminology of the enterprise in general. Individuals in the enterprise forget that various terms are tied to a particular application and vendor just like we forget that ‘Kleenex’ is tied to a particular brand and manufacturer. Also, because the enterprise ontology is intended for use across the whole enterprise, it is not a good idea to use jargon terms that are only understood by specialists in a given area, and will likely be confusing to others. Future applications that are based on the enterprise ontology can introduce local terms that are understood by the narrower group of people.

To reap the most rewards from the enterprise ontology in the long term, it is important to explicitly link the terms in the application to the concepts in the enterprise ontology. This way, the terms in the application effectively become synonyms for the terms in the ontology reflecting the mapped concepts.

Read Next:

Part 4: Identify the underlying concepts

aIn the previous posts in the series, we discussed how it is important to focus on the concepts first and then the terms. Today we discuss identifying what the central concepts are in the enterprise. Every enterprise typically has a small handful of core concepts that all the other concepts hinge on. For retail manufacturing, it is all about products and specifications (of products) which leads to manufacturing. For health care, everything hinges on patients and providing care (to patients) which is the driver for diagnosis and treatment procedures. But how do we identify those core concepts? Unfortunately, there is no magic answer.

The trick is to get into beginners mind and start asking basic questions. Sometimes it takes a while before it is clear what the core concepts are. One good sign that you have them is that everything seems to click nicely into place. It is the distilled essence of a complex web of ideas. Once identified, this small handful of concepts becomes the glue for holding the enterprise ontology together as well as the basis for the story of explaining and socializing it to stakeholders when it is ready.

Read Next:

Part 3: Concepts first, then terms

In my previous blog, I described how for very broad and general terms, it can be nearly impossible to get a roomful of experts to agree on a definition of the term. However, it can be relatively easy to identify a small set of core concepts that everyone agrees are central to what they are talking about when they use that particular term.

In this blog, we explore the role of concepts vs. terms in the ontology engineering process more broadly; that is focusing on all terms, not just the more challenging ones. First of all, is it important to understand the role of terms when building an ontology in a formal logic formalism such as OWL. Basically, they don’t matter. Well, that’s not quite true. What is true is that from the perspective of the logic, the formal semantics and the behavior of any inference engine that uses the ontology, they don’t matter.

You could change any term in an ontology, or all of them, and logically, the ontology is still exactly the same. A rose by any other name, smells just as sweet. You can call it ‘soccer’ or ‘football’, but it is still the same game. So, especially in the early stages of building the ontology, it is important to focus first on getting the concepts nailed, and to defer any difficult discussions about terms. Of course, you have give the concept a name, so you can refer to it in the ontology, or to anyone else interested in the ontology. If there is a handy name that people are generally happy with, then just use that. When no good terms come to mind, then use a long descriptive term that is suggestive of the meaning, and get back to it later.

Spectrograph

Last night at the EDM Council meeting, Dave Newman from Wells Fargo used spectroscopy as an analogy to the process used to decompose business concepts into their constituent parts. The more I’ve been thinking about it the more I like it. Spectrograph Last week I was staring at the concept “unemployment rate” as part of a client project. Under normal light (that is, using traditional modeling approaches) we’d see “unemployment rate” as a number, probably attach it to the thing that the unemployment rate was measuring, say the US Economy, and be done with it. But when we shine the semantic spectrometer at it, the constituent parts start to light up. It is a measurement. Stare at those little lines a bit harder: the measurement has a value (because the term “value” is overloaded, in gist we’d say it has a magnitude) and the magnitude is a percentage. Percentages are measured in ratios and this one (stare a little harder at the spectrograph) is the ratio of two populations (in this case, groups of humans). One population consists of those people who are not currently employed and who have been actively seeking employment over the past week, and the other is the first group plus those who are currently employed. These two populations are abstract concepts until someone decides to measure unemployment. At that time, the measurement process has us establish an intensional group (say residents of Autuaga County, Alabama) and perform some process (maybe a phone survey) of some sample (a sub population) of the residents. Each contacted resident is categorized into one of three sub sub populations (currently working, currently not working and actively seeking work, and not working and not actively seeking work). Note: there is another group that logically follows from this decomposition, is not of interest to the Bureau of Labor Standards, but is of interest to recruiters: working and actively seeking employment. Finally, the measurement process dictates whether the measure is a point in time or an average of several measures made over time. This seems like a lot of work for what started as just a simple number. But look at what we’ve done: we have a completely non-subjective definition of what the concept means. We have a first class concept that we can associate with many different reference points, for example, the same concept can be applied to National, State, or Local unemployment. An ontology will organize this concept in close proximity to other closely-related concepts. And the constituent parts of the model (the populations for instance) are now fully reusable concepts as well. The other thing of interest is that the entire definition was built out of reusable parts (Magnitude, Measurement, Population, Measurement Process, Residence, and Geographic Areas) that existed (in gist) prior to this examination. The only thing that needed to be postulated to complete this definition was what would currently be two taxonomic distinctions: working and seeking work. David, thanks for that analogy.Last night at the EDM Council meeting, Dave Newman from Wells Fargo used spectroscopy as an analogy to the process used to decompose business concepts into their constituent parts. The more I’ve been thinking about it the more I like it. Last week I was staring at the concept “unemployment rate” as part of a client project. Under normal light (that is, using traditional modeling approaches) we’d see “unemployment rate” as a number, probably attach it to the thing that the unemployment rate was measuring, say the US Economy, and be done with it. But when we shine the semantic spectrometer at it, the constituent parts start to light up. It is a measurement. Stare at those little lines a bit harder: the measurement has a value (because the term “value” is overloaded, in gist we’d say it has a magnitude) and the magnitude is a percentage. Percentages are measured in ratios and this one (stare a little harder at the spectrograph) is the ratio of two populations (in this case, groups of humans). One population consists of those people who are not currently employed and who have been actively seeking employment over the past week, and the other is the first group plus those who are currently employed. These two populations are abstract concepts until someone decides to measure unemployment. At that time, the measurement process has us establish an intensional group (say residents of Autuaga County, Alabama) and perform some process (maybe a phone survey) of some sample (a sub population) of the residents. Each contacted resident is categorized into one of three sub sub populations (currently working, currently not working and actively seeking work, and not working and not actively seeking work). Note: there is another group that logically follows from this decomposition, is not of interest to the Bureau of Labor Standards, but is of interest to recruiters: working and actively seeking employment. Finally, the measurement process dictates whether the measure is a point in time or an average of several measures made over time. This seems like a lot of work for what started as just a simple number. But look at what we’ve done: we have a completely non-subjective definition of what the concept means. We have a first class concept that we can associate with many different reference points, for example, the same concept can be applied to National, State, or Local unemployment. An ontology will organize this concept in close proximity to other closely-related concepts. And the constituent parts of the model (the populations for instance) are now fully reusable concepts as well. The other thing of interest is that the entire definition was built out of reusable parts (Magnitude, Measurement, Population, Measurement Process, Residence, and Geographic Areas) that existed (in gist) prior to this examination. The only thing that needed to be postulated to complete this definition was what would currently be two taxonomic distinctions: working and seeking work. David, thanks for that analogy.

The Profound Effect of Linked Data on Enterprise Systems

Jan Voskuil, of Taxonic, recently sent me this white paper. It is excellent.  It’s so good I put it in the taxonomic category, “I wish I would have written this”. But since I didn’t, I did the next best thing.  Jan has agreed to let us host a copy here on our web site. Enjoy a succinct 18 page white paper suitable for technical and non technical audiences.  Very nice explanation of why the flexible data structures of linked data are making such a profound difference in the cost of changing enterprise systems.

Download whitepaper by Jan Voskuil – Linked Data in the Enterprise

How Data Became Metadata

Is it just me, or is the news that the NSA is getting off the hook on its surveillance of us because it’s just “metadata” more than a bit duplicitous? Somehow the general public is being sold this idea that if the NSA is not looking at the content of our phone calls or email, then maybe it’s alright.  I’m not sure where this definition of metadata came from, but  unfortunately it’s one of the first that the general public has had and it’s in danger of sticking. Our industry has not done ourselves any favors by touting cute definitions of metadata such as “data about data.” Not terribly helpful.  Those of us who have been in the industry longer than we want to admit, generally refer to metadata as being equivalent to schema.  So the metadata for an email system might be, something like:

  • sender (email address)
  • to (email addresses)
  • cc (email addresses
  • bcc (email addresses)
  • sent (datetime)
  • received (datetime)
  • read (boolean)
  • subject (text)
  • content (text)
  • attachments (various mimetypes)

If we had built an email system in a relational database, these would be the column headings (we’d have to make a few extra tables to allow for the multi-value fields like “to”).  If we were designing in XML these would be the elements. But that’s it.  That’s the metadata. Ten pieces of metadata. The NSA is suggesting that the values of the first seven fields are also “metadata” and that only the values of the last three constitute “data.”  Seems like an odd distinction to me.  And if calling the first seven, presumably harmless “metadata” and thereby giving themselves a free pass doesn’t creep you out, then check out this:  https://immersion.media.mit.edu/ . Some researchers at MIT have created a site where you can look at your own email graph in much the same way that the NSA can. Here is my home email graph (that dense cloud above is the co-housing community I belong to and source of most of my home email traffic).  Everybody’s name is on their nodes.  All patterns of interaction are revealed.  And this is just one view.  There are potentially views over time, views with whether an email was read, responded to, how soon etc., etc. Imagine this data for everyone in the country, coupled with the same data for phone calls. If that doesn’t raise the hackles of your civil libertarianism, then nothing will. ‘Course, it’s only metadata.   by Dave McComb

The Importance of Distinguishing between Terms and Concepts

Michael Uschold, Semantic Arts Senior Ontologist, will share a six part series on the importance of distinguishing terms from concepts when building enterprise ontologies for corporate clients.  This first article summarizes five key points; each is elaborated in greater detail in the subsequent posts. (Note: The titles of each post will become linkable below as they are published.) The importance of distinguishing Terms and Concepts: Overview

  1. Don’t Let Terms Get In The Way.Some terms are so broad and used in so many different ways (e.g. ‘risk’, ‘process’, ‘control’), that it may be impossible to get even a small group of people to agree on a single meaning and definition.  Oddly, it can be relatively easy to get those same people to agree on a few core things that they all are talking about when they are using the term.  The concepts are the real things you are trying to model, the terms are ‘merely’ what you call those things.
  2. Focus on Concepts First, Then Terms. We noted that getting a clear understanding of what the underlying concepts are is easier than getting agreement on terms. It is also more important, especially in the early stages of discovery and ontology development.  When modeling in a logic-based ontology language like OWL, the terms have zero impact on the semantics, the logic and the inference behavior of the ontology. From this perspective terms don’t matter at all.  This means we can safely defer the challenging discussions about terminology until after the concepts are firmly in place.
  3. Identify The Underlying Concepts. In a given enterprise, there is usually a small handful of key concepts that nearly everything else hinges on.  In manufacturing, it is about products and specifications. In healthcare, it is about patients and care provision.  Identifying these concepts first lays the foundation for building the rest of the ontology.
  4. Good Terms Are Important For Socializing The OntologyTerms do matter tremendously for overcoming the challenge of getting other people to learn and understand the ontology.  Confusing and inconsistent terminology just exacerbate the problem, which in turn hinders usability.  Having good terms matters to the ontology developers so that at a glance, they easily know what a term is intended to mean. It matters even more when socializing the ontology so that others can easily learn and understand it.
  5. Text Definitions Are Important For Socializing The OntologyFrom a logic perspective, text definitions and other comments are meaningless, just as terms are. But in ontology development, they play an even more important role up front than terms do.  This is because text definitions are the way we communicate to the stakeholders that we have understood the concepts they have in mind.

Stay tuned- we will discuss each item in further detail in the coming weeks. If you have questions for Michael, email him at [email protected] .

Read Next:

The re-release of DBBO: Why it’s better.

I’m writing this on an airplane as I’m watching the South Park episode where the lads attempt to save classic films from their directors who want to re-release them to make them more politically correct and appeal to new audiences. (The remake of Saving Private Ryan where all the guns were replaced with cell phones, for instance). So it’s with great trepidation that I describe the motivation and result of the “re-release” of DBBO. We’ve been teaching “Designing and Building Business Ontologies” for over 10 years. We have been constantly updating, generally adding more material. What we found was that while the material was quite good, we had to rush to get through it all. Additionally, we added material that caused us to explain concepts we hadn’t yet covered in order to make a particular point. And of course there was ever more material we wanted to add. Also we were interested in modularizing the material, so that people could take less than the full course if that’s what they needed. We essentially created a dependency graph of the concepts we were covering and dramatically re-sequenced things. This shone a light on a number of areas where additional content was needed. Our first trial run was an internal training project with Johnson & Johnson. This group at J&J had already built some ontologies in Protégé and TopBraid and had training in Linked Open Data and SPARQL. So we were able to try out two aspects of the new modularity: if students had the right pre-requisites, they could start in the middle somewhere. However, with the new packaging, would it be possible to spread the training out over a longer period of time and still give the students something they could use in the short term? So with J&J we did the new days 3, 4 and 5 in two sessions separated by two weeks. I’m happy to report that it went very well. Then in May we had our first public session. We’ve decided to have days 1-3 on Wednesday – Friday and days 4-6 on the following Monday – Wednesday. Some people who really want to learn all there is to learn can power through all six sessions contiguously. There is a lot to do in the Fort Collins area, and we’ve heard the break is good for consolidating what was learned before getting back into it. Others have decided to take days 1-3 and come back later to finish up days 4-6. The course is now more logically structured, not quite as rushed (although there is a lot of material still) hard to imagine with all the hands on work, we created nearly 2,000 slides, 90% of which were new or substantially upgraded.Learn more or sign up for DBBO. Dave McComb

Semantic Tech meets Data Gov

Watch out world! Eric Callmann, a vet of data governance, recently joined the Semantic Arts team as a consultant. We like his fresh and unique perspective on how to use semantic technology to help manage the mass amounts of data that could potentially drive us all mad.

We did a little Q&A with Eric to find out more about him and his take on #SemTech2013.

SA: What’s your background and how did we find you?

Eric Callmann: My background is in developing information quality programs at large information services companies. I found my way to Semantic Arts by way of a

n Information Governance Council where I met Dave McComb.

SA: What made you want to take the leap into semantics?

EC: The arena of semantics and ontologies is at the forefront of emerging, next generation tools to help us manage the larger and larger amounts of data that are being produced. Also the development of data governance programs had me thinking that as the realm of semantics grows, there is a greater need to govern and ensure high quality triples created within ontologies. I shared this with Dave and he felt the same way. As a result, I joined Semantic Arts in May and have been drinking from the semantic fire hose ever since.

SA: So we sent you to SemTech in June, what was your experience?

EC: I was fortunate to be able to attend SemTech2013. For someone who is not a diehard techie (e.g. I don’t build apps, manage databases, install hardware, etc.), rather a business person who knows how to leverage technology, the event was a way to see into the future.

SA: What do you mean by ‘seeing into the future’?

EC: What I mean by being able to see into the future is that semantic technologies is and will continue to make all of our lives on the internet and within enterprises easier to find the things we need. For example, Walmart presented their use of Semantics to improve the search experience on Walmart.com. They are leveraging publicly available information, such as Wikipedia, to help the search engine understand the context of what one is searching for. Throughout the conference there were numerous presentations about using this technology in new ways to improve analytics and develop new services. To say the least, the amount of innovation and entrepreneurship that is happening in this space is astounding. There are already new services that are being offered that use this technology such as Whisk.com that allows you to find a recipe and have your grocery list created automatically at your favorite grocery store (Note: This is only available in the UK…lucky Londoners). If Walmart is doing it, it is pretty obvious that Google and Yahoo! are leveraging this great technology in really cool ways too. More from Eric’s perspective in the future. We are excited to have Eric on board! If you have any questions for Eric, make sure to shoot him an email: [email protected]

Skip to content