S&P Global Commodity Insights (formerly known as Platts) provides benchmark price assessments for the physical commodities markets. As we have noted before, if you hear on the news that Brent Crude is trading at $100 a barrel, it’s likely that S&P Global determined this price.
The price of a given commodity—be it coal, crude oil, or steel—depends on several factors, and in large part it depends on the commodity’s chemical characteristics. Crude oil, for example, comes in a variety of “grades.” A quick look at this periodic table of crude oil grades will gives a sense of how much variety there really is. If you hover around and click on some of the boxes, you will see that each crude oil grade is associated with a technical specification related to characteristics like sulfur concentration and API Gravity. These characteristics determine whether crude oil is “sweet” or “sour” (depending on its sulfur concentration) and whether it is “heavy” or “light” (depending on its API Gravity).
While crude oils have other characteristics, the market has determined sulfur concentration and API Gravity to be the most important for the purposes of price assessments. The same thing holds for other types of commodities—there are “key market characteristics” for any commodity that will be important for pricing purposes.
While this knowledge about commodities is commonplace at S&P, pinning down the precise meaning of “commodity grade” turned out to be a significant challenge internally to define. This is one place where Semantic Arts was able to provide value. We identified and analyzed the various concepts underpinning the idea of a commodity and produced a clear definition.
One might think of a commodity grade as a “leaf” in the hierarchy of commodity types. Crude oil is the general type, while the Saharan Blend is a specific grade at the bottom of the hierarchy. This is a good first pass definition, but it fails to get at the essence of what a commodity grade really is.
To make headway in clarifying the definition, our strategy was to learn from subject matter experts about how the concept is used in practice.
A key question was:
When people talk about commodity grades at S&P, what are they truly talking about?
It turned out that whatever else people might mean by a commodity grade, it always included the specification having features that the market cares about. Additionally, it is specific enough to be priced in the marketplace (by S&P or anyone else). With the hindsight of this analysis, we were able to create a clear and formal definition of the concept of a commodity grade using semantic standards (W3C) to express. This was the first step . It provided the basis for deeper data quality checks. This becomes critical to Commodity Insights; whereby S&P’s core business is selling accurate data under constantly changing conditions.
The second step was to create a means for validating in S&P’s data that commodity grades had specifications for the relevant key market characteristics. For example, any crude oil grade should have a specification indicating an acceptable range of sulfur concentration. To achieve this, we developed a solution using SPARQL & SHACL that returned validation reports to indicate which grades might be missing specifications for certain characteristics. This semantic method proved significantly more efficient than using Excel spreadsheets, manual entry (mistake prone) processes.
While we used crude oil as an example to illustrate this idea, the framework we developed can be applied to all commodities. Most importantly, it instilled automation, data interoperability and consistent meaning across multiple teams that previously interpreted commodity grade with a different lens. The solution offers easier value change implementation saving countless man hours in aggregating spreadsheets with greater data integrity and traceability.
There were a few important takeaways. First, it can take years of working on a project to determine what something means with confidence. Delivering a semantic layer brings data reusability and data centric principles is solving a core challenge of dissolving data silos.
Second, it’s possible to live with an unclear definition but once you’ve cracked the semantic code puzzle across the enterprise … efficiency and accuracy of knowledge reaches new realization. A tangible benefit of being confident by knowing exactly what something means is that problems arising due to the persistence of ambiguity will be a thing of the past. It’s now built into the standard S&P operating procedures inherently with a data-centric, semantic driving solution.