The Flagging Art of Saying Nothing

Who doesn’t like a nice flag? Waving in the breeze, reminding us of who we are and what we stand for. Flags are a nice way of providingUnderstanding Meaning in Data a rallying point around which to gather and show our colors to the world. They are a way of showing membership in a group, or providing a warning. Which is why it is so unfortunate when we find flags in a data management system, because they are reduced to saying nothing. Let me explain.

When we see Old Glory, we instantly know it is emblematic of the United States. We also instantly recognize the United Kingdom’s emblematic Union Jack and Canada’s Maple Leaf Flag. Another type of flag is a Warning flag alerting us to danger. In either case, we have a clear reference to what the flag represents. How about when you look at a data set and see ‘Yes’, or ‘7’? Sure, ‘Yes’ is a positive assertion and 7 is a number, but those are classifications, not meaning. Yes what? 7 what? There is no intrinsic meaning in these flags. Another step is required to understand the context of what is being asserted as ‘Yes’. Numeric values have even more ambiguity. Is it a count of something, perhaps 7 toasters? Is it a ranking, 7th place? Or perhaps it is just a label, Group 7?

In data systems the number of steps required to understand a value’s meaning is critical both for reducing ambiguity, and, more importantly, for increasing efficiency. An additional step to understand that ‘Yes’ means ‘needs review’, so the processing steps have doubled to extract its meaning. In traditional systems, the two-step flag dance is required because two steps are required to capture the value. First a structure has to be created to hold the value, the ‘Needs Review’ column. Then a value must be placed into that structure. More often than not, an obfuscated name like ‘NdsRvw’ is used which requires a third step to understand what that means. Only when the structure is understood can the value and meaning the system designer was hoping to capture be deciphered.

In cases where what value should be contained in the structure isn’t known, a NULL value is inserted as a placeholder. That’s right, a value literally saying nothing. Traditional systems are built as structure first, content second. First the schema, the structure definition, gets built. Then it is populated with content. The meaning of the content may or may not survive the contortions required to stuff it into the structure, but it gets stuffed in anyway in the hope it can deciphered later when extracted for a given purpose. Given situations where there is a paucity of data, there is a special name for a structure that largely says nothing – sparse tables. These are tables known to likely contain only a very few of the possible values, but the structure still has to be defined before the rare case values actually show up. Sparse tables are like requiring you to have a shoe box for every type of shoe you could possibly ever own even though you actually only own a few pair.

Structure-first thinking is so embedded in our DNA that we find it inconceivable that we can manage data without first building the structure. As a result, flag structures are often put in to drive system functionality. Logic then gets built to execute the flag dance and get executed every time interaction with the data occurs. The logic says something like this:
IF this flag DOESN’T say nothing
THEN do this next thing
OTHERWISE skip that next step
OR do something else completely.
Sadly, structure-first thinking requires this type of logic to be in place. The NULL placeholders are a default value to keep the empty space accounted for, and there has to be logic to deal with them.

Semantics, on the other hand, is meaning-first thinking. Since there is no meaning in NULL, there is no concept of storing NULL. Semantics captures meaning by making assertions. In semantics we write code that says “DO this with this data set.” No IF-THEN logic, just DO this and get on with it. Here is an example of how semantics maintains the fidelity of our information without having vacuous assertions.

The system can contain an assertion that the Jefferson contract is categorized as ‘Needs Review’ which puts it into the set of all contracts needing review. It is a subset of all the contracts. The rest of the contracts are in the set of all contracts NOT needing review. These are separate and distinct sets which are collectively the set of all contracts, a third set. System functionality can be driven by simply selecting the set requiring action, the “Needs Review” set, the set that excludes those that need review, or the set of all contracts. Because the contracts requiring review are in a different set, a sub-set, and it was done with a single step, the processing logic is cut in half. Where else can you get a 50% discount and do less work to get it?

I love a good flag, but I don’t think they would have caught on if we needed to ask the flag-bearer what the label on the flagpole said to understand what it stood for.

Blog post by Mark Ouska 

For more reading on the topic, check out this post by Dave McComb.

The Data-Centric Revolution: Chapter 2

The Data-Centric Revolution

Below is an excerpt and downloadable copy of the “Chapter 2: What is Data-Centric?”

CHAPTER 2

What is Data-Centric?

Our position is:

A data-centric enterprise is one where all application functionality is based on a single, simple, extensible data model.

First, let’s make sure we distinguish this from the status quo, which we can describe as an application-centric mindset. Very few large enterprises have a single data model. They have one data model per application, and they have thousands of applications (including those they bought and those they built). These models are not simple. In every case we examined, application data models are at least 10 times more complex than they need to be, and the sum total of all application data models is at least 100-1000 times more complex than necessary.

Our measure of complexity is the sum total of all the items in the schema that developers and users must learn in order to master a system.  In relational technology this would be the number of classes plus the number of all attributes (columns).  In object-oriented systems, it is the number of classes plus the number of attributes.  In an XML or json based system it is the number of unique elements and/or keys.

The number of items in the schema directly drives the number of lines of application code that must be written and tested.  It also drives the complexity for the end user, as each item, eventually surfaces in forms or reports and the user must master what these mean and how the relate to each other to use the system.

Very few organizations have applications based on an extensible model. Most data models are very rigid.  This is why we call them “structured data.”  We define the structure, typically in a conceptual model, and then convert that structure to a logical model and finally a physical (database specific) model.  All code is written to the model.  As a result, extending the model is a big deal.  You go back to the conceptual model, make the change, then do a bunch of impact analysis to figure out how much code must change.

An extensible model, by contrast is one that is designed and implemented such that changes can be added to the model even while the application is in use. Later in this book and especially in the two companion books we get into a lot more detail on the techniques that need to be in place to make this possible.

In the data-centric world we are talking about a data model that is primarily about what the data means (that is, the semantics). It is only secondarily, and sometimes locally, about the structure, constraints, and validation to be performed on the data.

Many people think that a model of meaning is “merely” a conceptual model that must be translated into a “logical” model, and finally into a “physical” model, before it can be implemented. Many people think a conceptual model lacks the requisite detail and/or fidelity to support implementation. What we have found over the last decade of implementing these systems is that done well, the semantic (conceptual) data model can be put directly into production. And that it contains all the requisite detail to support the business requirements.

And let’s be clear, being data-centric is a matter of degree. It is not binary. A firm is data-centric to the extent (or to the percentage) its application landscape adheres to this goal.

Data-Centric vs. Data-Driven

Many firms claim to be, and many firms are, “data-driven.” This is not quite the same thing as data-centric. “Data-driven” refers more to the place of data in decision processes. A non-data-driven company relies on human judgement as the justification for decisions. A data-driven company relies on evidence from data.

Data-driven is not the opposite of data-centric. In fact, they are quite compatible, but merely being data-driven does not ensure that you are data-centric. You could drive all your decisions from data sets and still have thousands of non-integrated data sets.

Our position is that data-driven is a valid aspiration, though data-driven does not imply data-centric. Data-driven would benefit greatly from being data-centric as the simplicity and ease of integration make being data-driven easier and more effective.

We Need our Applications to be Ephemeral

The first corollary to the data-centric position is that applications are ephemeral, and data is the important and enduring asset. Again, this is the opposite of the current status quo. In traditional development, every time you implement a new application, you convert the data to the new applications representation. These application systems are very large capital projects. This causes people to think of them like more traditional capital projects (factories, office buildings, and the like). When you invest $100 Million in a new ERP or CRM system, you are not inclined to think of it as throwaway. But you should. Well, really you shouldn’t be spending that kind of money on application systems, but given that you already have, it is time to reframe this as sunk cost.

One of the ways application systems have become entrenched is through the application’s relation to the data it manages. The application becomes the gatekeeper to the data. The data is a second-class citizen, and the application is the main thing. In data-centric, the data is permanent and enduring, and applications can come and go.

Data-Centric is Designed with Data Sharing in Mind

The second corollary to the data-centric position is default sharing. The default position for application-centric systems is to assume local self-sufficiency. Most relational database systems base their integrity management on having required foreign key constraints. That is, an ordering system requires that all orders be from valid customers. The way they manage this is to have a local table of valid customers. This is not sharing information. This is local hoarding, made possible by copying customer data from somewhere else. And this copying process is an ongoing systems integration tax. If they were really sharing information, they would just refer to the customers as they existed in another system. Some API-based systems get part of the way there, but there is still tight coupling between the ordering system and the customer system that is hosting the API. This is an improvement but hardly the end game.

As we will see later in this book, it is now possible to have a single instantiation of each of your key data types—not a “golden source” that is copied and restructured to the various application consumers, but a single copy that can be used in place.

Is Data-Centric Even Possible?

Most experienced developers, after reading the above, will explain to you why this is impossible. Based on their experience, it is impossible. Most of them have grown up with traditional development approaches. They have learned how to build traditional standalone applications. They know how applications based on relational systems work. They will use this experience to explain to you why this is impossible. They will tell you they tried this before, and it didn’t work.

Further, they have no idea how a much simpler model could recreate all the distinctions needed in a complex business application. There is no such thing as an extensible data model in traditional practice.

You need to be sympathetic and recognize that based on their experience, extensive though it might be, they are right. As far as they are concerned, it is impossible.

But someone’s opinion that something is impossible is not the same as it not being possible. In the late 1400s, most Europeans thought that the world was flat and sailing west to get to the far east was futile. In a similar vein, in 1900 most people were convinced that heavier than air flight was impossible.

The advantage we have relative to the pre-Columbians, and the pre-Wrights is that we are already post-Columbus and post-Wrights. These ideas are both theoretically correct and have already been proved.

The Data-Centric Vision

To fix your wagon to something like this, we need to make a few aspects of the end game much clearer. We earlier said the core of this was the idea of a single, simple, extensible data model. Let’s drill in on this a bit deeper.

Click here to download the entire chapter.

Use the code: SemanticArts for a a 20% discount off of Technicspub.com

Semantic Ontology: The Basics

What is Semantics?

Semantics is the study of meaning. By creating a common understanding of the meaning of things, semantics helps us better understandsemantic arts each other. Common meaning helps people understand each other despite different experiences or points of view. Common meaning in semantic technology helps computer systems more accurately interpret what people mean. Common meaning enables disparate IT systems – data sources and applications – to interface more efficiently and productively.

What is an Ontology?

An ontology defines all of the elements involved in a business ecosystem and organizes them by their relationship to each other. The benefits of building an ontology are:

  • Everyone agrees on a common set of terms used to describe things
  • Different systems – databases and applications – can communicate with each other without having to directly connect to each other.

Enterprise Ontology

An Ontology is a set of formal concept definitions.

An Enterprise Ontology is an Ontology of the key concepts that organize and structure an Organization’s information systems. Having an Enterprise Ontology provides a unifying whole that makes system integration bearable.

An Enterprise Ontology is like a data dictionary or a controlled vocabulary, however it is different in a couple of key regards. A data dictionary, or a controlled vocabulary, or even a taxonomy, relies on humans to read the definitions and place items into the right categories. An ontology is a series of rules about class (concept) membership that uses relationships to set up the inclusion criteria. This has several benefits, one of the main ones being that a system (an inference engine) can assign individuals to classes consistently and automatically.

By building the ontology in application neutral terminology it can fill the role of “common denominator” between the many existing and potential data sources you have within your enterprise. Best practice in ontology building favors building an Enterprise Ontology with the fewest concepts needed to promote interoperability, and this in turns allows it to fill the role of “least common denominator”

Building an Enterprise Ontology is the jumping off point for a number of Semantic Technology initiatives. We’ll only mention in passing here the variety of those initiatives (we invite you to poke around our web site to find out more) . We believe that Semantic Technology will change the way we implement systems in three major areas:

  • Harvest – Most of the information used to run most large organizations comes from their “applications” (their ERP or EHR or Case Management or whatever internal application). Getting new information is a matter of building screens in these applications and (usually) paying your employees to enter data, such that you can later extract it for other purposes. Semantic Technology introduces approaches to harvest data not only from internal apps, but from Social Media, unstructured data and the vast and growing sets of publicly available data waiting to be integrated.
  • Organize – Relational, and even Object Oriented, technology, impose a rigid, pre-defined structure and set of constraints on what data can be stored and how it is organized. Semantic Technology replaces this with a flexible data structure that can be changed without converting the underlying data. It is so flexible that not all the users of a data set need to share the same schema (they need to share some part of the schema, otherwise there is no basis for sharing, but they don’t need to be in lockstep, each can extend the model independently). Further the semantic approach promotes the idea that the information is at least partially “self-organizing.” Using URIs (Web based Uniform Resource Identifiers) and graph-based databases allows these systems to infer new information from existing information and then use that new information in the dynamic assembly of data structures.
  • Consume — Finally we think semantic technology is going to change the way we consume information. It is already changing the nature of work flow-oriented systems (ask us about BeInformed). It is changing data analytics. It is the third “V” in Big Data (“Variety”). Semantic Based mashups are changing the nature of presentation. Semantic based Search Engine Optimization (SEO) is changing internal and external search.

Given all that, how does one get started?

Well you can do it yourself. We’ve been working in this space for more than twenty years and have been observing clients take on a DIY approach, and while there have been some successes, in general we see people recapitulating many of the twists and turns that we have worked through over the last decade.

You can engage some of our competitors (contact us and we’d be happy to give you a list). But, let us warn you ahead of time: most of our competitors are selling products, and as such their “solutions” are going to favor the scope of the problem that their tools address. Nothing wrong with that, but you should know going in, that this is a likely bias. And, in our opinion, our competitors are just not as good at this as we are. Now it may come to pass that you need to go with one of our competitors (we are a relatively small shop and we can’t always handle all the requests we get) and if so, we wish you all the best…

If you do decide that you’d like to engage us, we’d suggest a good place to get started would be with an Enterprise Ontology. If you’d like to get an idea, for your budgeting purposes, what this might entail, click here to get in touch, and you’ll go through a process where we help you clarify a scope such that we can estimate from it. Don’t worry about being descended on by some over eager sales types, we recognize that these things have their own timetables and we will be answering questions and helping you decide what to do next. We recognize that these days “selling” is far less effective than helping clients do their own research and supporting your buying process.

That said, there are three pretty predictable next steps:

  • Ask us to outline what it would cost to build an Enterprise Ontology for your organization (you’d be surprised it is far less than the effort to build and Enterprise Data Model or equivalent)
  • gist – as a byproduct of our work with many Enterprise Ontologies over the last decade we have built and made publicly available “gist” which is an upper ontology for business systems. We use it in all our work and we have made it publicly available via a Creative Commons Share Alike license (you can use it for any purpose provided you acknowledge where you got it)
  • Training – if you’d like to learn more about the language and technology behind this (either through public courses or in house) check out of offerings in training.

How is Semantic Technology different from Artificial Intelligence?

Artificial Intelligence (AI) is a 50+ year old academic discipline that provided many technologies that are now in commercial use. Two things comprise the core of semantic technology. The first stems from AI research in knowledge representation and reasoning done in the 70s and 80s and includes ontology representation languages such as OWL and inference engines like Fact++. The second relates to data representation and querying using triple stores, RDF and SPARQL, which are largely unrelated to AI. A broad definition of semantic technology includes a variety of other technologies that emerged from AI. These include machine learning, natural language processing, intelligent agents and to a lesser extent speech recognition and planning. Areas of AI not usually associated with semantic technology include creativity, vision and robotics.

How Does Semantics Use Inference to Build Knowledge?

Semantics organizes data into well-defined categories with clearly defined relationships. Classifying information in this way enables humans and machines to read, understand and infer knowledge based on its classification. For example, if we see a red breasted bird outside our window in April, our general knowledge leads us to identify it as a robin. Once it is properly categorized, we can infer a lot more information about the robin then just its name.

We know for example that it is a bird; it flies; it sings a song; it spends its winter somewhere else and the fact that it has showed up means that good weather is on its way.

We know this other information because the robin has been correctly identified within the schematic of our general knowledge about birds, a higher classification; seasons, a related classification, etc.

This is a simple example of by correctly classifying information into a predefined structure we can infer new knowledge. In a semantic model, once the relationships are set up, a computer can classify data appropriately, analyze it based on the predetermined relationships and then infer new knowledge based on this analysis.

What is Semantic Agreement?

The primary challenge in building an ontology is getting people to agree about what they really mean when they describe the concepts that define their business. Gaining semantic agreement is the process of helping people understand exactly what they mean when they express themselves.

Semantic technologists accomplish this because they define terms and relationships independent from the context of how they are applied or the IT systems that store the information, so they can build pure and consistent definitions across disciplines.

Why is Semantic Agreement Important?

Semantic agreement is important because it is enables disparate computer systems to communicate directly with each other. If one application defines a customer as someone who has placed an order and another application defines the customer as someone who might place an order, then the two applications cannot pass information back and forth because they are talking about two different people. In a traditional IT approach, the only way the two applications will be able to pass information back and forth is through a systems integration patch. Building these patches costs time and money because it requires the owners of the two systems need to negotiate a common meaning and write incremental code to ensure that the information is passed back and forth correctly. In a semantic enabled IT environment, all the concepts that mean the same thing are defined by a common meaning, so the different applications are able to communicate with each other without having to write systems integration code.

What is the Difference Between a Taxonomy and Ontology?

A taxonomy is a set of definitions that are organized by a hierarchy that starts at the most general description of something and gets more defined and specific as you go down the hierarchy of terms. For example, a red-tailed hawk could be represented in a common language taxonomy as follows:

  • Bird
    • Raptors
    • Hawks
      • Red Tailed Hawk

An ontology describes a concept both by its position in a hierarchy of common factors like the above description of the red-tailed hawk but also by its relationships to other concepts. For example, the red-tailed hawk would also be associated with the concept of predators or animals that live in trees.

The richness of the relationships described in an ontology is what makes it a powerful tool for modeling complex business ecosystems.

What is the Difference Between a Logical Data Model and Ontology?

The purpose of an ontology is to model the business. It is independent from the computer systems, e.g. legacy or future applications and databases. Its purpose is to use formal logic and common terms to describe the business, in a way that both humans and machines can understand. Ontologies use OWL axioms to describe classes and properties that are shared across multiple lines of business so concepts can be defined by their relationships, making them extensible to increasing levels of detail as required. Good ontologies are ‘fractal’ in nature, meaning that the common abstractions create an organizing structure that easily expands to accommodate the complex information management requirements of the business. The purpose of a logical model is to describe the structure of the data required for a particular application or service. Typically, a logical model shows all the entities, relationships and attributes required for a proposed application. It only includes data relevant to the particular application in question. Ideally logical models are derived from the ontology which ensures consistent meaning and naming across future information systems.

How can an Ontology Link Computer Systems Together?

Since an ontology is separate from any IT structure, it is not limited by the constraints required by specific software or hardware. The ontology exists as a common reference point for any IT system to access. Thanks to this independence, it can serve as a common ground for different:

  • database structures, such as relational and hierarchical,
  • applications, such as an SAP ERP system and a cloud-hosted e-market,
  • devices, such as an iPad or cell phone.

The benefit of the semantic approach is that you can link the legacy IT systems that are the backbone of most business to exciting new IT solutions, like cloud computing and mobile delivery.

What are 5 Business Benefits of Semantic Technology Solutions?

Semantic technology helps us:

  1. Find more relevant and useful information
    • Because it enables us to search information from disparate sources (federated search) and automatically refine our searches (faceted search).
  2. Better understand what is happening
    • Because it enables us to use the relationships between concepts to predict and interpret change.
  3. Build more transparent systems and communications
    • Because it is based on common meanings and mutual understanding of the key concepts and relationships that govern our business ecosystems.
  4. Increase our effectiveness, efficiency and strategic advantage
    • Because it enables us to make changes to our information systems more quickly and easily.
  5. Become more perceptive, intelligent and collaborative
    • Because it enables us to ask questions we couldn’t ask before.

How Can Semantic Technology Enable Dynamic Workflow?

Semantic-driven dynamic workflow systems are a new way to organize, document and support knowledge management. They include two key things:

  1. A consistent, comprehensive and rigorous definition of an ecosystem that defines all its elements and the relationships between elements. It is like a map.
  2. A set of tools that use this model to:
    • Gather and deliver ad hoc, relevant data.
    • Generate a list of actions – tasks, decisions, communications, etc. – based on the current situation.
    • Facilitate and document interactions in the ecosystem.

These tools work like a GPS system that uses the map to adjust its recommendations based on human interactions This new approach to workflow management enables organizations to respond faster, make better decisions and increase productivity.

Why Do Organizations Need Semantic-Driven, Dynamic Workflow Systems?

A business ecosystem is a series of interconnected systems that is constantly changing. People need flexible, accurate and timely information and tools to positively impact their ecosystems. Then they need to see how their actions impact the systems’ energy and flow. Semantic-driven, dynamic workflow systems enable users to access information from non-integrated sources, set up rules to monitor this information and initiate workflow procedures when the dynamics of the relationship between two concepts change. It also supports the definition or roles and responsibilities to ensure that this automated process is managed appropriately and securely. Organizational benefits to implementing semantic-driven, dynamic workflow systems include:

  • Improved management of complexity
  • Better access to accurate and timely information
  • Improved insight and decision making
  • Proactive management of risk and opportunity
  • Increased organizational responsiveness to change
  • Better understanding of the interlocking systems that influence the health of the business ecosystem

Blog post by Dave McComb

Click here to read a free chapter of Dave McComb’s book, “A Data-Centric Revolution”

 

What Size is Your Meaning?

It’s an odd question, yet determining size is the tacit assumption behind traditional data management efforts. That assumption existssemantic technology because, traditionally, there has always been a structure built in rows and columns to store information. This is based on physical thinking.

Size matters when building physical things. Your bookshelf needs to be tall, wide and deep enough for your books. If the garage is too small, you won’t be able to fit your truck.

Rows and columns have been around since the early days of data processing, but Dan Bricklin brought this paradigm to the masses when he invented VisiCalc. His digital structure allowed us to perform operations on entire rows or columns of information. This is a very powerful concept. It allows a great deal of analysis to be done and great insight to be delivered. It is, however, still rooted in the same constraint as the book shelf or garage; how tall, wide, and deep must the structure be?

Semantic technology flips this constraint on its head by shifting away from structure and focusing on meaning.

Meaning, unlike books, has no physical size or dimension.

Meaning will have volume when we commit it to a storage system, but it remains shapeless just like water. There is no concept of having to organize water in a particular order or structure it within a vessel. It simply fills the available space.

At home, we use water in its raw form. It’s delivered to us through a system of pipes as a liquid, which is then managed according to its default molecular properties. When thirsty, we pour it into a glass. If we want a cold beverage, we freeze it in an ice cube tray. Heated into steam, it gives us the ability to make cappuccino.

We don’t have different storage or pipes to manage delivery in each of these forms; it is stored in reservoirs and towers and is delivered through a system of pipes as a liquid. Only after delivery do we begin to change it for our consumption patterns. Storage and consumption are disambiguated from one another.

Semantic technology treats meaning like water. Data is stored in a knowledge graph, in the form of triples, where it remains fluid. Only when we extract meaning do we change it from triples into a form to serve our consumption patterns. Semantics effectively disambiguates the storage and consumption concerns, freeing the data to be applied in many ways previously unavailable.

Meaning can still be extracted in rows and columns where the power of aggregate functions can be applied. It can also be extracted as a graph whose shape can be studied, manipulated, and applied to different kinds of complex problem solving. This is possible because semantic technology works at the molecular level preventing structure from being imposed prematurely.

Knowledge graphs are made up of globally unique information units (atoms) which are then combined into triples (molecules). Unlike water’s two elements, ontologies establish a large collection of elements from which the required set of triples (molecules) are created. Triples are comprised of a Subject, Predicate, and Object. Each triple is an assertion of some fact about the Subject. Triples in the knowledge graph are all independently floating around in a database affectionately known as a “bag of triples” because of its fluid nature.

Semantic technology stores meaning in a knowledge graph using Description Logics to formalize what our minds conceptualize. Water can be stored in many containers and still come out as water just like a knowledge graph can be distributed across multiple databases and still contain the same meaning. Data storage and data consumption are separate concerns that should be disambiguated from one another.

Semantic technology is here, robust and mature, and fully ready to take on enterprise data management. Rows and columns have taken us a long way, but they are getting a bit soggy.

It’s time to stop imposing artificial structure when storing our data and instead focus on meaning. Let’s make semantic technology our default approach to handling the data tsunami.

Blog post by Mark Ouska

What is Software Architecture and How to Select an Effective Architect

What is Software Architecture?

Originally published as What is Software Architecture on August 1, 2010

As Howard Roark pointed out in “The Fountainhead” the difference between an artist and an architect, is that an architect needs a client.

Software Architecture is the design of the major components of complex information systems, and how they interact to form a coherent whole. The identification of what constitutes the “major” components is primarily a matter of scale and scope, but generally starts at the enterprise level and works down through subsidiary levels.

The term “architecture” has been overused in the software industry to the extent that it is in danger of becoming meaningless. This is unfortunate, for it comes at a time when companies are in greatest need of some architectural direction. Architecture deals primarily with the specific configuration of technologies and applications in place and those desired to be in place in a particular institution.  While we often speak of a “client/server” architecture  or a “thin client” architecture, what we are really referring to is an architectural style, in much the same way that we would refer to “gothic” as a style of physical architecture, but the architecture itself only exists in particular buildings.

It isn’t architecture until it’s built

As Howard Roark pointed out in “The Fountainhead” the difference between an artist and an architect, is that an architect needs a client. Architecture, in the built world as well as the software world, generally only comes into play when the scale of the endeavor is such that an individual cannot execute it by themselves.

Generally, architecture is needed because of the scale of the problem to be solved. The phrase “it isn’t architecture until it’s built” refers to the difference between architects who draw drawings that may be interesting or attractive, but don’t result in structures being built have only participated in artwork, and not architecture.

Dealing with the “as-is”

Another area of confusion for the subject is the relationship between the “architecture” of a procured item, and the architecture of the environment in which it is implemented. We often speak of software with a “J2EE architecture,” and while it is true the framework has an architecture, the real architecture is the combination of the framework of the procured item with the host of components that exist in the implementation environment.

In the built world we may procure an elevator system, and this may be instrumental in the height and use of the building we design and build, and while the elevator system itself no doubt has architecture, we wouldn’t say that the building has an “elevator architecture.” This confusion of the procured parts with the architecture is what often leads people to short shrift their existing architecture. Sponsors may sense that their current architecture is inadequate and desire to replace it with something new.

However, unless they are in a position to completely eliminate the existing systems, they will be dealing with the architecture of the procured item as well as the incumbent one. All information systems have an architecture.  Many are accidental, but there is an organization of components and some way they operate together to address the goals of the organization.  Much as a remodeler will often employ an architect to document the “as built” and “as maintained” architecture before removing a bearing wall, remodelers of information systems would do well to do the same.

Architecture’s many layers

Architecture occurs in many layers or levels, but each is given by the context of the higher-level architecture. So, we can think of the architecture of the plumbing of a house. It has major components (pipes and traps and vents) and their interrelationship, but the architecture of the plumbing only makes sense in the context of the architecture of the dwelling.

It is the same in information systems. We can consider the error handling architecture, but only in the context of a broader architecture, such as Service Oriented or Client/Server. The real difference between and intentional and an accidental architecture is whether the layering was planned and executed top down, or whether the higher-level architectures just emerged from a bottom up process.

Beyond Building Materials

The software industry seems to equate building materials with architecture. We might talk about an architecture being “C++” or “Oracle” or “Object Oriented” (We’ve heard all these as answers to “what is your architecture?”).

But this confusion between what we build things out of and how we assemble the main pieces would never happen in the built world. No architect would say a building architecture was “brick” or “dry wall” or even “post and lintel,” even though they may use these items or techniques in their architecture.

Conclusion

No doubt there will continue to be confusion about architecture and what it means in the software world, but with a bit of discipline we may be able to revive the term and make it meaningful.

Who Needs Software Architecture?

Originally published as Who Needs Software Architecture on August 5, 2010

Most firms don’t need a new architecture.

They have an architecture and it works fine. If you live in a Tudor house, most of the time you don’t think about architecture, only when you want to make some pretty major changes. You might expect, given that we do software architecture and this is on our web site, that we would eventually try to construct this theme to say, well, nearly everyone sooner or later needs a software architect.

But that’s just not true. Most companies don’t need software architects and even those that do don’t need them most of the time. Let’s take a look at some of the situations where companies don’t need software architects.

Size

Small companies generally don’t need software architects. By small we mean companies of typically fewer than 100 people, however, this can vary quite a bit depending on the complexity of the information they need to process. If they are in any standard industry and if there exists packaged software which addresses their business needs, most small-business people would be far better off to adopt that package or the package of their choice and simply live with the architecture that comes with it.

For instance, in the restaurant industry now, there is a company called Squirrel that has by far the largest market share of the restaurant management applications. You can take orders on Squirrel, print out the receipts, take the credit cards, manage your inventory, schedule your wait people, cooks, busboys, and the like. For the most part, restaurant owners should not care what architecture Squirrel uses. It has an architecture but it’s not an important concern at that scale.

Stability

Larger companies most of the time will find themselves in a relatively stable state. They have a set of applications sitting on a set of platforms using a set of database management systems communicating via some networking protocol and communicating with some set of desktop or portable devices.

No matter how various it is, that is their current architecture and to the extent that it is stable and they are able to maintain it, make extensions to it and satisfy their needs, that is exactly what they should do and they should live within the architecture they’ve created, no matter how accidental the architectural creation process was.

It is really only where there are relatively complex systems, where the complexity is interfering with productivity or the ability to change and respond, or where major changes to the infrastructure are being contemplated, that companies should really consider undertaking architectural projects.

What Does a Software Architect Do?

Originally published as What Does a Software Architect Do? on August 11, 2010

The Software Architect’s primary job is to help a client understand deeply the architecture they have built, understand clearly what they desire from their information systems, and help them construct a plan to get there.

The simple answer of course is that the software architect or architectural firm creates the architecture.

The more involved question is, what goes into that and what process is typically followed to get to that result? The architecture, or as we sometimes refer to it, the “to-be” or target architecture, is an architecture that does not yet exist; and in that sense it is prescriptive. However, in order to define, articulate, draw, and envision a future architecture, we must start from where the client’s architecture currently is and work forward from there.

Divining the “As-is” Architecture

The client’s current architecture is a combination of a descriptive and visual representation of all the key components in the information infrastructure that currently exist. We have found time and time again that the mere inventorying, ordering, arranging, and presenting of this information has yielded tremendous benefit and insights to many clients.

Typically, the process involves reviewing whatever written documentation is available for the existing systems. Sometimes this is catalogued information such as a listing or repository of existing applications/technologies. Sometimes it’s research into the licensing of pieces of software. Sometimes it’s a review of diagrams: network diagrams, hardware schematics, etc.

The architect then interviews many of the key personnel: primarily technical personnel but also knowledgeable end-user personnel who interact with the systems and in many cases understand where other shadow IS systems live.

The end product of these interviews is a set of diagrams and summary documentation that show not only the major components but how they are interrelated. For instance, in some cases we have found it important to document the technical dependency relationships, which include the relationship of an application to the technologies in which it was created and therefore on which it is dependent. (See our article on technical dependencies for more detail in this area.)

Listening to the Stakeholders

The second set of inputs will come primarily from the business or user side of the organization. This will include interviews to establish not only what exists, and especially what exists that doesn’t work well, but also what is envisioned; what it is that the organization wishes to live into and is potentially hampered by in their existing information systems.

The real art comes in how we get from where we are now to where we want to be.

This is a very intense active listening activity in that we do not expect end users to be able to articulate architectural themes, needs, requirements, or anything of that nature. However, they do have the key raw material that is needed to construct the target architecture, which can be drawn out in conversation. The end product of this activity combined with what’s known from the current architecture is the first draft of what is called the target architecture or the to-be architecture. At this point the major themes, or styles if you will, are described and decided upon. It’s very much as if at this point the client is choosing between new urbanism or neo-modern styles of architecture.

Again, unless you know going in the style of architecture that will be required, it is best to work with architects who have a range of styles and capabilities. As the architects conceive of the overall architecture, they shift into a collaborative and consensus building mode with senior management and any other stakeholders that are essentially the owners of the long-term architecture. This process is not merely laying out the blueprints and describing them but is a longer ongoing process of describing themes, trade-offs, economics, manageability, and the like; trying out ideas and gathering feedback from the team. Again, active listening is employed to ensure that all concerns are heard and that, in the end, all participants are in agreement as to the overall direction.

Migration Plan

The real art comes in how we get from where we are now to where we want to be. First, the architects and management team need to discuss urgency, overall levels of effort, and related questions. Getting from an existing architecture to a future architecture very often resembles construction on a major arterial highway.

We all know it would be simpler and far more economical to shut down the highway for six months or a year and do all the improvements. But the fact is that most urban arterial roads are in very heavy use and shutting them down for efficient construction is not feasible, so the actual road construction project becomes a very clever series of detours, lane expansion, and, unfortunately, very often reworking the same piece of pavement multiple times. And so it is in the software industry.

Ten years ago, it was fashionable to “bulldoze the slums,” in other words, to launch massive projects that would essentially start from a green field and build all new systems that the owners could move into. There have been multiple problems with this approach over the years, the first being that the sheer size of these projects has had a dramatically negative impact on their success.

The second problem is that we are all, if you will, living in those slums; we are running our businesses with the existing systems and it is very often not feasible to tear them down in a wholesale fashion. So, one of the duties of the architects is to construct a series of incremental projects, each of which will move the architecture forward. At the same time, many, if not all, should be designed to provide some business benefit for the project itself.

This is easier said than done, but very often there is a backlog of projects that needs to be completed. These projects have ROI (return-on-investment) that has been documented and it is a matter of, perhaps, re-scoping, retargeting, or rearranging the project in a way that not only achieves its independent business function and return-on-investment but also advances the architecture.

Balancing Short Term and Long-Term Goals

This is an area that in the past has been sorely neglected. Each project has come along focused very narrowly on its short-term payoff. The net result has been a large series of projects that not only neglects the overall architecture but also continues to make it worse and worse, such that each subsequent project faces higher and higher hurdles of development productivity that it must overcome in order to achieve its payback.

When an overall plan and sequencing of projects has been agreed upon, which by the way often takes quite a significant amount of time, the plan is ready to be converted into what is more normally thought of as a long-range information system plan, where we begin to put high-level estimates on projects, define some of the resources, and the like.

That, in a nutshell, is what the software architect does. At the completion of this process, the client knows with a great deal of certainty where he’s headed to architecturally, why his destination architecture is superior to the architecture he currently has, and the benefits that will accrue once he is in that architecture.

And finally, he has a road map and timeline for getting from his current state to the desired state.

How to Select a Software Architect

Originally published as How to Select a Software Architect on August 31, 2010

Selecting a Software Architect is an important decision, as the resulting architecture will impact your information systems for a long time.

We present a few thoughts for you to keep in mind as you consider your decision. Assuming you have come to the conclusion that you can use the services of a software architect, the next question becomes, how do you select one? We’re going to suggest three major areas as the focus of your attention:

  • Experience
  • Prejudice
  • Chemistry

Experience

By experience we are not referring to the number of years of specific experience with a given technology. For instance, assuming that you did “know” somehow that your new architecture was going to be Java J2EE-based (though by the way, a decision like that would normally be part of the architectural planning process and it would often be detrimental to “know” this information going in). Even if you did know this information, it would not necessarily be beneficial to base your selection of an architect on it.

This would be akin to selecting your building architect based on the number of years of dry walling experience or landscaping experience that they had had. At the same time, you certainly do not want inexperienced architects. The architectural decisions are going to have wide-ranging implication for your systems for years to come, and you want to look for professionals that have a great depth of knowledge, and breadth of experience, of different companies and even of different industries that they can draw upon to form the conclusions that will be the basis for your architecture.

Prejudice

By prejudice we mean literally prejudgment. You would like to find an architect as free as possible from pre-determined opinions about the direction and composition of your architecture. There are many ways that prejudice creeps into architecture, some subtle and some not so subtle. For starters, hardware vendors and major software platform vendors have architects on staff who would be glad to help you with your architectural decisions. Keep in mind that most of them either overtly or perhaps more subtly are prejudiced to create a target architecture that prominently features their products, whether or not that is the best solution for your needs.

Other more subtle forms of prejudice come from firms with considerable depth of experience in a particular architecture. You may find firms with a great deal of experience with Enterprise JavaBeans or Microsoft Foundation Classes, and in each case, it would be quite unusual to find them designing an architecture that excluded the very things that they are familiar with. The final source of prejudice is with firms who use architecture as a loss leader to define and then subsequently bid on development projects. You do not really want your architecture defined by a firm whose primary motive is to use the architecture to define a series of future development projects.

Chemistry

The last criterion, chemistry, is perhaps the most elusive. We’re considering chemistry here because a great deal of what the architect must do to be successful is to elicit from the client, the client’s employees, potentially their customers and suppliers, and from existing systems their needs, aspirations, and constraints, and to hear that in full fidelity. For this to work well there must be a melding of the cultures or at least an ability to communicate forthrightly and frankly about these issues and really the only way to make this sort of determination is through interview and reference. The selection of the software architect is an important decision for most companies, as the creation of the architecture is likely to be the single most important decision that will affect future productivity as well as the ability to add and change functionality within a system.

Skip to content