July 15, 2010
I've come across a couple of articles recently on either the difficulty or the impossibility of constructing a Return on Investment (ROI) analysis on enterprise architecture projects, which, of course, we would take to also include service oriented architecture projects. While I agree with many of the authors' points and frustrations, particularly in organizations with a policy of ROI-based project approval, I don't completely agree with the assessment that ROI is not possible. The difficulties in constructing an ROI for architectural projects lie in two major areas: timeframe and metrics.
In many organizations, ROI is a euphemism for quick fix or short term payout projects. It's not so much that an architectural project does not have a return on investment but typically the investment is considerably longer than befits a quick fix project. Organizations that overemphasize ROI are typically addicted to short term solutions. While a great deal of good may be done under a shorter time horizon, there is also often a dark side to them. That dark side is that the quick fix is often implemented at the expense of the architecture. What remains after a long period of time is often an architecture of many arbitrary pieces. As a result, each succeeding project becomes harder to accomplish and maintenance costs go up and up. However, these weren't factors in the ROI for the project and as a result the project approval scooted on through. Indeed, if there were no return on investment for an architectural project one would argue that we shouldn't do it at all; that if we were going to spend more than we would ever save, that we should just save ourselves the headache and not do it. However, I and most people who give it serious thought recognize that there is a great deal of payoff in rationalizing one's architecture, and that the payout occurs over a long period of time in increased adaptability, reduced maintenance, additional reuse, etc.
The ROI Denominator
Another piece of the ROI equation is the denominator. In other words, return on investment is what benefit or what return you got divided by the investment. One of the difficulties in justifying some architectural projects is that the denominator is just too large. Some architectural projects will overwhelm and outweigh any conceivable benefit. However, these projects do not have to be large and costly to be effective. Indeed, we find that the best combination is a fairly modest architectural planning project, which then uses monies that would have been spent anyway, supplemented with small bits of seed money, to grow the architecture in a desired direction incrementally over a large period of time. Not only does this reduce the denominator, more importantly, it reduces the risk because with any large infrastructure-driven architectural project there's not only a large investment to recoup but there's always the risk that the architecture might not work at all.
Getting the Right Metrics
The final problem, even after 50 years of software development, is that we're still not routinely collecting the metrics we need to make a rational decision in this area. Sure we have a few very gross metrics, such as percent of IT spending to sales or proportion of new development versus maintenance; and we have some low-level metrics such as cost per line of code or cost per function point. But neither of these are much help at all in shining light on the kinds of changes that will make a great economic difference. In order to do this, we now need to look at the numerator of the ROI equation. The numerator will consist primarily of benefits expressed as cost savings relative to the current baseline. Within that numerator we will divide the activity into the following categories:
- Operational costs
- User costs
- Maintenance costs
- Opportunity costs
Operational costs are the total costs required to keep the information systems running. This includes hardware costs: leases and amortization of purchased costs; software licensing costs: onetime costs and especially annual support costs. This also includes all the administrative costs that must be borne to support the systems, such as backup, indexing, and the like. In addition, we should include the costs required for direct support such as help desk support, which is required just so that users can continue to use the software systems as intended. This category of costs is relatively easy to get in the aggregate. In other words, from your financial statements you probably have a pretty good idea of these costs in total. In many organizations it's more difficult to break these costs into smaller units, where the investment analysis really occurs. This includes breaking them down into application level costs as well as cost per unit of useful work, which we will talk about more toward the end of this article. An enterprise architecture can very often have dramatic impacts on operational costs. Some of the more obvious include shifting production from inappropriate hardware and operating system platforms. For example, in many cases mainframe data centers are cost disadvantaged relative to newer technology. Also very often the architecture can help with the process of rationalizing vendor licenses by consolidating database management systems or application vendors. There can be considerable savings in that area. An architecture project may be targeted at changing an unfavorable outsourcing arrangement or conversely, introducing outsourcing if it's economically sensible. The longer run goals of service oriented architecture are to make as many as possible of the provided services into commodities with the specific intention of driving down the cost per unit of work. Without the architecture in place your switching costs are high enough that, for most shops, it's very difficult to work a process of ratcheting down from higher to lower cost suppliers.
By user costs we mean the amount of time spent by users of the system over and above the absolute bare minimum that would be required to complete a job function. So if a job requires hopping from system to system, navigating, copying data from one system to another or from a report into a system, transcribing, studying, etc., all of this activity is excess cost created by the logistics of using the system. These costs are much harder to gather because they are spread throughout the organization and there is no routine collection of the amount of time spent on these activities versus other non system mediated activities. Typically, what's needed in this area is some form of activity-based costing, where you can audit on a sampling basis how people spend their time and compare that against "should cost" analysis of the same tasks. Even when the task has been off loaded to the end user, what's called "self-service," it still may be worthwhile to review how much excess time is being used. In this case, it's not so much a measure of the resources lost from the organization but it may be an indicator that competitors may be able to take advantage of an effort difference and steal customers. Many aspects of service oriented architecture are aimed exactly at this category of costs. Certainly, all the composite application, much of the system integration, and the like, is aimed at reducing non value-added time that the end users spend with their systems. Implementing service oriented architecture as well as workflow or business process management or even process orchestration can be aimed directly at these costs.
These are the costs of keeping an information system in working order. It includes breakdown maintenance, which is fixing problems that occur in production; preventative maintenance, which is rarely done in the information industry but would include re-factoring; and reactive maintenance, which is maintenance that is required due to changes in the environment. This category includes changes to the technical environment, such as when an operating system is discontinued and we're forced to maintain an application to keep it running, as well as changes in the regulatory environment where a law changes and we are forced to make our systems comply. I did not include proactive maintenance or maintenance that is improving the user or business experience in this category as I will include them in the opportunity cost category. Maintenance costs are typically a function, not so much of the change to be made, but of the complexity of the thing to which the change is being applied. Most maintenance to mature systems involves relatively small numbers of lines of code. Especially when we exclude changes that are meant to improve the system we find fewer and fewer lines of code for any given maintenance activity. That's not to say that maintenance isn't consuming a lot of time; it is. Maintenance very often involves a great deal of analysis to pinpoint either where the operational problem is or where changes need to be made to comply with environmental change. Once the change site is identified another large amount of analysis needs to be done to determine the impact the change is likely to have. Unfortunately, the trend in the nineties to larger integrated systems essentially meant larger scope to search the problem and larger scope for the change analysis. The other major difficulty with getting metrics on maintenance is that many architectural changes eliminate the cost so effectively that people no longer recognize that they are saving money. One architectural principle that we used in the mid-nineties was called "riding lightly on the operating environment." We argued that a system could have the smallest possible footprint onto the operating system's API. In many ways this is the opposite way many applications are built. Many application developers try to get the maximum use out of their operating environment, which makes sense in the short term development productivity equation, but as we discovered the fewer points of contact you have with the operating system the more immune you are to changes in the operating system. As a result, systems we've built in that architecture survived multiple operating system upgrades, in many cases without even a recompile and in other cases with only a simple recompile or a few lines of code changed. The well-designed enterprise architecture goes far beyond that in the realm of reducing long-term maintenance costs. In the first place, the emphasis on modularity, partitioning, and loose coupling means that there are fewer places to look for the source of a problem, there is less investigation to do for the side effect, and any extremely problematic area can just be replaced. In this area, we will likely have to calculate the total cost per environmental change event, such as the cost to upgrade a version of a server operating system or, as we recently have history with, the cost when the environment changed at Y2K and two digit years were no longer sufficient.
In this category I'm putting both the current cost to make proactive changes - in other words, cost to improve the system to deliver better information with fewer keystrokes - as well as the cost of the opportunity lost for not making a change because it was too difficult. The architecture can drastically improve both of these measures. In the first case, it's relatively straightforward to get the total cost of proactive changes to current systems. However, we likely need to go much beyond that and look at change by types. For instance, what is the cost of adding additional information into a use case? What is the cost of streamlining three steps in the use case into one? Perhaps someone will come up with a good taxonomy in this area that would give some comparable values across companies. We also have to include the non-IS costs that go into making these kinds of changes. Not only does this include the business analyst's design time but if it's currently possible for analysts to make a change on their own, we should count that as development time. In the longer run, we expect end user groups to make many of these kinds of changes on their own; and indeed, this is one of the major goals of some of the components of an enterprise architecture. The other side, the opportunity lost, is very difficult to measure and can only be done by interviewing and guesstimate. But it's very true that many companies miss opportunities to enter new markets, to improve processes, etc., because their existing systems are so rigid, expensive, and hard to maintain that they simply don't bother. Also in this category are the costs of delay. If a proposed change would have a benefit but people delay, for often times years, in order to get the budgetary approval to make the change that's necessary, this puts off the benefit stream potentially by years. With a modern architecture, very often the change requires no capital investment and therefore could be implemented as soon as it's been well articulated.
Putting It All Together
An enterprise architecture project can have a substantial return on investment. Indeed, it's often hard to imagine things that could have a larger return on investment. The real question is whether the organization feels compelled to calculate the ROI. Most organizations that succeed with enterprise architecture initiatives, in my observation, have done so without rigorous ROI analyses. For those that need the comfort of an ROI, there is hope. But it comes at a cost. That cost is the need to get, in the manner we've described in this white paper, a rigorous baseline of the current costs and what causes them. Armed with this you can make very intelligent decisions about changes to your architecture that are directly targeted at changing these metrics. In particular, I think people will find that the change that has the biggest multiplier effect is literally changing the cost of change.