White Paper: How long should your URIs be?

This applies to URIs that a system needs to generate when it finds it needs to mint a new resource.

I’ve been thinking a lot about automated URI assignment lately. In particular the scheme we’ve been using (relying on the database to maintain a “next available number” and incrementing that), is fraught with potential problems. However I really don’t like the guid style with their large unwieldy and mostly unnecessarily long strings.

I did some back of the envelop thinking and came up with the following recommendations. After the fact I decided to search the web and see what I could find. I found some excellent stuff, but not this in particular, nor anything that seemed to rule it out. Of note, Phil Archer has some excellent guidelines here: http://philarcher.org/diary/2013/uripersistence/. This is much broader than what I’m doing here, but it is very good. He even has “avoid auto increment” as one of his top 10 recommendations.

The points in this paper don’t apply to hand crafted URIs (as you would typically have for your classes and properties, and even some of your hand curated special instances). This applies to URIs that a system needs to generate when it finds it needs to mint a new resource. A quick survey of the approaches and who uses them:

  • Hand curate all—dbpedia essentially has the author create a URI when they create a new topic.
  • Modest-sized number—dbpedia page IDs and page revision IDs look like next available number types.
  • Type+longish number—yago has URIs like yago:Horseman110185793 (class plus up to a billion numbers; not sure if there is a next available number behind this, but it kind of looks like there is).
  • Guids—cyc identifies everything with a long string like Mx4rvkS9GZwpEbGdrcN5Y29ycA.
  • Guids—Microsoft uses 128 bit guids for identifying system components, such as {21EC2020-3AEA-4069-A2DD08002B30309D}. The random version uses 6 bits to indicate random and therefore has a namespace of 1036, thought to be large enough that the probability of generating the same number is negligible.

Being pragmatist, I wanted to figure out if there is an optimal size and way to generate URIs.

Download the White-paper

The Problem with Default Values

Are there hidden problems with default values in software?

Virtually all information systems have “default values.” We put them in our systems to make things easier for the end-users as well as the system itself. As we will investigate in this white paper, there are many hidden problems with default values, some of which first surface in the edge cases. But as we began to reflect on them we realize that these problems infest all of our systems.

Default values and data capture

Most of our use of default values is tied in with the capture of data. If we’re capturing data from another electronic source we may know categorically that everything from this source gets aAre there hidden problems with default values in software? transaction code B-17. In that case, B-17 is a default value for transactions from the source. This rarely creates a problem. The area where default values create problems is exactly the area where they create benefits. Default values are very often used as an aid to data entry. If we can provide data to users that is most often the data that they would have entered, then we would have done them a favor and sped up data entry.

The most interesting distinction is between default values that a user actually saw and therefore could have changed; might have seen; or did not see. As we transition from mainframe to web-based systems, and as we adopt more portal-oriented technology for our front end, we will very often have the possibility where we don’t know whether the end-users saw and agreed with an individual piece of default data because construction of a user interface may have excluded it, may have put it on a tab they didn’t visit, may have put it in a scroll region they didn’t see. We have to keep these considerations in mind because of the implication of a user accepting a default value. Let’s take a look at some of the types of these defaults and where they might run amiss.

Default types

One of the most useful defaults that a system can provide is a current date. This may be used as a default for when a piece of information was posted to the system, in which case by definition a default is almost always exactly right. We can also use a current date as a default value of when an event occurred. This is a useful default if events are being captured nearly contemporaneously. The next type of default is the most commonly occurring category. Maybe our portfolio management system categorizes information assets as being either applications or infrastructure. Since we have a lot more applications than infrastructure it may make sense to default that field to applications. It will save the users, on average, a fair amount of time not to fill than in. The third major type of default values are those that come from implicit clones. An implicit clone is a new record based on either existing incidents or values that are deemed to be valid or representative. So in QuickBooks, every time we enter another credit card charge it assumes it will be the same credit card company and the same date as the last one.

Where the problem lies

The problem comes when we analyze the data. Did the user actually supply the data or did they make use of the fact that it was a default and therefore let it go through? For instance, let’s say we asked about eye color in our database. And let’s say that statistically 50% of the population has brown eyes and therefore we make that our default value. We also know that statistically 10% of the population has blue eyes. As we later do a survey of the data in our database we find that only five percent of our population has blue eyes and 70% has brown eyes. How much can we trust this data? Or was it that the data entry person had discovered that it was easier to go with the default and there didn’t seem to be much downside?

Conclusion

The use of default values is a two-edged sword. While it offers some convenience on data entry, as we begin building more and more interesting systems we may find that the presence of default values may confuse us as much as help us if it increases our uncertainty about what the user really did.

Written by Dave McComb

Three Value Logic: The Case for an ‘Unknown’ Value in Software Logic

Relational databases, relational theory, relational calculus, and predicate logic all rely on a two-value truth. That is, that a given proposition or predicate is either true or false. The fact that the results of the query can be proven to be correct rests on the fact that itThree Value Logic: The Case for an 'Unknown' Value in Software Logic can be established that each part of a query, each piece of predicate, is either true or false.

The Customer ID in the order table either equals the ID in the Customer table or it doesn’t. The value for the customer location either equals the literal that was put in the query or it doesn’t. We don’t think about this much. It works and, as pragmatists, we use it. However, there are many situations where we have to find subtle workarounds to the limitations of the two-truth values. There are already many situations where the two-value logic is a hindrance. As our systems become more dynamic and as the reach of our queries becomes more global we will be bumping up against these limitations more and more often. In this paper, we will discuss three-value logic and just a few of the implications for systems building.

Three-value logic

The three values are: true, false, don’t know. Let’s use an example from one of our clients. This client wishes to find patients in their patient database who are eligible for a particular clinical drug trial. The drug trial has very specific criteria for eligibility. Let’s say you have to be between 21 and 50 years of age, not pregnant, and have not taken inhaled steroids for at least six months. Typical drug trial eligibility criteria are typically more complex than this but are structurally similar.

There are two problems with doing this query against our patient database. The first is that we’ve likely expressed these concepts in different terms in our database. However, we can eventually resolve that. But the second is, how do we deal with missing data in our database? The simple answer is to make all data required. However, in practice this is completely unworkable. There are small amounts of data that you might make required for some purposes. You might be able to get a birth date for all your patients.

Probably you should. However, if you haven’t seen a patient for several months, you might not know whether they are pregnant and unless you ask them or unless you prescribed the aerosol inhaler you likely won’t know whether they have taken the inhaled steroids. And this is the first place where the two-value logic falls down. We are forced to assume that the “absence of evidence” is equivalent to the “evidence of absence;” that is, if we do not have a record of your steroid use, then you haven’t used steroids.

But it doesn’t take long for us to realize that that is just not so and that any database representation is a tiny subset of the total amount of data and the state of any given object at any point in time. The first thing that the three-value logic introduces is the concept of don’t know. We are dividing the result set of our query into three populations.

Firstly, there will be some people for whom we can knowingly answer that all three predicates are true. We know their date of birth and therefore we know their age to be in the range; we know, for instance, they are male and therefore we infer they are not pregnant (and we’ll have to discuss in more detail how an inference like this can be converted to a truth value); and if we happened to have recently asked if they have used an inhaled steroid in the last six months and they answered in the negative, we take those three pieces of data and assign that individual to the set of people that matched truthfully all three parts of the predicate.

It’s even easier to put someone in the category of ineligible because a truth value on any one of the three predicates is sufficient to exclude them from the population. So, underage, overage, pregnant, or the existence of a record of dispensing an inhaled steroid medication would be sufficient to exclude people from the population. However, this very likely leaves us with a large group of people that are in the don’t know category.

Don’t know and inference

The don’t know category is the most interesting and in some ways the most productive of the categories. If you’re looking for clinical trial members, certainly those that match in all the criteria are the “low hanging fruit.” But the fact is that this set is generally far too small to satisfy the recruitment requirements, and we are forced to deal with the large and imperfectly known set of patients that might be eligible.

By the way, this thinking is not restricted to an exotic field like clinical trial recruiting. I’m using this example because I think it’s a clear and real example. But any application with a search involved, including a Web-based search, has an element of this problem. That is, there are many documents or many people or many topics that “may” be relevant and we won’t know until we investigate further. The first line of offense is to determine what can be known categorically. In other words, we already know that we do not have complete information at the instance level. In other words, for this particular patient we do not have the particular data value we need to make the evaluation.

But there are some things that we can know with a great deal of certainty at a categorical level. Earlier we alluded to the fact that we can pretty well assume that a male patient is not pregnant. There are many ways to express and evaluate this and we will not go into this here except to say that this style of evaluation is different than pure predicate evaluation based on the value of the instance data. Perhaps the most important difference is that as we begin to move off of implications that are near certainty, such as pregnant males, we move into more probabilistic inferences. We would like to be able to subdivide our population into groups which are statistically likely to be included once we find their true values.

For instance, we may know statistically that at any point in time women between the age of 21 and 30 may have a 5% chance of being pregnant and women between age of 40 and 50 may have a 2% chance of being pregnant. (Those are made up numbers but I think you can see the point.) By extending this logic over all the pieces of data in our sample, such as the likelihood that any individual who has taken an inhaled steroid which may be adjusted based on medical condition – certainly asthma patients have a much higher chance of doing this than the population at large – could give us a stratified population with a likelihood of a match.

It’s a bit of a stretch but this is somewhat analogous to some of the techniques that Google employs to improve the percentage chance that the documents they return are the documents we’re interested in. They don’t use this technique – they use techniques based on references from authoritative sites and proximity of words to each other and the like – but the result is somewhat similar.

Uncertainty and the cost of knowing

The next thing that is inescapable if you pursue this is that eventually we will have to turn our uncertainty into an acceptable level of certainty. In virtually all cases, this involves effort. What we are looking for is what can give us the best result with the least effort. As it turns out, acquisition of information can very often be categorized based on the process that will be required to acquire the knowledge.

For instance, if a piece of data that we needed had to do with recent blood chemistry, our effort is great; we must schedule the patient to visit a facility where we can remove some of their blood and we have to send that blood to some sort of lab to acquire the information. If the information is about blood chemistry and we merely have to find and pull their paper chart and look it up, that’s considerably easier, less expensive, and faster. If we can look it up in an electronic medical record, better yet. In many cases, it means we have to go to the source. If we have to call and ask someone a question there is a cost of time.

If we can send an e-mail with some reasonable likelihood of response, there is a lower cost. What we would like to do is stratify our population based on the expected cost to acquire information that is most likely to make a difference. In other words, if there are dozens of pieces of data that we need confirmed in order to get to a positive yes, then the fact that any one of those pieces of data is inexpensive to acquire is not necessarily helpful. We may need to stratify our search by pieces of data that could either rule in or rule out a particular individual.

Conclusion

In this white paper we are not dealing with the technology approach of exactly how you would apply three value logic or do any one of the many, many implications. What I wanted to do was create some awareness of both the need of doing this and the potential benefits of incorporating this into an overall architecture.

Written by Dave McComb

Response Time Quanta

How do we perceive software response time? (I’m indebted to another author for this insight, but unfortunately I cannot credit him or her because I’ve lost the reference and can’t find it either in my pile of papers I call an office, nor on the Internet. So, if anyone is aware whose insight this was, please let me know so I can acknowledge them.)

Basic Thesis

  • In most situations, a faster response from a system (whether it is a computer system or a human system) is more desirable than a slower one.
  • People develop strategies for dealing with their experience of and expectation of response times from systems.
  • Attempts to improve response time will not even be perceived (and therefore will be effort wasted) unless the improvement crosses a threshold to where the user changes his or her strategy.

These three observations combine to create a situation where the reaction to response time improvement is not linear: a 30% improvement in response time may produce no effect, while a 40% improvement may have a dramatic effect. It is this “quantum-like” effect that gave rise to the title.

First Cut Empirical Model – No Overlaps

Our first cut of the model lumps each response into a non-overlapping range. As we’ll observe later, it is not likely that simple, however, it is surprising how far you can get with this.

Quanta Name Response timeExampleUser perceptionUser response/ strategy
SimultaneousLess than 1/10th of a secMouse cursor delay on a fast system, selection highlight, turning on an incandescent light bulbUsers believe that the two things are the same thing. That there is no indirection. Moving the mouse is moving the cursor, that the click directly selects the item and that the switch turns on the lightTransparency. Users are not aware there is an intermediary between their action and the result
Instant1/10th – ½ secondScrolling, dropping physical objectBarely perceptible difference between the stimulus and the response, but just enough to realize the stimulus causes the effect.Users are aware but in control. Their every action is swiftly answered with a predictable response. No strategy required.
Snappy/ Quick ½ – 2 secondsOpening a new window, pulling a drop down list, turning on a fluorescent lightMust pay attention, "did I click that button?" (Have you ever spun the knob on a bedside lamp in a hotel, thinking it wasn't working, when you were just too fast for the fluorescent?)Brief pause, to prevent initiating the response twice. Requires conscious attention to what you are doing, which distracts from the direct experience.
Pause2–10 secondsA good web site, on a good connection. The time for someone to orally respond to a questionI have a few seconds to focus my attention elsewhere. I can plan what I'm going to do next, start another task etc. Frustration if it's not obvious the activity is in progress (hourglass needed).Think of or do something else. Many people now click on a web link, and then task switch to another program, look at their watch or something else. This was the time when data entry people would turn the page to get to the next document.
Mini Task 10 – 90 secondsLaunching a program, shutting down, asking for someone to pass something at the dinner tableThis task is going into the background until it is complete. Time to start another task (but not multiple other tasks).Time for a progress bar.You're obligated to do something else to avoid boredom. Pick up the phone, check your todo list, engage in conversation, etc.
Task90 seconds – 10 minutesA long compile, turning on your computer, rewinding a video tapeNot only do I start another task of comparable length, I also expect to have some notification that the first task is complete (a dialog box, the click the video makes).This is where the user starts another task, very often changing context (leaving the office, getting on the phone, etc.), however, the second task may be interruptible when the first task finishes.
Job10 – 60 minutesVery long compile, do a load of laundryJob is long enough that it is not worth hanging around until it is complete.Plan ahead for this, do not casually start a process that will take this long until you have other filler tasks planned (lunch, a meeting, something to read, etc.). Come back when you're pretty sure it will be done
Batch process1 – 12 hoursOld-fashioned MRP or large report run, airplane flight.Deal with the schedule more than monitoring the actual event in progress.Schedule these.
Wait½ – 3 daysResponse to email, Reference check call back, Dry cleaning,I potentially have too many of these at once. I'll lose track of them if I don't write them down.Todo lists
Project3 days – 4 monthsSoftware Project, Marketing campaign, GardeningThis is too long to wait to find out what is happening.Active statusing at periodic intervals

My contention is that once a user recognizes a situation and categories it into one of these quanta, they will adopt the appropriate strategy. For many of the strategies they won’t notice if the response time has improved, until and unless it improves enough to cause them to change strategies. Getting a C++ compile time down from 4 minutes to 2 minutes likely won’t change anyone’s work habits, but going to a Pause or Snappy turnaround, like in a Java IDE, will. In many cases the strategy obviates any awareness of the improvement. If I drop my car at the car wash before lunch and pick it up afterward, I’ll have no idea if they improved the throughput such that what used to take 40 minutes now only takes 15. However a drive-through that only takes 10 minutes might cause me to change how I do car washes.

Overlapping Edges

While I think the quantum effect is quite valid, I don’t believe that the categories are quite as precise as I suggested, and I think they may vary as someone is moving up and down the hierarchy. For instance a 2.5 second response time may in some contexts be considered snappy.

Implications

I think this has implication for systems design as well as business design. The customer facing part of a business presents a response time to the customer. The first implication is that in any project (software, hardware or network improvement, or business process reengineering) there should be a response time goal, with a reason for that, just as valid as any other requirement of a project. Where an improvement is desired, it should require that the improvement cross at lease one quanta threshold and the benefit ascribed from doing so be documented. IBM made hay in the 70’s with studies showing that dramatic productivity gains from sub-second response time on their systems more than made up for the increased cost of hardware. What was interesting was that the mathematical savings from the time shaved off each transaction wasn’t enough to justify the change, but that users worked with their systems differently (i.e., they were more engaged) when the response time went down. Some implications for… call center response time: if you expect it will be a “job” [> 10 minutes] you will plan your call much more carefully. on line ordering: when products arrive first thing the next morning and people expect that, they deal with ordering, and setting up reminders that somethings will arrive. installation programs: unless it is a “mini task” and can be done in-line (like getting a plug-in) you need to make sure that all the questions can be answered up front and the install can then run in the background. Many writers of installation programs wrongly believe that asking the user questions throughout the installation process will have them think the installation is snappy. Hello — nobody thinks that, they expect it to be a “task” and would like to turn their attention elsewhere. However, if they do something else and come back and find the install stopped because it was waiting for more info from the user, they get pissed (it was supposed to be done when they got back to it.)

Written by Dave McComb

Application Scope: A Fractal View

The scope of an application is one of those topics that seems to be quite important and yet frustratingly slippery.

There are several very good reasons for its seeming importance. One, the scope of an application eventually determines its size and, therefore, the cost to develop and/or implement it. It’s also important in that it defines what a given team is responsible for and what they have chosen to leave undone or be left for others to complete. But it’s slippery because our definitions of scope sound good; we have descriptive phrases about what is in and what is out of scope.

But as a project progresses, we find initially a trickle and eventually a tidal wave of concerns that seem to be right on the border between what is inside and outside scope. Also, one of the most puzzling aspects of an application’s scope is the way a very similar scope description for one organization translates into a drastically different size of implementation project when applied to another organization. In this paper, we explore what scope really is, the relationship between it and the effort to implement a project, and how an analogy to fractal geometry will have some bearing and perhaps be an interesting metaphor for thinking about application scope issues.

What is Fractal Geometry?

Fractal geometries are based on recursive functions that describe a geometrical shape. For instance, in the following illustrations we see a simple geometric shape, a triangle. In the second figure, we see a shape that’s been defined by a function which says, “in the middle of each side of the triangle put another triangle; and in the middle of the side of that triangle put another, and yet another, etc.” We’ve all seen the beautiful, multicolored, spiral shapes that have been created with the Mandelbrot sets. These are all variations of the fractal geometry idea. There are several things of interest about fractal geometry besides their aesthetic attractiveness. The main thing to note in this example, and in most others like this, is that while the contained area of the shape grows a little bit with each iteration of the function or with each increase in resolution of the shape, the perimeter continues to grow at a constant rate. In this case, with each iteration, the perimeter gets 33% longer. You can see that on any one side which was three units long, we could remove the middle unit and replace it with two equal length units. So we now have four thirds of the perimeter on that one side; this is repeated on all sides, and is repeated at every level. If we were to continue to run the recursion, the detail would get finer and finer until the resolution would be such that we would no longer be able to see all the details of the edges. And yet, as we magnify the image we would see that in fact the detail was there. Fractal geometricians sometimes say that Britain has an infinitely long coastline. The explanation is that if you took a low resolution map of Britain and measured the coastline you would get a particular figure. But as you increase the resolution of your map, you find many crags and edges and inlets that the previous resolution had omitted. If you were to add them up, the total coastline would now be greater. And if you went to an even higher resolution, it would be greater again and supposedly ad infinitum. So as we see from this, the perimeter of a fractal shape is a matter of the resolution that we use to observe and measure it. Hold that thought.

Application Size

When we talk about the size of application, we are usually concerned with some aspect of that application that could change in such a way that our effort to develop or implement it would change. For instance, a common size metric for applications has been number of lines of code. A system with one million lines of code is considered in many ways ten times as large as one with 100,000 lines of code. That is a gross simplification but let’s start there. It would take ten times as long to write one million lines of code as it would to write 100,000 lines of code.

Similarly, it would take ten times as long, on average, to modify part of a system that has one million lines of code as opposed to one with 100,000 lines of code. But lines of code was a reasonable metric only when most of our systems were written by hand, in procedural code, in languages that were at least comparable in their expressiveness.

Nowadays, a given system will be built of handwritten code plus generated code plus graphically designed components, style sheets, inherited code, libraries, etc. The general concepts of size and complexity still exist, but we are more likely to find the complexity in some other areas. One of the main areas where we find complexity in information systems is the size and variety of the user interfaces. For instance, a system with 100 different user interfaces can be thought of as one-tenth as complex as one with 1000 different user interfaces.

This relationship held with the lines of code metric in the old days because each screen was written by hand; the more screens, the more lines of code. Nowadays, although the connection between the screens and the lines of code may not be as valid because much of the code is generated, there is still ten times as much complexity in designing the user interfaces, laying them out, discussing them with users, etc. Other metrics that tend to indicate larger sized applications are APIs (application programming interfaces).

An API is a function call that the system has made public and can be used by other systems. An application API could be almost anything. It could be get_ quote (order_number) or it could be post_ ledger (tran). The size of applications that support APIs tends to be proportional to the number of APIs they have published. This is partly because each API must have code behind it to implement the behavior described, and partly because the act of creating the signature and the interface requires definition, design, negotiating with other system users, and maintaining and upgrading these interfaces.

Finally, one other metric that tends to move proportionally with the size of application systems is the size of their schema. The schema is the definition of the data that is being maintained by the application. Classically, it is the tables and columns in the database. However, with object oriented or XML style systems, the schema will have a slightly different definition.

An application for a database with 10,000 items in its schema (which might be 1,000 tables and 9,000 columns in total), all other things being equal, will very often be ten times as complex as an application with 1,000 items in its schema.. This is because each of those individual attributes is there for some reason and typically has to be moved from the database to a screen, or an API, or an algorithm, and has to be moved back when it’s been changed, etc.

Surface Area v Volume

Now, imagine an application as a shape. It might be an amorphous amoeba-like shape or it might be a very ordered and beautiful snowflake-like shape that could have been created from fractal geometry. The important thing to think about is the ratio of the surface area to the volume of the application. My contention is that most business applications have very little volume. What would constitute volume in a business application would be algorithms or rules about what is allowed and permissible. And while it may first seem that this is the preponderance of what you find when you look at a business application, the truth is far from that.

Typically, 1 or 2% of the code of a traditional business application is devoted to algorithms and/or true business rules. The vast majority is dealing with the surface area itself. To try another analogy, you may think of a business system as very much like a brain, where almost all the cognitive activity occurs on the surface and the interior is devoted to connecting regions of the surface. This goes a long way towards explaining why the surface of the brain is a convoluted and folded structure rather than a smooth organ like a liver.

Returning to our business application, let’s take a simple example of an inventory control system. Let’s say in this example that the algorithms of the application (the “volume”) have to do economic order quantity analysis, in other words, calculate how fast inventory is moving and determine, therefore, how much should be reordered and when. Now, imagine this is a very simple system. The surface area is relatively small compared to the volume and would consist of user interfaces to process receipts, to process issues, perhaps to process adjustments to the inventory, and a resultant inventory listing.

The schema, too, would be simple and, in this case, would have no API. Here is where it becomes interesting. This exact same system, implemented in a very small company with a tiny amount of inventory, would be fairly small and easy to develop and implement. Now take the same system and attempt to implement it in a large company.

The first thing you’ll find is that whatever simplistic arrangement we made for recording where the inventory was would now have to be supplemented with a more codified description, such as bin and location; for an even larger company that has automated robotic picking, we might need a very precise three-dimensional location.

To take it further, a company that stores its inventory in multiple geographic locations must keep track of not only the inventory in each of these locations but also the relative cost and time to move inventory between locations. This is a non-issue in our simplistic implementation.

So it would go with every aspect of the system. We would find in the large organization enough variation in the materials themselves that would warrant changes in the user interfaces and the schema to accommodate them, such as the differences between raw materials and finished goods, between serialized parts and non-serialized parts, and between items that do and do not have shelf life. We would find the sheer volume of the inventory would require us to invent functionality for cycle counting, and, as we got large enough, special functionality to deal with count errors, to hold picking while cycle count was occurring, etc.

Each of these additions increases the surface area, and, therefore, the size of the application. I’m not suggesting at all that this is wasteful. Indeed, each company will choose to extend the surface area of its application in exactly those areas where capital investment in computerizing their process will pay off. So what we have is the fractal shape becoming more complex in some areas and not in others, very dependent on the exact nature of the problem and the client site to which it is being implemented.

Conclusion

Once we see an application in this light, that is, as a fractal object with most of its size in or near its perimeter or its surface, we see why the “same” application implemented in organizations of vastly different size will result in applications of vastly different size.. We also see why a general description of the scope of an application may do a fair job of describing its volume but can miss the mark wildly when describing its surface area and its perimeter.

To date, I know of no technique that allows us to scope and estimate the surface area of an application until we’ve done sufficient design, nor to analyze return on investment to determine just how far we should go with each aspect of the design. My intention with this white paper is to make people aware of the issue and of the sensitivity of project estimates to fine details that can rapidly add up.

The Case for the Shared Service Lite

When are Two Service Implementations Better than One? The Case for the Shared Service Lite

One of the huge potential benefits of an SOA is the possibility of eliminating the need to re-implement large chunks of application functionality by using shared services.  For instance, rather than each application writing its own correspondence system to create and mail out letters, a shared service that could be invoked by all applications would do this generically.

A simple shared correspondence service in this case might maintain a number of templates (think of a Word template) and a message based API. The application would send a message containing the template ID, name and address of the correspondent and any other parameters to the service. The service would substitute in the parameterized data, mail out the letter and possibly return a confirmation message to the calling application.

So far so good, and if each application’s correspondence needs are limited to this then we are good to go. We have saved ourselves all the application code that hard-codes the specific correspondence in each of the applications, or alternatively replaced all the code in the application’s own internal template-based correspondence service.

However let’s say we have ten applications and all of them have slightly different correspondence requirements; one needs to be able to specify chunks of boilerplate text to be included in the final document, another needs to be able to mail the same document to multiple recipients, another needs to support high volume end of month mailings, and so on.  As the service is extended to meet all these requirements it gets increasingly generalized and increasingly complex.

The shared service is likely to be significantly more complex than any one of the application-specific implementations that it is replacing. At one level this is a good thing. If any individual application needs to extend its correspondence capabilities in the future there is a reasonable chance that it can do so without much impact: just make a change to the request message, update a template or two and off you go. At another level, however, this is a potential problem.

SOA is looking to simplify the development of applications and the architecture as a whole. Flexibility usually comes at the cost of some complexity. Now each application has to understand, configure and test the correspondence service for its particular needs. If your original correspondence problem was a simple one this could mean replacing a relatively simple coded solution with a relatively complex configuration problem. Exacerbating this effect is that fact that many shared service implementations will be package solutions which are designed to deal with the varied correspondence requirements of multiple enterprises, some of which are features that none of your applications will use. When considering one service in isolation this might not be such big deal, but an SOA will be likely to have tens of services that application designers need to use.

Each service, like the correspondence service, will likely have a relatively simple API that the application uses to invoke it, but it will also likely have a configuration data base that is used to provide the flexibility the service needs. The correspondence service has its templates; the workflow service has its roles; tasks and routing rules, the security service has its roles, resources and access rules, and so on.

An application team now has to understand, configure and test each of these potentially complicated pieces of software. It also has to keep this configuration data in sync between services. Even with the help of a service specialist this could potentially be a daunting task. In short each application, no matter how simple its requirements for a service, has to deal with the complexity generated by a service complex enough to meet the current and future requirements of all the applications in the enterprise.

One solution to this problem is to implement two versions of a service. Let’s say you have ten applications, seven of which have very simple correspondence requirements. Start with a simple correspondence service solution that meets the needs of these seven applications, and later, as you migrate the first of the remaining three apps, you can implement the more complex generalized solution. This approach has a number of advantages:

  1. Assuming that most applications could get away with using the simpler service, and only require more robust services in a minority of cases, the overall complexity of the application is reduced.
  2. If an application can get away with using the lite version of the service it can have its cake and eat it, too: current simplicity and future flexibility. If its requirements change in the future it can upgrade to the full-function version with minimal effort, assuming we have done a good job of standardizing the interfaces between both versions of the services.
  3. It will reduce the time it takes to implement at least one version of each service when migrating to an SOA. Implementing the ultimate correspondence service is likely to take far longer than implementing a lightweight version. This could greatly reduce the amount of rework necessitated for applications developed early in the process before all the services are available.
  4. A lightweight package implementation is required early on in the SOA planning process anyway, in order to test out key concepts and educate developers in the new architectural practices.
  5. If carefully designed it might be possible to increase the degree of runtime decoupling in the architecture by building applications that can hot swap between the two services if one or other of them is not available. This might be valuable for applications with high availability requirements. This would depend on the application being able to make do with the functionality in the lite version of the service in an emergency. You could swap the other way from the full function to light version but you would obviously lose any productivity benefits you gained, as you now have to configure both services.

The downside of this approach of course is that you now have to maintain two shared services but this is still an improvement over each individual application coding its own functionality in each project. To summarize, when planning a migration path to an SOA, consider the benefits of implementing lite versions of each service early in the process and keeping them around even after more robust services have been implemented in order to reduce the overall complexity of application development.

Skip to content