Friday, May 18, 2007

On fragmented layers

In a previous post I described a layered approach to an organization. This time I'd like to extend the model a little and give some areas where it might be a useful decision-making tool. You can see a figure of the stack on the left. The point of the model is that different layers deal with different aspects of how an organization is built and that they are highly interdependent with changes in any of them causing a cascade of changes both upwards and down. It's also worth noting that, almost by definition, all projects impact all of the layers. Usually new pieces are added to the ends so somebody needs to make sure that the picture stays consistent, new pieces fit with existing stuff and do not cause discrepancies with others. All of this is pretty straightforward for the technical architecture but is often disregarded for the other aspects of a business.

How would one use the model in real life? One useful application I have found is explaining people why they should consider other things (like new processes or even teams) besides functionality when they are setting up a project. It also helps to visualize responsibilities (who deals with the functional architecture in your company?).

One use I'd like to focus a little more on, is the fragmentation aspect of the model. In short, the message goes: if you are to crack a layer, you better align it with cracks in neighboring ones.

Consider, for example, a scenario when you have two web applications supported by two different business organizations on two different continents. Which means there's a division in both business and organizational layers. Of course, the crack runs all the way and the applications are not integrated in any way. Now what if somebody up in the management decides, very sensibly, that it really sucks that customers would need to go to two different stores to get their SkypeOut minutes and headsets. Makes perfect sense and just making two systems talk to each other is not a fundamental obstacle. However, could you imagine two teams 4 timezones apart sharing responsibility for what the same piece of code does (i.e. integrating the technical architecture)? Or could you imagine an actual purchase flow (functional architecture) where you buy a SkypeIN number with all of it's details and finesse of all the legal requirements we have there and at the same time compare 4 headsets? Quite difficult, isn't it? Of course all of this could be done actually, but just linking the infrastructure without thinking of the organizational (how is responsibility shared among the teams?), functional (how do the different purchase flows fit together?), business (what about revenues and, say, marketing costs of the banners in the store?) or support (do our, say, release cycles need to be synchronized with the ones of our partner) dependencies are handled makes little sense.

Monday, May 7, 2007

Time to review

There has been yet another case of rebellion in wonderland recently. Basically, a design decision was challenged long after it had been made and also implemented. A year ago, we had designed (and implemented) a system that had a substantial influence to some billing and destination resolution logic in our calling infrastructure. As this coincided with some other changes in the same modules a decision was made to start separating that logic from the actual signaling logic as the later is highly stable (and needs to be very robust) while the former is liable to change much more often.

At the same time, our developers sought to standardize communication (and load balancing, redundancy, configuration management etc. issues that come with it) between separately deployed components and stuck with ICE. So, keen to play around with the new technology, they conducted some tests and a decision was made to use it for the newly created lump of business logic.

Historically, most of our business logic has resided within our databases. Not a bad decision at all given the horizontal and vertical splitting technology plus Postgres know-how we have in-house. However, this also meant that most of the knowledge of billing and routing internals resided with people who knew databases and were not about to start writing C++ code overnight, especially when it usually did not make any sense to ship data to a remote component for decision-making.

As a result, we ended up with a fairly slim layer of logic between the calling infrastructure and the database that, at the first glimpse, did very little but call a bunch of stored procedures. Of course, come time to deploy the thing, our operations people came asking why the heck they needed to support (and make highly available) an additional component that didn't add any value at all. Which was the rebellion at the beginning of the story.

So we discussed. And ended up with an understanding that in terms of design, developers still find value in that layer as the decision _which_ procedures to call is quite significant. Also, the data structures that get passed between it and the calling infrastructure are complex and it would be unwise to build serialization into flat structures required by the database into all of the calling systems. Some of the supportability concerns (but not all) the ops guys had could actually be solved quite easily, too. No major change in the architecture, then.

The reason I'm writing about this event is that there are several very important conclusions to draw from this event
  • Your architectural decisions should take into account the organization you operate with. In some other situation, the very idea of moving logic from a middleware layer to a database would have been pure lunacy (most of the organizations struggle to do the opposite) but given the stuff our DBAs pull of on a regular basis it's not that bad
  • Challenges are valuable, regular ones are even better. No design decision should be cast to stone, no concept should be considered OK only because "this is the way things are done". Although, most of the cases you still end up retaining the original idea, sometimes you don't. And this is where architectural evolution happens
  • Work closely with the operations people. They provide very good reality check. Helps with deployment griefs, too