Sunday, April 22, 2012

The big bucket at the center of all things

A lot of organizations have a central database.

The database is the equivalent of a large bucket into which all information is poured. The database is the center of the processing universe for these companies, with application programs relegated to the role of satellites orbiting the database.

The problem with this approach is that the applications are tied to the database schema. When you design your applications and tie them into the central database, you can easily bind them to the schema of the database.

The result is a large collection of applications that all depend on the schema of the database. If you change the schema, you run the risk of breaking some or all of your applications. At minimum you must recompile and redeploy your applications; it may be necessary to redesign some of them.

Notice that this approach scales poorly. As you create new applications, your ability to change the database schema declines. (More specifically, the cost of changing the database schema increases.)

You can make some types of changes to the schema without affecting applications. You can add columns to tables, and you can add tables and views. You can add new entities without affecting existing applications, since they will not be using the new entities. But you cannot rename a table or column, or change the parameters to a stored procedure, without breaking the applications that use those elements. Your ability to change the schema depends on your knowledge of specific dependencies of applications on the database.

Cloud computing may help this problem. Not because cloud computing has scalable processing or scalable database storage. Not because cloud computing uses virtualized servers. And not because cloud computing has neat brand names like "Elastic Cloud" and "Azure".

Cloud computing helps the "database at the center of the universe" by changing the way people think of systems. With cloud computing, designers think in terms of services rather than physical entities. Instead of thinking of a single processor, cloud designers think of processing farms. Instead of thinking of a web server, designers think of web services. Instead of thinking of files, cloud designers think of message queues.

Cloud computing can help solve the problem of the central database by getting people to think of the database as data provided by services, not data defined by a schema. By thinking in terms of data services, application designers then build their applications to consume services. A service layer can map the exposed service to the private database schema. When the schema changes, the service layer can absorb the changes and the applications can remain unchanged.

Some changes to the database schema may bleed through the service layer. Some changes are too large to be absorbed. For those cases, the challenge becomes identifying the applications that use specific data services, a task that I think will be easier than identifying applications that use specific database tables.

No comments: