Fitzpatrick's Fabulous Future: containers

Showing posts with label containers. Show all posts

Wednesday, March 11, 2020

Holding back time

An old episode of the science-fiction series "Dr. Who", the villain roams the galaxy and captures entire planets, all to power "time dams" used to prevent his tribal matriarch from dying. The efforts were in vain, because while one can delay changes one cannot hold them back indefinitely.

A lot of effort in IT is also spent on "keeping time still" or preventing changes.

Projects using a "waterfall" process prevent changes by agreeing early on to the requirements, and then "freezing" the requirements. A year-long project can start with two-month phase to gather, review, and finalize requirements; the remainder of the year is devoted to implementing those requirements, exactly as agreed, with no changes or additions. The result is often disappointing. Delivered systems were incorrect (because the requirements, despite review, were incorrect) or incomplete (for the same reason) and even if neither of those were true the requirements were a year out of date. Time had progressed, and changes had occurred, outside of the project "bubble".

Some waterfall-managed projects allow for changes, usually with an onerous "change control" process that requires a description and justification of the change and agreement (again) among all of the concerned parties. This allows for changes, but puts a "brake" on them, limiting the number and scope of changes.

But project management methodologies are not the only way we try to hold back time. Other areas that we try to prevent changes include:

Python's "requirements.txt" file, which lists the required packages. When used responsibly, it lists the required packages and the minimum version of each package. (A good idea as one does need to know the packages and the versions, and this was a consistent method.) Some projects try to hold back changes by specifying an exact version of a package (such as "must be version 1.4 and no other") in fear that a later version may break something.

Locking components to specific versions will eventually fail: a component will not be available, or the specified version will not work on a new operating system or in a new version of the interpreter. (Perhaps even the Python interpreter itself, if held back in this manner, will fail.)

Containers, which contain the "things that an application needs". Many "containerized" applications contain a database and the database software, but they can also include other utilities. The container holds a frozen set of applications and libraries, installed each time the container is deployed. While they can be updated that doesn't mean they are updated.

Those utilities and libraries that are "frozen in time" will eventually cause problems. They are not stand-alone; they often rely on other utilities and libraries, which may not be present in the container. At some point, the "outside" libraries will not work for the "inside" applications.

Virtual machines to run old versions of operating systems, to run old versions of applications that run only on old versions of operating systems. Virtual machines can be used for other purposes, and this is yet another form of "holding back time".

Virtual machines with old versions of operating systems, running old versions of applications, also have problems. Their ability to communicate with other systems on the network will (probably) break, due to expired certificates or a change in a protocol.

All of these techniques pretend to solve a problem. But they are not really solutions -- they simply delay the problem. Eventually, you will have an incompatibility, somewhere. But that isn't the biggest problem.

The biggest problem may be in thinking that you don't have a problem.

Sunday, February 12, 2017

Databases, containers, and Clarke's first law

A blog post by a (self-admitted) beginner engineer rants about databases inside of containers. The author lays out the case against using databases inside containers, pointing out potential problems from security to configuration time to the problems of holding state within a container. The argument is intense and passionate, although a bit difficult for me to follow. (That, I believe, is due to my limited knowledge of databases and my even more limited knowledge of containers.)

I believe he raises questions which should be answered before one uses databases in containers. So in one sense, I think he is right.

In a larger sense, I believe he is wrong.

For that opinion, I refer to Clarke's first law, which states: When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.

I suspect that it applies to sysadmins and IT engineers just as much as it does to scientists. I also suspect that age has rather little effect, too. Our case is one of a not-elderly not-scientist claiming that databases inside of containers is impossible, or at least a Bad Idea and Will Lead Only To Suffering.

My view is that containers are useful, and databases are useful, and many in the IT field will want to use databases inside of containers. Not just run programs that access databases on some other (non-containerized) server, but host the database within a container.

Not only will people want to use databases in containers, there will be enough pressure and enough interested people that they will make it happen. If our current database technology does not work well with containers, then engineers will modify containers and databases to make them work. The result will be, quite possibly, different from what we have today. Tomorrow's database may look and act differently from today's databases. (Just as today's phones look and act differently from phones of a decade ago.)

Utility is one of the driving features of technology. Containers have it, so they will be around for a while. Databases have it (they've had it for decades) and they will be around for a while. One or both may change to work with the other.

We'll still call them databases, though. The term is useful, too.

Sunday, October 18, 2015

More virtual, less machine

A virtual machine, in the end, is really an elaborate game of "let's pretend". The host system (often called a hypervisor), persuades an operating system that a physical machine exists, and the operating system works "as normal", driving video cards that do not really exist and responding to timer interrupts created by the hypervisor.

Our initial use of virtual machines was to duplicate our physical machines. Yet in the past decade, we have learned about the advantages of virtual machines, including the ability to create (and destroy) virtual machines on demand. These abilities have changed our ideas about computers.

Physical computers (that is, the real computers one can touch) often server multiple purposes. A desktop PC provides e-mail, word processing, spreadsheets, photo editing, and a bunch of other services.

Virtual computers tend to be specialized. We build virtual machines often as single-purpose servers: web servers, database servers, message queue servers, ... you get the idea.

Our operating systems and system configurations have been designed around the desktop computer, the one serving multiple purposes. Thus, the operating system has to provide all possible services, including those that might never be used.

But with specialized virtual servers, perhaps we can benefit from a different approach. Perhaps we can use a specialized operating system, one that includes only the features we need for our application. For example, a web server needs an operating system and the web server software, and possibly some custom scripts or programs to assist the web server -- but that's it. It doesn't need to worry about video cards or printing. It doesn't need to worry about programmers and their IDEs, and it doesn't need to have a special debug mode for the processor.

Message queue servers are also specialized, and if they keep everything in memory then they need little about file systems and reading or writing files. (They may need enough to bootstrap the operating system.)

All of our specialized servers -- and maybe some specialized desktop or laptop PCs -- could get along with a specialized operating system, one that uses the components of a "real" operating and just enough of those components to get the job done.

We could change policy management on servers. Our current arrangement sees each server as a little stand-alone unit that must receive policies and updates to those policies. That means that the operating system must be able to receive the policy updates. But we could change that. We could, upon instantiation of the virtual server, build in the policies that we desire. If the policies change, instead of sending an update, we create a new virtual instance of our server with the new policies. Think of it as "server management meets immutable objects".

The beauty of virtual servers is not that they are cheaper to run, it is that we can throw them away and create new ones on demand.

Fitzpatrick's Fabulous Future