Sunday, December 17, 2017

Single point of failure

If you're going to have a single point of failure, make it replaceable.

We strive to avoid single points of failure. They hold risk -- if a single point of failure fails, then the entire system fails.

It is not always possible to avoid a single point of failure. Sometimes the constraint is cost. Other times the design requires a single component for a function.

If you have a single point of failure, make it easy to replace. Design the component so that you can replace it quickly and with little risk. When it fails, you can respond and install the replacement component. (Kind of like a spare tire on an automobile. Although the four tires on a car are not a single point of failure, because there are four of them. But you get the idea.)

A simple design for a single point of failure (or any component) requires care and attention. You have to design the component with minimal functionality. Move what you can to other, redundant components.

You also have to guard against changes to the simplicity. Over time, designs change. People add to designs. They want new features, or extensions to existing features. Watch for changes that complicate the single point of failure. Add them to other, redundant components in the system.

No comments: