Wednesday, February 1, 2023

To build and to maintain

I had the opportunity to visit another country recently (which one doesn't matter) and I enjoyed the warmer climate and the food. I also had the opportunity to observe another country's practices for building and maintaining houses, office buildings, roads, bridges, and other things.

The United States is pretty good at building things (roads, bridges, buildings, and such) and also good at maintaining them. The quality of construction and the practices for maintenance vary, of course, and overall governments and large corporations are better at them than small companies or individuals.

In the country I visited, the level of maintenance was lower. The culture of the country is such that people in the country are good at building things, but less concerned with maintaining them. This was apparent in things like signs in public parks: once installed they were left exposed to the elements where they faded and broke in the sun and wind.

My point is not to criticize the country or its culture, but to observe that maintaining something is quite different from building it.

That difference also applies to software. The practices of maintaining software are different from the practices of constructing software.

Software does not wear or erode like physical objects. Buildings expand and contract, develop leaks, and suffer damage. Software, stored in bits, does not expand and contract. It does not develop leaks (memory leaks aside), It is impervious to wind, rain, and fire. So why do I say that software needs maintenance?

I can make two arguments for maintenance of software. The first argument is a cyber-world analog of damage: The technology platform changes, and sometimes the software must change to adapt. A Windows application, for example, may have been designed for one version of Windows. Windows, though, is not a permanent platform; Microsoft releases new versions with new capabilities and other changes. While Microsoft makes a considerable effort to maintain compatibility, there are times when changes are necessary. Thus, maintenance is required.

The second argument is less direct, but perhaps more persuasive. The purpose of maintenance (for software) is to ensure that the software continues to run, possibly with other enhancements or changes. Yet software, when initially built, can be assembled via shortcuts and poor implementations -- what we commonly call "technical debt". Often, those choices were made to allow for rapid delivery.

Once the software is "complete" -- or at least functional, maintenance can be the act of reducing technical debt, with the goal of allowing future changes to be made quickly and reliably. This is not the traditional meaning of maintenance for software, yet it seems to correspond well with the maintenance of "real world" objects such as automobiles and houses. Maintenance is work performed to keep the object running.

If we accept this definition of maintenance for software, then we have a closer alignment of software with real-world objects. It also provides a purpose for maintenance; to ensure the long-term viability of the software.

Let's go back to the notions of building and maintaining. They are very different, as anyone who has maintained software (or a house, or an automobile).

Building a thing (software or otherwise) requires a certain set of skills and experience.

Maintaining that thing requires a different set of skills and experience. Which probably means that the work for maintenance needs a different set of management techniques, and a different set of measurements.

And building a thing in such a way that it can be maintained requires yet another set of skills and experience. And that implies yet another set of management techniques and measurements.

All of this may be intuitively obvious (like solutions to certain mathematics problems were intuitively obvious to my professors). Or perhaps not obvious (like solutions to certain math problems were to me). In either case, I think it is worth considering.

Monday, January 16, 2023

The end of more

From the very beginning, PC users wanted more. More pixels and more colors on the screen. More memory. Faster processors. More floppy disks. More data on floppy disks. (Later, it would be more data on hard disks.)

When IBM announced the PC/XT, we all longed for the space (and convenience) of its built-in hard drive. When IBM announced the PC/AT we envied those with the more powerful 80286 processor (faster! more memory! protected mode!). When IBM announced the EGA (Enhanced Graphics Adapter) we all longed for the higher-resolution graphics. With the PS/2, we wanted the reliability of 3.5" floppy disks and the millions of colors on a VGA display.

The desire for more didn't stop in the 1980s. We wanted the 80386 processor, and networks, and more memory, and faster printers, and multitasking. More programs! More data!

But maybe -- just maybe -- we have reached a point that we don't need (or want) more.

To quote a recent article in MacWorld:

"Ever since Apple announced its Apple silicon chip transition, the Mac Pro is the one Mac that everyone has anxiously been awaiting. Not because we’re all going to buy one–most of the people reading this (not to mention me, my editor, and other co-workers) won’t even consider the Mac Pro. It’s a pricey machine and the work that we do is handled just as well by any Mac in the current lineup".

Here's the part I find interesting:

"the work that we do is handled just as well by any Mac in the current lineup"

Let that sink in a minute.

The work done in the offices of MacWorld (which I assume is typical office work) can be handled by any of Apple's Mac computers. That means that the lowliest Apple computer can handle the work. Therefore, Macworld, being a commercial enterprise and wanting to reduce expenses, should be equipping its staff with the low-end MacBook Air or Mac mini PCs. To do otherwise would be wasteful.

It is not just the Apple computers that have outpaced computing needs. Low end Windows PCs also handle most office work. (I myself am typing this on a Dell desktop that was made in 2007.)

The move from 32-bit processing to 64-bit processing had a negligible affect on many computing tasks. Microsoft Word, for example, ran just as well in 32-bit Windows as it did in 64-bit Windows. The move to 64-bit processing did not improve word processing.

There are some who do still want more. People who play games want the best performance from not only video cards but also central processors and memory. Folks who edit video want performance and high-resolution displays.

But the folks who need, really need, high performance are a small part of the PC landscape. Many of the demanding tasks in computation can be handled better by cloud-based systems. It is only a few tasks that require local, high-performance processing.

The majority of PC users can get by with a low-end PC. The majority of PC users are content. One may look at a new PC with more memory or more pixels, but the envy has dissipated. We have enough colors, enough pixels, and enough storage.

If we have reached "peak more" in PCs, what does that mean for the future of PCs?

An obvious change is that people will buy PCs less frequently. With no urge to upgrade, people will keep their existing equipment longer. Corporations that buy PCs for employees may continue on a "replace every three years" schedule, but that was driven by depreciation rules and tax laws. Small mom-and-pop businesses will probably keep computers until a replacement is necessary (I suspect that they have been doing that all along). Some larger corporations may choose to defer PC replacements, noting that cash outlays for new equipment are still cash outlays, and should be minimized.

PC manufacturers will probably focus on other aspects of their wares. PC makers will strive for better battery life, durability, or ergonomic design. They may even offer Linux as an alternative to Windows.

It may be that our ideas about computing are changing. It may be that instead of local PCs that do everything, we are now looking at cloud computing (and perhaps older web applications) and seeing a larger expanse of computing. Maybe, instead of wanting faster PCs, we will shift our desires to faster cloud-based systems.

If that is true, then the emphasis will be on features of cloud platforms. They won't compete on pixels or colors, but they may compete on virtual processors, administration services, availability, and supported languages and databases. Maybe we won't be envious of new video cards and local memory, but envious instead of uptime and automated replication. 

Monday, January 9, 2023

After the GUI

Some time ago (perhaps five or six years ago) I watched a demo of a new version of Microsoft's Visual Studio. The new version (at the time) had a new feature: the command search box. It allowed the user to search for a command in Visual Studio. Visual Studio, like any Windows program, used menus and icons to activate commands. The problem was that Visual Studio was complex and had a lot of commands -- so many commands that the menu structure to hold them all was enormous, and searching for a command was difficult. Many times, users failed to find the command.

The command search box solved that problem. Instead of searching through menus, one could type the name of the command and Visual Studio would execute it (or maybe tell you the path to the command).

I also remember, at the time, thinking that this was not a good idea. I had the distinct impression that the command search box showed that the GUI paradigm had failed, that it worked up to a point of complexity but not beyond that point.

In one sense, I was right. The GUI paradigm does fail after a certain level of complexity.

But in another sense, I was wrong. Microsoft was right to introduce the command search box.

Microsoft has added the command search box to the online versions of Word and Excel. These command boxes work well, once you get acquainted with them. And you must get acquainted with them; some commands are available only through the command search box, and not through the traditional GUI.

Looking back, I can see the benefit of changing the user interface, and changing it in such a way as to make a new type of user interface.

The first user interface for personal computers was the command line. In the days of PC-DOS and CP/M-86, users had to type commands to invoke actions. There were some systems (such as the UCSD p-System) that used full-screen text displays as their interface, but these were rare. Most systems required the user to learn the commands and type them.

Apple's Macintosh and Microsoft's Windows used a GUI (Graphical User Interface) which provided the possible commands on the screen. Users could click on an icon to open a file, another icon to save the file, and a third icon to print the file. The icons were visible, and more importantly, they were the same across all programs. Rarely used commands were listed in menus, and one could quickly look through the menu to find a command.

Graphical User Interfaces with icons and buttons and menus worked, until they didn't. They were adequate for simple programs such as the early versions of Word and Excel, but they were difficult to use on complex programs that offered dozens (hundreds?) of commands.

The command search box addresses that problem. A program that uses the command search box, instead of displaying all possible commands in icons and buttons and menus, shows the commonly-used commands in the GUI and hides the less-used commands in the search box.

The search box is also rather intelligent. Enter a word or a phrase and the application shows a list of commands that are either what you want or close to it. It is, in a sense, a small search engine tuned to the commands for the application. As such, you don't have to remember the exact command.

This is a departure from the original concept of "show all possible actions". Some may consider it a refinement of the GUI; I think of it as a separate form of user interface.

I think that it is a separate form of interface because this concept could be applied to the traditional command line. (Command line interfaces are still around. Ask any user of Linux, or any admin of a server.) Today's command line interfaces are pretty much the same as the original ones from the 1970s, in that you must type the command from memory.

Some command shell programs now prompt you with suggestions to auto-complete a command. That's a nice enhancement. I think another enhancement could be something similar to the command search box of Microsoft Excel: a command that takes a phrase and reports matches. Such an option does not require graphics, so I think that this search-based interface is not tied to a GUI.

Command search boxes are the next step in the user interface. It follows the first two designs: command line (where you must memorize commands and type them exactly) and GUI (where you can see all of the commands in icons and menus). Command search boxes don't require every command to be visible (like in a GUI) and they don't require the user to recall each command exactly (like in a command line). They really are something new.

Now all we need is a name that is better than "command search box".

Monday, January 2, 2023

Southwest airlines and computers

Southwest Airlines garnered a lot of attention last week, A large winter storm caused delays on a large number of flights, a problem with which all of the airlines had to cope. But Southwest had a more difficult time of it, and people are now jumping to conclusions about Southwest and its IT systems.

Before I comment on the conclusions to which people are jumping, let me explain what I know about the problem.

The problem in Southwest's IT systems, from what I can tell, has little to do with the age of their programs or the programming languages that they chose. Instead, the problem is caused by a mix of automated and manual processes.

Southwest, like all airlines, must manage its aircraft and crews. For a large airline, this is a daunting task. Airplanes fly across the country, starting at one point and ending at a second point. Many times (especially for Southwest) the planes stop at intermediate points. Not only do airplanes make these transits, but crews do as well. The pilots and cabin attendants go along for the ride, so to speak.

Southwest, or any airline, cannot simply assign planes and crews at random. They must take into account various constraints. Flight crews, for example, can work for so many hours and then they must rest. Aircraft must be serviced at regular intervals. The distribution of planes (and crews) must be balanced -- an airline cannot end its business day with all of its aircraft and crews on the west coast, for example. The day must end with planes and crews positioned to start the next day.

For a very small airline (say one with two planes) this scheduling can be done by hand. For an airline with hundreds of planes, thousands of employees, and thousands of flights each day, the task is complex. It is no surprise that airlines use computers to plan the assignment of planes and crews. Computers can track all of the movements and ensure that constraints are respected by the plan.

But the task does not end with the creation of a set of flight assignments. During each day, random events can happen that delay a flight. Delays can be caused by headwinds, inclement weather, or sick passengers. (I guess crew members, being people, can get sick, too.)

Delays in one flight may mean delays in subsequent flights. Airlines may swap crews or planes from one planned flight to another, or they may simply wait for the late equipment. Whatever the reason, and whatever the change, the flight assignments have to be recalculated. (Much like a GPS system in your car recalculates the route when you miss an exit or a turn, except on a much larger scale.)

Southwest's system has two main components: an automated system and a manual process. The automated system handles the scheduling of aircraft and crews. The manual process handles the delays, and provides information to the automated system.

During the large winter storm, a large number of flights were delayed. So many flights were delayed that the manual process for updating information was overwhelmed -- people could not track and input the information fast enough to keep the automated system up to date.

A second problem happened on the automated side. So many people visited the web site (to check the status of flights) that it, too, could not handle all of the requests.

This is what I think happened. (At least, this makes sense to me.)

A number of people have jumped to the conclusion that Southwest's IT systems were antiquated and outdated, and that lead to the breakdown. Some people have jumped further and concluded that Southwest's management actively prevented maintenance and enhancements of their IT systems to increase profits and dividend payouts.

I'm not willing to blame Southwest's management, at least not without evidence. (And I have seen none.)

I will share these thoughts:

1. Southwest's IT systems -- even if they are outdated -- worked for years (decades?) prior to this failure.

2. All systems fail, given the right conditions.

One can argue that Southwest's system, a combination of automated and manual processes, could be redesigned to have more work handled by the automated side. It would require some way to track flights and record crews and planes arriving at a destination. Such changes are not trivial, and should be made with care.

One can argue that Southwest's IT systems use old programming techniques (and maybe even old programming languages), and Southwest should modernize their code. I find this argument unpersuasive, as newer programming languages and code written in those languages is not necessarily better (or more reliable) than the old code.

One can argue that Southwest's IT system could not scale up to handle the additional demand, and that Southwest should use cloud technologies to better meet variable demand. That is also a weak argument; moving to cloud technologies will not automatically make a system scalable.

Clearly this event was an embarrassment for Southwest, as well as a loss of some customer goodwill. (Not to mention the expense of refunds.) Given that a large winter storm could happen again (if not this year then possibly next year), Southwest may want to make adjustments to its scheduling systems and processes. But I would caution them against a large-scale re-write of their entire system. Such large projects tend to fail. Instead, I would recommend small, incremental improvements to their databases, their web sites, and their scheduling systems.

Whatever course Southwest chooses, I hope that it is executed with care, and with respect for the risks involved.

Sunday, December 18, 2022

Moving fast and breaking things is not enough

Many have lauded the phrase "Move fast and break things". Uttered by Mark Zuckerberg, founder of Facebook, it became a rallying cry for developing at a fast pace. It is a rejection of the older philosophy of careful analysis, reviewed design, and comprehensive tests. And while the pace of "move fast and break things" has its appeal, it is clear that "move fast and break things", by itself, is not enough.

Moving fast and breaking things results in, obviously, broken things. Broken things can be useful (more on this later) but they are, well, broken. A broken web site does not help customers. A broken database does not produce end-of-month reports. A broken... you get the idea.

Clearly, the one thing that you must do after you break something is to fix it. The fix may be easy or may be difficult, depending on the nature of the failures that occurred. A developer, working in a private sandbox, can break things and then restore them to working order with a "revert" command to the version control system. (This assumes a version control system, which I think in 2022 is a reasonable assumption.)

Moving fast and breaking things in the production environment is most likely a larger problem. One cannot simply revert everything to last night's backup -- today's transactions must be maintained. So we can say that moving fast is safer in developer sandboxes and riskier in production. (Just about everything is riskier in production, I think.)

But breaking things and fixing them is not enough, either. There is little point in breaking something and then fixing in by putting things back as they were.

As I see it, the point of breaking things (and fixing them) is to learn. One can learn about the system: its strengths and weaknesses, how errors are propagated, the dependencies of different components, and the information contained in logs.

With new information, one can fix a system and provide a solution that is better than the previous design. One can identify future areas for improvements. One can understand the limitations of external services and third-party libraries. That knowledge can be used to improve the system, to make it more resilient against failures, to make it more flexible for future enhancements.

So yes, by all means move fast and break things. But also fix things, and learn about the system.

Monday, November 21, 2022

More Twitter

Elon Musk has made quite the controversy, with his latest actions at Twitter (namely, terminating employment of a large number of employees, terminating the contracts for a large number of contractors, and discontinuing many of Twitter's services). His decisions have been almost universally derided; it seems that the entire internet is against him.

Let's take a contrarian position. Let's assume -- for the moment -- that Musk knows what he is doing, and that he has good reasons for his actions. Why would he take those actions, and what is his goal?

The former is open to speculation. My thought is that Twitter is losing money (it is) and is unable to fill the gap between income and "outgo" with investments. Thus, Twitter must raise revenue or reduce spending, or some combination of both. While this fits with Musk's actions, it may or may not be his motivation. 

The question of Musk's goal may be easier to answer. His goal is to improve the performance of Twitter, making it profitable and either keeping the company or selling it. (We can rule out the goal of destroying the company.) Keeping Twitter gives Musk a large communication channel to lots of people (free advertising for Tesla?) and makes him a notable figure in the tech (software) community. If Musk can "turn Twitter around" (that is, make it profitable, whether he keeps it or sells it) he builds on his reputation as a capable business leader.

Reducing the staff at Twitter has two immediate effects. The first is obvious: reduced expenses. The second is less obvious: a smaller company with fewer teams, and therefore more responsive. Usually, a smaller organization can make decisions faster than a large one, and can act faster than a large one.

It is true that a lot of "institutional knowledge" can be lost with large decreases in staff. That knowledge can range from the design of Twitter's core software, its databases, and its processes for updates, and its operations (keeping the site running). Yet a lot of knowledge can be stored in software (and database structures), and read by others if the software is well-written.

I'm not ready to bury Twitter just yet. Musk may be able to make Twitter profitable and keep a commanding presence in the tech space.

But I'm also not ready to build on top of Twitter. Musk's effort may fail, and Twitter may fail. I'm taking a cautious approach, using it for distributing and collecting and non-critical information. 

Wednesday, November 2, 2022

Twitter

Elon Musk has bought Twitter and started making changes. Lots of people have commented on the changes. Here are my thoughts.

Musk's actions are radical and seem reckless. (At least, they seem reckless to me.) Dissolving the board, terminating employment of senior managers, demanding that employees work 84-hour weeks to quickly implement a new feature (a fee for the blue 'authenticated' checkmark), and threatening to terminate the employment of employees who don't meet performance metrics are no way to win friends -- although it may influence people.

Musk may think that running Twitter is similar to running his other companies. But Tesla, SpaceX, The Boring Company are quite different from Twitter.

Twitter has a number of components. It has software: the various clients that provide Twitter to devices and PCs, the database of tweets, the query routines that select the tweets to show to individuals, and advertising inventory (ads) and the functions that inject those ads into the viewed streams.

But notice that the database of tweets is not made by Twitter. It is made by Twitter's users. It is the user base that creates the tweets, not Twitter employees. (Nor are they mined from the ground or grown on trees.)

The risk that Twitter now faces is one of reputation. If the quality (or the perceived quality) of Twitter falls, people (users) will leave. And like all social media, the value of Twitter is mostly defined by how many other people are on the service. Facebook's predecessor MySpace knows this, as does MySpace's predecessor Friendster.

Social media is like a telephone. A telephone is useful when lots of people have them. If you were the only person on Earth with a phone, it would be useless to you. (Who could you call?) The more people who use Twitter, the more valuable it is.

Musk's actions are damaging Twitter's reputation. A number of people have already closed their accounts, and more a claiming to do so in the future. (Those future closures haven't occurred, and it is possible that those individuals will decide to stay on Twitter.)

As I see it, Twitter has technical problems (all companies do) but their larger issues are management and leadership issues. Musk may have made some unforced errors that will drive away users, advertisers, employees, and future investors.