Wednesday, May 26, 2021

"Clean Code" isn't necessary today, because it succeeded

A recent rant - no, a complaint - about Robert Martin's "Clean Code" raises interesting points.

Yet while reading the complaint, I got the feeling that the author (of the complaint) didn't understand the situation at the time, and the purpose of "Clean Code".

"Clean Code" was written in 2008. Prior to that, programming in Windows was difficult. The tools were inconsistent and difficult to use. Not all tools worked with other tools. Getting one's development environment working was a challenge, as was keeping it working. Programmers had to worry about lots of things, and the quality of the code was low on the list.

Programmers in companies (that is, most programmers) had to worry about schedules and due dates. Here, the priorities were clear: shipping a poorly-built product on time was better than shipping a "clean code" product late.

Very few programmers worried about the quality of their code. Very few programmers talked about "clean code". Robert Martin was one of the few who did.

Converting poorly-designed code into "clean code" was not easy. We did not even have a standard for clean code -- everyone had their own ideas. Robert Martin gave us some standards for clean code. He also made the argument that making code clean from the start was better than writing poor code and later making it clean. The effort to make code clean (refactoring large functions into smaller functions, renaming variables and functions) often had to be done manually, and required a fair amount of effort.

Over time, programming tools improved. Windows improved. Testing tools improved. Version control improved. As tools got better, programmers could spend less time worrying about them and more time worrying about the quality of code. Also, the argument of "clean code" was beginning to make sense to managers. Studies were showing that clean code was less expensive overall, slower to write but faster to debug and faster to change, easier for new team members to understand, and less prone to defects when changes were made. Those studies made the argument for clean code to managers in terms of money, time, and risk (three of the four dimensions that managers understand).

I think we can say that "Clean Code" marks the time that PC programmers (programmers of Windows applications) stopped worrying about tools and technology and were able to worry about the quality of their code. Or at least a significant number of programmers.

"Clean Code" is not suited to today's development. It relies on an early version of Java. It is heavily object-oriented. It recommends some extreme code techniques ("functions of no more than two or three lines"). Yet it contains truths ("a function should have no side effects").

But the suitability of "Clean Code" is not my point. My point is that "Clean Code" marked a turning point in IT: a point in time when programmers had enough free time to think about the quality of code, and they could be persuaded to do so (as could managers).

Since that time, many of the ideas of "Clean Code" have become accepted as standard practices. The author of the complaint (the only name I found was the nom-de-plume "qntm") notes "This is all good advice, if a little tepid and entry-level".

Perhaps we can look back with some sense of satisfaction that "Clean Code" is no longer necessary. Robert Martin's plea for programmers to think about quality was a success -- programmers now think about quality, so much so that programmers with less than a decade of experience might think that it has always been done this way.

Three additional thoughts --

This complaint marks another point of progress: when programmers accept as a given that code quality is important and deserving of attention. The actual point may have occurred earlier, but this is documented evidence of the attitude about code quality.

The rise of Agile methods may have helped the focus on quality to gain acceptance. (Or perhaps the focus on quality helped Agile gain acceptance. Or maybe they form a self-reinforcing cycle.)

The Linux folks can (rightfully) point to Kernighan and Plaugher's "The Elements of Programming Style" from 1974, some thirty-plus years ahead of Martin's "Clean Code". Written from experience on Unix systems, it covers many of the same ideas. Its early arrival is not surprising; Unix had a stable set of tools that worked well together, and Unix was often used in research and academic settings which have different attitudes towards due dates and quality of work.

Tuesday, May 18, 2021

The new bold colors of iMacs are for Apple, not you

 I must admit that when I first saw Apple's new iMacs and the bold colors that Apple assigned to them, I was puzzled. Why would anyone want those iMacs?

Not that the colors are unappealing. They are, in fact, quite nice.

But why put such colors on iMacs? That is, why put such colors on computers that are not portable?

I understand the reasoning for colors on laptops. Bright or bold colors (and the Apple logo) on laptops makes sense. Macbook owners identify with their laptops. In an earlier age, when Macbooks were white, their owners festooned them with stickers and artwork. Today, carrying around a Macbook lets everyone else know that one is in the club of Cool Apple Kids.

But that logic doesn't work for iMacs. People don't (as a general rule) carry their iMacs from home to the office, or use them at the local coffee shop.

And why, of all places, did Apple decide to put the colors on the back of the display? That is the one place that the user isn't looking. Users of iMacs -- at least the users who I know -- look at the display and rarely look at the back of the unit. Most folks position the iMac on a desk up against a wall, where no one can see the back of the iMac.

After a bit of puzzling, I arrived at an answer.

The colors on the iMac are not for the user.

The colors on the iMac are for Apple.

Apple's positioning of colors on the back of an iMac, and the use of bold colors, makes sense from a certain point of view -- advertising. Specifically, advertising in the corporate environment.

It's true that iMacs used in a home will be positioned on desks against a wall. But that doesn't hold for the corporate environment, with its open office plans where people sit around desks that are little more than flat tables.

In those offices, people do see the backs of computers (or displays, if the CPU is on the desk or below on the floor).

By using bold colors, Apple lets everyone in an office quickly see that a new computer has arrived. All of the other computers are black; the new Apple iMacs are red, or blue, or green, or yellow. A new iMac in an office shouts out to the entire office "I'm an Apple iMac!" -- no, better than that, it shouts "I'm a new Apple iMac!".

This is advertising, and I think it will be effective. Once one person gets a new iMac, many other folks in the office will want new iMacs. "If Sam can get a new iMac, why can't I?" will be the thinking.

Notice that this advertising is targeted for offices. It doesn't work in the home. (Although in the home, with everyone knowing what everyone else has, bold colors are not necessary to generate demand.) This advertising works in offices, especially those offices where equipment is associated with status. iMacs are the Cool New Thing, and the Very Cool People always have the Cool New Thing.

Apple is leveraging its brand well.

Monday, May 10, 2021

Large programming languages considered harmful

I have become disenchanted with the C# programming language. When it was introduced in 2001, I like the language. But the latter few years have seen me less interested. I finally figured out why.

The reason for my disenchantment is the size of C#. The original version was a medium-sized language. It was an object-oriented language, and in many ways a copy of Java (which was also a medium-sized language in 2001).

Over the years, Microsoft has released new versions of C#. Each new version added features, and increased the capabilities of the language. But as Microsoft increased the capabilities, it also increased the size of the language.

The size of a programming language is an imprecise concept. It is more than a simple count of the keywords, or the number of rules for syntax. The measure I like to use is a rough guess of how much space it requires in the head of a programmer; how much brainpower is required to learn the language and how many neurons are needed to remember the different concepts, keywords, and rules of the language.

Such a measure has not been made with any tools, at least not that I know of. All I have is a rough estimate of a language's size. But that rough estimate is good enough to classify languages into small (BASIC, AWK, original FORTRAN), medium (Ruby, Python), and large (COBOL, C#, and Perl).

It may seem natural that languages expand over time. Languages other than C# have been expanded: Java (by Sun and later Oracle), Visual Basic (by Microsoft), C++ (by committee), Perl, Python, Ruby, and even languages such as COBOL and Fortran.

But such expansions of languages worry me. The source of my worry goes back to the "language wars" of the early days of computing.

In the 1960s, 1970s, and 1980s programmers argued (passionately) over programming languages. C vs Pascal, BASIC vs FORTRAN, Assembly language vs... everything.

Those arguments were fueled, mostly in my opinion, by of the high cost of changing. Programming languages were not free. Compilers and interpreters were sold (or licensed). Changing languages meant spending for the new language -- and abandoning the investment in the old. And that meant that, once invested in a language, you were loath to give it up. And that meant you would defend that choice of programming language. People would rather fight than switch.

In the 2000s, thanks to open source, compilers and interpreters became free. The financial cost of changing from one language to another disappeared. And that meant that people could switch programming languages. And that meant that people could switch rather than fight.

So why am I worried, now, in 2021, about a new round of language wars?

The reason is the size of programming languages. More specifically, the size of the environment for any one programming language. That environment includes the language, the compiler (or interpreter), the standard library (or common packages used for development), and the IDE. Each of these components requires some amount of effort to learn and remember.

As each of these environments grows, the effort to learn it grows. And that means that the effort to switch from one language to another also grows. Changing from C# to Python, for example, requires not only learning the Python syntax, it also requires learning the common packages that are necessary for effective Python programs and also learning the IDE (probably PyCharm, which is quite different from Visual Studio).

We are rebuilding the barriers between programming languages. The old barrier was financial: it cost a lot to switch from one language to another. The new barrier is not financial but technical: the tools are free but the time to learn them is significant.

Barriers to switching programming languages can put us back in the position of defending our choices. Once again, programmers may rather fight than switch.

Monday, May 3, 2021

The fall and possible rise of UML

Lots of folks are discussing UML, and specifically the death of UML. What killed UML? Lots of people have different ideas. I have some ideas too. Rather than pin the failure on one reason, I have a bunch.

First, our methods changed, and UML was not a good fit with newer methods. UML was created in the world of large-scale waterfall projects. It works well with those projects, with design up front (disparagingly called "Big Design Up Front") as a precursor to coding. UML does not work well with Agile methods, where design and coding occur in parallel. UML assigns value to code; the idea of up-front design is to build the right code form the start and not revise it. In the UML world, changes to code are expensive and to be avoided. UML also attaches value to the designs, with identical attachments to designs and the desire to avoid changes to designs. (Although changes to designs are preferred over changes to code.)

UML works well with object-oriented programming, but not with cloud computing (small scripts instead of big code).

Second, UML entailed costs. UML notation was difficult to learn. Or at least required some time to learn. The tools took time to learn, and they also cost significant sums. The mindset was "invest now (by learning UML and buying the tools) to prevent more costly mistakes later". At the time, there were charts showing the cost of a mistake, and comparing the cost of detecting the mistake at different points in the project. A mistake detected early (say in requirements or design) was less expensive than a mistake detected later (say in coding or testing). Mistakes detected after deployment were the most expensive. This effect justified the expense of UML tools.

But UML tools were expensive, and not everyone on the team got UML tools. The tools were reserved for the designers; coders were limited to printed copies of UML diagrams. This lead to the notion that designers were special and worth more than programmers. (The elite received UML tools; the plebes did not.) This in turn lead to resentment by programmers.

A third (and often overlooked) reason was the expense for designers. When programmers performed both design and programming, their salaries covered both activities. UML formalized the design process and required a subteam of designers, and each of those designers required a salary. (And they often wanted salaries higher than those of programmers.)

A fourth (and also often overlooked) reason was the added delay to the development process.

UML created an additional step in the waterfall process. Theoretically, it did not, because UML was simply formalized design documents. But in practice, UML did create an additional step.

Before UML, a project would have the formal steps of requirements, design, coding, testing, and deployment. That's what managers thought they had. In reality, the steps were slightly different than those formal steps. The actual steps were requirements, design and coding, testing, and deployment.

Before UML: requirements -> design and code -> test -> deploy

Notice that the steps of design and code are one step, not two. It was an activity performed by the programming team. As it was a single team, people could move from designing to coding and back again, revising the design as they developed the code.

UML and a formal design deliverable changed the process to the five steps the managers thought they had:

With UML: requirements -> design -> code -> rest -> deploy

UML forced the separation of design from coding, and in doing so, changed the (informal) four-step process to a five-step process.

Programmers were used to designing as well as programming. With UML, programmers could not unilaterally change the design; they had to push back against the design. This set up conflicts between designers and programmers. Sometimes the designers "gave in" and allowed a change; other times they "held fast" and programmers had to build something they considered wrong. In either case, such differences introduced delays and political struggles when there were none before.

Those are my observations for UML, and why it failed: new methods not suitable for UML, direct expense of tools and training, direct expense of designers, and a slower development process.

* * * * *

In a way, I am sorry for the loss of UML. I think it can be a helpful tool. But not in the way it was implemented.

UML was added to a project as a design aid, and one that occurred prior to coding. Perhaps it is better to have UML as a diagnostic instead of an aspiration. That is, instead of creating UML and then generating code from UML, create code and then generate UML from the code.

In this way, UML could be a kind of "super lint" that reports on the design of the system.

There was "round-tripping" which allowed for UML to be converted to code, and then that code converted back to UML. That is not the same; it leaves UML as the center for design. And round-tripping never really worked the way we needed. A one-way code-to-UML diagnostic puts code at the center and UML as a tool to assist the programmers. (That's my bias as a programmer showing.)

A code-to-UML diagnostic could be helpful to Agile projects, just as 'lint' and other style checkers are. The tools may be less expensive (we've gotten better at tools, and a diagnostic tool is easier to build than a UML editor). We would not have a separate design team, avoiding that expense (and the associated politics). And a diagnostic tool would not slow the development process -- or at least not so much.

Maybe we will see such a tool. If we do, it will have to be developed by the open-source community. (That is, an individual who wants to scratch an itch, much like Perl, or Python, or Linux.) I don't see a large corporation building one; I don't see a business model for it.

Anyone want to scratch an itch?