Wednesday, April 22, 2020

Three levels of Python programming

Python programming is not always what we think it is. I now think of Python programming as having three levels, three distinct forms of programming.

The first level is what we typically think of as programming in Python. It is writing Python code. This is the impression one gets when one has an "introduction to Python" class. The first program of "Hello, World" is written in Python, as are the successive programs in the class. Programs become more complex, with the addition of functions and later classes to organize larger and larger programs.

In this level, all of the code is Python. It is Python from top to bottom. And it works, for simple applications.

For some applications, it is not "Python all the way down". Some applications are complex. They must manage large quantities of data, and perform a significant number of calculations, and they must do it quickly. A Python-only solution is not a satisfactory solution, because Python is interpreted and slow.

At this point, programmers include carefully-constructed modules that perform calculations quickly. The modules "numpy" and "scipy" are the common modules, but there are many.

This is the second level of programming in Python. It is not often thought of as "programming in Python" or even "programming". It is more often though of as "importing modules and using the classes and functions in those modules".

That mindset makes sense. This work is less about Python and more about knowing which modules are available and which functions they provide. The task of programming is different; instead of writing all of the code, one assembles a solution from pre-packaged modules and uses Python to connect the various pieces.

That is why I think of it as a second level of programming. It is a different type of programming, a different type of thinking. It is not "how can I write code?" but instead "what existing code will perform this computation?".

Which brings us to the third level.

The third level of Python programming is building your own module. The existing Python modules, if they do what you need, are fast and effective. But if they do not do what you need, then they are not helpful.

Writing your own solution in Python will result is a slow program -- perhaps unacceptably slow. Therefore, as a last resort, one writes one's own module (in C or C++) and imports it into the main Python program.

This is, purists will argue, programming not in Python but in C or C++. They have a point -- it is writing C or C++ code.

But when the objective is to build a system to perform a specific task, and the top layer of the application is written in Python, then one can argue that the C code is merely an extension of the same application.

Or, one can think of the task as creating a system in multiple modules and multiple languages, not a single program in a single programming language, and using the best language for each piece of the system.

Python programming (or systems development) is often less about coding in a particular language and more about solving problems. With Python, we have three levels at which we can solve those problems.

Thursday, April 16, 2020

Lessons from the 2020 pandemic

In the middle of the 2020 pandemic, we can look around and see that many companies have shifted from "work in the office" to "work from home". (Many companies, especially retail, restaurants, movie theaters, and entertainment venues, have closed completely, with no ability to work from home.)

For those companies that have made the change, we can look and wonder why they did not make this change earlier. While some companies offered limited "work from home" opportunities (and many companies offered nothing), the pandemic has forced companies to change. Why the sudden change?

Some observations:

Shifting from "work in the office" to "work from home" is possible when the infrastructure is present. The automation of work, starting with PC-based word processors (in the 1980s) and continuing with networks (in the 1990s) and then connected networks and high-speed internet in the home (in the 2000s) all allow remote work to occur. But even with the infrastructure in place, office culture held that face-to-face interactions and work in the office was better than work from home.

Allowing your entire employee base (or a large percentage of it) is easy when a government order closes your office and forbids employees from working in it. Some work, even the small amount that gets done when working from home, is better than none.

Allowing your workforce to work from home is also easy when all other companies -- especially your competition -- are allowing their employees to work from home. Being "part of the crowd" reduces the risk (or the perceived risk) of such a change. With all companies making the change, the risk reverses: the oddball is not the company that shifts to "work from home" but the company that remains in the office.

Changing from "work in the office" to "work from home" is also easy when all of your employees make the change, instead of a few chosen workers. The typical approach to change (small pilot programs with a few employees) sets up the dynamics of "chosen" and "not chosen" employees, which can create resentment among the "not chosen" employees. When all employees shift to "work from home" it is clear that there is no favoritism and that "work from home" is not a reward for good behavior.


The change from "work in the office" to "work from home" did happen, for many companies, quickly and easily. Much of that ease of change was from the risks, or rather the change in the risk profile. The technology was in place, other companies were making the same change, all employees (or as many as practical) were involved, and the government was issuing orders that made "work in the office" impossible.

Looking forward, will companies shift back to "work in the office"? I suspect that the office culture of face-to-face interactions still holds, so that will pull managers towards a "work in the office" arrangement. In the other direction, no company wants to be first, especially when the risk of COVID-19 is still present. The decision to shift from "work from home" to "work in the office" will not be an easy one.

Tuesday, April 7, 2020

Tech debt considered possibly not harmful

Current wisdom holds that tech debt (poorly-implemented programs or sections of programs) is bad, and should be avoided. Much as been said about "clean code" and keeping the code in a good state of repair at all times. The Agile Development methodology recognizes the need for refactoring, to reduce tech debt.

Everyone agrees that good code is good (for the project and for the company) and bad code is bad (also, for the project and the company).

Except possibly me.

Which is to say, I am not convinced that every project should take steps to avoid or reduce tech debt. I am of the opinion that some projects should avoid or reduce tech debt, and other projects should not.

Which projects should avoid tech debt -- and which projects should allow it -- is an interesting question, and not always easy to answer. But the answers lie within another question: why is tech debt bad?

Tech debt is bad, we all agree (including myself), in that it increases the development cost. The forms that tech debt takes -- poorly written programs, older programming languages or programs that depend on old versions of interpreters or compilers -- slows the development of new features and fixes to existing features. Tech debt also makes for a brittle code base, such that a small change in one section can have large effects throughout the entire system. Thus, even the smallest of changes must be carefully analyzed, carefully designed, carefully implemented, carefully tested, carefully reviewed, and carefully tested again. Each of those "carefully" operations requires time and effort.

But preventing or fixing tech debt also has a cost. It diverts development resources from adding new features into fixing old code. That diversion can delay the implementation of new features (if you keep the size of the development team constant) or increase the cost of the development team (if you add members).

The decision to reduce tech debt depends on one thing, and one thing only: the value that the organization places on the software. And the value of software, while it can be calculated with the rules of accounting for capital expenditures and depreciation, is really dependent on how you use the software.

Any code base can have value from the following uses:
  • Using the application (that is, running it) for company business
  • Taking pieces of the code for use in other applications
  • Copying design of the code for use in other applications (in a different language)
  • Selling the code to another organization
If you are actively using the software (and most likely maintaining it with small fixes and possibly large enhancements), then the effects of tech debt will drive up the development costs, and tech debt should be evaluated and, when reasonable, reduced.

If you are not using the software, but intend to use pieces of the software in other systems, then the cost of tech debt must be discounted. Only the pieces that will be transferred should be considered. The remaining pieces, which will be discarded, have no intrinsic value and you should not fix their tech debt.

A different calculation applies for the transfer of not code but design. Transferring a poor design from one system to another is maintaining a poor design. But it may be more effective to fix the tech debt on the receiving end, rather than in the source system.

If you are selling the code, and this is a one-time event, then I see little incentive to improve the code. Odds are that the purchaser will not evaluate the quality of the code, or provide a higher purchase price for the improved code.

If you sell code often -- and perhaps are in the business of selling code -- then your code is your product, and you should look to remove tech debt from your code. Your code is your offering to your customers, and your reputation is built on it.

The one scenario that I did not list is the decommissioning of software. If your software has a short life (and you must define "short" as it varies from organization to organization) with no use after that life, then any investment will have a limited time for a return. We don't fix cars that are about the be hauled off to the junkyard, and we shouldn't fix software that we are about to discard.

The decision to avoid or reduce tech debt depends on the future use of the software. For some systems, this is easy: long-lived code such as the Linux kernel, or Microsoft Word, or an accounting system, all benefit from reduced tech debt. Other, short-lived code (such as a short script that is discarded at the end of the day) gain little from refactoring and improvements.

The difficult part in this is determining the future of the software. But once you know that, you know how much effort you should put into the removal of tech debt.

Friday, April 3, 2020

Information in code

We programmers think of code as providing information, to the programmer as well as the computer. Code is converted into an executable that performs tasks for us, which is information for the computer. Code is also a description of those tasks, suitable for a programmer to read.

But information in code is not uniformly distributed. Some constructs in code provide more information than others.

Let's look at three different constructs that each provide information: data types, variable names, and
comments. They all provide information to the reader of the code, and are all useful.

The first construct is the variable type. It conveys information. It is most likely correct, and it is certainly consistent with the operations performed by the code. The variable type exists in multiple places in the code, although it may not be obvious. Anywhere the variable is used, the type is present.

Variable types prevent errors, by restricting the contents of the variable and operations on the variable. (At least in statically-typed languages.)

The second construct is the variable name. It also conveys information, but information that is different from the variable type. The variable name provides information about the intent of the variable. A good variable name describes the contents in such a way as to be useful to the programmer.

The name exists in multiple places in the code. It is (syntactically) linked to the variable -- where the variable is used, the name is used.

The third construct is the comment. A comment in code conveys information, or at least it is capable of conveying information. A well-written comment is useful to the programmer: It can inform the reader of reasons for the code. Comments can explain why the code was built in a particular way. No other construct (in code) can provide this information.

A comment exists in only one place in the code. It does not appear in multiple places, like variable types or variable names. Thus, the placement of a comment is important.

Comments may be incorrect. (They can be wrong from the start, or they can be correct and then left unchanged as the associated code is changed.)

Notice that all three of these elements provide value to the programmer. Each provides some value. Together they provide a comprehensive set of information: reasons, purpose, and constraints.

All three of these elements are important when writing a program. I won't say "necessary", as some programs can be written with expressive data types and descriptive variable names, and no comments. But any program beyond the trivial will benefit from comments.

Also notice that these three elements are different. Selecting the type of variable is mostly a technical decision, with clear right-and-wrong answers. Choosing a name for a variable is often difficult -- although can be easy when a variable's contents corresponds to a real-world concept. Composing a comment to explain the reason for decisions requires an understanding of that decision and the skill to convey that decision in clear (and concise) language.

Effective programmers will use all of these constructs (variable types, variable names, and comments). They will develop skills for selecting the right data types, for assigning descriptive variable names, and they will write comments that are helpful to themselves and other programmers.