Tuesday, July 20, 2021

Debugging

Development consists of several tasks: analysis, design, coding, testing, and deployment are the typical tasks listed for development. There is one more: debugging, and that is the task I want to talk about.

First, let me observe that programmers, as a group, like to improve their processes. Programmers write the compilers and editors and operating systems, and they build tools to make tasks easier.

Over the years, programmers have built tools to assist in the tasks of development. Programmers were unhappy with machine coding, so they wrote assemblers which converted text codes to numeric codes. They were unhappy with those early assemblers because they still had to compute locations for jump targets, so they wrote symbolic assemblers that did that work.

Programmers wrote compilers for higher-level languages, starting with FORTRAN and FLOW-MATIC and COBOL. We've created lots of languages, and lots of compilers, since.

Programmers created editors to allow for creation and modification of source code. Programmers have created lots of editors, from simple text editors that can run on a paper-printing terminal to the sophisticated editors in today's IDEs.

Oh, yes, programmers created IDEs (integrated development environments) too.

And tools for automated testing.

And tools to simplify deployment.

Programmers have made lots of tools to make the job easier, for every aspect of development.

Except debugging. Debugging has not changed in decades.

There are three techniques for debugging, and they have not changed in decades.

Desk check: Not used today. Used in the days of mainframe and batch processing, prior to interactive programming. To "desk check" a program, one looks at the source code (usually on paper) and checks it for errors.

This technique was replaced by tools such as lint and techniques such as code reviews and pair programming.

Logging: Modify the code to print information to a file for later examination. Also know as "debug by printf()".

This technique is in use today.

Interactive debugging: This technique has been around since the early days of Unix. It was available in 8-bit operating systems like CP/M (the DDT program). The basic idea: Run the program with the debugger, pausing the execution it at some point. The debugger keeps the program loaded in memory, and one can examine or modify data. Some debuggers allow you to modify the code (typically with interpreted languages).

This technique is in use today. Modern IDEs such as Visual Studio and PyCharm provide interactive debuggers.

Those are the three techniques. They are fairly low-level technologies, and require the programmer to keep a lot of knowledge in his or her head.

These techniques gave us Kernighan's quote:

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"

— The Elements of Programming Style, 2nd edition, chapter 2

These debugging techniques are the equivalent of assemblers. They allow programmers to do the job, but put a lot of work on the programmers. They assist with the mechanical aspect of the task, but not the functional aspect. A programmer, working on a defect and using a debugger, usually follow the following procedure:

- understand the defect
- load the program in the debugger
- place some breakpoints in the source code, to pause execution at points that seem close to the error
- start the program running, wait for a breakpoint
- examine the state of the program (variables and their contents)
- step through the program, one line at a time, to see which decisions are made ('if' statements)

This process requires the programmer to keep a model of the program inside his or her head. It requires concentration, and interruptions or distractions can destroy that model, requiring the programmer to start again.

I think that we are ready for a breakthrough in debugging. A new approach that will make it easier for the programmer.

That new approach, I think, will be innovative. It will not be an incremental improvement on the interactive debuggers of today. (Those debuggers are the result of 30-odd years of incremental improvements, and they still require lots of concentration.)

The new debugger may be something completely new, such as running two (slightly different) versions of the same program and identifying the points in the code where execution varies.

Or possibly new techniques for visualizing the data of the program. Today's debuggers show us everything, with limited ways to specify items of interest (and other items that we don't care about and don't want to see).

Or possibly visualization of the program's state, which would be a combination of variables and executed statements.

I will admit that the effort to create a debugger (especially a new-style debugger) is hard. I have written two debuggers in my career: one for 8080 assembly language and another for an interpreter for BASIC. Both were challenges, and I was not happy with the results for either of them. I suspect that to write a debugger, one must be twice as clever as when writing the compiler or interpreter.

Yet I am hopeful that we will see a new kind of debugger. It may start as a tool specific to one language. It may be for an established language, but I suspect it will be for a newer one. Possibly a brand-new language with a brand-new debugger. (I suspect that it will be an interpreted language.) Once people see the advantages of it, the idea will be adopted by other language teams.

The new technique may be so different that we don't call it a debugger. We may give it a new name. So it may be that the new debugger is not a debugger at all.

No comments: