Tuesday, April 23, 2019

Full-stack developers and the split between development and system administration

The notion of a "full stack" developer has been with us for a while, Some say it is a better way to develop and deploy systems, others take the view that it is a way for a company to build systems at lower cost. Despite their differing opinions on the value of a full stack engineer, everyone agrees on the definition: A "full stack" developer (or engineer) is a person who can "do it all" from analysis to development and testing (automated testing), from database design to web site deployment.

But here is a question: Why was there a split in functions? Why did we have separate roles for developers and system administrators? Why didn't we have combined roles from the beginning?

Well, at the very beginning of the modern computing era, we did have a single role. But things became complicated, and specialization was profitable for the providers of computers. Let's go back in time.

We're going way back in time, back before the current cloud-based, container-driven age. Back before the "old school" web age. Before the age of networked (but not internet-connected) PCs, and even before the PC era. We're going further back, before minicomputers and before commercial mainframes such as the IBM System/360.

We're going back to the dawn of modern electronic computing. This was a time before the operating system, and individuals who wanted to use a computer had to write their own code (machine code, not a high-level language such as COBOL) and those programs managed memory and manipulated input-output devices such as card readers and line printers. A program had total control of the computer -- there was no multiprocessing -- and it ran until it finished. When one programmer was finished with the computer, a second programmer could use it.

In this age, the programmer was a "full stack" developer, handling memory allocation, data structures, input and output routines, business logic. There were no databases, no web servers, and no authentication protocols, but the programmer "did it all", including scheduling time on the computer with other programmers.

Once organizations developed programs that they found useful, especially programs that had to be run on a regular basis, they dedicated a person to the scheduling and running of those tasks. That person's job was to ensure that the important programs were run on the right day, at the right time, with the right resources (card decks and magnetic tapes).

Computer manufacturers provided people for those roles, and also provided training for client employees to learn the skills of the "system operator". There was a profit for the manufacturer -- and a cost to be avoided (or at least minimized) by the client. Hence, only a few people were given the training.

Of the five "waves" of computing technology (mainframe, minicomputers, personal computers, networked PCs, and web servers) most started with a brief period of "one person does it all" and then shifted to a model that divided labor among specialists. Mainframes specialized with programmers and system operators (and later, database administrators). Personal computers, by their very nature, had one person but later specialists for word processing, databases, and desktop publishing. Networked PCs saw specialization with enterprise administrators (such as Windows domain administrators) and programmers each learning different skills.

It was the first specialization of tasks, in the early mainframe era, that set the tone for later specializations.

Today, we're moving away from specialization. I suspect that the "full stack" engineer is desired by managers who have tired of the arguments between specialists. Companies don't want to hear sysadmins and programmers bickering about who is at fault when an error occurs; they want solutions. Forcing sysadmins and programmers to "wear the same hat" eliminates the arguments. (Or so managers hope.)

The specialization of tasks on the different computing platforms happened because it was more efficient. The different jobs required different skills, and it was easier (and cheaper) to train some individuals for some tasks and other individuals for other tasks, and manage the two groups.

Perhaps the relative costs have changed. Perhaps, with our current technology, it is more difficult (and more expensive) to manage groups of specialists, and it is cheaper to train full-stack developers. That may say more about management skills than it does about technical skills.

Wednesday, April 10, 2019

Program language and program size

Can programs be "too big"? Does it depend on the language?

In the 1990s, the two popular programming languages from Microsoft were Visual Basic and Visual C++. (Microsoft also offered Fortran and an assembler, and I think COBOL, but they were used rarely.)

I used both Visual Basic and Visual C++. With Visual Basic it was easy to create a Windows application, but the applications in Visual Basic were limited. You could not, for example, launch a modal dialog from within a modal dialog. Visual C++ was much more capable; you had the entire Windows API available to you. But the construction of Visual C++ applications took more time and effort. A simple Visual Basic application could be "up and running" in a minute. The simplest Visual C++ application took at least twenty minutes. Applications with dialogs took quite a bit of time in Visual C++.

Visual Basic was better for small applications. They could be written quickly, and changed quickly. Visual C++ was better for large applications. Larger applications required more design and coding (and more testing) but could handle more complex tasks. Also, the performance benefits of C++ were only obtained for large applications.

(I will note that Microsoft has improved the experience since those early days of Windows programming. The .NET framework has made a large difference. Microsoft has also improved the dialog editors and other tools in what is now called Visual Studio.)

That early Windows experience got me thinking: are some languages better at small programs, and other languages better at large programs? Small programs written in languages that require a lot of code (verbose languages) have a disadvantage because of the extra work. Visual C++ was a verbose language; Visual Basic was not -- or was less verbose. Other languages weigh in at different points on the scale of verbosity.

Consider a "word count" program. (That is, a program to count the words in a file.) Different languages require different amounts of code. At the small-program end of the scale we have languages such as AWK and Perl. At the large-end of the scale we have COBOL.

(I am considering lines of code here, and not executable size or the size of libraries. I don't count run-time environments or byte-code engines.)

I would much rather write (and maintain) the word-count program in AWK or Perl (or Ruby or Python). Not because these languages are modern, but because the program itself is small. (Trival, actually.) The program in COBOL is large; COBOL has some string-handling functions (but not many) and it requires a fair amount of overhead to define the program. A COBOL program is long, by design. The COBOL language is a verbose language.

Thus, there is an incentive to build small programs in certain languages. (I should probably say that there is an incentive to build certain programs in certain languages.)

But that is on the small end of the scale of programs. What about the other end? Is there an incentive to build large programs in certain languages?

I believe that the answer is yes. Just as some languages are good for small programs, other languages are good for large programs. The languages that are good for large programs have structures and constructs which help us humans manage and understand the code in large scale.

Over the years, we have developed several techniques we use to manage source code. They include:

  • Multiple source files (#include files, copybooks, separate compiled files in a project, etc.)
  • A library of subroutines and functions (the "standard library")
  • A repository of libraries (CPAN, CRAN, gems, etc.)
  • The ability to define subroutines
  • The ability to define functions
  • Object-oriented programming (the ability to define types)
  • The ability to define interfaces
  • Mix-in fragments of classes
  • Lambdas and closures

These techniques help us by partitioning the code. We can "lump" and "split" the code into different subroutines, functions, modules, classes, and contexts. We can define rules to limit the information that is allowed to flow between the multiple "lumps" of a system. Limiting the flow of information simplifies the task of programming (or debugging, or documenting) a system.

Is there a point when a program is simply "too big" for a language?

I think there are two concepts lurking in that question. The first is a relative answer, and the second is an absolute answer.

Let's start with a hypothetical example. A mind experiment, if you will.

Let's imagine a program. It can be any program, but it is small and simple. (Perhaps it is "Hello, world!") Let's pick a language for our program. As the program is small, let's pick a language that is good for small programs. (It could be Visual Basic or AWK.)

Let's continue our experiment by increasing the size of our program. As this was a hypothetical program, we can easily expand it. (We don't have to write the actual code -- we simply expand the code in our mind.)

Now, keeping our program in mind, and remembering our initial choice of a programming language, let us consider other languages. Is there a point when we would like to switch from our chosen programming language to another language?

The relative answer applies to a language when compared to a different language. In my earlier example, I compared Visual Basic with Visual C++. Visual Basic was better for small programs, Visual C++ for large programs.

The exact point of change is not clear. It wasn't clear in the early days of Windows programming, either. But there must be a crossover point, where the situation changes from "better in Visual Basic" to "better in Visual C++".

The two languages don't have to be Visual Basic and Visual C++. They could be any pair. One could compare COBOL and assembler, or Java and Perl, or Go and Ruby. Each pair has its own crossover point, but the crossover point is there. Each pair of languages has a point in which it is better to select the more verbose language, because of its capabilities at managing large code.

That's the relative case, which considers two languages and picks the better of the two. Then there is the absolute case, which considers only one language.

For the absolute case, the question is not "Which is the better language for a given program?", but "Should we write a program in a given language?". That is, there may be some programs which are too large, too complex, too difficult to write in a specific programming language.

Well-informed readers will be aware that a program written in a language that is "Turing complete" can be translated into any other programming language that is also "Turing complete". That is not the point. The question is not "Can this program be written in a given language?" but "Should this program be written in a given language?".

That is a much subtler question, and much more subjective. I may consider a program "too big" for language X while another might consider it within bounds. I don't have metrics for such a decision -- and even if I did, one could argue that my cutoff point (a complexity value of 2000, say) is arbitrary and the better value is somewhat higher (perhaps 2750). One might argue that a more talented team can handle programs that are larger and more complex.

Someday we may have agreed-upon metrics, and someday we may have agreed-upon cutoff values. Someday we may be able to run our program through a tool for analysis, one that computes the complexity and compares the result to our cut-off values. Such a tool would be an impartial judge for the suitability of the programming language for our task. (Assuming that we write programs that are efficient and correct in the given programming language.)

Someday we may have all of that, and the discipline to discard (or re-design) programs that exceed the boundaries.

But we don't have that today.