Tuesday, December 12, 2017

Do you want to be time and on budget, or do you want a better product?

Project management in IT is full of options, opinions, and arguments. Yet one thing that just about everyone agrees on is this: a successful development project must have a clear vision of the product (the software) and everyone the team has to understand that vision.

I'm not sure that I agree with that idea. But my explanation will be a bit lengthy.

I'll start with a summary of one of my projects: a BASIC interpreter.

* * * * *

It started with the development of an interpreter. My goal was not to build a BASIC interpreter, but to learn the Ruby programming language. I had built some small programs in Ruby, and I needed a larger, more ambitious project to learn the language in depth. (I learn by doing. Actually, I learn by making mistakes, and then fixing the mistakes. So an ambitious project was an opportunity to make mistakes.)

My initial clear vision of the product was just that: clear. I was to build a working interpreter for the BASIC language, implementing BASIC as described in a 1965 text by Kemeny and Kurtz (the authors of BASIC). That version had numeric variables but not text (string) variables. The lack of string variables simplified several aspects of the project, from parsing to execution. But the project was not trivial; there were some interesting aspects of a numeric-only BASIC language, including matrix operations and output formatting.

After some effort (and lots of mistakes), I had a working interpreter. It really ran BASIC! I could enter the programs from the "BASIC Programming" text, run them, and see the results!

The choice of Kemeny and Kurtz' "BASIC Programming" was fortuitous. It contains a series of programs, starting with simple ones and working up to complex programs, and it shows the output of each. I could build a very simple interpreter to run the initial programs, and then expand it gradually as I worked my way through the text. At each step I could check my work against the provided output.

Then things became interesting. After I had the interpreter working, I forked the source code and created a second interpreter that included string variables. A second interpreter was not part of my initial vision, and some might consider this change "scope creep". It is a valid criticism, because I was expanding the scope of the product.

Yet I felt that the expansion of features, the processing of string variables, was worth the effort. In my mind, there may be someone who wants a BASIC interpreter. (Goodness knows why, but perhaps they do.) If so, they most likely want a version that can handle string variables.

My reasoning wasn't "the product needs this feature to be successful"; it was "users of the product will find this feature helpful". I was making the lives of (possibly imaginary) users easier.

I had to find a different reference for my tests. "BASIC Programming" said nothing about string variables. So off I went, looking for old texts on BASIC. And I found them! I found three useful texts: Coan's "Basic BASIC", Tracton's "57 Practical Programs and Games", and David Ahl's "101 BASIC Computer Games".

And I was successful at adding string variables to the interpreter.

Things had become interesting (from a project management perspective) with the scope expansion for an interpreter that had string variables. And things stayed interesting: I kept expanding the scope. As I worked on a feature, I thought about new, different features. As I did, I noted them and kept working on the current feature. When I finished one feature, I started another.

I added statements to process arrays of data. BASIC can process individual variables (scalars) and matrices. I extended the definition of BASIC and created new statements to process arrays. (BASICs that handle matrices often process arrays as degenerate forms of matrices. The internal structure of the interpreter made it easy to add statements specific to arrays.)

I added statements to read and write files. This of course required statements to open and close files. These were a little challenging to create, but not that hard. Most of the work had already been done with the input-output processing for the console.

I added a trace option, to see each line as it executed. I found it useful for debugging. Using my logic from the expansion for string variables, if I found it useful then other users would (possibly) find it useful. And adding it was a simple operation: the interpreter was already processing each line, and all I had to do was add some logic to display the line as it was interpreted.

I added a profiler, to count and time the execution of each line of code. This helped me reduce the run-time of programs, by identifying inefficient areas of the code. This was also easy to add, as the interpreter was processing each line. I simply added a counter to each line's internal data, and incremented the counted when the line was executed.

Then I added a cross-reference command, which lists variables, functions, and constants, and the lines in which they appear. I use this to identify errors. For example, a variable that appears in one line (and only one line) is probably an error. It is either initialized without being used, or used without being initialized.

I decided to add a debugger. A debugger is exactly like trace mode, with the option to enter a command after each statement. This feature, too, helps the typical user.

* * * * *

Stepping back from the project, we can see that the end result (two interpreters each with profiler, cross-reference, trace mode, and debugger) is quite far from the initial vision of a simple interpreter for BASIC.

According to the predominant thinking in project management, my project is a failure. It delivered a product with many more features than initially planned, and it consumed more time than planned, two sins of project management.

Yet for me, the project is a success. First, I learned quite a bit about the Ruby programming language. Second -- and perhaps more important -- the product is much more capable and serves the user better.

* * * * *

This experience shows a difference in project management. As a side project, one without a firm budget or deadline, it was successful. The final product is much more capable that the initial vision. But more importantly, my motivation was to provide a better experience for the user.

That result is not desired in corporate software. Oh, I'm sure that corporate managers will quickly claim that they deliver a better experience to their customers. But they will do it only when their first priority has been met: profits for the corporation. And for those, the project must fit within expense limits and time limits. Thus, a successful corporate project delivers the initial vision on time and on budget -- not an expanded version that is late and over budget.

Friday, December 8, 2017

The cult of fastest

In IT, we (well, some of us) are obsessed with speed. The speed-cravers seek the fastest hardware, the fastest software, and the fastest network connections. They have been with us since the days of the IBM PC AT, which ran at 6MHz which was faster than the IBM PC (and XT) speed of 4.77MHz.

Now we see speed competition among browsers. First Firefox claims their browser is fastest. Then Google releases a new version of Chrome, and claims that it is the fastest. At some point, Microsoft will claim that their Edge browser is the fastest.

It is one thing to improve performance. When faced with a long-running job, we want the computer to be faster. That makes sense; we get results quicker and we can take actions faster. Sometimes it is reasonable to go to great lengths to improve performance.

I once had a job that compared source files for duplicate code. With 10,000 source files, and the need to compare each file against each other file, there were 1,000,000 comparisons. Each comparison took about a minute, so the total job was projected to run for 1,000,000 minutes -- or about 2 years! I revised the job significantly, using a simpler (and faster) comparison to identify if two files had any common lines of code and then using the more detailed (and longer) comparison on only those pairs with over 1,000 lines of common code.

Looking for faster processing in that case made sense.

But it is another thing to look for faster processing by itself.

Consider a word processor. Microsoft Word has been around for decades. (It actually started its life in MS-DOS.) Word was designed for systems with much smaller memory and much slower processors, and it still has some of that design. The code for Word is efficient. It spends most of its time not in processing words but in waiting for the user to type a key or click the mouse. Making the code twice as fast would not improve its performance (much), because the slowness comes from the user.

E-mail is another example. Most of the time for e-mail is, like Word, the computer waiting for the user to type something. When an e-mail is sent, the e-mail is passed from one e-mail server to another until it arrives at the assigned destination. Changing the servers would let the e-mail arrive quicker, but it doesn't help with the composition. The acts of writing and reading the e-mail are based on the human brain and physiology; faster processors won't help.

The pursuit of faster processing without definite benefits is, ironically, a waste of time.

Instead of blindly seeking faster hardware and software, we should think about what we want. We should identify the performance improvements that will benefit us. (For managers, this means lower cost or less time to obtain business results.)

Once we insist on benefits for improved performance, we find a new concept: the idea of "fast enough". When an improvement lets us meet a goal (a goal more specific than "go faster"), we can justify the effort or expense for faster performance. But once we meet that goal, we stop.

This is a useful tool. It allows us to eliminate effort and focus on changes that will help us. If we decide that our internet service is fast enough, then we can look at other things such as database and compilers. If we decide that our systems are fast enough, then we can look at security.

Which is not to say that we should simply declare our systems "fast enough" and ignore them. The decision should be well-considered, especially in the light of our competitors and their capabilities. The conditions that let us rate our systems as "fast enough" today may not hold in the future, so a periodic review is prudent.

We shouldn't ignore opportunities to improve performance. But we shouldn't spend all of our effort for them and avoid other things. We shouldn't pick a solution because it is the fastest. A solution that is "fast enough" is, at the end of the day, fast enough.

Tuesday, November 28, 2017

Root with no password

Apple made the news today, and not in a good way. It seems that their latest version of macOS, "High Sierra", allows anyone to sit at a machine and gain access to administrative functions (guarded by a name-and-password dialog) and enter the name "root" and a password of ... nothing.

This behavior in macOS is not desired, and this "bug" is severe. (Perhaps the most severe defect I have seen in the industry -- and I started prior to Windows and MS-DOS, with CP/M and other operating systems.) But my point here is not to bash Apple.

My point is this: The three major operating systems for desktop and laptop computers (Windows, macOS, and Linux) are all very good, and none are perfect.

Decades ago, Apple had superior reliability and immunity from malware. That immunity was due in part to the design of macOS and in part to Apple's small market share. (Microsoft Windows was a more tempting target.) Those conditions have changed. Microsoft has improved Windows. Malware now targets macOS in addition to Windows. (And some targets Linux.)

Each of Windows, macOS, and Linux have strengths, and each have areas of improvement. Microsoft Windows has excellent support, good office tools, and good development tools. Apple's macOS has a (slightly) better user interface but a shorter expected lifespan. (Apple retires old hardware and software more quickly than Microsoft.) Linux is reliable, has lots of support, and many tools are available for free; you have more work configuring it and you must become (or hire) a system administrator.

If you choose your operating system based on the idea that it is better than the others, that it is superior to the other choices, then you are making a mistake -- possibly larger than Apple's goof. Which is best for you depends on the tasks you intend to perform.

So think before you choose. Understand the differences. Understand your use cases. Don't simply pick Microsoft because the competition is using it. Don't pick Apple because the screen looks "cool". Don't pick Linux because you want to be a rebel.

Instead, pick Microsoft when the tools for Windows are a good match for your team and your plans. Or pick macOS because you're working on iPhone apps. Or pick Linux because your team has experience with Linux and your product or service will run on Linux and serve your customers.

Think before you choose.

Saturday, November 18, 2017

Technical debt is a bet

A long time ago, I worked in the IT department of a small, regional bank with 14 or so branch offices.

The IT team was proud of their mainframe-based online teller network. All teller transactions were cleared through the system and it prevented the fraud of someone making withdrawals at different branches, withdrawals that would exceed the account balance. (We might think little of such a system today, but in that time it was an impressive system.)

But the system had technical debt. It was written in IBM's assembler language, and it was extremely difficult to change. At the core of the system was the branch table. Not "branch" as in "jump instruction", but "branch" as in "branch office". The table allowed for 20 branch offices, and no more.

Lots of code was built around the branch table, and that code had built-in dependencies on the size of the table. In other words, the entire system "knew" that the size of the branch table was 20.

Things were okay as long as the bank had 20 (or fewer) branches. Which they did.

Until the president retired, and a new president took the helm. The new president wanted to expand the bank, and he did, acquiring a few branch offices from other banks.

The IT team started working on the expansion of the branch table. It wasn't an immediate problem, but they knew that the limit would be exceeded. They had to expand the table.

After months of analysis, coding, and tests, the IT team came to a difficult realization: they were unable to expand the branch table. The director of data processing had to inform the president. (I imaging the meeting was not pleasant.)

Technical debt exists in your systems, but it is a bet against you competitors.

It doesn't matter if the debt is your reliance on an out-of-date compiler, an old version of an operating system, or lot of messy source code.

Each of these is a form of technical debt, and each of these is a drag on agility. It slows your ability to respond to changes in the market, changes in technology, and competition. Yet in the end, it is only the competition that matters.

Does the technical debt of your existing system -- the hard-to-read code, the magic build machine, the inconsistent database schema -- slow you in responding to the competition?

It doesn't have to be a new product from the competition. It could be something that affects the entire market, such as legislation, to which you and your competition must respond. Your technical debt may delay that response. Does your competition have similar technical debt, such that their response will also be delayed? Are you sure?

That's the risk of technical debt.

Tuesday, November 14, 2017

Apple Copies Microsoft

We're familiar with the story behind Windows, and how Microsoft created Windows to compete with Apple's Macintosh. (And tech-savvy folks know how Apple copied the Xerox Star to make the Macintosh -- but that's not important here.)

Apple has just recently copied Microsoft.

In a small way.

They did it with the numbering scheme for iPhones. Apple released two iPhones this year, the iPhone 8 and the iPhone X (which Apple insists is pronounced "ten").

There is no iPhone 9.

So what does this have to do with Microsoft?

Back in 2015, Microsoft released Windows 10. It was the successor to Windows 8 (or Windows 8.1, if you want to be picky).

There is no Windows 9.

There was Windows 95 and Windows 98, collectively referred to as "Windows 9x". Some software identified those versions with the test

windowsVersion.startswith("9")

which works for Windows 95 and Windows 98 -- and probably doesn't do what you want on an imaginary Windows 9 operating system. So "Windows 10" came to be.

Apple, of course, never had an "iPhone 95" or an "iPhone 98", so they didn't have the same problem as Microsoft. They picked "iPhone X" to celebrate the 10th anniversary of the iPhone.

Did they realize that they were following Microsoft's lead? Perhaps. Perhaps not.

I'm not concerned that Apple is going to follow Microsoft in other matters.

But I do find it amusing.

Wednesday, November 8, 2017

Go is not the language we think it is

When I first started working with the Go language, my impression was that it was a better version of C. A C with strings, with better enums, and most importantly a C without a preprocessor.

All of those aspects are true, but I'm not sure that the "origin language" of Go was C.

I say this after reading Kernighan's 1981 paper "Why Pascal is not my Favorite Programming Language". In it, he lists the major deficiencies of Pascal, including types, fixed-length arrays (or the absence of variable-length arrays), lack of static variables and initialization, lack of separate compilation (units), breaks from loops, order of evaluation in expressions, the detection of end-of-line and end-of-file, the use of 'begin' and 'end' instead of braces, and the use of semicolons as separators (required in some places, forbidden in others).

All of these criticisms are addressed in the Go language. It is as if Kernighan had sent these thoughts telepathically to Griesemer, Pike, and Thompson (the credited creators of Go).

Now, I don't believe that Kernighan used mind control on the creators of Go. What I do believe is somewhat more mundane: Kernighan, working with Pike and Kernighan on earlier projects, shared his ideas on programming languages with them. And since the principal players enjoyed long and fruitful careers together, they were receptive to those ideas. (It may be that Kernighan adopted some ideas from the others, too.)

My conclusion is that the ideas that became the Go language were present much earlier than the introduction of Go, or even the start of the Go project. They were present in the 1980s, and stimulated by the failings of Pascal. Go is, as I see it, a "better Pascal", not a "better C".

If that doesn't convince you, consider this: The assignment operator in C is '=' and in Pascal it is ':='. In Go, the assignment operators (there are two) are ':=' and '='. (Go uses the first form to declare new variables and the second form to assign to existing variables.)

In the end, which language (C or Pascal) was the predecessor of Go matters little. What matters is that we have the Go language and that it is a usable language.

Sunday, October 29, 2017

We have a problem

The Rust programming language has a problem.

The problem is one of compactness, or the lack thereof. This problem was brought to my attention by a blog post about the Unix 'yes' program.

In short, Rust requires a lot of code to handle a very simple task.

The simple task, in this case, is the "yes" program from Unix. This program feeds the string "y\n" ('y' with newline) to output as many times as possible.

Here's the program in C:
main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s\n", argc>1? argv[1]: "y");
}
And here is an attempt in Rust:
use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}
The Rust version is quite slow compared to the C version, so the author and others made some "improvements" to Make It Go Fast:
use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<'a>(buffer: &'a mut [u8], output: &'a [u8]) -> &'a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y\n"[..],
    ),
    |mut arg| {
      arg.push(b'\n');
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}
Now, that's a lot of code. Really a lot. For a simple task.

To be fair, the author mentions that the GNU version of 'yes' weighs in at 128 lines, more that twice this monstrosity in Rust. But another blogger posted this code which improves performance:
#define LEN 2
#define TOTAL 8192
int main() {
    char yes[LEN] = {'y', '\n'};
    char *buf = malloc(TOTAL);
    int bufused = 0;
    while (bufused < TOTAL) {
        memcpy(buf+bufused, yes, LEN);
        bufused += LEN;
    }
    while(write(1, buf, TOTAL));
    return 1;
}

Programming languages should be saving us work. The high-performance solution in Rust is long, way too long, for such simple operations.

We have a problem. It may be in our programming languages. It may be in run-time libraries. It may be in the operating systems and their APIs. It may be in the hardware architecture. It may be a combination of several.

But a problem we have.