Fitzpatrick's Fabulous Future

Saturday, November 18, 2017

Technical debt is a bet

A long time ago, I worked in the IT department of a small, regional bank with 14 or so branch offices.

The IT team was proud of their mainframe-based online teller network. All teller transactions were cleared through the system and it prevented the fraud of someone making withdrawals at different branches, withdrawals that would exceed the account balance. (We might think little of such a system today, but in that time it was an impressive system.)

But the system had technical debt. It was written in IBM's assembler language, and it was extremely difficult to change. At the core of the system was the branch table. Not "branch" as in "jump instruction", but "branch" as in "branch office". The table allowed for 20 branch offices, and no more.

Lots of code was built around the branch table, and that code had built-in dependencies on the size of the table. In other words, the entire system "knew" that the size of the branch table was 20.

Things were okay as long as the bank had 20 (or fewer) branches. Which they did.

Until the president retired, and a new president took the helm. The new president wanted to expand the bank, and he did, acquiring a few branch offices from other banks.

The IT team started working on the expansion of the branch table. It wasn't an immediate problem, but they knew that the limit would be exceeded. They had to expand the table.

After months of analysis, coding, and tests, the IT team came to a difficult realization: they were unable to expand the branch table. The director of data processing had to inform the president. (I imaging the meeting was not pleasant.)

Technical debt exists in your systems, but it is a bet against you competitors.

It doesn't matter if the debt is your reliance on an out-of-date compiler, an old version of an operating system, or lot of messy source code.

Each of these is a form of technical debt, and each of these is a drag on agility. It slows your ability to respond to changes in the market, changes in technology, and competition. Yet in the end, it is only the competition that matters.

Does the technical debt of your existing system -- the hard-to-read code, the magic build machine, the inconsistent database schema -- slow you in responding to the competition?

It doesn't have to be a new product from the competition. It could be something that affects the entire market, such as legislation, to which you and your competition must respond. Your technical debt may delay that response. Does your competition have similar technical debt, such that their response will also be delayed? Are you sure?

That's the risk of technical debt.

Tuesday, November 14, 2017

Apple Copies Microsoft

We're familiar with the story behind Windows, and how Microsoft created Windows to compete with Apple's Macintosh. (And tech-savvy folks know how Apple copied the Xerox Star to make the Macintosh -- but that's not important here.)

Apple has just recently copied Microsoft.

In a small way.

They did it with the numbering scheme for iPhones. Apple released two iPhones this year, the iPhone 8 and the iPhone X (which Apple insists is pronounced "ten").

There is no iPhone 9.

So what does this have to do with Microsoft?

Back in 2015, Microsoft released Windows 10. It was the successor to Windows 8 (or Windows 8.1, if you want to be picky).

There is no Windows 9.

There was Windows 95 and Windows 98, collectively referred to as "Windows 9x". Some software identified those versions with the test

windowsVersion.startswith("9")

which works for Windows 95 and Windows 98 -- and probably doesn't do what you want on an imaginary Windows 9 operating system. So "Windows 10" came to be.

Apple, of course, never had an "iPhone 95" or an "iPhone 98", so they didn't have the same problem as Microsoft. They picked "iPhone X" to celebrate the 10th anniversary of the iPhone.

Did they realize that they were following Microsoft's lead? Perhaps. Perhaps not.

I'm not concerned that Apple is going to follow Microsoft in other matters.

But I do find it amusing.

Wednesday, November 8, 2017

Go is not the language we think it is

When I first started working with the Go language, my impression was that it was a better version of C. A C with strings, with better enums, and most importantly a C without a preprocessor.

All of those aspects are true, but I'm not sure that the "origin language" of Go was C.

I say this after reading Kernighan's 1981 paper "Why Pascal is not my Favorite Programming Language". In it, he lists the major deficiencies of Pascal, including types, fixed-length arrays (or the absence of variable-length arrays), lack of static variables and initialization, lack of separate compilation (units), breaks from loops, order of evaluation in expressions, the detection of end-of-line and end-of-file, the use of 'begin' and 'end' instead of braces, and the use of semicolons as separators (required in some places, forbidden in others).

All of these criticisms are addressed in the Go language. It is as if Kernighan had sent these thoughts telepathically to Griesemer, Pike, and Thompson (the credited creators of Go).

Now, I don't believe that Kernighan used mind control on the creators of Go. What I do believe is somewhat more mundane: Kernighan, working with Pike and Kernighan on earlier projects, shared his ideas on programming languages with them. And since the principal players enjoyed long and fruitful careers together, they were receptive to those ideas. (It may be that Kernighan adopted some ideas from the others, too.)

My conclusion is that the ideas that became the Go language were present much earlier than the introduction of Go, or even the start of the Go project. They were present in the 1980s, and stimulated by the failings of Pascal. Go is, as I see it, a "better Pascal", not a "better C".

If that doesn't convince you, consider this: The assignment operator in C is '=' and in Pascal it is ':='. In Go, the assignment operators (there are two) are ':=' and '='. (Go uses the first form to declare new variables and the second form to assign to existing variables.)

In the end, which language (C or Pascal) was the predecessor of Go matters little. What matters is that we have the Go language and that it is a usable language.

Sunday, October 29, 2017

We have a problem

The Rust programming language has a problem.

The problem is one of compactness, or the lack thereof. This problem was brought to my attention by a blog post about the Unix 'yes' program.

In short, Rust requires a lot of code to handle a very simple task.

The simple task, in this case, is the "yes" program from Unix. This program feeds the string "y\n" ('y' with newline) to output as many times as possible.

Here's the program in C:

main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s\n", argc>1? argv[1]: "y");
}

And here is an attempt in Rust:

use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}

The Rust version is quite slow compared to the C version, so the author and others made some "improvements" to Make It Go Fast:

use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<'a>(buffer: &'a mut [u8], output: &'a [u8]) -> &'a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y\n"[..],
    ),
    |mut arg| {
      arg.push(b'\n');
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}

Now, that's a lot of code. Really a lot. For a simple task.

To be fair, the author mentions that the GNU version of 'yes' weighs in at 128 lines, more that twice this monstrosity in Rust. But another blogger posted this code which improves performance:

#define LEN 2
#define TOTAL 8192
int main() {
    char yes[LEN] = {'y', '\n'};
    char *buf = malloc(TOTAL);
    int bufused = 0;
    while (bufused < TOTAL) {
        memcpy(buf+bufused, yes, LEN);
        bufused += LEN;
    }
    while(write(1, buf, TOTAL));
    return 1;
}

Programming languages should be saving us work. The high-performance solution in Rust is long, way too long, for such simple operations.

We have a problem. It may be in our programming languages. It may be in run-time libraries. It may be in the operating systems and their APIs. It may be in the hardware architecture. It may be a combination of several.

But a problem we have.

Sunday, October 15, 2017

Don't make things worse

We make many compromises in IT.

Don't make things worse. Don't break more things to accommodate one broken thing.

On one project in my history, we had an elderly Windows application that used some data files. The design of the application was such that the data files had to reside within the same directory as the executable. This design varies from the typical design for a Windows application, which sees data files stored in a user-writeable location. Writing to C:\Program Files should be done only by install programs, and only with elevated privileges.

Fortunately, the data files were read by the application but not written, so we did not have to grant the application write access to a location under C:\Program Files. The program could run, with its unusual configuration, and no harm was done.

But things change, and there came a time when this application had to share data with another application, and that application *did* write to its data files.

The choices were to violate generally-accepted security configurations, or modify the offending application. We could grant the second application write permissions into C:\Program Files (actually, a subdirectory, but still a variance from good security).

Or, we could install the original application in a different location, one in which the second application could write to files. This, too, is a variance from good security. Executables are locked away in C:\Program Files for a reason -- they are the targets of malware, and Windows guards that directory. (True, Windows does look for malware in all directories, but its better for executables to be locked away until needed.)

Our third option was to modify the original application. This we could do; we had the source code and we had built it in the past. The code was not in the best of shape, and small changes could break things, but we did have experience with changes and a battery of tests to back us up.

In the end, we selected the third option. This was the best option, for a number of reasons.

First, it moved the original application closer to the standard model for Windows. (There were other things we did not fix, so the application is not perfect.)

Second, it allowed us to follow accepted procedures for our Windows systems.

Finally, it prevented the spread of bad practices. Compromising security to accommodate a poorly-written application is a dangerous path. It expands the "mess" of one application into the configuration of the operating system. Better to contain the mess and not let it grow.

We were lucky. We had an option to fix the problem application and maintain security. We had the source code for the application and knowledge about the program. Sometimes the situation is not so nice, and a compromise is necessary.

But whenever possible, don't make things worse.

Sunday, October 8, 2017

The Amazon Contest

Allow me to wander from my usual space of technology and share some thoughts on the Amazon.com announcement.

The announcement is for their 'HQ2', an office building (or complex) with 50,000 well-paid employees that is up for grabs to a lucky metropolitan area. City planners across the country are salivating over winning such an employer for their struggling town. Amazon has announced the criteria that they will consider for the "winner", including educated workforce, transit, and cost of living.

The one thing that I haven't seen is an analysis of the workforce numbers. From this one factor alone, we can narrow the hopeful cities to a handful.

Amazon.com wants their HQ2 complex to employee 50,000 people. That means that they will either hire the people locally or they will relocate them. Let's assume that they relocate one-third of the employees in HQ2. (The relocated could be current employees at other Amazon.com offices or new hires from out of the HQ2 area.)

That leaves about 33,000 people to hire. Assuming that they hire half as entry-level, they will need the other half to be experienced. (I'm assuming that Amazon.com will not relocate entry-level personnel.)

The winning city will have to supply 16,000 experienced professionals and 16,000 entry-level people. That's not an easy lift, and not one that many cities can offer. It means that the city (or metro area) must have a large population of professionals -- larger than 16,000 because not everyone will be willing to leave their current position and enlist with Amazon.com. (And Amazon.com may be unwilling to hire all candidates.)

If we assume that only one in ten professionals are willing to move, then Amazon.com needs a metro area with at least 160,000 professionals. (Or, if Amazon.com expected to pick one in ten candidates, the result is the same.)

And don't forget the relocated employees. They will need housing. Middle class, ready to own, housing -- not "fixer uppers" or "investment opportunities". A few relocatees may choose the "buy and invest" option, but most are going to want a house that is ready to go. How many cities have 15,000 modern housing units available?

These two numbers -- available housing and available talent -- set the entrance fee. Without them, metro areas cannot compete, no matter how good the schools or the transit system or the tax abatement.

So when Amazon.com announces the location of HQ2, I won't be surprised if it has a large population of professionals and a large supply of housing. I also won't be surprised if it doesn't have some the other attributes that Amazon.com put on the list, such as incentives and tax structure.

Wednesday, October 4, 2017

Performance, missing and found

One of the constants in technology has been the improvement of performance. More powerful processors, faster memory, larger capacity in physically smaller disks, and faster communications have been the results of better technology.

This increase in performance is mostly mythological. We are told that our processors are more powerful, we are told that memory and network connections are faster. Yet what is our experience? What are the empirical results?

For me, word processors and spreadsheets run just as fast as they did decades ago. Operating systems load just as fast -- or just as slow.

Linux on my 2006 Apple MacBook loads slower than 1980s-vintage systems with eight-bit processors and floppy disk drives. Windows loads quickly, sort of. It displays a log-on screen and lets me enter a name and password, but then it takes at least five minutes (and sometimes an hour) updating various things.

Compilers and IDEs suffer the same fate. Each new version of Visual Studio takes longer to load. Eclipse is no escape -- it has always required a short eternity to load and edit a file. Slow performance is not limited to loading; compilation times have improved but only slightly, and not by the orders of magnitude to match the advertised improvements in hardware.

Where are the improvements? Where is the blazing speed that our hardware manufacturers promise?

I recently found that "missing" performance. It was noted in an article on the longevity of the C language, of all things. The author clearly and succinctly describes C and its place in the world. On the way, he describes the performance of one of his own C programs:

"In 1987 this code took around half an hour to run, today 0.03 seconds."

And there it is. A description of the performance improvements we should see in our systems.

The performance improvements we expect from better hardware has gone into software.

We have "invested" that performance in our operating systems, our programming languages, and user interfaces. Instead of taking all the improvements for reduced running times, we have diverted performance to new languages and to "improvements" in older languages. We invested in STL over plain old C++, Java over C++ (with or without STL), Python over Java and C#.

Why not? Its better to prevent mistakes than to have fast-running programs that crash or -- worse -- don't crash but provide incorrect results. Our processors are faster, and our programming languages do more for us. Boot times, load times, and compile times may be about the same as from decades ago, but errors are far fewer, easily detected, and much less dangerous.

Yes, there are still defects which can be exploited to hack into systems. We have not achieved perfection.

Our systems are much better with operating systems and programming languages that do the checking that the now do, and businesses and individuals can rely on computers to get the job done.

That's worth some CPU cycles.