Sunday, December 17, 2017

Single point of failure

If you're going to have a single point of failure, make it replaceable.

We strive to avoid single points of failure. They hold risk -- if a single point of failure fails, then the entire system fails.

It is not always possible to avoid a single point of failure. Sometimes the constraint is cost. Other times the design requires a single component for a function.

If you have a single point of failure, make it easy to replace. Design the component so that you can replace it quickly and with little risk. When it fails, you can respond and install the replacement component. (Kind of like a spare tire on an automobile. Although the four tires on a car are not a single point of failure, because there are four of them. But you get the idea.)

A simple design for a single point of failure (or any component) requires care and attention. You have to design the component with minimal functionality. Move what you can to other, redundant components.

You also have to guard against changes to the simplicity. Over time, designs change. People add to designs. They want new features, or extensions to existing features. Watch for changes that complicate the single point of failure. Add them to other, redundant components in the system.

Tuesday, December 12, 2017

Do you want to be time and on budget, or do you want a better product?

Project management in IT is full of options, opinions, and arguments. Yet one thing that just about everyone agrees on is this: a successful development project must have a clear vision of the product (the software) and everyone the team has to understand that vision.

I'm not sure that I agree with that idea. But my explanation will be a bit lengthy.

I'll start with a summary of one of my projects: a BASIC interpreter.

* * * * *

It started with the development of an interpreter. My goal was not to build a BASIC interpreter, but to learn the Ruby programming language. I had built some small programs in Ruby, and I needed a larger, more ambitious project to learn the language in depth. (I learn by doing. Actually, I learn by making mistakes, and then fixing the mistakes. So an ambitious project was an opportunity to make mistakes.)

My initial clear vision of the product was just that: clear. I was to build a working interpreter for the BASIC language, implementing BASIC as described in a 1965 text by Kemeny and Kurtz (the authors of BASIC). That version had numeric variables but not text (string) variables. The lack of string variables simplified several aspects of the project, from parsing to execution. But the project was not trivial; there were some interesting aspects of a numeric-only BASIC language, including matrix operations and output formatting.

After some effort (and lots of mistakes), I had a working interpreter. It really ran BASIC! I could enter the programs from the "BASIC Programming" text, run them, and see the results!

The choice of Kemeny and Kurtz' "BASIC Programming" was fortuitous. It contains a series of programs, starting with simple ones and working up to complex programs, and it shows the output of each. I could build a very simple interpreter to run the initial programs, and then expand it gradually as I worked my way through the text. At each step I could check my work against the provided output.

Then things became interesting. After I had the interpreter working, I forked the source code and created a second interpreter that included string variables. A second interpreter was not part of my initial vision, and some might consider this change "scope creep". It is a valid criticism, because I was expanding the scope of the product.

Yet I felt that the expansion of features, the processing of string variables, was worth the effort. In my mind, there may be someone who wants a BASIC interpreter. (Goodness knows why, but perhaps they do.) If so, they most likely want a version that can handle string variables.

My reasoning wasn't "the product needs this feature to be successful"; it was "users of the product will find this feature helpful". I was making the lives of (possibly imaginary) users easier.

I had to find a different reference for my tests. "BASIC Programming" said nothing about string variables. So off I went, looking for old texts on BASIC. And I found them! I found three useful texts: Coan's "Basic BASIC", Tracton's "57 Practical Programs and Games", and David Ahl's "101 BASIC Computer Games".

And I was successful at adding string variables to the interpreter.

Things had become interesting (from a project management perspective) with the scope expansion for an interpreter that had string variables. And things stayed interesting: I kept expanding the scope. As I worked on a feature, I thought about new, different features. As I did, I noted them and kept working on the current feature. When I finished one feature, I started another.

I added statements to process arrays of data. BASIC can process individual variables (scalars) and matrices. I extended the definition of BASIC and created new statements to process arrays. (BASICs that handle matrices often process arrays as degenerate forms of matrices. The internal structure of the interpreter made it easy to add statements specific to arrays.)

I added statements to read and write files. This of course required statements to open and close files. These were a little challenging to create, but not that hard. Most of the work had already been done with the input-output processing for the console.

I added a trace option, to see each line as it executed. I found it useful for debugging. Using my logic from the expansion for string variables, if I found it useful then other users would (possibly) find it useful. And adding it was a simple operation: the interpreter was already processing each line, and all I had to do was add some logic to display the line as it was interpreted.

I added a profiler, to count and time the execution of each line of code. This helped me reduce the run-time of programs, by identifying inefficient areas of the code. This was also easy to add, as the interpreter was processing each line. I simply added a counter to each line's internal data, and incremented the counted when the line was executed.

Then I added a cross-reference command, which lists variables, functions, and constants, and the lines in which they appear. I use this to identify errors. For example, a variable that appears in one line (and only one line) is probably an error. It is either initialized without being used, or used without being initialized.

I decided to add a debugger. A debugger is exactly like trace mode, with the option to enter a command after each statement. This feature, too, helps the typical user.

* * * * *

Stepping back from the project, we can see that the end result (two interpreters each with profiler, cross-reference, trace mode, and debugger) is quite far from the initial vision of a simple interpreter for BASIC.

According to the predominant thinking in project management, my project is a failure. It delivered a product with many more features than initially planned, and it consumed more time than planned, two sins of project management.

Yet for me, the project is a success. First, I learned quite a bit about the Ruby programming language. Second -- and perhaps more important -- the product is much more capable and serves the user better.

* * * * *

This experience shows a difference in project management. As a side project, one without a firm budget or deadline, it was successful. The final product is much more capable that the initial vision. But more importantly, my motivation was to provide a better experience for the user.

That result is not desired in corporate software. Oh, I'm sure that corporate managers will quickly claim that they deliver a better experience to their customers. But they will do it only when their first priority has been met: profits for the corporation. And for those, the project must fit within expense limits and time limits. Thus, a successful corporate project delivers the initial vision on time and on budget -- not an expanded version that is late and over budget.

Friday, December 8, 2017

The cult of fastest

In IT, we (well, some of us) are obsessed with speed. The speed-cravers seek the fastest hardware, the fastest software, and the fastest network connections. They have been with us since the days of the IBM PC AT, which ran at 6MHz which was faster than the IBM PC (and XT) speed of 4.77MHz.

Now we see speed competition among browsers. First Firefox claims their browser is fastest. Then Google releases a new version of Chrome, and claims that it is the fastest. At some point, Microsoft will claim that their Edge browser is the fastest.

It is one thing to improve performance. When faced with a long-running job, we want the computer to be faster. That makes sense; we get results quicker and we can take actions faster. Sometimes it is reasonable to go to great lengths to improve performance.

I once had a job that compared source files for duplicate code. With 10,000 source files, and the need to compare each file against each other file, there were 1,000,000 comparisons. Each comparison took about a minute, so the total job was projected to run for 1,000,000 minutes -- or about 2 years! I revised the job significantly, using a simpler (and faster) comparison to identify if two files had any common lines of code and then using the more detailed (and longer) comparison on only those pairs with over 1,000 lines of common code.

Looking for faster processing in that case made sense.

But it is another thing to look for faster processing by itself.

Consider a word processor. Microsoft Word has been around for decades. (It actually started its life in MS-DOS.) Word was designed for systems with much smaller memory and much slower processors, and it still has some of that design. The code for Word is efficient. It spends most of its time not in processing words but in waiting for the user to type a key or click the mouse. Making the code twice as fast would not improve its performance (much), because the slowness comes from the user.

E-mail is another example. Most of the time for e-mail is, like Word, the computer waiting for the user to type something. When an e-mail is sent, the e-mail is passed from one e-mail server to another until it arrives at the assigned destination. Changing the servers would let the e-mail arrive quicker, but it doesn't help with the composition. The acts of writing and reading the e-mail are based on the human brain and physiology; faster processors won't help.

The pursuit of faster processing without definite benefits is, ironically, a waste of time.

Instead of blindly seeking faster hardware and software, we should think about what we want. We should identify the performance improvements that will benefit us. (For managers, this means lower cost or less time to obtain business results.)

Once we insist on benefits for improved performance, we find a new concept: the idea of "fast enough". When an improvement lets us meet a goal (a goal more specific than "go faster"), we can justify the effort or expense for faster performance. But once we meet that goal, we stop.

This is a useful tool. It allows us to eliminate effort and focus on changes that will help us. If we decide that our internet service is fast enough, then we can look at other things such as database and compilers. If we decide that our systems are fast enough, then we can look at security.

Which is not to say that we should simply declare our systems "fast enough" and ignore them. The decision should be well-considered, especially in the light of our competitors and their capabilities. The conditions that let us rate our systems as "fast enough" today may not hold in the future, so a periodic review is prudent.

We shouldn't ignore opportunities to improve performance. But we shouldn't spend all of our effort for them and avoid other things. We shouldn't pick a solution because it is the fastest. A solution that is "fast enough" is, at the end of the day, fast enough.

Tuesday, November 28, 2017

Root with no password

Apple made the news today, and not in a good way. It seems that their latest version of macOS, "High Sierra", allows anyone to sit at a machine and gain access to administrative functions (guarded by a name-and-password dialog) and enter the name "root" and a password of ... nothing.

This behavior in macOS is not desired, and this "bug" is severe. (Perhaps the most severe defect I have seen in the industry -- and I started prior to Windows and MS-DOS, with CP/M and other operating systems.) But my point here is not to bash Apple.

My point is this: The three major operating systems for desktop and laptop computers (Windows, macOS, and Linux) are all very good, and none are perfect.

Decades ago, Apple had superior reliability and immunity from malware. That immunity was due in part to the design of macOS and in part to Apple's small market share. (Microsoft Windows was a more tempting target.) Those conditions have changed. Microsoft has improved Windows. Malware now targets macOS in addition to Windows. (And some targets Linux.)

Each of Windows, macOS, and Linux have strengths, and each have areas of improvement. Microsoft Windows has excellent support, good office tools, and good development tools. Apple's macOS has a (slightly) better user interface but a shorter expected lifespan. (Apple retires old hardware and software more quickly than Microsoft.) Linux is reliable, has lots of support, and many tools are available for free; you have more work configuring it and you must become (or hire) a system administrator.

If you choose your operating system based on the idea that it is better than the others, that it is superior to the other choices, then you are making a mistake -- possibly larger than Apple's goof. Which is best for you depends on the tasks you intend to perform.

So think before you choose. Understand the differences. Understand your use cases. Don't simply pick Microsoft because the competition is using it. Don't pick Apple because the screen looks "cool". Don't pick Linux because you want to be a rebel.

Instead, pick Microsoft when the tools for Windows are a good match for your team and your plans. Or pick macOS because you're working on iPhone apps. Or pick Linux because your team has experience with Linux and your product or service will run on Linux and serve your customers.

Think before you choose.

Saturday, November 18, 2017

Technical debt is a bet

A long time ago, I worked in the IT department of a small, regional bank with 14 or so branch offices.

The IT team was proud of their mainframe-based online teller network. All teller transactions were cleared through the system and it prevented the fraud of someone making withdrawals at different branches, withdrawals that would exceed the account balance. (We might think little of such a system today, but in that time it was an impressive system.)

But the system had technical debt. It was written in IBM's assembler language, and it was extremely difficult to change. At the core of the system was the branch table. Not "branch" as in "jump instruction", but "branch" as in "branch office". The table allowed for 20 branch offices, and no more.

Lots of code was built around the branch table, and that code had built-in dependencies on the size of the table. In other words, the entire system "knew" that the size of the branch table was 20.

Things were okay as long as the bank had 20 (or fewer) branches. Which they did.

Until the president retired, and a new president took the helm. The new president wanted to expand the bank, and he did, acquiring a few branch offices from other banks.

The IT team started working on the expansion of the branch table. It wasn't an immediate problem, but they knew that the limit would be exceeded. They had to expand the table.

After months of analysis, coding, and tests, the IT team came to a difficult realization: they were unable to expand the branch table. The director of data processing had to inform the president. (I imaging the meeting was not pleasant.)

Technical debt exists in your systems, but it is a bet against you competitors.

It doesn't matter if the debt is your reliance on an out-of-date compiler, an old version of an operating system, or lot of messy source code.

Each of these is a form of technical debt, and each of these is a drag on agility. It slows your ability to respond to changes in the market, changes in technology, and competition. Yet in the end, it is only the competition that matters.

Does the technical debt of your existing system -- the hard-to-read code, the magic build machine, the inconsistent database schema -- slow you in responding to the competition?

It doesn't have to be a new product from the competition. It could be something that affects the entire market, such as legislation, to which you and your competition must respond. Your technical debt may delay that response. Does your competition have similar technical debt, such that their response will also be delayed? Are you sure?

That's the risk of technical debt.

Tuesday, November 14, 2017

Apple Copies Microsoft

We're familiar with the story behind Windows, and how Microsoft created Windows to compete with Apple's Macintosh. (And tech-savvy folks know how Apple copied the Xerox Star to make the Macintosh -- but that's not important here.)

Apple has just recently copied Microsoft.

In a small way.

They did it with the numbering scheme for iPhones. Apple released two iPhones this year, the iPhone 8 and the iPhone X (which Apple insists is pronounced "ten").

There is no iPhone 9.

So what does this have to do with Microsoft?

Back in 2015, Microsoft released Windows 10. It was the successor to Windows 8 (or Windows 8.1, if you want to be picky).

There is no Windows 9.

There was Windows 95 and Windows 98, collectively referred to as "Windows 9x". Some software identified those versions with the test

windowsVersion.startswith("9")

which works for Windows 95 and Windows 98 -- and probably doesn't do what you want on an imaginary Windows 9 operating system. So "Windows 10" came to be.

Apple, of course, never had an "iPhone 95" or an "iPhone 98", so they didn't have the same problem as Microsoft. They picked "iPhone X" to celebrate the 10th anniversary of the iPhone.

Did they realize that they were following Microsoft's lead? Perhaps. Perhaps not.

I'm not concerned that Apple is going to follow Microsoft in other matters.

But I do find it amusing.

Wednesday, November 8, 2017

Go is not the language we think it is

When I first started working with the Go language, my impression was that it was a better version of C. A C with strings, with better enums, and most importantly a C without a preprocessor.

All of those aspects are true, but I'm not sure that the "origin language" of Go was C.

I say this after reading Kernighan's 1981 paper "Why Pascal is not my Favorite Programming Language". In it, he lists the major deficiencies of Pascal, including types, fixed-length arrays (or the absence of variable-length arrays), lack of static variables and initialization, lack of separate compilation (units), breaks from loops, order of evaluation in expressions, the detection of end-of-line and end-of-file, the use of 'begin' and 'end' instead of braces, and the use of semicolons as separators (required in some places, forbidden in others).

All of these criticisms are addressed in the Go language. It is as if Kernighan had sent these thoughts telepathically to Griesemer, Pike, and Thompson (the credited creators of Go).

Now, I don't believe that Kernighan used mind control on the creators of Go. What I do believe is somewhat more mundane: Kernighan, working with Pike and Kernighan on earlier projects, shared his ideas on programming languages with them. And since the principal players enjoyed long and fruitful careers together, they were receptive to those ideas. (It may be that Kernighan adopted some ideas from the others, too.)

My conclusion is that the ideas that became the Go language were present much earlier than the introduction of Go, or even the start of the Go project. They were present in the 1980s, and stimulated by the failings of Pascal. Go is, as I see it, a "better Pascal", not a "better C".

If that doesn't convince you, consider this: The assignment operator in C is '=' and in Pascal it is ':='. In Go, the assignment operators (there are two) are ':=' and '='. (Go uses the first form to declare new variables and the second form to assign to existing variables.)

In the end, which language (C or Pascal) was the predecessor of Go matters little. What matters is that we have the Go language and that it is a usable language.

Sunday, October 29, 2017

We have a problem

The Rust programming language has a problem.

The problem is one of compactness, or the lack thereof. This problem was brought to my attention by a blog post about the Unix 'yes' program.

In short, Rust requires a lot of code to handle a very simple task.

The simple task, in this case, is the "yes" program from Unix. This program feeds the string "y\n" ('y' with newline) to output as many times as possible.

Here's the program in C:
main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s\n", argc>1? argv[1]: "y");
}
And here is an attempt in Rust:
use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}
The Rust version is quite slow compared to the C version, so the author and others made some "improvements" to Make It Go Fast:
use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<'a>(buffer: &'a mut [u8], output: &'a [u8]) -> &'a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y\n"[..],
    ),
    |mut arg| {
      arg.push(b'\n');
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}
Now, that's a lot of code. Really a lot. For a simple task.

To be fair, the author mentions that the GNU version of 'yes' weighs in at 128 lines, more that twice this monstrosity in Rust. But another blogger posted this code which improves performance:
#define LEN 2
#define TOTAL 8192
int main() {
    char yes[LEN] = {'y', '\n'};
    char *buf = malloc(TOTAL);
    int bufused = 0;
    while (bufused < TOTAL) {
        memcpy(buf+bufused, yes, LEN);
        bufused += LEN;
    }
    while(write(1, buf, TOTAL));
    return 1;
}

Programming languages should be saving us work. The high-performance solution in Rust is long, way too long, for such simple operations.

We have a problem. It may be in our programming languages. It may be in run-time libraries. It may be in the operating systems and their APIs. It may be in the hardware architecture. It may be a combination of several.

But a problem we have.

Sunday, October 15, 2017

Don't make things worse

We make many compromises in IT.

Don't make things worse. Don't break more things to accommodate one broken thing.

On one project in my history, we had an elderly Windows application that used some data files. The design of the application was such that the data files had to reside within the same directory as the executable. This design varies from the typical design for a Windows application, which sees data files stored in a user-writeable location. Writing to C:\Program Files should be done only by install programs, and only with elevated privileges.

Fortunately, the data files were read by the application but not written, so we did not have to grant the application write access to a location under C:\Program Files. The program could run, with its unusual configuration, and no harm was done.

But things change, and there came a time when this application had to share data with another application, and that application *did* write to its data files.

The choices were to violate generally-accepted security configurations, or modify the offending application. We could grant the second application write permissions into C:\Program Files (actually, a subdirectory, but still a variance from good security).

Or, we could install the original application in a different location, one in which the second application could write to files. This, too, is a variance from good security. Executables are locked away in C:\Program Files for a reason -- they are the targets of malware, and Windows guards that directory. (True, Windows does look for malware in all directories, but its better for executables to be locked away until needed.)

Our third option was to modify the original application. This we could do; we had the source code and we had built it in the past. The code was not in the best of shape, and small changes could break things, but we did have experience with changes and a battery of tests to back us up.

In the end, we selected the third option. This was the best option, for a number of reasons.

First, it moved the original application closer to the standard model for Windows. (There were other things we did not fix, so the application is not perfect.)

Second, it allowed us to follow accepted procedures for our Windows systems.

Finally, it prevented the spread of bad practices. Compromising security to accommodate a poorly-written application is a dangerous path. It expands the "mess" of one application into the configuration of the operating system. Better to contain the mess and not let it grow.

We were lucky. We had an option to fix the problem application and maintain security. We had the source code for the application and knowledge about the program. Sometimes the situation is not so nice, and a compromise is necessary.

But whenever possible, don't make things worse.

Sunday, October 8, 2017

The Amazon Contest

Allow me to wander from my usual space of technology and share some thoughts on the Amazon.com announcement.

The announcement is for their 'HQ2', an office building (or complex) with 50,000 well-paid employees that is up for grabs to a lucky metropolitan area. City planners across the country are salivating over winning such an employer for their struggling town. Amazon has announced the criteria that they will consider for the "winner", including educated workforce, transit, and cost of living.

The one thing that I haven't seen is an analysis of the workforce numbers. From this one factor alone, we can narrow the hopeful cities to a handful.

Amazon.com wants their HQ2 complex to employee 50,000 people. That means that they will either hire the people locally or they will relocate them. Let's assume that they relocate one-third of the employees in HQ2. (The relocated could be current employees at other Amazon.com offices or new hires from out of the HQ2 area.)

That leaves about 33,000 people to hire. Assuming that they hire half as entry-level, they will need the other half to be experienced. (I'm assuming that Amazon.com will not relocate entry-level personnel.)

The winning city will have to supply 16,000 experienced professionals and 16,000 entry-level people. That's not an easy lift, and not one that many cities can offer. It means that the city (or metro area) must have a large population of professionals -- larger than 16,000 because not everyone will be willing to leave their current position and enlist with Amazon.com. (And Amazon.com may be unwilling to hire all candidates.)

If we assume that only one in ten professionals are willing to move, then Amazon.com needs a metro area with at least 160,000 professionals. (Or, if Amazon.com expected to pick one in ten candidates, the result is the same.)

And don't forget the relocated employees. They will need housing. Middle class, ready to own, housing -- not "fixer uppers" or "investment opportunities". A few relocatees may choose the "buy and invest" option, but most are going to want a house that is ready to go. How many cities have 15,000 modern housing units available?

These two numbers -- available housing and available talent -- set the entrance fee. Without them, metro areas cannot compete, no matter how good the schools or the transit system or the tax abatement.

So when Amazon.com announces the location of HQ2, I won't be surprised if it has a large population of professionals and a large supply of housing. I also won't be surprised if it doesn't have some the other attributes that Amazon.com put on the list, such as incentives and tax structure.

Wednesday, October 4, 2017

Performance, missing and found

One of the constants in technology has been the improvement of performance. More powerful processors, faster memory, larger capacity in physically smaller disks, and faster communications have been the results of better technology.

This increase in performance is mostly mythological. We are told that our processors are more powerful, we are told that memory and network connections are faster. Yet what is our experience? What are the empirical results?

For me, word processors and spreadsheets run just as fast as they did decades ago. Operating systems load just as fast -- or just as slow.

Linux on my 2006 Apple MacBook loads slower than 1980s-vintage systems with eight-bit processors and floppy disk drives. Windows loads quickly, sort of. It displays a log-on screen and lets me enter a name and password, but then it takes at least five minutes (and sometimes an hour) updating various things.

Compilers and IDEs suffer the same fate. Each new version of Visual Studio takes longer to load. Eclipse is no escape -- it has always required a short eternity to load and edit a file. Slow performance is not limited to loading; compilation times have improved but only slightly, and not by the orders of magnitude to match the advertised improvements in hardware.

Where are the improvements? Where is the blazing speed that our hardware manufacturers promise?

I recently found that "missing" performance. It was noted in an article on the longevity of the C language, of all things. The author clearly and succinctly describes C and its place in the world. On the way, he describes the performance of one of his own C programs:
"In 1987 this code took around half an hour to run, today 0.03 seconds."
And there it is. A description of the performance improvements we should see in our systems.

The performance improvements we expect from better hardware has gone into software.

We have "invested" that performance in our operating systems, our programming languages, and user interfaces. Instead of taking all the improvements for reduced running times, we have diverted performance to new languages and to "improvements" in older languages. We invested in STL over plain old C++, Java over C++ (with or without STL), Python over Java and C#.

Why not? Its better to prevent mistakes than to have fast-running programs that crash or -- worse -- don't crash but provide incorrect results. Our processors are faster, and our programming languages do more for us. Boot times, load times, and compile times may be about the same as from decades ago, but errors are far fewer, easily detected, and much less dangerous.

Yes, there are still defects which can be exploited to hack into systems. We have not achieved perfection.

Our systems are much better with operating systems and programming languages that do the checking that the now do, and businesses and individuals can rely on computers to get the job done.

That's worth some CPU cycles.

Monday, September 25, 2017

Web services are the new files

Files have been the common element of computing since at least the 1960s. Files existed before disk drives and file systems, as one could put multiple files on a magnetic tape.

MS-DOS used files. Windows used files. OS/2 used files. (Even the p-System used files.)

Files were the unit of data storage. Applications read data from files and wrote data to files. Applications shared data through files. Word processor? Files. Spreadsheet? Files. Editor? Files. Compiler? Files.

The development of databases saw another channel for sharing data. Databases were (and still are) used in specialized applications. Relational databases are good for consistently structured data, and provide transactions to update multiple tables at once. Microsoft hosts its Team Foundation on top of its SQL Server. (Git, in contrast, uses files exclusively.)

Despite the advantages of databases, the main method for storing and sharing data remains files.

Until now. Or in a little while.

Cloud computing and web services are changing the picture. Web services are replacing files. Web services can store data and retrieve data, just as files. But web services are cloud residents; files are for local computing. Using URLs, one can think of a web service as a file with a rather funny name.

Web services are also dynamic. A file is a static collection of bytes: what you read is exactly was was written. A web service can provide a set of bytes that is constructed "on the fly".

Applications that use local computing -- desktop applications -- will continue to use files. Cloud applications will use web services.

Those web services will be, at some point, reading and writing files, or database entries, which will eventually be stored in files. Files will continue to exist, as the basement of data storage -- around, but visited by only a few people who have business there.

At the application layer, cloud applications and mobile applications will use web services. The web service will be the dominant method of storing, retrieving, and sharing data. It will become the dominant method because the cloud will become the dominant location for storing data. Local computing, long the leading form, will fall to the cloud.

The default location for data will be the cloud; new applications will store data in the cloud; everyone will think of the cloud. Local storage and local computing will be the oddball configuration. Legacy systems will use local storage; modern systems will use the cloud.

Monday, September 18, 2017

What Agile and Waterfall have in common

Agile and Waterfall are often described in contrasts: Waterfall is large and bureaucratic, Agile is light and nimble. Waterfall is old, Agile is new. Waterfall is... you get the idea.

But Waterfall and Agile have things in common.

First and most obvious, Waterfall and Agile are both used to manage development projects. They both deliver software.

I'm sure that they are both used to deliver things other than software. They are tools for managing projects, not limited to software projects.

But those are the obvious common elements.

An overlooked commonality is the task of defining small project steps. For Waterfall, this is the design phase, in which requirements are translated into system design. The complete set of requirements can paint a broad picture of the system, providing a general shape and contours. (Individual requirements can be quite specific, with details on input data, calculations, and output data.)

Breaking down the large idea of the system into smaller, code-able pieces is a talent required for Waterfall. It is how you move from requirements to coding.

Agile also needs that talent. In contrast to Waterfall, Agile does not envision the completed system and does not break that large picture into smaller segments. Instead, Agile asks the team to start with a small piece of the system (most often a core function) and build that single piece.

This focus on a single task is, essentially, the same as the design phase in Waterfall. It converts a requirement (or user story, or use case, or whatever small unit is convenient) into design for code.

The difference between Waterfall and Agile is obvious: Waterfall converts all requirements in one large batch before any coding is started, and Agile performs the conversions serially, seeing one requirement all the way to coding and testing (or more properly, testing and then coding!) before starting the next.

So whether you use Waterfall or Agile, you need the ability to "zoom in" from requirements to design, and then on to tests and code. Waterfall and Agile are different, but the differences are more in the sequence of performing tasks and not the tasks themselves.

Monday, September 11, 2017

Legacy cloud applications

We have legacy web applications. We have legacy Windows desktop applications. We have legacy DOS applications (albeit few). We have legacy mainframe applications (possibly the first type to be named "legacy").

Will we have legacy cloud applications? I see no reason why not. Any technology that changes over time (which is just about every technology) has legacy applications. Cloud technology changes over time, so I am confident that, at some time, someone, somewhere, will point to an older cloud application and declare it "legacy".

What makes a legacy application a legacy application? Why do we consider some applications "legacy" and others not?

Simply put, then technology world changed and the application did not. There are multiple aspects to the technology world, and any one of them, when left unchanged, may cause us to view an application as legacy.

It may the user interface. (Are we using an old version of HTML and CSS? An old version of JavaScript?) It may be the database. (Are we using a relational database and not a NoSQL database?) The the back-end code may be difficult to read. The back-end code may be in a language that has fallen out of favor. (Perl, or Visual Basic, or C, or maybe an early version of Java?)

One can ask similar questions about legacy Windows desktop applications or mainframe applications. (C++ and MFC? COBOL and CICS?)

But let us come back to cloud computing. Cloud computing has been around since 2006. (There was an earlier use of the term "cloud computing", but for our purposes the year 2006 is sufficient.)

So let's assume that the earliest cloud applications were built in 2006. Cloud computing has changed since then. Have all of these applications kept up with those changes? Or have some of them languished, retaining their original design and techniques? If they have not kept up with changing technology, we can consider them legacy cloud applications.

Which means, as owners or custodians of applications, we now not only have to worry about legacy mainframe applications and legacy web applications and legacy desktop applications. We can add legacy cloud applications to our list.

Cloud computing is a form of computing, but it is not magical. It evolves over time, just like other forms of computing. Those who look after applications must either make the effort to modify cloud applications over time (to keep up with the mainstream) or live with legacy cloud applications. That effort is an expense.

Like any other expense, it is really a business decision: invest time and money in an old (legacy) application or invest the time and money somewhere else. Both paths have benefits and costs; managers must decide which has the greater merit. Choosing to let an old system remain old is an acceptable decision, provided you recognize the cost of maintaining that older technology.

Monday, September 4, 2017

Agile can be cheap; Waterfall is expensive

Agile can be cheap, but Waterfall will always be expensive.

Here's why:

Waterfall starts its process with an estimate. The Waterfall method uses a set of phases (analysis, design, coding, testing, and deployment) which are executed according to a fixed schedule. Many Waterfall projects assign specific times to each phase. Waterfall needs this planning because it makes a promise: deliver a set of features on a specific date.

But notice that Waterfall begins with an estimate: the features that can be implemented in a specific time frame. That estimate is crucial to the success of the project. What is necessary to obtain that estimate?

Only people with knowledge and experience can provide a meaningful estimate. (One could, foolishly, ask an inexperienced person for the estimate, but that estimate has no value.)

What knowledge does that experienced person need? Here are some ideas:
- The existing code
- The programming language and tools used
- The different teams involved in development and testing
- The procedures and techniques used to coordinate efforts
- The terms and concepts used by the business

With knowledge of these, a person can provide a reasonable estimate for the effort.

These areas of knowledge do not come easily. They can be learned only by working on the project and in different capacities.

In other words, the estimate must be provided by a senior member of the team.

In other words, the team must have at least one senior member.

Waterfall relies on team members having knowledge about the business, the code, and the development processes.

Agile, in contrast, does not rely on that experience. Agile is designed to allow inexperienced people work on the project.

Thus, Agile projects can get by without senior, experienced team members, but Waterfall projects must have at least one (and probably more) senior team members. Since senior personnel are more expensive than junior, and Waterfall requires senior personnel, we can see that Waterfall projects will, on average, cost more than Agile projects. (At least in terms of per-person costs.)

Do not take this to mean that you should run all projects with Agile methods. Waterfall may be more expensive, but it provides different value. It promises a specific set of functionality on a specific date, a promise that Agile does not make. If you need the promises of Waterfall, it may be worth the extra cost (higher wages). This is a business decision, similar to using proprietary tools over open-source tools, or leasing premium office space in the suburbs over discount office space in a not-so-nice part of town.

Which method you choose is up to you. But be aware that they are not the same, not only in terms of deliverables but in staffing requirements. Keep those differences in mind when you make your decision.

Saturday, August 19, 2017

Cloud, like other forms of computers, changes over time

Cloud computing has been with us a while. In its short life, and like other types of computing, it has changed.

"Cloud" started out as the outsourcing of system administration.

Then "cloud" was about scalability, and the ability to "spin up" servers as you needed them and "spin down" servers when they were not needed.

Shortly after, "cloud" was a cost-control measure: pay for only the servers you use.

For a while, "cloud" was a new type of system architecture with dedicated servers (web, database) connected by message queues.

Then "cloud" was about microservices, which are small web services that are less than complete applications. (Connect the right microservices in the right way, and you have an application!)

Lately, "cloud" has been all about containers, and the rapid and lightweight deployment of applications.

So what is "cloud computing", really?

Well, it's all of these things. As I see it, cloud computing is a new form of computing, difference from mainframe computing, desktop computing, and web applications. As a new form of computing, it has taken us a while to fully understand it.

We had similar transitions with desktop (or PC) computing and web applications. Early desktop microcomputers (the Apple II, the TRS-80, and even the IBM PC) were small, slow, and difficult to use. Over time, we modified those PCs: powerful processors, bigger displays, more memory, simpler attachments (USB instead of serial), and better interfaces (Windows instead of DOS).

Web applications went through their own transitions, from static web pages to CGI Perl scripts to AJAX applications to new standards for HTML.

Cloud computing is undergoing a similar process. It shouldn't be a surprise; this process of gradual improvement is less about technology and more about human creativity. We're always looking for new ways of doing things.

One can argue that PCs and web applications have not stopped changing. We've just added touchscreens to desktop and laptop computers, and we've invented NoSQL databases for web applications (and mobile applications). It may be that cloud computing will continue to change, too.

It seems we're pretty good at changing things.

Sunday, August 13, 2017

Make it go faster

I've worked on a number of projects, and a (not insignificant) number of them had a requirement (or, more accurately, a request) to improve the performance of an existing system.

A computerized system consists of hardware and software. The software is a set of instructions that perform computations, and the hardware executes those instructions.

The most common method of improving performance is to use a faster computer. Leave the software unchanged, and run it on a faster processor. (Or if the system performs a lot of I/O, a faster disk, possibly an SSD instead of a spinning hard drive.) This method is simple, with no changes to the software and therefore low risk. It is the first method of reducing run time: perform computations faster.

Another traditional method is to change your algorithm. Some algorithms are faster than others, often by using more memory. This method has higher risk, as it changes the software.

Today's technology sees cloud computing as a way to reduce computing time. If your calculations are partitionable (that is, subsets can be computed independently) then you can break a large set of computations into a group of smaller computations, assign each smaller set to its own processor, and compute them in parallel. Effectively, this is computing faster, provided that the gain in parallel processing outweighs the cost of partitioning your data and combining the multiple results.

One overlooked method is using a different compiler. (I'm assuming that you're using a compiled language such as C or C++. If you are using Python or Ruby, or even Java, you may want to change languages.) Switching from one compiler to another can make a difference in performance. The code emitted by one compiler may be fine-tuned to a specific processor; the code from another compiler may be generic and intended for all processors in a family.

Switching from one processor to another may improve performance. Often, such a change requires a different compiler, so you are changing two things, not one. But a different processor may perform the computations faster.

Fred Brooks has written about essential complexity and accidental complexity. Essential complexity is necessary and unavoidable; accidental complexity can be removed from the task. There may be an equivalent in computations, with essential computations that are unavoidable and accidental computations that can be removed (or reduced). But removing accidental complexity is merely reducing the number of computations.

To improve performance you can either perform the computations faster or you can reduce the number of computations. That's the list. You can use a faster processor, you can change you algorithms, you can change from one processor and compiler to a different processor and compiler. But there is an essential number of computations, in a specific sequence. You can't exceed that limit.

Wednesday, August 2, 2017

Agile is for startups; waterfall for established projects

The Agile method was conceived as a revolt against the Waterfall method. Agile was going to be everything that Waterfall was not: lightweight, simple, free of bureaucracy, and successful. In retrospect, Agile was successful, but that doesn't mean that Waterfall is obsolete.

It turns out that Agile is good for some situations, and Waterfall is good for others.

Agile is effective when the functionality of the completed system is not known in advance. The Agile method moves forward in small steps which explore functionality, with reviews after each step. The Agile method also allows for changes in direction after each review. Agile provides flexibility.

Waterfall is effective when the functionality and the delivery date is known in advance. The Waterfall method starts with a set of requirements and executes a plan for analysis, design, coding, testing, and deployment. It commits to delivering that functionality on the delivery date, and does not allow (in its pure form) for changes in direction. Waterfall provides predictability.

Established companies, which have an ongoing business with procedures and policies, often want the predictability that Waterfall offers. They have business partners and customers and people to whom they make commitments (such as "a new feature will be ready for the new season"). They know in detail the change they want and they know precisely when they want the change to be effective. For them, the Waterfall method is appropriate.

Start-ups, on the other hand, are building their business model and are not always certain of how it will work. Many start-ups adjust their initial vision to obtain profitability. Start-ups don't have customers and business partners, or at least not with long-standing relationships and specific expectations. Start-ups expect to make changes to their plan. While they want to start offering services and products quickly (to generate income) they don't have a specific date in mind (other than the "end of runway" date when they run out of cash). Their world is very different from the world of the established company. They need the flexibility of Agile.

The rise of Agile did not mean the death of Waterfall -- despite some intentions of the instigators. Both have strengths, and each can be used effectively. (Of course, each can be mis-used, too. It is quite easy to manage a project "into the ground" with the wrong method, or a poorly applied method.)

The moral is: know what is best for your project and use the right methods.

Tuesday, July 18, 2017

A sharp edge in Ruby

I like the Ruby programming language. I've been using it for several projects, including an interpreter for the BASIC language. (The interpreter for BASIC was an excuse to do something in Ruby and learn about the language.)

My experience with Ruby has been a good one. I find that the language lets me do what I need, and often very quickly. The included classes are well-designed and include the functions I need. From time to time I do have to add some obscure capability, but those are rare.

Yet Ruby has a sharp edge to it, an aspect that can cause trouble if you fail to pay attention.

That aspect is one of its chief features: flexibility.

Let me explain.

Ruby is an object-oriented language, which means the programmer can define classes. Each class has a name, some functions, and usually some private data. You can do quite a bit with class definitions, including class variables, class instance variables, and mix-ins to implement features in multiple classes.

You can even modify existing classes, simply by declaring the same class name and defining new functions. Ruby accepts a second definition of a class and merges it into the first definition, quietly and efficiently. And that's the sharp edge.

This "sharp edge" cut me when I wasn't expecting it. I was working on my BASIC interpreter, and had just finished a class called "Matrix", which implemented matrix operations within the language. My next enhancement was for array operations (a matrix being a two-dimensional structure and an array being a one-dimensional structure).

I defined a class called "Array" and defined some functions, including a "to_s" function. (The name "to_s" is the Ruby equivalent of "ToString()" or "to_string()" in other languages.)

And my code behaved oddly. Existing functions, having nothing to do with arrays or my Array class, broke.

Experienced Ruby programmers are probably chuckling at this description, knowing the problem.

Ruby has its own Array class, and my Array class was not a new class but a modification of the existing, built-in class named "Array". My program, in actuality, was quite different from my imagined idea. When I defined the function "to_s" in "my" Array class, I was actually overwriting the existing "to_s" function in the Ruby-supplied Array class. And that happened quietly and efficiently -- no warning, no error, no information message.

Part of this problem is my fault. I was not on my guard against such a problem. But part of the problem, I believe, is Ruby's -- specifically the design of Ruby. Letting one easily modify an existing class, with no warning, is dangerous. And I say this not simply due to my background with languages that use static checking.

My error aside, I can think of two situations in which this can be a problem. The first is when a new version of the Ruby language (and its system libraries) are released. Are there new classes defined in the libraries? Could the names of those classes duplicate any names I have used in my project? For example, will Ruby one day come with a class named "Matrix"? If it does, it will collide with my class named "Matrix". How will I know that there is a duplicate name?

The second situation is on a project with multiple developers. What happens if two developers create classes with the same name? Will they know? Or will they have to wait for something "weird" to happen?

Ruby has some mechanisms to prevent this problem. One can use namespaces within the Ruby language, to prevent such name conflicts. A simple grep of the code, looking for "class [A-Z][\w]" and then a sort will identify duplicate names. But these solutions require discipline and will -- they don't come "for free".

As I said earlier, this is a sharp edge to Ruby. Is it a defect? No, I think this is the expected behavior for the language. It's not a defect. But it is an aspect of the language, and one that may limit the practicality of Ruby on large applications.

I started this blog with the statement that I like Ruby. I still like Ruby. It has a sharp edge (like all useful tools) and I think that we should be aware of it.

Sunday, July 9, 2017

Cloud and optimizations

We all recognize that cloud computing is different.

It may be that cloud computing breaks some of our algorithms.

A colleague of mine, a long time ago, shared a story about programming early IBM mainframes. They used assembly language, because code written in assembly executed faster than code written in COBOL. (And for business applications on IBM mainframes, at the time, those were the only two options.)

Not only did they write in assembly language, they wrote code to be fast. That is, they "optimized" the code. One of the optimizations was with the "multiply" instruction.

The multiply instruction does what you think: it multiplies to numbers and stores the result. To optimize it, they wrote the code to place the larger of the two values in one register and the smaller of the two values in the other register. The multiply instruction was implemented as a "repeated addition" operation, so the second register was really a count of the number of addition operations that would be performed. By storing the smaller number in the second register, programmers reduced the number of "add" operations and improved performance.

(Technically inclined folks may balk at the notion of reducing a multiply operation to repeated additions, and observe that it works for integer values but not floating-point values. The technique was valid on early IBM equipment, because the numeric values were either integers or fixed-point values, not floating-point values.)

It was an optimization that was useful at the time, when computers were relatively slow and relatively expensive. Today's faster, cheaper computers can perform multiplication quite quickly, and we don't need to optimize it.

Over time, changes in technology make certain optimizations obsolete.

Which brings us to cloud computing.

Cloud computing is a change in technology. It makes available a variable number of processors.

Certain problems have a large number of possible outcomes, with only certain outcomes considered good. The problems could describe the travels of a salesman, or the number of items in a sack, or playing a game of checkers. We have algorithms to solve specific configurations of these problems.

One algorithm is the brute-force, search-every-possibility method, which does just what you think. While it is guaranteed to find an optimal solution, there are sometimes so many solutions (millions upon millions, or billions, or quintillions) that this method is impractical.

Faced with an impractical algorithm, we invent others. Many are iterative algorithms which start with a set of conditions and then move closer and closer to a solution by making adjustments to the starting conditions. Other algorithms discard certain possibilities ("pruning") which are known to be no better than current solutions. Both techniques reduce the number of tested possibilities and therefore reduce the time to find a solution.

But observe: The improved algorithms assume a set of sequential operations. They are designed for a single computer (or a single person), and they are designed to minimize time.

With cloud computing, we no longer have a single processor. We have multiple processors, each operating in parallel. Algorithms designed to optimize for time on a single processor may not be suitable for cloud computing.

Instead of using one processor to iteratively find a solution, it may be possible to harness thousands (millions?) of cloud-based processors, each working on a distinct configuration. Instead of examining solutions in sequence, we can examine solutions in parallel. The result may be a faster solution to the problem, in terms of "wall time" -- the time we humans are waiting for the solution.

I recognize that this approach has its costs. Cloud computing is not free, in terms of money or in terms of computing time. Money aside, there is a cost in creating the multiple configurations, sending them to respecting cloud processors, and then comparing the many results. That time is a cost, and it must be included in our evaluation.

None of these ideas are new to the folks who have been working with parallel processing. There are studies, papers, and ideas, most of which have been ignored by mainstream (sequential) computing.

Cloud computing will lead, I believe, to the re-evaluation of many of our algorithms. We may find that many of them have a built-in bias for single-processor operation. The work done in parallel computing will be pertinent to cloud computing.

Cloud computing is a very different form of computing. We're still learning about it. The application of concepts from parallel processing is one aspect of it. I won't be surprised if there are more. There may be all sorts of surprises ahead of us.

Sunday, July 2, 2017

It's not always A or B

Us folks in IT almost pride themselves on our fierce debates on technologies. And we have so many of them: emacs vs. vim, Windows vs. Mac, Windows vs. Linux, C vs. Pascal, C# vs. Java, ... the list goes on and on.

But the battles in IT are nothing compared to the fights that were held between the two different types of electricity. In the early 20th century, Edison lead the group for direct current, and Tesla lead the alternate group for, well, alternating current. The battle between these two made our disputes look like a Sunday picnic. Edison famously electrocuted an elephant -- with the "wrong" type of electricity, of course.

I think we in IT can learn from the Great Electricity War. (And its not that we should be electrocuting elephants.)

Despite all of the animosity, despite all of the propaganda, despite all of the innovation on both sides, neither format "won". Neither vanquished its opponent. We use both types of electricity.

For power generation, transmission, and large appliances, we use alternating current. (Large appliances include washing machines, dryers, refrigerators, and vacuum cleaners.)

Small appliances (personal computers, digital televisions, calculators, cell phones) use direct current. They may plug into the AC wall outlet, but the first thing they do is convert 110 VAC into lower-voltage DC.

Alternating current has advantages in certain situations and direct current has advantages in other situations. It's not that one type of electricity is better than the other, its that one type is better for a specific application.

We have a multitude of solutions in IT: multiple operating systems, multiple programming languages, multiple editors, multiple hardware platforms... lots and lots of choices. We too often pick one of many, name it our "standard", and force entire companies to use that one selection. That may be convenient for the purchasing team, and probably for the support team, but is it the best strategy for a company?

Yes, we in IT can learn a lot from electricity. And please, respect the elephants.

Monday, June 26, 2017

The humble PRINT statement

One of the first statements we learn in any language is the "print" statement. It is the core of the "Hello, world!" program. We normally learn it and then discard it, focussing our efforts on database and web service calls.

But the lowly "print" statement has its uses, as I was recently reminded.

I was working on a project with a medium-sized C++ application. We needed information to resolve several problems, information that would normally be available from the debugger and from the profiler. But the IDE's debugger was not usable (executing under the debugger would require a run time of about six hours) and the IDE did not have a profiler.

What to do?

For both cases, the PRINT statement (the "fprintf()" statement, actually, as we were using C++) was the thing we needed. A few carefully placed statements allowed use to capture the necessary information, make decisions, and resolve the problems.

The process wasn't that simple, of course. We needed several iterations, adding and removing PRINT statements in various locations. We also captured counts and timings of various functions.

The effort was worth it.

PRINT statements (or "printf()", or "print()", or "puts()", whatever you use) are useful tools. Here's how they can help:
  • They can capture values of internal variables and state when the debugger is not available.
  • They can capture lots of values of variables and state, for analysis at a level higher than the interactive level of the debugger. (Consider viewing several thousand values for trends in a debugger.)
  • They can capture performance when a profiler is not available.
  • They can extract information from the "release" version of software, because sometimes the problem doesn't occur in "debug" mode.
They may be simple, but they are useful. Keep PRINT statements in your toolbox.

* * * * *

I was uncertain about the title for this column. I considered the C/C++ form of the generic statement ('printf()'). I also considered the general form used by other languages ('print()', 'puts()', 'WriteLine()'). I settled on BASIC's form of PRINT -- all capitals, no parentheses. All popular languages have such a statement; in the end, I suspect it matters little. Use what is best for you.

Sunday, June 18, 2017

Three models of computing

Computing comes in different flavors. We're probably most familiar with personal computers and web applications. Let's look at the models used by different vendors.

Apple has the simplest model: Devices that compute. Apple has built it's empire on high-quality personal computing devices. They do not offer cloud computing services. (They do offer their "iCloud" backup service, which is an accessory to the central computing of the iMac or Macbook.) I have argued that this model is the same as personal computing in the 1970s.

Google has a different model: web-based computing. This is obvious in their Chromebook, which is a lightweight computer that can run a browser -- and nothing else. All of the "real" computing occurs on the servers in Google's data center. The same approach is visible in most of the Google Android apps -- lightweight apps that communicate with servers. In some ways, this model is an update of the 1970s minicomputer model, with terminals connected to a central processor.

Microsoft has a third model, a hybrid of the two. In Microsoft's model, some computing occurs on the personal computer and some occurs in the data center. It is the most interesting of the two, requiring communication and coordination of two components.

Microsoft did not always have their current approach. Their original model was the same as Apple's: personal computers as complete and independent computing entities. Microsoft started with implementations of the BASIC language, and then sold PC-DOS to IBM. Even early versions of Windows were for stand-alone, independent PCs.

Change to that model started with Windows for Workgroups, and became serious with Windows NT, domains, and ActiveDirectory. Those three components allowed for networked computing and distributed processing. (There were network solutions from other vendors, but the Microsoft set was a change in Microsoft's strategy.)

Today, Microsoft offers an array of services under its "Azure" mark. Azure provides servers, message queues, databases, and other services, all hosted in its cloud environment. They allow individuals and companies to create applications that can combine PC and cloud technologies. These applications perform some computing on the local PC and some computing in the Azure cloud. You can, of course, build an application that runs completely on the PC, or completely in the cloud. That you can build those applications shows the flexibility of the Microsoft platform.

I think this hybrid model, combining local computing and server-based computing, has the best potential. It is more complex, but it can handle a wider variety of applications than either the PC-only solution (Apple's) or the server-only solution (Google's). Look for Microsoft to support this model with development tools, operating systems, and communication protocols and libraries.

Looking forward, I can see Microsoft working on a "fluid" model of computing, where some processing can move from the server to the local PC (for systems with powerful local PCs) and from the PC to the server (for systems with limited PCs).

Many things in the IT realm started in a "fixed" configuration, and over time have become more flexible. I think processing is about to join them.

Wednesday, June 14, 2017

Evangelism from Microsoft

Microsoft has designated certain employees as "evangelists": people knowledgeable in the details of specific products and competent at presentations.

It strikes me that the folks in the evangelist role were mostly, well, preaching to the choir. They would appear at events where one would expect Microsoft users, fans, and enthusiasts to gather.

I'm not sure that Microsoft needs them, and it seems that Microsoft is coming to the same conclusion. A recent blog post on MSDN seems to indicate that the Developer Evangelist group is being disbanded. (The post is vague.)

Can Microsoft compete (and survive) without the evangelist team? I believe that they can.

First, I believe that Satya Nadella is confident in his position as CEO of Microsoft, and that confidence flows down to the entire company.

Second, I believe that Microsoft has confidence in the direction of its products and services. It has ceased being the "Windows company" in which everything revolved around Windows. Today, Microsoft has embraced outside technologies (namely open source) and developed its cloud services (Azure), competing successfully with them.

In short, Microsoft feels good about its current position and its future.

With such confidence in its products and services, Microsoft doesn't need the re-assurance of evangelists. Perhaps they were there to tell Microsoft -- not customers -- that its products were good. Now Microsoft believes it without their help.

Sunday, June 11, 2017

Apple's Files App is an admission of imperfection

When Apple introduced the iPhone, they introduced not just a smart phone but a new approach to computing. The iPhone experience was a new, simpler experience for the user. The iPhone (and iOS) did away with much of the administrative work of PCs. It eliminated the notion of user accounts and administrator accounts. Updates were automatic and painless. Apps knew how to get their data. The phone "just worked".

The need for a Files app is an admission that the iPad experience does not meet those expectations. It raises the hood and allows the user to meddle with some of the innards of the iPhone. One explanation for its existence is that Apps cannot always find the needed files, and the Files App lets you (the user) find those files.

Does anyone see the irony in making the user do the work that the computer should do? Especially a computer from Apple?

To be fair, Android has had File Manager apps for years, so the Android experience does not meet those expectations either. Microsoft's Surface tablets, starting with the first one, have had Windows Explorer built in, so they are failing to provide the new, simpler experience too.

A curmudgeon might declare that the introduction of the Files App shows that even Apple cannot provide the desired user experience, and if Apple can't do it then no one can.

I'm not willing to go that far.

I will say that the original vision of a simple, easy-to-use, reliable computing device still holds. It may be that the major players have not delivered on that vision, but that doesn't mean the vision is unobtainable.

It may be that the iPhone (and Android) are steps in a larger process, one starting with the build-it-yourself microcomputers of the mid 1970s, passing through IBM PCs with DOS and later PC-compatibles with Windows, and currently with iPhones and tablets. Perhaps we will see a new concept in personal computing, one that improves upon the iPhone experience. It may be as different from iPhone and Android as those operating systems are from Windows and MacOS. It may be part of the "internet of things" and expand personal computing to household appliances.

I'm looking forward to it.

Monday, June 5, 2017

Better programming languages let us do more -- and less

We tend to think that better programming languages let us programmers do more. Which is true, but it is not the complete picture.

Better languages also let us do less. They remove capabilities. In doing so, they remove the possibility for errors.

PL/I was better than COBOL and FORTRAN because it let us write free-form source code. In COBOL and FORTRAN, the column in which code appeared was significant. The restrictions were from the technology of the time (punch cards) but once in the language they were difficult to remove.

BASIC was better than FORTRAN because it eliminated FORMAT specifications. FORMAT specifications were necessary to parse input data and format output data. They were precise, opaque, and easy to get wrong. BASIC, with no such specifications, removed the possibility of errors from such specifications. BASIC also fixed the DO loops of FORTRAN and removed restrictions on subscript form. (In FORTRAN, a subscript could not be an arbitrary expression but had to have the form A*B+C. Any component could be zero and omitted so A+C was allowed, as was A*B. But you could not use A+B+C or A/2.)

Pascal was better than BASIC because it limited the use of GOTO statements. In BASIC, you could use a GOTO to transfer control to any other part of the program, including in and out of loops or subroutines. It made for "spaghetti code" which was difficult to understand and debug. Pascal put an end to that, with a constrained form of GOTO.

Java eliminated the need for the explicit 'delete' or 'free' operations on allocated memory. You cannot forget the 'delete' operation -- you can't write one at all! The internal garbage collector recycles memory. In Java, it is much harder to create memory leaks than in C++ and C.

Python forces us to consider indentation as part of the code. In C, C++, Java, and C#, you can write:

initialize();

if (some_condition)
    do_something();
    do_another_thing();

complete_the_work();

But the code acts in a way you may not expect. Python's use of indentation to specify code organization makes the code clearer. The Python code:

initialize()

if some_condition:
    do_something()
    do_another_thing()

complete_the_work()

does what you expect.

New programming languages do provide new capabilities. (Often, they are refinements to constructs and concepts that were implemented roughly in earlier programming languages.) A new programming language is a combination of new things we can do and old things we no longer need to do.

When considering a new language (or reviewing the current language for a project), keep in mind not only the things that a new language lets you do, but also the things that it won't let you do.