Sunday, October 15, 2017

Don't make things worse

We make many compromises in IT.

Don't make things worse. Don't break more things to accommodate one broken thing.

On one project in my history, we had an elderly Windows application that used some data files. The design of the application was such that the data files had to reside within the same directory as the executable. This design varies from the typical design for a Windows application, which sees data files stored in a user-writeable location. Writing to C:\Program Files should be done only by install programs, and only with elevated privileges.

Fortunately, the data files were read by the application but not written, so we did not have to grant the application write access to a location under C:\Program Files. The program could run, with its unusual configuration, and no harm was done.

But things change, and there came a time when this application had to share data with another application, and that application *did* write to its data files.

The choices were to violate generally-accepted security configurations, or modify the offending application. We could grant the second application write permissions into C:\Program Files (actually, a subdirectory, but still a variance from good security).

Or, we could install the original application in a different location, one in which the second application could write to files. This, too, is a variance from good security. Executables are locked away in C:\Program Files for a reason -- they are the targets of malware, and Windows guards that directory. (True, Windows does look for malware in all directories, but its better for executables to be locked away until needed.)

Our third option was to modify the original application. This we could do; we had the source code and we had built it in the past. The code was not in the best of shape, and small changes could break things, but we did have experience with changes and a battery of tests to back us up.

In the end, we selected the third option. This was the best option, for a number of reasons.

First, it moved the original application closer to the standard model for Windows. (There were other things we did not fix, so the application is not perfect.)

Second, it allowed us to follow accepted procedures for our Windows systems.

Finally, it prevented the spread of bad practices. Compromising security to accommodate a poorly-written application is a dangerous path. It expands the "mess" of one application into the configuration of the operating system. Better to contain the mess and not let it grow.

We were lucky. We had an option to fix the problem application and maintain security. We had the source code for the application and knowledge about the program. Sometimes the situation is not so nice, and a compromise is necessary.

But whenever possible, don't make things worse.

Sunday, October 8, 2017

The Amazon Contest

Allow me to wander from my usual space of technology and share some thoughts on the Amazon.com announcement.

The announcement is for their 'HQ2', an office building (or complex) with 50,000 well-paid employees that is up for grabs to a lucky metropolitan area. City planners across the country are salivating over winning such an employer for their struggling town. Amazon has announced the criteria that they will consider for the "winner", including educated workforce, transit, and cost of living.

The one thing that I haven't seen is an analysis of the workforce numbers. From this one factor alone, we can narrow the hopeful cities to a handful.

Amazon.com wants their HQ2 complex to employee 50,000 people. That means that they will either hire the people locally or they will relocate them. Let's assume that they relocate one-third of the employees in HQ2. (The relocated could be current employees at other Amazon.com offices or new hires from out of the HQ2 area.)

That leaves about 33,000 people to hire. Assuming that they hire half as entry-level, they will need the other half to be experienced. (I'm assuming that Amazon.com will not relocate entry-level personnel.)

The winning city will have to supply 16,000 experienced professionals and 16,000 entry-level people. That's not an easy lift, and not one that many cities can offer. It means that the city (or metro area) must have a large population of professionals -- larger than 16,000 because not everyone will be willing to leave their current position and enlist with Amazon.com. (And Amazon.com may be unwilling to hire all candidates.)

If we assume that only one in ten professionals are willing to move, then Amazon.com needs a metro area with at least 160,000 professionals. (Or, if Amazon.com expected to pick one in ten candidates, the result is the same.)

And don't forget the relocated employees. They will need housing. Middle class, ready to own, housing -- not "fixer uppers" or "investment opportunities". A few relocatees may choose the "buy and invest" option, but most are going to want a house that is ready to go. How many cities have 15,000 modern housing units available?

These two numbers -- available housing and available talent -- set the entrance fee. Without them, metro areas cannot compete, no matter how good the schools or the transit system or the tax abatement.

So when Amazon.com announces the location of HQ2, I won't be surprised if it has a large population of professionals and a large supply of housing. I also won't be surprised if it doesn't have some the other attributes that Amazon.com put on the list, such as incentives and tax structure.

Wednesday, October 4, 2017

Performance, missing and found

One of the constants in technology has been the improvement of performance. More powerful processors, faster memory, larger capacity in physically smaller disks, and faster communications have been the results of better technology.

This increase in performance is mostly mythological. We are told that our processors are more powerful, we are told that memory and network connections are faster. Yet what is our experience? What are the empirical results?

For me, word processors and spreadsheets run just as fast as they did decades ago. Operating systems load just as fast -- or just as slow.

Linux on my 2006 Apple MacBook loads slower than 1980s-vintage systems with eight-bit processors and floppy disk drives. Windows loads quickly, sort of. It displays a log-on screen and lets me enter a name and password, but then it takes at least five minutes (and sometimes an hour) updating various things.

Compilers and IDEs suffer the same fate. Each new version of Visual Studio takes longer to load. Eclipse is no escape -- it has always required a short eternity to load and edit a file. Slow performance is not limited to loading; compilation times have improved but only slightly, and not by the orders of magnitude to match the advertised improvements in hardware.

Where are the improvements? Where is the blazing speed that our hardware manufacturers promise?

I recently found that "missing" performance. It was noted in an article on the longevity of the C language, of all things. The author clearly and succinctly describes C and its place in the world. On the way, he describes the performance of one of his own C programs:
"In 1987 this code took around half an hour to run, today 0.03 seconds."
And there it is. A description of the performance improvements we should see in our systems.

The performance improvements we expect from better hardware has gone into software.

We have "invested" that performance in our operating systems, our programming languages, and user interfaces. Instead of taking all the improvements for reduced running times, we have diverted performance to new languages and to "improvements" in older languages. We invested in STL over plain old C++, Java over C++ (with or without STL), Python over Java and C#.

Why not? Its better to prevent mistakes than to have fast-running programs that crash or -- worse -- don't crash but provide incorrect results. Our processors are faster, and our programming languages do more for us. Boot times, load times, and compile times may be about the same as from decades ago, but errors are far fewer, easily detected, and much less dangerous.

Yes, there are still defects which can be exploited to hack into systems. We have not achieved perfection.

Our systems are much better with operating systems and programming languages that do the checking that the now do, and businesses and individuals can rely on computers to get the job done.

That's worth some CPU cycles.

Monday, September 25, 2017

Web services are the new files

Files have been the common element of computing since at least the 1960s. Files existed before disk drives and file systems, as one could put multiple files on a magnetic tape.

MS-DOS used files. Windows used files. OS/2 used files. (Even the p-System used files.)

Files were the unit of data storage. Applications read data from files and wrote data to files. Applications shared data through files. Word processor? Files. Spreadsheet? Files. Editor? Files. Compiler? Files.

The development of databases saw another channel for sharing data. Databases were (and still are) used in specialized applications. Relational databases are good for consistently structured data, and provide transactions to update multiple tables at once. Microsoft hosts its Team Foundation on top of its SQL Server. (Git, in contrast, uses files exclusively.)

Despite the advantages of databases, the main method for storing and sharing data remains files.

Until now. Or in a little while.

Cloud computing and web services are changing the picture. Web services are replacing files. Web services can store data and retrieve data, just as files. But web services are cloud residents; files are for local computing. Using URLs, one can think of a web service as a file with a rather funny name.

Web services are also dynamic. A file is a static collection of bytes: what you read is exactly was was written. A web service can provide a set of bytes that is constructed "on the fly".

Applications that use local computing -- desktop applications -- will continue to use files. Cloud applications will use web services.

Those web services will be, at some point, reading and writing files, or database entries, which will eventually be stored in files. Files will continue to exist, as the basement of data storage -- around, but visited by only a few people who have business there.

At the application layer, cloud applications and mobile applications will use web services. The web service will be the dominant method of storing, retrieving, and sharing data. It will become the dominant method because the cloud will become the dominant location for storing data. Local computing, long the leading form, will fall to the cloud.

The default location for data will be the cloud; new applications will store data in the cloud; everyone will think of the cloud. Local storage and local computing will be the oddball configuration. Legacy systems will use local storage; modern systems will use the cloud.

Monday, September 18, 2017

What Agile and Waterfall have in common

Agile and Waterfall are often described in contrasts: Waterfall is large and bureaucratic, Agile is light and nimble. Waterfall is old, Agile is new. Waterfall is... you get the idea.

But Waterfall and Agile have things in common.

First and most obvious, Waterfall and Agile are both used to manage development projects. They both deliver software.

I'm sure that they are both used to deliver things other than software. They are tools for managing projects, not limited to software projects.

But those are the obvious common elements.

An overlooked commonality is the task of defining small project steps. For Waterfall, this is the design phase, in which requirements are translated into system design. The complete set of requirements can paint a broad picture of the system, providing a general shape and contours. (Individual requirements can be quite specific, with details on input data, calculations, and output data.)

Breaking down the large idea of the system into smaller, code-able pieces is a talent required for Waterfall. It is how you move from requirements to coding.

Agile also needs that talent. In contrast to Waterfall, Agile does not envision the completed system and does not break that large picture into smaller segments. Instead, Agile asks the team to start with a small piece of the system (most often a core function) and build that single piece.

This focus on a single task is, essentially, the same as the design phase in Waterfall. It converts a requirement (or user story, or use case, or whatever small unit is convenient) into design for code.

The difference between Waterfall and Agile is obvious: Waterfall converts all requirements in one large batch before any coding is started, and Agile performs the conversions serially, seeing one requirement all the way to coding and testing (or more properly, testing and then coding!) before starting the next.

So whether you use Waterfall or Agile, you need the ability to "zoom in" from requirements to design, and then on to tests and code. Waterfall and Agile are different, but the differences are more in the sequence of performing tasks and not the tasks themselves.

Monday, September 11, 2017

Legacy cloud applications

We have legacy web applications. We have legacy Windows desktop applications. We have legacy DOS applications (albeit few). We have legacy mainframe applications (possibly the first type to be named "legacy").

Will we have legacy cloud applications? I see no reason why not. Any technology that changes over time (which is just about every technology) has legacy applications. Cloud technology changes over time, so I am confident that, at some time, someone, somewhere, will point to an older cloud application and declare it "legacy".

What makes a legacy application a legacy application? Why do we consider some applications "legacy" and others not?

Simply put, then technology world changed and the application did not. There are multiple aspects to the technology world, and any one of them, when left unchanged, may cause us to view an application as legacy.

It may the user interface. (Are we using an old version of HTML and CSS? An old version of JavaScript?) It may be the database. (Are we using a relational database and not a NoSQL database?) The the back-end code may be difficult to read. The back-end code may be in a language that has fallen out of favor. (Perl, or Visual Basic, or C, or maybe an early version of Java?)

One can ask similar questions about legacy Windows desktop applications or mainframe applications. (C++ and MFC? COBOL and CICS?)

But let us come back to cloud computing. Cloud computing has been around since 2006. (There was an earlier use of the term "cloud computing", but for our purposes the year 2006 is sufficient.)

So let's assume that the earliest cloud applications were built in 2006. Cloud computing has changed since then. Have all of these applications kept up with those changes? Or have some of them languished, retaining their original design and techniques? If they have not kept up with changing technology, we can consider them legacy cloud applications.

Which means, as owners or custodians of applications, we now not only have to worry about legacy mainframe applications and legacy web applications and legacy desktop applications. We can add legacy cloud applications to our list.

Cloud computing is a form of computing, but it is not magical. It evolves over time, just like other forms of computing. Those who look after applications must either make the effort to modify cloud applications over time (to keep up with the mainstream) or live with legacy cloud applications. That effort is an expense.

Like any other expense, it is really a business decision: invest time and money in an old (legacy) application or invest the time and money somewhere else. Both paths have benefits and costs; managers must decide which has the greater merit. Choosing to let an old system remain old is an acceptable decision, provided you recognize the cost of maintaining that older technology.

Monday, September 4, 2017

Agile can be cheap; Waterfall is expensive

Agile can be cheap, but Waterfall will always be expensive.

Here's why:

Waterfall starts its process with an estimate. The Waterfall method uses a set of phases (analysis, design, coding, testing, and deployment) which are executed according to a fixed schedule. Many Waterfall projects assign specific times to each phase. Waterfall needs this planning because it makes a promise: deliver a set of features on a specific date.

But notice that Waterfall begins with an estimate: the features that can be implemented in a specific time frame. That estimate is crucial to the success of the project. What is necessary to obtain that estimate?

Only people with knowledge and experience can provide a meaningful estimate. (One could, foolishly, ask an inexperienced person for the estimate, but that estimate has no value.)

What knowledge does that experienced person need? Here are some ideas:
- The existing code
- The programming language and tools used
- The different teams involved in development and testing
- The procedures and techniques used to coordinate efforts
- The terms and concepts used by the business

With knowledge of these, a person can provide a reasonable estimate for the effort.

These areas of knowledge do not come easily. They can be learned only by working on the project and in different capacities.

In other words, the estimate must be provided by a senior member of the team.

In other words, the team must have at least one senior member.

Waterfall relies on team members having knowledge about the business, the code, and the development processes.

Agile, in contrast, does not rely on that experience. Agile is designed to allow inexperienced people work on the project.

Thus, Agile projects can get by without senior, experienced team members, but Waterfall projects must have at least one (and probably more) senior team members. Since senior personnel are more expensive than junior, and Waterfall requires senior personnel, we can see that Waterfall projects will, on average, cost more than Agile projects. (At least in terms of per-person costs.)

Do not take this to mean that you should run all projects with Agile methods. Waterfall may be more expensive, but it provides different value. It promises a specific set of functionality on a specific date, a promise that Agile does not make. If you need the promises of Waterfall, it may be worth the extra cost (higher wages). This is a business decision, similar to using proprietary tools over open-source tools, or leasing premium office space in the suburbs over discount office space in a not-so-nice part of town.

Which method you choose is up to you. But be aware that they are not the same, not only in terms of deliverables but in staffing requirements. Keep those differences in mind when you make your decision.