In the closed-source world, the market encourages duplicate efforts. Lotus creates and sells a spreadsheet, Borland creates and sells a spreadsheet, Microsoft creates and sells a spreadsheet... you get the idea. Each vendor can differentiate their product and make a profit. Vendors keep their source code closed, so each company must create their own spreadsheet from scratch.
The open source world is different. There is no need to create a competing product from scratch. The Libre Office project includes a word processor and a spreadsheet (among other things) and it is open source. If I wanted to create a competing spreadsheet, I could take the code from Libre Office, modify it (a little or a lot) and redistribute it. (The catch is that I would also have to distribute my modified version of the source code.)
Rather than build my own version with private enhancements, it would be easier to suggest my enhancements to the team that maintains Libre Office. With private enhancements, I have to make the same changes with each new release of Libre Office (assuming I want the latest version); by submitting my enhancements (and getting them included) they then become part of the product and I get them with each update. (Of course, so does everyone else.)
Open source is not "one solution only". It has different software packages that exist in the same "space". There are a multitude of text editors. There are different display managers for Linux. There are multiple windowing systems. One can even argue that the languages Awk, Perl, Python, and Ruby all compete. There can be competing efforts in open source.
The closed-source world does not always provide competition. It has settled on some "winner" programs: Microsoft Word for word processing. Microsoft Excel for spreadsheets. Photoshop for editing pictures. Competitors may emerge, but the cost of entry to the market it high.
In general, I think that the overall trend (for closed source and open source) is to move to a single package. The "network effect" exerts a gentle but consistent pull for a single solution in both worlds. The open source market falls quicker than the closed-source market; for-profit vendors have more to gain by keeping their product in the market. They resist the tug of the network effect.
Open source becomes a more efficient space. With fewer people working to create similar-but-different products, the open source world can work on a more diverse set of problems. Or it can invest less effort for the same result.
Many companies invest effort in core competencies and outsource non-essential activities. Open source may be the cost-effective method for those non-essential activities.
Monday, June 30, 2014
Sunday, June 15, 2014
Untangle code with automated testing
Of all of the tools and techniques for untangling code, the most important is automated testing.
What does automated testing have to do with the untangling of code?
Automated testing provides insurance. It provides a back-stop against which developers can make changes.
The task of untangling code, of making code readable, often requires changes across multiple modules and multiple classes. While a few improvements can be made to single modules (or classes), most require changes in multiple modules. Improvements can require changes to the methods exposed by a class, or remove access to member variables. These changes ripple though other classes.
Moreover, the improvement of tangled code often requires a re-thinking of the organization of the code. You move functions from one class to another. You rename variables. You split classes into smaller classes.
These are significant changes, and they can have significant effects on the operation of the code. Of course, while you want to change the organization of the code you want the results of calculations to remain unchanged. That's how automated tests can help.
Automated tests verify that your improvements have no effect on the calculations.
The tests must be automated. Manual tests are expensive: they require time and attention. Manual tests are easy to skip. Manual tests are easy to "flub". Manual tests can be difficult to verify. Automated tests are consistent, accurate, and most of all, cheap. They do not require attention or effort. They are complete.
Automated tests let programmers make significant improvements to the code base and have confidence that their changes are correct. That's how automated tests help you untangle code.
What does automated testing have to do with the untangling of code?
Automated testing provides insurance. It provides a back-stop against which developers can make changes.
The task of untangling code, of making code readable, often requires changes across multiple modules and multiple classes. While a few improvements can be made to single modules (or classes), most require changes in multiple modules. Improvements can require changes to the methods exposed by a class, or remove access to member variables. These changes ripple though other classes.
Moreover, the improvement of tangled code often requires a re-thinking of the organization of the code. You move functions from one class to another. You rename variables. You split classes into smaller classes.
These are significant changes, and they can have significant effects on the operation of the code. Of course, while you want to change the organization of the code you want the results of calculations to remain unchanged. That's how automated tests can help.
Automated tests verify that your improvements have no effect on the calculations.
The tests must be automated. Manual tests are expensive: they require time and attention. Manual tests are easy to skip. Manual tests are easy to "flub". Manual tests can be difficult to verify. Automated tests are consistent, accurate, and most of all, cheap. They do not require attention or effort. They are complete.
Automated tests let programmers make significant improvements to the code base and have confidence that their changes are correct. That's how automated tests help you untangle code.
Wednesday, June 11, 2014
Learning to program, without objects
Programming is hard.
Object-oriented programming is really hard.
Plain (non-object-oriented) programming has the concepts of statements, sequences, loops, comparisons, boolean logic, variables, variable types (text and numeric), input, output, syntax, editing, and execution. That's a lot to comprehend.
Object-oriented programming has all of that, plus classes, encapsulation, access, inheritance, and polymorphism.
Somewhere in between the two is the concept of modules and multi-module programs, structured programming, subroutines, user-defined types (structs), and debugging.
For novices, the first steps of programming (plain, non-object-oriented programming) are daunting. Learning to program in BASIC was hard. (The challenge was in organizing data into small, discrete chunks and processes into small, discrete steps.)
I think that the days of an object-oriented programming language as the "first language to learn" are over. We will not be teaching C# or Java as the introduction to programming. (And certainly not C++.)
The introduction to programming will be with languages that are not necessarily object-oriented: Python or Ruby. Both are, technically, object-oriented programming languages, supporting classes, inheritance, and polymorphism. But you don't have to use those features.
C# and Java, in contrast, force one to learn about classes from the start. One cannot write a program without classes. Even the simple "Hello, world!" program in C# or Java requires a class to hold main() .
Python and Ruby can get by with a simple
print "Hello, world"
and be done with it.
Real object-oriented programs (ones that include a class hierarchy and inheritance and polymorphism) require a bunch of types (at least two, probably three) and operations complex enough to necessitate the need for so many types. The canonical examples of drawing shapes or simulating an ATM are complex enough to warrant object-oriented code.
A true object-oriented program has a minimum level of complexity.
When learning the art of programming, do we want to start with that level of complexity?
Let us divide programming into two semesters. The first semester can be devoted to plain programming. The second semester can introduce object-oriented programming. I think that the "basics" of plain programming are enough for a single semester. I also think that one must be comfortable with those basics before one starts with object-oriented programming.
Object-oriented programming is really hard.
Plain (non-object-oriented) programming has the concepts of statements, sequences, loops, comparisons, boolean logic, variables, variable types (text and numeric), input, output, syntax, editing, and execution. That's a lot to comprehend.
Object-oriented programming has all of that, plus classes, encapsulation, access, inheritance, and polymorphism.
Somewhere in between the two is the concept of modules and multi-module programs, structured programming, subroutines, user-defined types (structs), and debugging.
For novices, the first steps of programming (plain, non-object-oriented programming) are daunting. Learning to program in BASIC was hard. (The challenge was in organizing data into small, discrete chunks and processes into small, discrete steps.)
I think that the days of an object-oriented programming language as the "first language to learn" are over. We will not be teaching C# or Java as the introduction to programming. (And certainly not C++.)
The introduction to programming will be with languages that are not necessarily object-oriented: Python or Ruby. Both are, technically, object-oriented programming languages, supporting classes, inheritance, and polymorphism. But you don't have to use those features.
C# and Java, in contrast, force one to learn about classes from the start. One cannot write a program without classes. Even the simple "Hello, world!" program in C# or Java requires a class to hold main() .
Python and Ruby can get by with a simple
print "Hello, world"
and be done with it.
Real object-oriented programs (ones that include a class hierarchy and inheritance and polymorphism) require a bunch of types (at least two, probably three) and operations complex enough to necessitate the need for so many types. The canonical examples of drawing shapes or simulating an ATM are complex enough to warrant object-oriented code.
A true object-oriented program has a minimum level of complexity.
When learning the art of programming, do we want to start with that level of complexity?
Let us divide programming into two semesters. The first semester can be devoted to plain programming. The second semester can introduce object-oriented programming. I think that the "basics" of plain programming are enough for a single semester. I also think that one must be comfortable with those basics before one starts with object-oriented programming.
Tuesday, June 10, 2014
Slow and steady wins the race -- or does it?
Apple and Google run at a faster pace than their predecessors. Apple introduces new products often: new iPhones, new iPad tablets, new versions of iOS; Google does the same with Nexus phones and Android.
Apple and Google's quicker pace is not limited to the introduction of new products. They also drop items from their product line.
The "old school" was IBM and Microsoft. These companies moved slowly, introduced new products and services after careful planning, and supported their customers for years. New versions of software were backwards compatible. New hardware platforms supported the software from previous platforms. When a product was discontinued, customers were offered a path forward. (For example, IBM discontinued the System/36 minicomputers and offered the AS/400 line.)
IBM and Microsoft were the safe choices for IT, in part because they supported their customers.
Apple and Google, in contrast, have dropped products and services with no alternatives. Apple dropped .Mac. Google dropped their RSS reader. (I started this rant when I learned that Google dropped their conversion services from App Engine.)
I was about to chide Google and Apple for their inappropriate behavior when I thought of something.
Maybe I am wrong.
Maybe this new model of business (fast change, short product life) is the future?
What are the consequences of this business model?
For starters, businesses that rely on these products and services will have to change. These businesses can no longer rely on long product lifetimes. They can no longer rely on a guarantee of "a path forward" -- at least not with Apple and Google.
Yet IBM and Microsoft are not the safe havens of the past. IBM is out of the PC business, and getting out of the server business. Microsoft is increasing the frequency of operating system releases. (Windows 9 is expected to arrive in 2015. The two years of Windows 8's life are much shorter than the decade of Windows XP.) The "old school" suppliers of PC technology are gone.
Companies no longer have the comfort of selecting technology and using it for decades. Technology will "rev" faster, and the new versions will not always be backwards compatible.
Organizations with large IT infrastructures will find that their technologies are less homogeneous. Companies can no longer select a "standard PC" and purchase it over a period of years. Instead, every few months will see new hardware.
Organizations will see software change too. New versions of operating systems. New versions of applications. New versions of online services (software as a service, platform as a service, infrastructure as a service, web services) will occur -- and not always on a convenient schedule.
More frequent changes to the base upon which companies build their infrastructure will mean that companies spend more time responding to those changes. More frequent changes to the hardware will mean that companies have more variations of hardware (or they spend more time and money keeping everyone equipped with the latest).
IT support groups will be stressed as they must learn the new hardware and software, and more frequently. Roll-outs of internal systems will become more complex, as the target base will be more diverse.
Development groups must deliver new versions of their products on a faster schedule, and to a broader set of hardware (and software). It's no longer acceptable to deliver an application for "Windows only". One must include MacOS, the web, tablets, and phones. (And maybe Kindle tablets, too.)
Large organizations (corporations, governments, and others) have developed procedures to control the technology within (and to minimize costs). Those procedures often include standards, centralized procurement, and change review boards (in other words, bureaucracy). The outside world (suppliers and competitors) cares not one whit about a company's internal bureaucracy and is changing.
The slow, sedate pace of application development is a thing of the past. We live in faster times.
"Slow and steady" used to win. The tortoise would, in the long run, win over the hare. Today, I think the hare has the advantage.
Apple and Google's quicker pace is not limited to the introduction of new products. They also drop items from their product line.
The "old school" was IBM and Microsoft. These companies moved slowly, introduced new products and services after careful planning, and supported their customers for years. New versions of software were backwards compatible. New hardware platforms supported the software from previous platforms. When a product was discontinued, customers were offered a path forward. (For example, IBM discontinued the System/36 minicomputers and offered the AS/400 line.)
IBM and Microsoft were the safe choices for IT, in part because they supported their customers.
Apple and Google, in contrast, have dropped products and services with no alternatives. Apple dropped .Mac. Google dropped their RSS reader. (I started this rant when I learned that Google dropped their conversion services from App Engine.)
I was about to chide Google and Apple for their inappropriate behavior when I thought of something.
Maybe I am wrong.
Maybe this new model of business (fast change, short product life) is the future?
What are the consequences of this business model?
For starters, businesses that rely on these products and services will have to change. These businesses can no longer rely on long product lifetimes. They can no longer rely on a guarantee of "a path forward" -- at least not with Apple and Google.
Yet IBM and Microsoft are not the safe havens of the past. IBM is out of the PC business, and getting out of the server business. Microsoft is increasing the frequency of operating system releases. (Windows 9 is expected to arrive in 2015. The two years of Windows 8's life are much shorter than the decade of Windows XP.) The "old school" suppliers of PC technology are gone.
Companies no longer have the comfort of selecting technology and using it for decades. Technology will "rev" faster, and the new versions will not always be backwards compatible.
Organizations with large IT infrastructures will find that their technologies are less homogeneous. Companies can no longer select a "standard PC" and purchase it over a period of years. Instead, every few months will see new hardware.
Organizations will see software change too. New versions of operating systems. New versions of applications. New versions of online services (software as a service, platform as a service, infrastructure as a service, web services) will occur -- and not always on a convenient schedule.
More frequent changes to the base upon which companies build their infrastructure will mean that companies spend more time responding to those changes. More frequent changes to the hardware will mean that companies have more variations of hardware (or they spend more time and money keeping everyone equipped with the latest).
IT support groups will be stressed as they must learn the new hardware and software, and more frequently. Roll-outs of internal systems will become more complex, as the target base will be more diverse.
Development groups must deliver new versions of their products on a faster schedule, and to a broader set of hardware (and software). It's no longer acceptable to deliver an application for "Windows only". One must include MacOS, the web, tablets, and phones. (And maybe Kindle tablets, too.)
Large organizations (corporations, governments, and others) have developed procedures to control the technology within (and to minimize costs). Those procedures often include standards, centralized procurement, and change review boards (in other words, bureaucracy). The outside world (suppliers and competitors) cares not one whit about a company's internal bureaucracy and is changing.
The slow, sedate pace of application development is a thing of the past. We live in faster times.
"Slow and steady" used to win. The tortoise would, in the long run, win over the hare. Today, I think the hare has the advantage.
Sunday, June 8, 2014
Untangle code with small classes
If you want to simplify code, build small classes.
I have written (for different systems) classes for things such as ZIP Codes, account numbers, weights, years, year-month combinations, and file names.
These are small, simple classes, usually equipped with a constructor, comparison operators, and a "to string" operator. Sometimes they have other operators. For example, the YearMonth class has next_month() and previous_month() functions.
Why create a class for something as simple? After all, a year can easily be represented by an int (or an unsigned int, if you prefer). A file name can be held in a string. Why have a separate class for them?
Small classes provide a number of benefits.
Check for validity The constructor can check for the validity of the contents. With the proper checks in place, you know that every instance of the class is a valid instance. With primitive types (such as a string to hold a ZIP Code), you are never sure.
Consolidate redundant code A class can hold the logic that is duplicated in the main code. The Year class can tell if a year is a leap year, instead of repeating if (year % 4 == 0) in the code. It is easier (and more readable) to have code say if (year.is_leap_year()).
Consistent operations Our Year class performs the proper calculation for leap years (not the simple one listed above). Using the Year class for all instances of a year means that the calculations for leap year are consistent (and correct).
Clear names for operations Our Year class has operations named next_year() and previous_year() which give clear meaning to the operations year + 1 and year - 1.
Limit operations Custom classes provide the operations you specify and no others. The standard library provides classes with lots of operations, some of which may be inappropriate for your needs.
Add operations Our YearMonth class has operations next_month() and previous_month(), operations which are not supplied in the standard library's Date class. (Yes, one can add a TimeSpan object, provided one gets the right number of days in the TimeSpan, but the code is more complex.) Also, our YearMonth class can calculate the quarter of the year, something we need for our computations.
Prevent accidental use An object of a specific class cannot be used accidentally. If passed to a function or class, the target must be ready to accept the class. Our Year class cannot be carelessly passed to another function. If we stored our years in ints, those ints could be passed to any function that expected an int.
These benefits simplify the main code. Custom classes for small data elements let you ensure that objects are complete and internally consistent. They let you consolidate logic into a single place. They let you tailor the operations to your needs. They prevent accidental use or assignment.
Simplifying the main code means that the main code becomes, well, simpler. By moving low-level operations to low-level classes, your "mainline" code focusses on higher-level concepts. You spend less of your time worrying about low-level things and more of your time thinking about high-level (that is, application-level) ideas.
If you want to simplify code, build small classes.
I have written (for different systems) classes for things such as ZIP Codes, account numbers, weights, years, year-month combinations, and file names.
These are small, simple classes, usually equipped with a constructor, comparison operators, and a "to string" operator. Sometimes they have other operators. For example, the YearMonth class has next_month() and previous_month() functions.
Why create a class for something as simple? After all, a year can easily be represented by an int (or an unsigned int, if you prefer). A file name can be held in a string. Why have a separate class for them?
Small classes provide a number of benefits.
Check for validity The constructor can check for the validity of the contents. With the proper checks in place, you know that every instance of the class is a valid instance. With primitive types (such as a string to hold a ZIP Code), you are never sure.
Consolidate redundant code A class can hold the logic that is duplicated in the main code. The Year class can tell if a year is a leap year, instead of repeating if (year % 4 == 0) in the code. It is easier (and more readable) to have code say if (year.is_leap_year()).
Consistent operations Our Year class performs the proper calculation for leap years (not the simple one listed above). Using the Year class for all instances of a year means that the calculations for leap year are consistent (and correct).
Clear names for operations Our Year class has operations named next_year() and previous_year() which give clear meaning to the operations year + 1 and year - 1.
Limit operations Custom classes provide the operations you specify and no others. The standard library provides classes with lots of operations, some of which may be inappropriate for your needs.
Prevent accidental use An object of a specific class cannot be used accidentally. If passed to a function or class, the target must be ready to accept the class. Our Year class cannot be carelessly passed to another function. If we stored our years in ints, those ints could be passed to any function that expected an int.
These benefits simplify the main code. Custom classes for small data elements let you ensure that objects are complete and internally consistent. They let you consolidate logic into a single place. They let you tailor the operations to your needs. They prevent accidental use or assignment.
Simplifying the main code means that the main code becomes, well, simpler. By moving low-level operations to low-level classes, your "mainline" code focusses on higher-level concepts. You spend less of your time worrying about low-level things and more of your time thinking about high-level (that is, application-level) ideas.
If you want to simplify code, build small classes.
Thursday, June 5, 2014
Apple Swift language increases separation of tech tribes
A surprising announcement at the recent Apple WWDC was the support of the Swift programming language for iOS and MacOS. It is surprising because counter to all of the predictions, a new language was not among them.
The advantages of Swift are significant: cleaner syntax and memory management.
Yet there is a downside to this choice: more separation of technology tribes.
The Apple tribe uses iOS and MacOS, with languages Swift, Objective-C, C, and C++. I expect Swift to soon become the preferred language for apps (preferred by Apple, that is).
The Microsoft tribe uses Windows "Intel" and Windows RT, with languages C# and C++, with C# being the preferred (by Microsoft) language. Other languages (C, F#, IronPython, IronRuby, etc.) form a small asteroid belt around the primaries.
The Google tribe uses Chrome and Android, with the language Java and Javascript.
There is little in common among these technology tribes. Apple's selection of Swift does nothing to bring them together; instead, it widens the distance.
To some extent, the technology tribes have always been separate. People argued about processors and operating systems (and still do). People argued about languages (and still do). People argued about programming languages (and still do). People even argued about character sets (and some may still argue, but we've pretty much all moved to UNICODE). Despite the differences, there were always elements that spanned platforms, often in programming languages.
Languages, while debated, could bring us together. The big, popular languages (COBOL, FORTRAN, BASIC, C, and C++) were developed under standards committees specifically formed to address the needs of multiple platforms. Some popular languages extended standards (Turbo Pascal, Microsoft BASIC) and some popular languages were new constructs (Visual Basic, Java). Visual Basic was limited to Windows, but Java ran everywhere.
Now we are entering an age with distinct platforms and distinct languages. Your platform defines your language -- or your language defines your platform. The overlap between tribes shrinks, and we have less in common.
This may be caused by vendors, but it also may be caused by developers. There are cross-platform languages (Java is still supported on lots of platforms, COBOL is still with us) but we as developers seem to identify with platforms. ("I'm a Windows developer." "I'm an iPhone developer.") We're not pushing for cross-platform tools.
The future may see more specialization and more tribal separation. I see nothing to pull the tribes together, no central mass to exert a gravitational pull to a center.
The advantages of Swift are significant: cleaner syntax and memory management.
Yet there is a downside to this choice: more separation of technology tribes.
The Apple tribe uses iOS and MacOS, with languages Swift, Objective-C, C, and C++. I expect Swift to soon become the preferred language for apps (preferred by Apple, that is).
The Microsoft tribe uses Windows "Intel" and Windows RT, with languages C# and C++, with C# being the preferred (by Microsoft) language. Other languages (C, F#, IronPython, IronRuby, etc.) form a small asteroid belt around the primaries.
The Google tribe uses Chrome and Android, with the language Java and Javascript.
There is little in common among these technology tribes. Apple's selection of Swift does nothing to bring them together; instead, it widens the distance.
To some extent, the technology tribes have always been separate. People argued about processors and operating systems (and still do). People argued about languages (and still do). People argued about programming languages (and still do). People even argued about character sets (and some may still argue, but we've pretty much all moved to UNICODE). Despite the differences, there were always elements that spanned platforms, often in programming languages.
Languages, while debated, could bring us together. The big, popular languages (COBOL, FORTRAN, BASIC, C, and C++) were developed under standards committees specifically formed to address the needs of multiple platforms. Some popular languages extended standards (Turbo Pascal, Microsoft BASIC) and some popular languages were new constructs (Visual Basic, Java). Visual Basic was limited to Windows, but Java ran everywhere.
Now we are entering an age with distinct platforms and distinct languages. Your platform defines your language -- or your language defines your platform. The overlap between tribes shrinks, and we have less in common.
This may be caused by vendors, but it also may be caused by developers. There are cross-platform languages (Java is still supported on lots of platforms, COBOL is still with us) but we as developers seem to identify with platforms. ("I'm a Windows developer." "I'm an iPhone developer.") We're not pushing for cross-platform tools.
The future may see more specialization and more tribal separation. I see nothing to pull the tribes together, no central mass to exert a gravitational pull to a center.
Subscribe to:
Posts (Atom)