Fitzpatrick's Fabulous Future: code quality

Showing posts with label code quality. Show all posts

Sunday, April 9, 2017

Looking inwards and outwards

It's easy to categorize languages. Compiled versus interpreted. Static typing versus dynamic. Strongly typed versus weakly typed. Strict syntax versus liberal. Procedural. Object-oriented. Functional. Languages we like; languages we dislike.

One mechanism I have not seen is the mechanism for assuring quality. It's obscurity is not a surprise -- the mechanisms are more a function of the community, not the language itself.

Quality assurance tends to fall into two categories: syntax checking and unit tests. Both aim to verify that programs perform as expected. The former relies on features of the language, the latter relies on tests that are external to the language (or at least external to the compiler or interpreter).

Interestingly, there is a correlation between execution type (compiled or interpreted) and assurance type (language features or tests). Compiled languages (C, C++, C#) tend to rely on features of the language to ensure correctness. Interpreted languages (Perl, Python, Ruby) tend to rely on external tests.

That interpreted languages rely on external tests is not a surprise. The languages are designed for flexibility and do not have the concepts needed to verify the correctness of code. Ruby especially supports the ability to modify objects and classes at runtime, which means that static code analysis must be either extremely limited or extremely sophisticated.

That compiled languages (and the languages I mentioned are strongly and statically typed) rely on features of the language is also not a surprise. IDEs such as Visual Studio can leverage the typing of the language and analyze the code relatively easily.

We could use tests to verify the behavior of compiled code. Some projects do. But many do not, and I surmise from the behavior of most projects that it is easier to analyze the code than it is to build and run tests. That matches my experience. On some projects, I have refactored code (renaming classes or member variables) and checked in changes after recompiling and without running tests. In these cases, the syntax checking of the compiler is sufficient to ensure quality.

But I think that tests will win out in the end. My reasoning is: language features such as strong typing and static analysis are inward-looking. They verify that the code meets certain syntactic requirements.

Tests, when done right, look not at the code but at the requirements. Good tests are built on requirements, not code syntax. As such, tests are more aligned with the user's needs, and not the techniques used to build the code. Tests are more "in touch" with the actual needs of the system.

The syntax requirements of languages are inward looking. They verify that the code conforms to a set of rules. (This isn't bad, and at times I want C and C++ compilers to require indentation much like Python does.) But conforming to rules, while nice (and possibly necessary) is not sufficient.

Quality software requires looking inward and outward. Good code is easy to read (and easy to change). Good code also performs the necessary tasks, and it is tests -- and only tests -- that can verify that.

Monday, August 8, 2016

Agile is all about code quality

Agile promises clean code. That's the purpose of the 'refactor' phase. After creating a test and modifying the code, the developer refactors the code to eliminate compromises made during the changes.

But how much refactoring is enough? One might flippantly say "as much as it takes" but that's not an answer.

For many shops, the answer seems to be "as much as the developer thinks is needed". Other shops allow refactoring until the end of the development cycle. The first is subjective and opens the development team to the risk of spending too much time on refactoring and not enough on adding features. The second is arbitrary and risks short-changing the refactoring phase and allowing messy code to remain in the system.

Agile removes risk by creating automated tests, creating them before modifying the code, and having developers run those automated tests after all changes. Developers must ensure that all tests pass; they cannot move on to other changes while tests are failing.

This process removes judgement from the developer. A developer cannot say that the code is "good enough" without the tests confirming it. The tests are the deciders of completeness.

I believe that we want the same philosophy for code quality. Instead of allowing a developer to decide when refactoring has reached "good enough", we will instead use an automated process to make that decision.

We already have code quality tools. C and C++ have had lint for decades. Other languages have tools as well. (Wikipedia has a page for static analysis tools.) Some are commercial, others open source. Most can be tailored to meet the needs of the team, placing more weight on some issues and ignoring others. My favorite at the moment is 'Rubocop', a style-checking tool for Ruby.

I expect that Agile processes will adopt a measured approach to refactoring. By using one (or several) code assessors, a team can ensure quality of the code.

Such a change is not without ramifications. This change, like the use of automated tests, takes judgement away from the programmer. Code assessment tools can consider many things, some of which are style. They can examine indentation, names of variables or functions, the length or complexity of a function, or the length of a line of code. They can check the number of layers of 'if' statements or 'while' loops.

Deferring judgement to the style checkers will affect managers as well as programmers. If a developer must refactor code until it passes the style checker, then a manager cannot cut short the refactoring phase. Managers will probably not like this change -- it takes away some control. Yet it is necessary to maintain code quality. By ending refactoring before the code is at an acceptable quality, managers allow poor code to remain in the system, which will affect future development.

Agile is all about code quality.

Sunday, July 31, 2016

Agile pushes ugliness out of the system

Agile differs from Waterfall in many ways. One significant way is that Agile handles ugliness, and Waterfall doesn't.

Agile starts by defining "ugliness" as an unmet requirement. It could be a new feature or a change to the current one. The Agile process sees the ugliness move through the system, from requirements to test to code to deployment. (Waterfall, in contrast, has the notion of requirements but not the concept of ugliness.)

Let's look at how Agile considers ugliness to be larger than just unmet requirements.

The first stage is an unmet requirement. With the Agile process, development occurs in a set of changes (sometimes called "sprints") with a small set of new requirements. Stakeholders may have a long list of unmet requirements, but a single sprint handles a small, manageable set of them. The "ugliness" is the fact that the system (as it is at the beginning of the sprint) does not perform them.

The second stage transforms the unmet requirements into tests. By creating a test -- an automated test -- the unmet requirement is documented and captured in a specific form. The "ugliness" has been captured and specified.

After capture, changes to code move the "ugliness" from a test to code. A developer changes the system to perform the necessary function, and in doing so changes the code. But the resulting code may be "ugly" -- it may duplicate other code, or it may be difficult to read.

The fourth stage (after unmet requirements, capture, and coding) is to remove the "ugliness" of the code. This is the "refactoring" stage, when code is improved without changing the functions it performs. Modifying the code to remove the ugliness is the last stage. After refactoring, the "ugliness" is gone.

The ability to handle "ugliness" is the unique capability of Agile methods. Waterfall has no concept of code quality. It can measure the number of defects, the number of requirements implemented, and even the number of lines of code, but it doesn't recognize the quality of the code. The quality of the code is simply its ability to deliver functionality. This means that ugly code can collect, and collect, and collect. There is nothing in Waterfall to address it.

Agile is different. Agile recognizes that code quality is important. That's the reason for the "refactor" phase. Agile transforms requirements into tests, then into ugly code, and finally into beautiful (or at least non-ugly) code. The result is requirements that are transformed into maintainable code.

Thursday, July 21, 2016

Spaghetti in the Cloud

Will cloud computing eliminate spaghetti code? The question is a good one, and the answer is unclear.

First, let's understand the term "spaghetti code". It is a term that dates back to the 1970s according to Wikipedia and was probably an argument for structured programming techniques. Unstructured programming was harder to read and understand, and the term introduced an analogy of messy code.

Spaghetti code was bad. It was hard to understand. It was fragile, and small changes led to unexpected failures. Structured programming was, well, structured and therefore (theoretically) spaghetti programming could not occur under the discipline of structured programming.

But theory didn't work quite right, and even with the benefits of structured programming, we found that we had code that was difficult to maintain. (In other words, spaghetti code.)

After structured programming, object-oriented programming was the solution. Object-oriented programming, with its ability to group data and functions into classes, was going to solve the problems of spaghetti code.

Like structured programming before it, object-oriented programming didn't make all code easy to read and modify.

Which brings us to cloud computing. Will cloud computing suffer from "spaghetti code"? Will we have difficult to read and difficult to maintain systems in the cloud?

The obvious answer is "yes". Companies and individuals who transfer existing (difficult to read) systems into the cloud will have ... difficult-to-understand code in the cloud.

The more subtle answer is... "yes".

The problems of difficult-to-read code is not the programming style (unstructured, structured, or object-oriented) but in mutable state. "State" is the combination of values for all variables and changeable entities in a program. For a program with mutable state, these variables change over time. For one to read and understand the code, one must understand the current state, that is, the current value of all of those values. But to know the current value of those variables, one must understand all of the operations that led to the current state, and that list can be daunting.

The advocates of functional programming (another programming technique) doesn't allow for mutable variables. Variables are fixed and unchanging. Once created, they exist and retain their value forever.

With cloud computing, programs (and variables) do not hold state. Instead, state is stored in databases, and programs run "stateless". Programs are simpler too, with a cloud system using smaller programs linked together with databases and message queues.

But that doesn't prevent people from moving large, complicated programs into the cloud. It doesn't prevent people from writing large, complicated programs in the cloud. Some programs in the cloud will be small and easy to read. Others will be large and hard to understand.

So, will spaghetti code exist in the cloud? Yes. But perhaps not as much as in previous technologies.

Sunday, May 15, 2016

Agile values clean code; waterfall may but doesn't have to

Agile and Waterfall are different in a number of ways.

Agile promises that your code is always ready to ship. Waterfall promises that the code will be ready on a specific date in the future.

Agile promises that your system passes the tests (at least the tests for code that has been implemented). Waterfall promises that every requested feature will be implemented.

There is another difference between Agile and Waterfall. Agile values clean code; Waterfall values code that performs as intended but has no notion of code quality. The Agile cycle includes a step for refactoring, a time for developers to modify the code and improve its design. The Waterfall method has no corresponding step or phase.

Which is not to say that Waterfall projects always result in poorly designed code. It is possible to build well-designed code with Waterfall. Agile explicitly recognizes the value of clean code and allocates time for correcting design errors. Waterfall, in contrast, has its multiple phases (analysis, design, coding, testing, and deployment) with the assumption that working code is clean code -- or code of acceptable quality.

I have seen (and participated in) a number of Waterfall projects, and the prevailing attitude is that code improvements can always be made later, "as time allows". The problem is that time never allows.

Many project managers have the mindset that developers should be working on features with "business value". Typically these changes fall into one of three categories: feature to increase revenue, features to reduce costs, and defect corrections. The mindset also considers any effort outside of those areas to be not adding value to the business and therefore not worthy of attention.

Improving code quality is an investment in the future. It is positioning the code to handle changes -- in requirements or staff or technology -- and reducing the effort and cost of those changes. In this light, Agile is looking to the future, and waterfall is looking to the past (or perhaps only the current release).

Sunday, June 7, 2015

Code quality doesn't matter today

In the 1990s, people cared about code quality. We held code reviews and developed metrics to measure code. We debated the different methods of measuring code: lines of code, cyclomatic complexity, function points, and more. Today, there is little interest in code metrics, or in code quality.

I have several possible explanations.

Agile methods Specifically, people believe that agile methods provide high-quality code (and therefore there is no need to measure it). This is possible; most advocates of agile tout the reduction in defects, and many people equate the lack of defects with high quality. While the re-factoring that occurs (or should occur) in agile methods, it doesn't guarantee high quality. Without measurements, how do we know?

Managers don't care More specifically, managers are focussed on other aspects of the development process. They care more about the short-term cost, or features, or cloud management.

Managers see little value in code It is possible that managers think that code is a temporary thing, something that must be constantly re-written. If it has a short expected life, there is little incentive to build quality code.

I have one more idea:

We don't know what makes good code good In the 1990s and 2000s, we built code in C++, Java, and later, C#. Those languages are designed on object-oriented principles, and we know what makes good code good. We know it so well that we can build tools to measure that code. The concept of "goodness" is well understood.

We've moved to other languages. Today we build systems in Python, Ruby, and JavaScript. These languages are more dynamic than C++, C#, and Java. Goodness in these languages is elusive. What is "good" JavaScript? What designs are good for Ruby? or Python? Many times, programming concepts are good in a specific context and not-so-good in a different context. Evaluating the goodness of a program requires more than just the code, it requires knowledge of the business problem.

So it is possible that we've advanced our programming languages to the point that we cannot evaluate the quality of our programs, at least temporarily. I have no doubt that code metrics and code quality will return.

Monday, March 24, 2014

Software is not always soft

Tim O'Reilly asks "Can hardware really change as fast as software":

Can hardware really change as fast as software? Xiaomi thinks so http://t.co/3sjquA2XUa via @jstogdill #solidconf
— Tim O'Reilly (@timoreilly) March 24, 2014

We're used to the idea that software changes faster than hardware. It's widely accepted as common knowledge. ("Of course software changes faster than hardware! Software is soft and hardware is hard!")

Yet it's not that simple.

Software is easy to change... sometimes. There are times when software is easier to change, and there are times when software is harder to change.

Tim O'Reilly's tweet (and the referenced article) consider software in the context of cell phones. While cell phones have been changing over time, the apps for phones tend to "rev" faster. But consider the software used on PCs. Sometimes PC software changes at a rate much slower than the hardware. The "Windows XP problem" is an example: people stay with Windows XP because their software runs on Windows XP (and not later versions of Windows).

Long-term software is not limited to PCs. Corporations and governments have large systems built with mainframe technology (COBOL, batch processing) and these systems have outlasted several generations of mainframe hardware. These systems are resistant to change and do not easily translate to out current technology set of virtualized servers and cloud computing.

What makes some software easy to change and other software hard? In my view, the answer is not in the software, but in the culture and processes of the organization. "Hard" software is a result, not a cause.

Development teams that use automated testing and that refactor code frequently have a better chance of building "soft" software -- software that is easy to change. Tests keep the developers "honest" and alert them to problems. Automated tests are cheap to run and therefore run frequently, giving developers immediate feedback. Comprehensive automated tests are cheap to run and, well, comprehensive, so developers get complete feedback and alerted to any deviations from requirements.

Refactoring is important; it allows developers to improve the code over time. We rarely get code right the first time. Often we are happy that it works, and we don't care about the simplicity or the consistency of the code. Refactoring lets us re-visit that code and make it simpler and consistent -- both of which make it easier to understand and change.

Development teams that use manual methods of testing (or no testing!) have little chance at building "soft" software. Without automated tests, the risk of introducing a defect while making a change is high. Developers and managers will both avoid unnecessary changes and will consider refactoring to be an unnecessary change. The result is that code is developed but never simplified or made consistent. The code remains hard to read and difficult to change.

If you want software to be soft -- to be easy to change -- then I encourage automated testing. I see no way to get "soft" software without it.

On the other hand, if you want "hard" software -- software that is resistant to change -- then skip the automated testing. Build a culture that avoids improvements to the code and allows only those changes that are necessary to meet new requirements.

But please don't complain about the difficulty of changes.

Friday, February 1, 2013

Refactoring and code cleanup are everyone's job

Code is like fish. Over time (and a surprisingly short period of time), it "goes bad" and starts to smell. While fish must be discarded, code can be improved.

Code can be messy for a number of reasons. It can be assembled from older (poorly written) systems. It can be developed under aggressive timeframes. The developers can be careless or inexperienced.

You want to improve your code. Messy code is hard to understand, difficult to debug, and problematic to change. Projects with messy code find that they miss deadlines and have a large number of defects.

Refactoring is the process of changing the structure of code while maintaining its behavior. By changing the structure, you can improve the readability and maintainability of the code. By keeping the functionality, you keep all current features.

One might think that assigning the task of refactoring to a subset of the team is sufficient. The idea is that this subteam will improve the code, cleaning up the mess that has developed over time. But I believe that such an approach is ineffective.

Refactoring (and code quality in general) is a task for everyone on the project. The approach of a separate team does not work. Here's why:

The team members dedicated to the task are viewed as a separate team. Usually they are viewed as the elite members of the team; more darkly as diva-developers. Sometimes they are viewed as servants, or a lower caste of the team. Fracturing the team in this way benefits no one.

Other (non refactoring) members of the team can become sloppy. They know that someone will come after them to clean their code. That knowledge sets up an incentive for sloppy code -- or at least removes the incentive for clean code.

The biggest reason, though, is one of numbers. The refactoring team is smaller than the rest of the team. This smaller team is attempting to clean up the mess created by the entire team. Your team's current processes create messy code (for whatever reason) so that larger (non-refactoring) team is still creating a mess. The smaller team attempts to clean while the larger team keeps creating a mess. This doesn't work.

As I see it, the only way to clean code is to get everyone involved. No separate team, no elite squad, no penalty assignments. The team's process must change to create clean code (or to improve messy code). Nothing less will do.

Wednesday, August 10, 2011

A measure of quality

I propose that quality of code (source code) is inversely correlated to duplications within the code. That is, the more duplications, the worse the code. Good code will have few or no duplications.

The traditional argument against duplicate code is the increased risk of defects. When I copy and paste code, I copy not only the code but also all defects within the copied code. (Or if requirements later change and we must modify the copied code, we may miss one of the duplicate locations and make an incomplete set of changes.)

Modern languages allow for the consolidation of duplicate code. Subroutines and functions, parent classes, and code blocks as first-class entities allow the development team to eliminate the duplication of code.

So let us assume that duplicate code is bad. Is it possible to measure (or even detect) code duplications? The answer is yes. I have done it.

Is it easy to detect duplicate code? Again, the answer is yes. Most developers, after some experience with the code base, will know if there are duplicate sections of code. But is there an automated way to detect duplicate code?

And what about measuring duplicate code? Is it easy (or even possible) to create a metric of duplicate code?

Let's handle these separately.

Identifying duplicate blocks of code within a system can be viewed as a scaled-up version of the same problem between two files. Given two separate source files, how can one find the duplicate blocks of code? The method I used was to run a custom program on the two files, a program that identified common blocks of code. The program operated like 'diff', but in reverse: instead of finding differences, it found common blocks. (And in fact that is how we wrote our program. We wrote 'diff', and then changed it to output the common blocks and not the different blocks.)

Writing our 'anti-diff' utility (we called it 'common') was hard enough. Writing it in such a way that it was fast was another challenge. (You can learn about some of the techniques by looking for 'how is grep fast' articles on the web.)

Once the problem has been solved for two files, you can scale it up to all of the files in your project. But be careful! After a moment's thought, you realize that to find all of the common blocks of code, you must compare every file against every other file, and this algorithm scales O(n-squared). This is a bad factor for scaling, and we solved it by throwing hardware at the problem. (Fortunately, the algorithm is parallizable.)

After more thought, you realize that there may be common blocks within a single file, and that you need a special case (and a special utility) to detect them. You are relieved that this special case scales at O(n).

Eventually, you have a process that identifies the duplicate blocks of code within your source code.

The task of identifying duplications may be hard, but assigning a metric is open to debate. Should a block of 10 lines duplicated twice (for a total of three occurrences) count the same as a block of 15 lines duplicated once? Is the longer duplication worse? Or is the more frequent duplication the more severe?

We picked a set of "badness factors" and used them to generate reports. We didn't care too much about the specific factors, or the "quantity vs. length" problem. For us, it was more important to use a consistent set of factors, get a consistent set of metrics, and observe the overall trend. (Which went up for a while, and then levelled off and later decreased as we requested a reduction in duplicate code. Having the reports of the most serious problems was helpful in convincing the development team to address the problem.)

In the end, one must review the costs and the benefits. Was this effort of identifying duplicate code worth the cost? We like to think that it is, for four reasons:

We reduced our code base: we identified and eliminated duplicate code.

We corrected defects: We identified near-identical code and found that the near-duplicates were true duplicates, some with fixes and some without. We combined the code and ensured that all code paths had the right fixes.

We demonstrated an interest in the quality of the code: Rather than focus on only the behavior of the code, we took an active interest in the quality of our source code.

We obtained a leading indicator of quality: Regression tests are lagging indicators of quality, observable only after the coding is complete. We can measure duplicate code from the source code, and from the first day of the project, getting measurements immediately.

We believe that we get the behavior that we reward. By imposing soft penalties for duplicate code, measuring the code, and distributing that information, we changed the behavior of the development team and improved the quality of our code. We made it easy to eliminate the duplicate code, by providing lists of the duplicate code and the locations within the code base.

Fitzpatrick's Fabulous Future