Sunday, March 6, 2011

One level down

A while back, I was build-master for a large project. The project consisted of twenty or so Visual C++ projects ("solutions", in Microsoft's terms) and five C#/.NET projects.

As build master, I had to maintain the build scripts and the system that ran them. The build system itself was a complicated application: A Java program with dozens of classes, XML files for the scripts, and an interface that ran on a web page. Maintaining the build system was expensive, and we chose to re-write the system. The resulting system was a simpler collection of batch files. The batch files looked something like this:


CD project-directory-1
MSDEV /build /solution project-1.sln /project Release
CD ..
CD project-directory-2
MSDEV /build /solution project-2.sln /project Release
CD ..
... repeat for all twenty-five projects

The one feature we needed in the system was for it to stop on an error. That is, if a Visual C++ solution failed to compile, we wanted the build system to stop and report the failure, not continue on and attempt to build the rest of the projects.

Batch files in Windows are not good at stopping. In fact, they are very good at continuing on. You can force a batch file to stop. Here's our first attempt:


CD project-directory-1
MSDEV /build /solution project-1.sln /project Release
IF %ERRORLEVEL% NEQ 0 EXIT /B 1
CD ..
CD project-directory-2
MSDEV /build /solution project-2.sln /project Release
IF %ERRORLEVEL% NEQ 0 EXIT /B 1
CD ..
... repeat for all twenty-five projects

This solution is pretty ugly, since it mixes in the control of the execution with the tasks of the execution. (Not to mention the repetitiveness of the 'IF/EXIT' command.) The problem was pervasive: we wanted our scripts to stop after a failure in any part of the process, not just compiling projects. Thus we needed 'IF/EXIT' lines sprinkled in the early phases of the job when we were getting files from version control and in the later part of the job when we were bundling files into an install package.

After a bit of thought and several discussions, we implemented a different solution. We wrote our own command processor, one that would feed commands to CMD.EXE one at a time, and check the results of each command. When a command failed, our command processor would stop and report the error.

The result was a much simpler script. We took out the 'IF/EXIT' lines, and the script once again focussed on the task of building our projects.

With our new command processor in place, we added logic to capture the output of the called programs. We captured the output of the compilers, the source control utilities, and the install packager. This allowed for an even simpler and more focussed script, since we removed the '>log.txt' and '2>errlog.txt' clauses on each line.

Looking back, I realize that our solution was to move the logic for error detection down one level. It took the problem out of the script space and into the space of the command processing.

Sometimes, pushing a problem to a different level is the right thing to do.

No comments: