Latest Posts »
Latest Comments »
Popular Posts »

Build Automation: Setting the Stage

Written by Kendall Miller on May 22, 2008 – 12:35 am

Editor’s Note: This is the first article in a three article series, with a new article posted every few days.

If you haven’t experienced the difference an automated software build system can make to your entire approach to development, this article series will show you why it’s worth your time and how to get it done. Before we launch into the nuts and bolts of setting up a build automation system, lets step back and establish some common ground.

What’s A Build?

A build is the process that takes your source code and translates it into an installable product. There are some definitions that merely look at the first part (building executable files), but I prefer to look at things from a results standpoint: A process should achieve an external result, and the external result of building software is that you have a package that can be distributed and installed by users.

The critical goal is to ensure traceability from product back to the source code that created it:

  1. A given version of your product must represent a unique build so you know that there’s just one “1.1.1452″ version of your product in existence.
  2. Each binary file (.dll, .exe, .jar, etc.) needs to have a unique version number to ensure that there is just one “1.1.1452″ version of “MyCoolApp.exe” so you can look up the source code by that version number.
  3. The source code for each binary must be labeled with the version number so you know what source code made that version.

The same rules apply to non-compiled code as well, you just tend to treat them at a higher level (e.g. a whole set of PHP files as a group instead of each individual file).

To achieve these goals, I’ve always used a few simple rules:

  1. Every exchange loops through the source code control system: From computer to computer or process to process, do it by checking the output into the source code control system and getting it from there on the other side. This ensures you have a way of seeing the output of each stage.
  2. Only builds leave development: When you are going to bridge from your raw development environment to any other environment - test, certification, whatever - it’s done through a full build that has its own unique tracking number. Even if you just made another build 10 minutes ago.

These rules eliminate the possibility of transient work products (e.g. binaries) getting anywhere without the tracking to back up where they came from. They also ensure that any developer that has pack rat tendencies (and most do) will have to push things from their box to the source code control system, which should be on a nice safe server that’s backed up.

Sidebar: Seriously. Your source code control system is virtually irreplaceable. It should live on server-grade hardware fed nice clean power with a UPS and regular backups. The system you select should have a strong track record of never corrupting data and you should be comfortable that your backups of it are top flight. I recommend a product that stores into a commercial-grade database because the data is just that important.

What’s In Your Build Process?

At a high level the process to achieve this traceability is going to look something like this:

  1. Get the code for each project that needs to be compiled.
  2. Update the version information so you get a unique version of the compiled files
  3. Compile them.
  4. Label the source code you compiled with the version number.
  5. Package the binary files with everything else needed for the product into a distribution format.
  6. Store that distribution in a central location with a version number or name indicating what version it is.

That feels very simple and straightforward, doesn’t it - just six steps. When you look closer, you’ll notice there are a lot of loops: You have to get, label, and compile the source code for every project that needs to be built. Often, these projects have to be built in a specific order to work correctly. It may not even be obvious until the code is smoke tested if they were built out of order and won’t run together as a group. You also need to do this with absolute confidence in the integrity of the process so when you find a problem on a computer and it appears to be running version “1.1.1452″ you have confidence on exactly what that means, all the way back to the source code.

Pretty much every development environment includes some form of build automation. In the old days it was “Make”. In Visual Studio it’s now MSBuild. For the most part, these tools are competent at performing the basic steps necessary to take source code and produce binaries, but they aren’t generally going to handle the other elements like labeling source code, checking in outputs, and copying the final distribution to a central location. If they can be extended to do that, it’s usually fairly high effort, and can easily get in the way of the routine work your developers need to do local builds on their development systems.

But Wait, There’s More

This is a very simplistic view of what a build looks like because it leaves out a critical step: The smoke test. It really can’t be called a build if it can’t be installed and at least fire up without laying over and dying. It’d also be nice to pull together release notes including the defects that were fixed or new features added in this build. Finally, lets notify the team that a new build exists so they can pick up where the build leaves off.

You Don’t Need an Automated Build

You can do all of this by hand indefinitely. After all, if you document the process it should be possible for a professional to correctly execute the build by hand every time, following each step.

There are three key problems with this approach:

  1. Humans are fallible: A well trained professional doing an intricate task will still make a mistake around two percent of the time. That’s one in 50 opportunities: They’ll put the wrong version number on something, label the same folder twice and one not at all, not clean out the working directory first, something.
  2. The potential for mistake degrades value: Because a main point of the build process is to have confidence that you can absolutely go from distribution package back to every element of source code it maintains, even the possibility that there was an error in how the build process was executed will make you doubt its integrity and therefore you won’t achieve the value you wanted.
  3. It’s wasteful: Each build occupies a well trained professional’s time. If you need to do a new build at 2:00AM, you need a well trained professional to execute a possibly lengthy process accurately. This costs you resources and even worse it’s not a job any developer likes, so it costs morale.

Over time, the fact that each build is a risk and a waste will tend to unconsciously affect the decision making of the development team, making them more likely to defer a fix or change they might be able to code and unit test on their own computer but don’t feel is worth the overhead of the build.

Traditional Resistance

There are a number of reasons that are typically put forward against having a central, automated build process. The most common ones I hear are:

  1. It will slow down testing and certification: Since each build that is going to be tested outside of a developer’s machine has to come from the build system, that means that even a small error found in certification will require the entire build be run before it can be tested. Why not let a developer just recompile the offending file and slip it onto the cert system to verify it?
  2. It takes extra resources: Having an engineer set up and maintain the build process takes time away from development, which means my customers will get fewer features, etc.
  3. It slows down change: Every time we want to add a new binary file or a dependency we will have to update the build system and possibly the build process and retest it. This will get in the way of an individual developer being able to get things done as fast as possible.
  4. Single point of failure: What if the build computer fails? If it’s the only place to do a build, we’re stopped.

These objections generally spring from a few underlying problems within the development team: Developers that lack confidence and fear of change.

Developers Playing Hide the Ball

If there is a developer on your team that isn’t up to the rest of your team’s level and they’re trying to hide it, this is virtually guaranteed to bring it to everyone’s attention. They won’t be able to just slip a new file into the build or slip a fix into test without it being clear what happened.

If the time it takes to perform a build - whatever that is - is an impediment to certifying your software because you need to fix problems faster than that time, you have a more fundamental issue: Your developers are not thinking through their code before it’s included in the build. Fundamentally, it’s called Certification, not Debugging for a reason: Developers should be genuinely surprised that their code doesn’t work as expected when it leaves their hands.

If this is the case, then when a problem makes it to test it shouldn’t matter if the build takes 30 minutes or even two hours. Any development process that needs to go from the developer’s fingertips to certification in less than that time has more fundamental quality control and process issues.

If you have developers concerned that this slows down their ability to add new projects or dependencies because they have to think through how to update the build system this is really a good thing: These decisions matter by the time you want to ship a product to customers, so the earlier you can address them the lower the probability you’ll discover in certification that redistributing a particular dependency is hard or being done wrong.

Fear Of The Unknown

Most developers are not IT administrators, and all developers are humans. Human beings fundamentally don’t like change. They will actively fight change, often with very good prose. Giving up control from being able to do a local compile and take the binaries that work on their box to a central system that is opaque is uncomfortable. The very same developer that’s perfectly willing to switch to Visual Studio 2008 the second it was posted to MSDN and downloads the latest nightly build of NHibernate will come up with all sorts of creative reasons against a central, automated build because of their fear of change.

If you are following reasonable source code control rules, you really don’t need to worry about backing up an individual developer’s system: There shouldn’t be much that’s on it uniquely if it were to be lost, preferably at most a day’s work (which is within the time frame of a backup/restore loss anyway). The build system is special: As part of making it the central authority of building your distribution, it really is inconvenient to have to recreate it from scratch through reinstalling all of the software components, etc. It is likely to be slightly different than your developer’s computers (server grade hardware vs. desktops) so your normal developer image won’t work on it. Back it up as part of your normal production server backup scheme, and invest in redundant disks so it’s unlikely you’ll need those backups. This will tend to give you better build performance anyway, so it’s a double benefit.

Coming Next: Benefits of Automation and Centralization

Check back for the second article in this series focusing on the benefits for your team of automating the build process and centralizing it, including the roles and capabilities of an automated build system. From there the series will continue with how to create an automated build incrementally and make it a natural evolutionary process of your team.

Bookmark and Share

Tags: ,
Posted in Process, Software Development |

Leave a Comment