Latest Posts »
Latest Comments »
Popular Posts »

Build Automation: Get Rich Slowly

Written by Kendall Miller on May 29, 2008 – 10:56 pm

Editor’s Note: This is the final article in a three article series. For a list of the entire series, see the Article Series page.

In the final article of our series we’ll look at how to create an automated build incrementally and make it a natural evolutionary process of your team, providing both immediate and ongoing value. Really, it’s no joke and it’s not going to require upending your technology or team.

Components of Build Automation

When introducing an automated build, we recommend pursing components in the following order:

  1. Compile files from Source: Retrieve all of the input data from the configuration management system, and compile everything that needs to be compiled. If you’re working in a language or technology that doesn’t require compilation, just retrieve and label all of the input files. Compiling may include activities such as automatically assembling release notes from the defect tracking system or any other action necessary to create a file that is distributed. Check compiled results back into the configuration management system.
  2. Assemble Build from all Files: Take all of the files needed to create your distribution including every dependency, release notes, etc. and create your distributable build. The distribution (a.k.a “build”) ultimately should be exactly one file per target platform. Copy to a central networked location either named with the unique build number or otherwise clearly identified.
  3. Automated Unit Test: Using a third party framework (recommended) or your own custom framework perform automated tests designed to exercise the individual components of your system in a very detailed way. This is often easier than a real system test because the tests will tend to be more stable over time and more compartmentalized.
  4. Install and Smoke Test: On a green system, perform a fresh installation and basic smoke test of the system.
  5. Automated System Test: Perform a full automated test of the public surface area (that reachable by users) of the system.

Compiling Files from Source Control

The first task to take on is to automate compiling all of the files from source. This is required before the build can be centralized, and really is the cornerstone of the process. This step will require the most investment before you realize any return. We recommend taking a developer with IT administration experience and dedicate them to the task. Many IT administrators are used to automating tasks and working with installing and cleaning up software, and a solid understanding of administration can be very handy for this step. In our experience it can take as little as a few days of time to as much as a few weeks depending on the experience of the developer and the complexity of the product.

It’s important that the build process be idempotent to be valid: Regardless of where the build gets started, you should be able to restart it and have it recover, cleanup, and then work. Generally it’s best to clean up the failed build then proceed with a good build instead of trying to pick up where the previous build left off (it’s more deterministic). This approach also lets you develop the build iteratively with a failed build recovery stage added to the start of a normal build process.

The build process must ensure that it generates the right labels to meet the traceability goals of a build. The best way to do this is to generate a unique build number, label the source code, then pull the source code based on the label. Even in source code management systems that aren’t completely transactional, retrieving source code by label is going to be consistent which is the key goal. This allows the build to run at any time without requiring developers withhold from checking in or out source code due to fears of interfering with the build. For best consistency, the build process should label all of the necessary source code in as short an interval as feasible (to guard against drift between projects) and then it can pull the source code as needed during the build process.

The build should be easy to extend with new projects. This requires spending a little time considering how to externalize what projects to compile, where to get them, and where to put the output from the raw build process itself. We spent some time writing a standard build script that integrates with our product of choice which uses a single data file to tell it all of the information it needs for any one product. This lets us set up new products very quickly and amortizes the development effort of making the build process over multiple projects.

Some example products that can help you fully automate your build:

  1. ANT: Free and capable. Probably the best freely available build system. Available in many flavors to match your technology (like NANT).
  2. Visual Build Pro: We’ve used this on several projects and it’s what we use internally. Sports a great GUI for developing and debugging the build process and comes with build-in interfaces for pretty much anything you want to talk to on Windows, and can be easily extended. Great for folks that prefer an IDE. Very cost effective.
  3. MSBuild: Visual Studio ships with an internal build environment that can be extended to handle a number of tasks and, with enough force of will can be used to do most anything you want. Debugging and extending it to perform tasks beyond basic compilation and file copying can be a challenge, and your time is probably better served using a dedicated build framework.

Centralizing the Build

Once you’ve automated your build, you can move it to a central server. To centralize the build on a common server, you need to have a mechanism that satisfies at least the following:

  1. Anyone can trigger a build remotely: Anyone (subject to some basic security authorization) can trigger a build, and do so without any particularly specialized knowledge. The build system will automatically know if it’s safe to build (such as ensuring conflicting builds aren’t run at the same time). This has to be available from anywhere developers are.
  2. Easy access to the status of builds remotely: It should be trivial to know if a build is underway and to know the historical success of the build process. This has to be available from anywhere developers are.
  3. Works logged off and through restarts: The build process should not require the system be left logged in or require manual steps to bring online after a computer restart.
  4. Runs as a unique user: The build process shouldn’t use anyone else’s identity to log into the source code management system or access other resources so it’s very clear from an audit perspective when the build did something.

Some example products that can centralize your build:

  1. Cruise Control: Pretty much the standard. Free, capable, and satisfies the requirements. We use the .NET oriented version, Cruise Control .NET for our in-house build system. The user interface is fairly primitive, and the non-web client is unduly cranky, but it is a good remoting system. It isn’t a particularly good build system - you’ll want to use one of the products listed below.
  2. Automated Build Studio: This commercial product for the windows platform is reasonably capable and is both a build product and a build centralization system. While it is capable, the pricing is fairly high when you compare it to Cruise Control + Visual Build Pro (see below) which is substantially as capable. This is because you really don’t need may Visual Build Pro license(s) on a typical team, but you will need a number of Automated Build Studio licenses to allow anyone to invoke and monitor a build. The centralization capability is relatively new in the product. The main reason to go this route is that one configuration IDE can give you centralization and automation, so you are trading money for time.

Challenge Your Team To Fill It Up, Then Buy A Bigger One

While it is very tempting to recycle some old developer system or server as the build server because it doesn’t feel like performance should matter, investing in a high speed build server can pay back quickly by allowing the build process to be optimized for strictness and reliability instead of performance while keeping it fast enough to preserve the development team’s attitude that builds are free. When your build process gets long (up to 1 hour) your team gets a great boost in productivity by purchasing just one system. Consider that a great new build server should top out at $5,000. If you replaced it every 18 months it’d double in performance at the same price.

When specifying a build server, you want to emphasize single processor performance and disk performance. Memory is generally not a particular issue, so invest in the fastest pair of disks you can find with a good hardware RAID controller (for the very best disk throughput) and the highest gigahertz single socket processor you can get. Build processes can very rarely take advantage of multiple cores, so anything beyond two cores isn’t going to speed up a single build, but gigahertz will. Finally, make sure it has a big network pipe to the configuration management system because it will spend a fair amount of its time pushing things in and out of it.

Automated Unit Tests

Connect your existing unit test system to the build system to automatically perform the unit tests on each build and post the results. This can generate some useful metrics that you can use to understand the quality and progress of your development:

  1. Increase in Proportion to Features: As you are claiming victory adding features to your system, you should see a linear increase in unit tests. For example, if you have 100 unit tests and the current system has say 200 design features then in broad terms for every 20 design features you should see 10 unit tests. It isn’t completely accurate- it’s a guide line. However, if you see only 5 unit tests and 25 features, you know something is up
  2. Indication of Design Complexity Problems: If you are seeing unit tests routinely break that previously worked, or a single unit test repeatedly break, this will tend to indicate that the design of the system (the architecture or software patterns or implementation) is unreasonably complicated for its feature goals. In the abstract it’s often hard to have team discussions in these points because it’s all a matter of tradeoffs, experience, and opinions. This will give you empirical evidence that the software is overly complicated to keep functional.
  3. Indication of Performance Issues: Every unit test, when tracked over time, gives you a clear trend to understand the performance of your system. It provides a highly standardized test case where the exact same routine was run on the same hardware in the same way, and timed. If you see the duration of your unit tests change (up OR down) it’s worth investigating - it may be failing quickly or slowly internally, or you may have counter-optimized the code.

Most Automated build systems can directly run one or more unit test frameworks. The build products discussed above each can do this. If your build system can’t, you might look at one of these. You should definitely check out:

  1. NUnit: Unit testing for .NET. And yes, we get the irony that the site is written in PHP. This is our unit test framework of choice, however it’s worth noting that we’d be a lot less enamored with it if it wasn’t for ReSharper’s ability to act as a dramatically better test runner.
  2. JUnit: Unit testing for Java. Don’t let the nearly comical web site fool you, this test system is all meat.
  3. TestComplete: A commercial product from the same folks that made Automated Build Studio. It goes way beyond unit testing (as do most commercial test products) and integrates with both Visual Studio and Automated Build Studio, so if you go with ABS you might look to use these together. It earns our honorable mention because it’s reasonably affordable, easy to use, and very approachable.
  4. Hundreds of others: There are many automated testing products out there. We recommend you start with something very simple and straightforward and stay away from the large enterprise testing systems. These are really not meant for the needs of small and mid-sized development teams, particularly where your QA staff are largely development trained.

Run It Every Day

Every development project should be built every day if there’s any change made to the source it comes from. The build centralization systems mentioned above can detect if there is any change to the affected projects checked in and automatically queue a build for a fixed time of day (for a scheduled, automatic build) or a few minutes after the change is checked in. The latter is a great approach for a team that is fully adopting agile development practices: Configure it to automatically start a build after any checkiin after there haven’t been any checkiins for a few minutes. This gives rapid feedback to the team and encourages good configuration management discipline: What you check in better build and pass tests, so don’t check it in until you’re ready.

This frequent execution will ensure that build changes need to be coordinated with code changes because they’ll fail the build immediately otherwise. This quickly will instill the discipline within the development team to keep the build clean, which in the end takes up the least time. Just like it’s easier to maintain something than fix it, it will take the least time across the team to keep the build live and accurate than deal with the downstream consequences of not having your house in order.

Get There In Stages

The great thing about centralizing and automating your build process is that you can get there incrementally. You can make minor investments in time from various team members and each investment will add a little value to your automated build process which in turn will generate a return from there forward. There aren’t a lot of development practices that can be incrementally adopted in such small chunks yet produce steady returns.

The incremental approach is highly recommended because it helps conquer some objections by not requiring a major change in developer habits or a major investment in developer time up front. You can even usually get it done with such small investments in time that it doesn’t need to be a formal project or even a formal part of your project plan, if you are concerned about internal resistance to spending work on non-development activities.

Wait, What about Visual Studio Team System?

Microsoft’s Visual Studio Team System (VSTS) does provide all of the infrastructure you need to automate your build process - centralization, configuration management, testing, reporting, the whole package. If you can afford its licensing fees and the investment in time and resources it takes to set it up, it will fit the bill. VSTS is aimed at larger development teams - on the small side 20+ developers (people actually writing code). The tools and techniques discussed in this article are intended to provide value on teams down as small as two developers. Unless you have large development teams or free VSTS licenses, it’s probably not your best bet.

About Product Recommendations

This article features specific product references and recommendations. Neither the authors of this article nor eSymmetrix are affiliated with any company mentioned, nor have they received any consideration at any time from a party with an interest in these products.

Bookmark and Share

Tags: ,
Posted in Process, Software Development |

Leave a Comment