Understanding Distributed Builds

Introduction

The Distributed Builds is an extension to the base Continuum functionalities that gives us the ability to process multiple independent builds beyond the capacity of a single server's processing power. It also enables us to execute builds on multiple different platforms while retaining a unified view of all project builds.

Architecture

Continuum follows a Client-Server model using XML-RPC as the protocol. However, since it uses a bi-directional XML-RPC implementation, we instead distinguish the components by calling them Master and Build Agent.

The Master is a Continuum instance that has the ability to delegate the builds to registered Build Agents.

The Build Agent is a standalone Jetty-bundled webapp that listens for any build requests from the Master it is assigned to.

There is a one-to-many relationship between the Master and the Slaves. A Master may have many Build Agents, but each Build Agent can only have one Master.

TODO: insert image here

Behavior

Distributed Builds happen at the project group level of Continuum. When the entire project group is built in the Master, independent projects (single project or multi-module project) are distributed to any available registered Slave. A Slave is said to be available when it is currently not building anything as it can only attend to a single build request from the Master.

In a project group containing a mix of projects, the distribution of work goes through the following steps:

  1. In the Master, a build in the project group is triggered.
  2. Every independent project within the project group is identified, whether as a single project or a multi-module project. Projects with inter-dependencies cannot be distributed separately, so multi-module projects are delegated to a Slave as one build.
  3. For each independent project, the Master iterates over the list of registered Slaves and queries each if available. The query is an XML-RPC ping() followed by a getBuildSizeOfAgent() invocation.
  4. If there is a Build Agent available, the Master collects the information necessary for the build (SCM url, project id, etc.) and passes it when invoking buildProjects() to the Build Agent with the smallest number of tasks in it's queue.
  5. In the Build Agent, the build request is processed: the build is queued and executed. Upon execution, the Build Agent first performs an SCM checkout then the actual build follows.
  6. At this point, when the build is running, the Master can invoke cancelBuild() which returns a transient build result, and getBuildResult() that updates the build output viewed in the Master.
  7. After the build, the Build Agent returns the complete build result to the Master by invoking the callback method returnBuildResult(), which the Master aggregates to provide a unified view of projects

    TODO: insert sequence diagram here

Setup

  • Install and Configure one or more Build Agents, then
  • Enable the Distributed Builds option in the General Configuration, and
  • Add your Build Agents to the Continuum Master.
  • Add your Build Agents to a Build Agent Group.
  • Add your Build Agent Group to Build Environment.

    WARNING Need to have a central remote repository to store the artifacts created from the build agent so that other agents will be able to use the new artifacts.

Limitations

  • only system administrator can enable/disable distributed builds
  • credentials (s.a. svn credentials) are passed along if specified, but if server cache is used it will need to be done individually on the slaves
  • there is no tracking of scm changes
  • the Build Agent needs a configuration web interface
  • all projects in a project group will be distributed to the same build agent

Future Enhancements

  • Remote builders
    • Builders can be installed on remote machines, a Continuum manager will send actions to run to builders. An action can be something to run on all builders, on some of them or eventually only to an available builder if we don't want to run more than one build. Actions can be sent with JMS and builders can apply some filters if they don't want to receive all actions. With that, we can do some parallel builds but the dependency tree must be respected for the build order. To work correctly with dependencies, each builders must use a central local repository. Maybe we can use an internal Archiva.
    • With Continuum builders configured to receive all commands, users can run multi-platform build for each build definition execution
    • With Continuum builders configured to receive only some project types, users can use a different builder by project group. In this case, the build of all projects will be done quickly because commands are balanced on few servers
    • With Continuum builders configured to build something when it is available, users can install few builders on several machine to balance the charge. In this case, it will be possible to run some parallel builds.
    • When the builder work will be done, a message will be sent to the manager to notify the end of the process.
    • With JMS used for the communication, we can add some listeners to create reports/statistics, log some informations
  • Policy-based distribution
    • Next available
    • Load balanced
    • Targeted environment matching