-*- mode: org -*-

This is a prototype of a tool/service with the aim of making it easier to
perform lots of builds across potentially many machines, and do something with
the results and outputs of those builds.

The aim of the tool is to make it easier to operate a build farm providing
Guix substitutes, or do testing of the builds for Guix packages, potentially
across different machines with different hardware and software setups.

* Usage instructions

This is a prototype, many features don't work properly yet, if they're even
implemented at all.

All the following commands should be run from the root of the Git
repository. Either use direnv to manage the environment, or run guix
environment to setup the environment.

#+BEGIN_SRC sh
  guix environment -l guix-dev.scm
  export PATH="$PWD/scripts:$PATH"
#+END_SRC

If you haven't yet done so, run the following commands to get the repository
setup to run the software.

#+BEGIN_SRC sh
  ./bootstrap.sh
  ./configure
  make
#+END_SRC

Run guix-build-coordinator to start the coordinator process. By default, this
will use sqitch to create the guix_build_coordinator.db SQLite database file,
as well as sqitch.db which contains metadata about the database state.

#+BEGIN_SRC sh
  guix-build-coordinator
#+END_SRC

In another terminal, run the following commands also at the root of the
repository to setup an agent process.

#+BEGIN_SRC sh
  guix-build-coordinator agent new
#+END_SRC

Note the UUID of the generated agent.

#+BEGIN_SRC sh
  guix-build-coordinator agent <AGENT ID> password new
#+END_SRC

Note the generated password for the agent.

#+BEGIN_SRC sh
  guix-build-coordinator-agent --uuid=<AGENT ID> --password=<AGENT PASSWORD>
#+END_SRC

At this point, both processes should be running and the guix-build-coordinator
should be logging requests from the agent.

In a third terminal, also at the root of the repository, generate a
derivation, and then instruct the coordinator to have it built.

#+BEGIN_SRC sh
  guix build --no-grafts -d hello
#+END_SRC

Note the derivation that is output.

#+BEGIN_SRC sh
  guix-build-coordinator build <DERIVATION FILE>
#+END_SRC

This will return a randomly generated UUID that represents the build. If
everything works, the agent will setup and perform the build, and then the
coordinator will output something like:

  build <BUILD ID> succeeded (on agent <AGENT ID>)

* Architecture

One coordinator process manages one or more agent processes. The coordinator
stores what to build, and allocates builds to agents as they request
them. Agent processes perform the builds, and inform the coordinator when the
build succeeds or fails. When the build succeeds, the agent sends the outputs
produced to the coordinator.

Builds have a specified or randomly generated UUID. The action to perform is
specified by a derivation as understood by GNU Guix. It's expected that the
derivation is either available to the coordinator and all agents, or that
they're able to download it from a substitute server.

Agents will only build the derivation they've been instructed to. It's
expected that any inputs required are either available, or downloadable from a
substitute server. If an input isn't available, the agent will report a setup
failure to the coordinator.

Agents also require the outputs of the derivation they're going to build, not
to be present. They'll attempt to delete them if they are, and report a setup
failure to the coordinator if this doesn't work. The build may then be tried
on another agent if one is available.

Some coordinator behaviour is configurable, but hooks are also provided to
execute code on certain events. This code can access the coordinator
datastore, and perform operations like submitting builds.

There are hooks that trigger when a build is successful, a build fails, and a
agent reports missing inputs. The default missing inputs hook will submit
builds for these missing inputs if none are present. This is the default hook
behaviour to allow automatically building derivations where the inputs are not
available, however the hook can be replaced if desired.

The datastore for the coordinator, and the way the agent <-> coordinator
communication happens is designed to support different modes of operation. For
the datastore, SQLite support is implemented and PostgreSQL support is
planned. For the agent <-> coordinator communication, HTTP is used currently,
but other methods like message passing over SSH could be supported in the
future.

With the HTTP transport for coordinator <-> agent communication, this should
happen over TLS for security if the network isn't secure. Each agent uses
basic authentication to connect to the coordinator.