aboutsummaryrefslogtreecommitdiff

-- mode: org --

This is a utility for managing a collection of nars (normalized archives, in the context of Guix) along with the corresponding narinfo files which contain some signed metadata.

While tasks like publishing local store items as nars is easy with tools like guix publish, nar-herder is aimed at enabling serving the same collection of nars from multiple machines at once, including moving the nars from machine to machine according to different criteria.

A reverse proxy (like nginx) should be used for the actual serving of the nars, as well as handling proxying the requests to nar-herder.

1. Design

This utility was designed to help manage a collection of nars from a substitute server. It can help move the nars between machines, as well as assist in setting up machines to serve the nars (mirrors).

Both these tasks can be accomplished without and specialised tooling. For example, rsync can be used to move nars between machines, and there are many tools for setting up reverse proxies which function as mirrors.

Even though this is the case, I think there are a few reasons why I think some value can be added by the nar-herder.

Firstly, storing the narinfo information in a SQLite database can facilitate things like tagging the narinfos and doing garbage collection like tasks. Plus, because the narinfos are quite small, I believe storing them in a database is actually more performant and efficient than storing them as files on the filesystem, even with the duplication that comes with the database schema being used. It's also easier to copy all the narinfos between machines when you can download a single "database", rather than copying the files individually.

Secondly, while tools like NGinx work great as a reverse proxy for the nar files, proxying the requests for the narinfo files can be problematic, particularly when caching is involved. Especially in cases like using guix weather, lots of requests for narinfos are made in quick succession. Through using the nar-herder to respond to narinfo requests from the database, the performance can be improved.

1.1. Example uses

1.1.1. Mirroring

In this example, foo.example.com is a substitute server using the nar-herder. We want to setup a mirror, mirror.example.com. To do this, NGinx will be used as caching reverse proxy, and the nar-herder will be used to serve the narinfo's.

Run the nar-herder like:

nar-herder run-server --mirror=https://foo.example.com

When run for the first time, the nar-herder will download the database from foo.example.com and then apply any recent changes (new or removed nars). Once this has happened, it will periodically check for changes and apply them.

Then, configure NGinx to reverse proxy the *.narinfo requests to the nar-herder (by default it listens on port 8080), and the nar* requests to foo.example.com. By adding caching, you can improve the performance for frequently requested files.

1.1.2. Moving nars between machines

Like in the previous example, foo.example.com is a substitute server using the nar-herder. This time though, we only want it to store some of the nars, all of them will be stored on storage.example.com, and foo.example.com will reverse proxy and requests for nars it doesn't have to storage.example.com.

Looking first at the nar-herder configuration for foo.example.com, the important options are the storage limit and storage nar removal criteria. The storage limit is the limit in bytes that the storage directory should not exceed. By setting it to 0, we're saying that the storage directory should be empty. To delete a nar though, the storage nar removal criteria must be met. In this case, it says the nar must be stored on storage.example.com. When looking for nars to delete, the nar-herder on foo.example.com will query storage.example.com to check if the nar-herder there is storing the files.

nar-herder run-server --storage=/var/lib/nars --storage-limit=0 --storage-nar-removal-criteria=stored-on=https://storage.example.com

On the storage.example.com side, this is similar to the previous mirror example, but because we want the nar-herder to actually download and store the nars from foo.example.com, we set a storage directory. Note that this will currently just keep downloading nars until they've either all been downloaded, or there's no more disk space. Like on foo.example.com, you can set a –storage-limit to prevent this.

nar-herder run-server --mirror=https://foo.example.com --storage=/var/lib/nars