| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Since time has been spent building them, so wait longer before giving up
submitting the outputs.
|
|
|
|
|
|
|
|
| |
Don't compress then send, since I think compression can be slower than
sending, so doing both at the same time is probably faster. Add
make-chunked-output-port* which might be more efficient than the Guile chunked
output port, will disable garbage collection to avoid issues with GnuTLS and
will try to force the garbage collector to run if there's garbage building up.
|
| |
|
|
|
|
|
| |
Since the gc breaking gnutls problem can occur for these requests probably as
well.
|
|
|
|
| |
I think this works better.
|
|
|
|
|
|
|
|
|
| |
This was put in to try and prevent the crashes inside gnutls, but was
ineffective since the actual trigger for the issues is garbage collection,
rather than parallel requests.
There might be some benefit from limiting request parallelism in the future,
but that can be thought through then.
|
|
|
|
|
|
|
|
|
| |
Guile's garbage collector interferes with Guile+gnutls, which means that
sending files while the garbage collector is active is difficult.
These changes try to work around this by disabling the garbage collector just
as the data is being written, then enabling it again. I think this helps to
work around the issue.
|
|
|
|
|
| |
Prefer upfront compression, as this might reduce GC activity while sending the
data.
|
|
|
|
| |
Just use it when uploading files.
|
|
|
|
| |
As I think this might make it faster.
|
|
|
|
|
|
| |
This avoids the need to create agents upfront, which could be useful when
creating many childhurd VMs or using scheduling tools to dynamically run
agents.
|
|
|
|
| |
I've see the process hang on the hurd, and I think this might help.
|
| |
|
|
|
|
| |
I'm seeing this pull in sqlite3 unnecessarily on the hurd.
|
| |
|
|
|
|
| |
This will allow adding more agent messaging approaches.
|
|
|
|
| |
So that the agent spends less time waiting.
|
|
|
|
| |
As it's shorter, and this keeps the logging neat.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, an agent could end up fetching builds from the coordinator, but
not receiving the response, say because of a network issue or timeout. When it
retries, it would fetch even more builds, and there would be some allocated to
it, but that it doesn't know about.
These changes attempt to make fetching builds more idempotent, rather than
returning the new allocated builds, it returns all the builds, and rather than
requesting a number of new builds, it's the total number of allocated builds
that is specified.
|
|
|
|
|
| |
When submitting builds. The agent will now retry the relevant thing, like
uploading the log file if the coordinator says that still needs doing.
|
|
|
|
| |
So that the client part doesn't depend on fibers.
|
|
|
|
|
| |
To start making it possible to use the agent, without having to load anything
related to fibers (as it doesn't work on the hurd yet).
|
| |
|
| |
|
|
|
|
| |
Which was extracted from the Guix Build Coordinator.
|
|
|
|
| |
Rather than the lzlib module within Guix.
|
|
|
|
|
|
|
| |
This isn't particularly accurate, what's actually being stored is the current
time when the record is inserted in to the coordinator database, but that
should happen just before the agent starts the build, so hopefully that's good
enough.
|
| |
|
|
|
|
| |
Also support fetching builds for specific systems from the Guix Data Service.
|
|
|
|
| |
As there seems to be some failures in this area.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
I'm seeing "Resource temporarily unavailable, try again" errors from GnuTLS,
mostly around the file uploads I think.
I'm not sure what's going on here, but it seems to happen when using multiple
threads in parallel. Anyway, this commit uses some mutexes to avoid uploading
files in parallel, and also improves error handling generally. I'm pretty sure
this isn't sufficient to fix the issue, but I could be looking in completely
the wrong place for the problem.
|
| |
|
| |
|
|
|
|
| |
Otherwise old values persist if an agent has no allocated builds.
|
| |
|
|
|
|
|
| |
Associate this with the coordinator, rather than having the logic in the agent
communication code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm looking to listen for client instructions ("build this", ...) maybe on a
UNIX socket, which looks to be possible with fibers, but doing this at the
same time as using a network socket for agent messaging requires more access
than run-server from the fibers web server module currently allows.
To get around this, patch the fibers web server run-server procedure to do
less, and do that instead in the guix-build-coordinator. This is somewhat
similar to what I think Cuirass does to allow it to do more with fibers.
This required messing with the current-fiber parameter in a couple more places
around threads, I'm not really sure why that problem has occurred now. This
current-fiber parameter issue should be resolved in the next fibers release.
One good thing with these changes is some behaviours not related to agent
communication, like triggering build allocation on startup have been moved out
of the agent communication code.
|
|
|
|
| |
Only move the file in to the destination location when the upload completes.
|
|
|
|
|
| |
I'm not sure why I did this... but it's slower and more complex than just not
base64 encoding the data.
|
| |
|
|
|
|
|
| |
Add time logging, increase the buffer size for dump-file, and increase the
retry times.
|
| |
|
| |
|
|
|
|
| |
This should reduce the request durations, and makes retrying slightly easier.
|
| |
|
|
|
|
| |
4 might result in contention.
|