| Commit message (Collapse) | Author | Age |
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Just instrument the update-managed-metrics! function, and move some code
around so this is clearer in the logs.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
As I've seen exceptions here.
|
| |
|
| |
|
|
|
|
| |
I think this will help when handling new requests after failed ones.
|
|
|
|
| |
The error handling here should be handling by unwinding.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
So that the only thing taking place in the upload slot, is the actual upload,
which should improve throughput.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This can help if the output has been uploaded, but the hash isn't present,
since trying to submit the build result will prompt for the output to be sent
again, but it doesn't need to be, the agent just needs to wait.
This is a little inelegant, maybe there needs to be some way for the agent to
explicitly check for the hash to be computed, but I'm hoping these changes
will help with uploading large outputs.
|
|
|
|
| |
As I've seen decompression errors.
|
| |
|
|
|
|
|
|
|
|
|
| |
When submitting an output. This also fixes a regression in not passing
report-bytes-sent on to call-with-streaming-http-request.
I think this case where the agent is trying to send 0 bytes to the coordinator
can happen when the last request to the coordinator times out, probably due to
the computing of the hash taking so long.
|
| |
|
|
|
|
|
| |
This means agents reattempting uploads don't have to start from scratch, and
can instead pick up from what's already been uploaded to the coordinator.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This provides some extra safety on top of the guarantees from TCP around the
integrity of the data received.
I'm introducing this now in preparation for supporting resuming partial
uploads. Because this will add some extra complexity around receiving uploads,
this extra check should ensure that issues with the implementation cannot lead
to corrupt uploads.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trying to avoid the GnuTLS bindings breaking when the garbage collector runs
is quite difficult, and the current approach isn't very effective.
I want to try instead to support resuming partial uploads, as that should both
help with the GnuTLS GC issue, as well as network interruptions in general.
I think that approach is going to be easier with compressing the files
up-front, so revert to doing that.
This partially reverts commit 8258e9c8d9f729b2670a602c523c59847b676b1a.
|
|
|
|
|
| |
Such that the retry happens with a fresh slot (and the associated tracking
information).
|
| |
|
|
|
|
|
| |
Since time has been spent building them, so wait longer before giving up
submitting the outputs.
|
| |
|
|
|
|
|
|
|
|
| |
Don't compress then send, since I think compression can be slower than
sending, so doing both at the same time is probably faster. Add
make-chunked-output-port* which might be more efficient than the Guile chunked
output port, will disable garbage collection to avoid issues with GnuTLS and
will try to force the garbage collector to run if there's garbage building up.
|
| |
|
|
|
|
|
| |
Since the gc breaking gnutls problem can occur for these requests probably as
well.
|
|
|
|
| |
I think this works better.
|
|
|
|
|
|
|
|
|
| |
This was put in to try and prevent the crashes inside gnutls, but was
ineffective since the actual trigger for the issues is garbage collection,
rather than parallel requests.
There might be some benefit from limiting request parallelism in the future,
but that can be thought through then.
|
|
|
|
|
|
|
|
|
| |
Guile's garbage collector interferes with Guile+gnutls, which means that
sending files while the garbage collector is active is difficult.
These changes try to work around this by disabling the garbage collector just
as the data is being written, then enabling it again. I think this helps to
work around the issue.
|
|
|
|
|
| |
Prefer upfront compression, as this might reduce GC activity while sending the
data.
|
| |
|
|
|
|
| |
Just use it when uploading files.
|
|
|
|
| |
As I think this might make it faster.
|
|
|
|
| |
This can happen if the request doesn't arrive in chunks.
|
|
|
|
| |
I think this can happen if the log doesn't arrive as a chunked HTTP request.
|
|
|
|
|
| |
I'm seeing mmap(PROT_NONE) failed crashes, and maybe these metrics will help
in understanding what's going on.
|
|
|
|
|
|
| |
This avoids the need to create agents upfront, which could be useful when
creating many childhurd VMs or using scheduling tools to dynamically run
agents.
|
|
|
|
| |
I've see the process hang on the hurd, and I think this might help.
|
| |
|
|
|
|
| |
I'm seeing this pull in sqlite3 unnecessarily on the hurd.
|