aboutsummaryrefslogtreecommitdiff
path: root/guix-build-coordinator/agent-messaging
Commit message (Collapse)AuthorAge
* Move retry in submit-outputChristopher Baines2022-04-09
|
* Add a timeout for submitting outputsChristopher Baines2022-04-09
|
* Only use GC protection when gnutls won't internally retryChristopher Baines2022-02-04
|
* Fix the threshold for metrics delay loggingChristopher Baines2022-02-01
|
* Tweak metrics delay loggingChristopher Baines2022-01-20
| | | | | Just instrument the update-managed-metrics! function, and move some code around so this is clearer in the logs.
* Remove bodies from responses to HEAD requestsChristopher Baines2021-12-27
|
* Fix route for getting the bytes uploaded for an outputChristopher Baines2021-12-24
|
* Improve still more to send log messageChristopher Baines2021-12-22
|
* Improve logging when submitting outputsChristopher Baines2021-12-22
|
* Check before deleting filesChristopher Baines2021-11-26
| | | | As I've seen exceptions here.
* Remove redundant if in the controllerChristopher Baines2021-11-26
|
* Don't print backtraces in the controller when chunked inputs endChristopher Baines2021-11-26
|
* Delete existing files when processing upload requestsChristopher Baines2021-11-22
| | | | I think this will help when handling new requests after failed ones.
* Unwind on some exceptionsChristopher Baines2021-11-22
| | | | The error handling here should be handling by unwinding.
* Improve some way numbers are displayedChristopher Baines2021-11-22
|
* Only check the size of the file once when uploadingChristopher Baines2021-11-21
|
* Fix variable reference in submit-outputChristopher Baines2021-11-21
|
* Compress outputs outside of the upload slotChristopher Baines2021-11-20
| | | | | So that the only thing taking place in the upload slot, is the actual upload, which should improve throughput.
* Track delays for reporting metricsChristopher Baines2021-11-16
|
* Check if an output has been uploaded before trying to upload itChristopher Baines2021-11-16
| | | | | | | | | | This can help if the output has been uploaded, but the hash isn't present, since trying to submit the build result will prompt for the output to be sent again, but it doesn't need to be, the agent just needs to wait. This is a little inelegant, maybe there needs to be some way for the agent to explicitly check for the hash to be computed, but I'm hoping these changes will help with uploading large outputs.
* Add error handling around computing output hashesChristopher Baines2021-11-15
| | | | As I've seen decompression errors.
* Fix the uri when calling coordinator-handle-failed-requestChristopher Baines2021-11-15
|
* Handle the case where there are no more bytes to sendChristopher Baines2021-11-15
| | | | | | | | | When submitting an output. This also fixes a regression in not passing report-bytes-sent on to call-with-streaming-http-request. I think this case where the agent is trying to send 0 bytes to the coordinator can happen when the last request to the coordinator times out, probably due to the computing of the hash taking so long.
* Remove some test codeChristopher Baines2021-11-14
|
* Implement initial support for resuming HTTP uploadsChristopher Baines2021-11-14
| | | | | This means agents reattempting uploads don't have to start from scratch, and can instead pick up from what's already been uploaded to the coordinator.
* Don't error for 404 responses in coordinator-http-requestChristopher Baines2021-11-14
|
* Don't error for responses with no body in coordinator-http-requestChristopher Baines2021-11-14
|
* Start checking the hashes of submitted outputsChristopher Baines2021-11-14
| | | | | | | | | | This provides some extra safety on top of the guarantees from TCP around the integrity of the data received. I'm introducing this now in preparation for supporting resuming partial uploads. Because this will add some extra complexity around receiving uploads, this extra check should ensure that issues with the implementation cannot lead to corrupt uploads.
* Return to compressing outputs then sending themChristopher Baines2021-11-07
| | | | | | | | | | | | | Trying to avoid the GnuTLS bindings breaking when the garbage collector runs is quite difficult, and the current approach isn't very effective. I want to try instead to support resuming partial uploads, as that should both help with the GnuTLS GC issue, as well as network interruptions in general. I think that approach is going to be easier with compressing the files up-front, so revert to doing that. This partially reverts commit 8258e9c8d9f729b2670a602c523c59847b676b1a.
* Move retrying uploads out of the with-upload-slot regionChristopher Baines2021-08-07
| | | | | Such that the retry happens with a fresh slot (and the associated tracking information).
* Support reporting bytes sent when submitting outputsChristopher Baines2021-06-08
|
* Retry more when sending outputsChristopher Baines2021-05-30
| | | | | Since time has been spent building them, so wait longer before giving up submitting the outputs.
* Improve the reveived output messageChristopher Baines2021-05-30
|
* Further tweak sending chunked HTTP requestsChristopher Baines2021-05-29
| | | | | | | | Don't compress then send, since I think compression can be slower than sending, so doing both at the same time is probably faster. Add make-chunked-output-port* which might be more efficient than the Guile chunked output port, will disable garbage collection to avoid issues with GnuTLS and will try to force the garbage collector to run if there's garbage building up.
* Add a space in coordinator-handle-failed-requestChristopher Baines2021-05-28
|
* Use GC protection for normal requests to the coordinator as wellChristopher Baines2021-05-28
| | | | | Since the gc breaking gnutls problem can occur for these requests probably as well.
* Increase the buffer size for sending outputs and log filesChristopher Baines2021-05-28
| | | | I think this works better.
* Get rid of the request mutexChristopher Baines2021-05-28
| | | | | | | | | This was put in to try and prevent the crashes inside gnutls, but was ineffective since the actual trigger for the issues is garbage collection, rather than parallel requests. There might be some benefit from limiting request parallelism in the future, but that can be thought through then.
* Tune sending files over HTTPChristopher Baines2021-05-28
| | | | | | | | | Guile's garbage collector interferes with Guile+gnutls, which means that sending files while the garbage collector is active is difficult. These changes try to work around this by disabling the garbage collector just as the data is being written, then enabling it again. I think this helps to work around the issue.
* Reduce the threshold for compressing nars on the flyChristopher Baines2021-05-26
| | | | | Prefer upfront compression, as this might reduce GC activity while sending the data.
* Remove stale log filesChristopher Baines2021-05-26
|
* Drop the request mutex for most requestsChristopher Baines2021-05-21
| | | | Just use it when uploading files.
* Use a bigger buffer when uploading logsChristopher Baines2021-05-13
| | | | As I think this might make it faster.
* Handle receiving outputs as a bytevectorChristopher Baines2021-04-23
| | | | This can happen if the request doesn't arrive in chunks.
* Handle receiving logs as bytevectorsChristopher Baines2021-04-09
| | | | I think this can happen if the log doesn't arrive as a chunked HTTP request.
* Add Guile GC related metricsChristopher Baines2021-03-25
| | | | | I'm seeing mmap(PROT_NONE) failed crashes, and maybe these metrics will help in understanding what's going on.
* Add a new dynamic authentication approachChristopher Baines2021-02-28
| | | | | | This avoids the need to create agents upfront, which could be useful when creating many childhurd VMs or using scheduling tools to dynamically run agents.
* Avoid some threads and locks when running on the hurdChristopher Baines2021-02-15
| | | | I've see the process hang on the hurd, and I think this might help.
* Remove unused coordinator module from the http agent messaging moduleChristopher Baines2021-02-13
|
* Remove (guix-build-coordinator datastore) import from agent moduleChristopher Baines2021-02-13
| | | | I'm seeing this pull in sqlite3 unnecessarily on the hurd.