aboutsummaryrefslogtreecommitdiff
path: root/guix-build-coordinator/agent-messaging/http
Commit message (Collapse)AuthorAge
* Don't use a chunked response for the metricsChristopher Baines2024-06-30
|
* Remove support for chunked requestsChristopher Baines2024-06-23
| | | | | This was a hack to work around reading the entire request/response body in to memory, and is no longer needed.
* Use the new process metrics exporterChristopher Baines2024-04-17
|
* Track the metrics endpoint durationChristopher Baines2023-11-17
|
* Try to avoid the metrics endpoint timing outChristopher Baines2023-11-10
| | | | As this makes it harder to debug issues.
* Exit when the server fails to startChristopher Baines2023-08-29
| | | | To avoid the process half working.
* Name more threadsChristopher Baines2023-08-09
| | | | To help with debugging.
* Try and get backtraces when current output port seems brokenChristopher Baines2023-06-04
| | | | | | | | | With the "conversion to port encoding failed" (#62590) error, I'm seeing the "error: when processing" logs, but the backtrace doesn't get logged, maybe because it's going to the current output port, which might be broken? Anyway, try sending the backtrace to the current error port, in the hope that this port is still working.
* Better handle the output hashing being completed for upload requestsChristopher Baines2023-05-23
| | | | | This currently causes an error on the server side and a timeout on the client side.
* Better handle upload requests with no contentChristopher Baines2023-05-22
| | | | | The idea with these is to allow the agent to resume waiting for the coordinator to finish computing the output hash.
* Guard against the stat call failingChristopher Baines2023-05-17
| | | | If the file doesn't exist.
* Add extra log line when hashing outputsChristopher Baines2023-05-14
|
* Not being able to log reliably is frustrating :(Christopher Baines2023-05-14
| | | | | | I'm seeing things like this, which I'm guessing relate to logging failing: 2023-05-14 13:40:44 (ERROR): exception in output hash thread: #<&compound-exception components: (#<&error> #<&origin origin: "put-char"> #<&message message: "conversion to port encoding failed"> #<&irritants irritants: 84> #<&exception-with-kind-and-args kind: encoding-error args: ("put-char" "conversion to port encoding failed" 84 #<output: file 1> #\2)>)>
* Guard against logging failures in the output hash threadChristopher Baines2023-05-14
|
* Set the name of the hash management threadChristopher Baines2023-05-14
| | | | For debugging purposes.
* Move some loggingChristopher Baines2023-05-14
| | | | So if it fails, it doesn't leave things in an inconsistent state.
* Add more logging around computing hashesChristopher Baines2023-05-14
|
* Fix missing logger argumentChristopher Baines2023-05-14
|
* Try to make starting hashing outputs more reliableChristopher Baines2023-05-12
| | | | Even if the connection to the agent has dropped when the upload has completed.
* Don't look at the content-length header for chunked transfersChristopher Baines2023-05-12
| | | | Since this isn't supposed to be set.
* Add more exception handling in the hash computing threadChristopher Baines2023-05-11
| | | | As I've seen log-msg here raise exceptions.
* Have the coordinator report on the outputs that are being hashedChristopher Baines2023-05-11
| | | | As this is useful to observe since it can take a long time for large outputs.
* Don't log so much when the database is busyChristopher Baines2023-05-10
|
* Move output hash related operations in to the dedicated threadChristopher Baines2023-05-10
| | | | | So in case of connection loss, this still happens and the work to compute the hash isn't wasted.
* Move computing output hashes to dedicated threadsChristopher Baines2023-05-10
| | | | | | | This should help the coordinator and agents ensure hashes are computed and the agent finds out when this has happened, even in a situation where the coordinator is restarted/crashes and the connection between the agents and coordinator are lost.
* More port forcingChristopher Baines2023-05-09
| | | | | | | | I think NGinx might time out reading the response headers for streaming requests, since the headers probably don't fill the buffer. So force output at that point. Also, address the issue with forcing output to the chunked port.
* Don't specify transfer encoding header in responsesChristopher Baines2023-05-08
| | | | | | As the fibers web server takes care of this. It's currently adding the header twice, so this should be fixed.
* Specifically handle empty partial uploadsChristopher Baines2023-05-08
| | | | | | This will happen when the upload process is interupted after the data has been received, but before the hash has been computed, perhaps by the coordinator restarting.
* Make logging conditional on the request content lengthChristopher Baines2023-05-08
|
* Stop using chunked transfers for file uploadsChristopher Baines2023-05-08
| | | | | As the amount of data to upload is known, this is unnecessary complexity and overhead.
* Provide progress reporting for computing the hashes of outputsChristopher Baines2023-05-08
| | | | | | | | This happens on the server and can be very slow for large outputs, exceeding the time that NGinx will keep waiting for a response. To address this, stream progress information back to the client, this should keep the connection from timing out.
* Expose the number threads as a metricChristopher Baines2023-05-08
| | | | As I think this might want reducing at some point.
* Include system uptime in the agent status informationChristopher Baines2023-05-05
| | | | As I've found this useful in spotting systems which have problems.
* Try to instrument ports/file descriptorsChristopher Baines2023-04-27
| | | | | | I'm seeing "too many open file" errors, and while I'm not sure if port-for-each is directly connected, it sounds like it might be worth instrumenting.
* Revert "Remove read-request-body workaround"Christopher Baines2023-04-25
| | | | | | I was mistaken, Guile doesn't handle chunked request bodies. This reverts commit 07e42953f257b846d44376d998cc7d654214ca17.
* Remove read-request-body workaroundChristopher Baines2023-04-25
| | | | | The Guile chunked input port now raises an exception when the input is incomplete.
* Deallocate canceled builds from agents when they startupChristopher Baines2023-04-21
|
* Include the submit_outputs information in the agent status responseChristopher Baines2023-04-21
| | | | | This means that agents will know whether to submit the outputs of builds, even if they're restarted.
* Stop using the guix-memory-metrics-updaterChristopher Baines2023-03-27
| | | | | The data doesn't look particularly useful, and I think the memory problem I was chasing was down to a broken hook (and poor handling of that).
* Instrument the size of some Guix managed hash tablesChristopher Baines2023-03-27
| | | | In case any of these are a factor in the occasional high memory use.
* Add processor count to the agent statusChristopher Baines2023-03-24
| | | | This is useful when interpreting the load information.
* Fix status load average handlingChristopher Baines2023-03-22
| | | | As the keys in JSON are strings.
* Implement and extend the agent status functionalityChristopher Baines2023-03-22
| | | | | | | Previously, updating the status was used by the agent just to get back the list of builds it was already allocated. Now the status sent is actually stored, along with the 1min load average.
* Stop using procedures for responses where unnecessaryChristopher Baines2023-03-21
| | | | | In newer versions of Guile Fibers, this would mean that chunked transport encoding is used, which is unnecessary for small quick respones.
* Clean up some loggingChristopher Baines2022-10-22
|
* Log when output files are renamedChristopher Baines2022-10-22
|
* Add more logging when output files are deletedChristopher Baines2022-10-22
|
* Copy upload error handling to the partial routeChristopher Baines2022-10-21
|
* Include the file size and md5 hash in error messagesChristopher Baines2022-10-21
| | | | | When an error occurs while trying to compute the hash, as I hope this information will help to identify where things have gone wrong.
* Don't crash on chunked input exceptionsChristopher Baines2022-10-08
|