aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
...
* Not being able to log reliably is frustrating :(Christopher Baines2023-05-14
| | | | | | I'm seeing things like this, which I'm guessing relate to logging failing: 2023-05-14 13:40:44 (ERROR): exception in output hash thread: #<&compound-exception components: (#<&error> #<&origin origin: "put-char"> #<&message message: "conversion to port encoding failed"> #<&irritants irritants: 84> #<&exception-with-kind-and-args kind: encoding-error args: ("put-char" "conversion to port encoding failed" 84 #<output: file 1> #\2)>)>
* Guard against logging failures in the output hash threadChristopher Baines2023-05-14
|
* Set the name of the hash management threadChristopher Baines2023-05-14
| | | | For debugging purposes.
* Move some loggingChristopher Baines2023-05-14
| | | | So if it fails, it doesn't leave things in an inconsistent state.
* Add more logging around computing hashesChristopher Baines2023-05-14
|
* Fix missing logger argumentChristopher Baines2023-05-14
|
* Try to make starting hashing outputs more reliableChristopher Baines2023-05-12
| | | | Even if the connection to the agent has dropped when the upload has completed.
* Don't look at the content-length header for chunked transfersChristopher Baines2023-05-12
| | | | Since this isn't supposed to be set.
* Fix retry-times not always being setChristopher Baines2023-05-11
|
* Clean up some handling of uploads for agentsChristopher Baines2023-05-11
| | | | This commit should correct the progress reporting on partial uploads.
* Add more exception handling in the hash computing threadChristopher Baines2023-05-11
| | | | As I've seen log-msg here raise exceptions.
* Have the coordinator report on the outputs that are being hashedChristopher Baines2023-05-11
| | | | As this is useful to observe since it can take a long time for large outputs.
* Have agents report on the progress of the coordinator hashing outputsChristopher Baines2023-05-11
| | | | Otherwise it looks like the upload should finish, but hasn't.
* Add #:streaming? to call-with-streaming-http-requestChristopher Baines2023-05-11
|
* Remove some left over debuggingChristopher Baines2023-05-11
|
* Add missing argument to port timeout errorsChristopher Baines2023-05-11
|
* Don't log so much when the database is busyChristopher Baines2023-05-10
|
* Tweak retrying for status update requestsChristopher Baines2023-05-10
| | | | | Don't retry status updates many times, since the information will be more out of date each time.
* Move output hash related operations in to the dedicated threadChristopher Baines2023-05-10
| | | | | So in case of connection loss, this still happens and the work to compute the hash isn't wasted.
* Move computing output hashes to dedicated threadsChristopher Baines2023-05-10
| | | | | | | This should help the coordinator and agents ensure hashes are computed and the agent finds out when this has happened, even in a situation where the coordinator is restarted/crashes and the connection between the agents and coordinator are lost.
* Correct some indentationChristopher Baines2023-05-10
|
* Separate read and write timeout exceptionsChristopher Baines2023-05-09
| | | | So it's clearer what has occurred.
* More port forcingChristopher Baines2023-05-09
| | | | | | | | I think NGinx might time out reading the response headers for streaming requests, since the headers probably don't fill the buffer. So force output at that point. Also, address the issue with forcing output to the chunked port.
* Remove make-chunked-output-port* as it's now unusedChristopher Baines2023-05-08
|
* Don't specify transfer encoding header in responsesChristopher Baines2023-05-08
| | | | | | As the fibers web server takes care of this. It's currently adding the header twice, so this should be fixed.
* Force sending the request in call-with-streaming-http-requestChristopher Baines2023-05-08
|
* Fix datastore-list-agent-builds canceledChristopher Baines2023-05-08
|
* Change submit-output to not spend so much time waitingChristopher Baines2023-05-08
| | | | | Make use of the coordinator trying to avoid the connection timing out. This should improve things when the coordinator is restarted or crashes.
* Specifically handle empty partial uploadsChristopher Baines2023-05-08
| | | | | | This will happen when the upload process is interupted after the data has been received, but before the hash has been computed, perhaps by the coordinator restarting.
* Don't error when a build cannot be foundChristopher Baines2023-05-08
|
* Make logging conditional on the request content lengthChristopher Baines2023-05-08
|
* Stop using chunked transfers for file uploadsChristopher Baines2023-05-08
| | | | | As the amount of data to upload is known, this is unnecessary complexity and overhead.
* Provide progress reporting for computing the hashes of outputsChristopher Baines2023-05-08
| | | | | | | | This happens on the server and can be very slow for large outputs, exceeding the time that NGinx will keep waiting for a response. To address this, stream progress information back to the client, this should keep the connection from timing out.
* Expose the number threads as a metricChristopher Baines2023-05-08
| | | | As I think this might want reducing at some point.
* Include system uptime in datastore-find-agent-statusChristopher Baines2023-05-06
|
* Add some comments about streaming http requestsChristopher Baines2023-05-05
|
* Include system uptime in the agent status informationChristopher Baines2023-05-05
| | | | As I've found this useful in spotting systems which have problems.
* Add utility to get system uptimeChristopher Baines2023-05-05
|
* Enable submitting regular status updates for the hurdChristopher Baines2023-05-05
|
* Log which inputs are missingChristopher Baines2023-05-05
| | | | As this can be useful.
* Break the allocator considered builds metric down by systemChristopher Baines2023-05-05
| | | | As this provides some extra useful information.
* Use a hash table for build systems in the derivation ordered allocatorChristopher Baines2023-05-05
| | | | Like the basic allocator as this'll probably be faster.
* Remove now redundant logging around gc protectionChristopher Baines2023-05-03
|
* Fix using with-upload-monitoring in submit-one-outputChristopher Baines2023-05-03
|
* Guard against errors in the initializer and destructorChristopher Baines2023-05-02
| | | | In the worker threads.
* Name more of the worker thread channelsChristopher Baines2023-05-02
|
* Remove the gbc prefix from the thread namesChristopher Baines2023-05-02
| | | | As this shouldn't be needed to help identify them.
* Name the thread pool threads for submitting buildsChristopher Baines2023-05-02
|
* Name the sqlite worker threadsChristopher Baines2023-05-02
| | | | So it's easier to debug issues.
* Support naming the worker thread channel threadsChristopher Baines2023-05-02
|