Commit message (Collapse) | Author | Age | ||
---|---|---|---|---|
... | ||||
* | Not being able to log reliably is frustrating :( | Christopher Baines | 2023-05-14 | |
| | | | | | | I'm seeing things like this, which I'm guessing relate to logging failing: 2023-05-14 13:40:44 (ERROR): exception in output hash thread: #<&compound-exception components: (#<&error> #<&origin origin: "put-char"> #<&message message: "conversion to port encoding failed"> #<&irritants irritants: 84> #<&exception-with-kind-and-args kind: encoding-error args: ("put-char" "conversion to port encoding failed" 84 #<output: file 1> #\2)>)> | |||
* | Guard against logging failures in the output hash thread | Christopher Baines | 2023-05-14 | |
| | ||||
* | Set the name of the hash management thread | Christopher Baines | 2023-05-14 | |
| | | | | For debugging purposes. | |||
* | Move some logging | Christopher Baines | 2023-05-14 | |
| | | | | So if it fails, it doesn't leave things in an inconsistent state. | |||
* | Add more logging around computing hashes | Christopher Baines | 2023-05-14 | |
| | ||||
* | Fix missing logger argument | Christopher Baines | 2023-05-14 | |
| | ||||
* | Try to make starting hashing outputs more reliable | Christopher Baines | 2023-05-12 | |
| | | | | Even if the connection to the agent has dropped when the upload has completed. | |||
* | Don't look at the content-length header for chunked transfers | Christopher Baines | 2023-05-12 | |
| | | | | Since this isn't supposed to be set. | |||
* | Fix retry-times not always being set | Christopher Baines | 2023-05-11 | |
| | ||||
* | Clean up some handling of uploads for agents | Christopher Baines | 2023-05-11 | |
| | | | | This commit should correct the progress reporting on partial uploads. | |||
* | Add more exception handling in the hash computing thread | Christopher Baines | 2023-05-11 | |
| | | | | As I've seen log-msg here raise exceptions. | |||
* | Have the coordinator report on the outputs that are being hashed | Christopher Baines | 2023-05-11 | |
| | | | | As this is useful to observe since it can take a long time for large outputs. | |||
* | Have agents report on the progress of the coordinator hashing outputs | Christopher Baines | 2023-05-11 | |
| | | | | Otherwise it looks like the upload should finish, but hasn't. | |||
* | Add #:streaming? to call-with-streaming-http-request | Christopher Baines | 2023-05-11 | |
| | ||||
* | Remove some left over debugging | Christopher Baines | 2023-05-11 | |
| | ||||
* | Add missing argument to port timeout errors | Christopher Baines | 2023-05-11 | |
| | ||||
* | Don't log so much when the database is busy | Christopher Baines | 2023-05-10 | |
| | ||||
* | Tweak retrying for status update requests | Christopher Baines | 2023-05-10 | |
| | | | | | Don't retry status updates many times, since the information will be more out of date each time. | |||
* | Move output hash related operations in to the dedicated thread | Christopher Baines | 2023-05-10 | |
| | | | | | So in case of connection loss, this still happens and the work to compute the hash isn't wasted. | |||
* | Move computing output hashes to dedicated threads | Christopher Baines | 2023-05-10 | |
| | | | | | | | This should help the coordinator and agents ensure hashes are computed and the agent finds out when this has happened, even in a situation where the coordinator is restarted/crashes and the connection between the agents and coordinator are lost. | |||
* | Correct some indentation | Christopher Baines | 2023-05-10 | |
| | ||||
* | Separate read and write timeout exceptions | Christopher Baines | 2023-05-09 | |
| | | | | So it's clearer what has occurred. | |||
* | More port forcing | Christopher Baines | 2023-05-09 | |
| | | | | | | | | I think NGinx might time out reading the response headers for streaming requests, since the headers probably don't fill the buffer. So force output at that point. Also, address the issue with forcing output to the chunked port. | |||
* | Remove make-chunked-output-port* as it's now unused | Christopher Baines | 2023-05-08 | |
| | ||||
* | Don't specify transfer encoding header in responses | Christopher Baines | 2023-05-08 | |
| | | | | | | As the fibers web server takes care of this. It's currently adding the header twice, so this should be fixed. | |||
* | Force sending the request in call-with-streaming-http-request | Christopher Baines | 2023-05-08 | |
| | ||||
* | Fix datastore-list-agent-builds canceled | Christopher Baines | 2023-05-08 | |
| | ||||
* | Change submit-output to not spend so much time waiting | Christopher Baines | 2023-05-08 | |
| | | | | | Make use of the coordinator trying to avoid the connection timing out. This should improve things when the coordinator is restarted or crashes. | |||
* | Specifically handle empty partial uploads | Christopher Baines | 2023-05-08 | |
| | | | | | | This will happen when the upload process is interupted after the data has been received, but before the hash has been computed, perhaps by the coordinator restarting. | |||
* | Don't error when a build cannot be found | Christopher Baines | 2023-05-08 | |
| | ||||
* | Make logging conditional on the request content length | Christopher Baines | 2023-05-08 | |
| | ||||
* | Stop using chunked transfers for file uploads | Christopher Baines | 2023-05-08 | |
| | | | | | As the amount of data to upload is known, this is unnecessary complexity and overhead. | |||
* | Provide progress reporting for computing the hashes of outputs | Christopher Baines | 2023-05-08 | |
| | | | | | | | | This happens on the server and can be very slow for large outputs, exceeding the time that NGinx will keep waiting for a response. To address this, stream progress information back to the client, this should keep the connection from timing out. | |||
* | Expose the number threads as a metric | Christopher Baines | 2023-05-08 | |
| | | | | As I think this might want reducing at some point. | |||
* | Include system uptime in datastore-find-agent-status | Christopher Baines | 2023-05-06 | |
| | ||||
* | Add some comments about streaming http requests | Christopher Baines | 2023-05-05 | |
| | ||||
* | Include system uptime in the agent status information | Christopher Baines | 2023-05-05 | |
| | | | | As I've found this useful in spotting systems which have problems. | |||
* | Add utility to get system uptime | Christopher Baines | 2023-05-05 | |
| | ||||
* | Enable submitting regular status updates for the hurd | Christopher Baines | 2023-05-05 | |
| | ||||
* | Log which inputs are missing | Christopher Baines | 2023-05-05 | |
| | | | | As this can be useful. | |||
* | Break the allocator considered builds metric down by system | Christopher Baines | 2023-05-05 | |
| | | | | As this provides some extra useful information. | |||
* | Use a hash table for build systems in the derivation ordered allocator | Christopher Baines | 2023-05-05 | |
| | | | | Like the basic allocator as this'll probably be faster. | |||
* | Remove now redundant logging around gc protection | Christopher Baines | 2023-05-03 | |
| | ||||
* | Fix using with-upload-monitoring in submit-one-output | Christopher Baines | 2023-05-03 | |
| | ||||
* | Guard against errors in the initializer and destructor | Christopher Baines | 2023-05-02 | |
| | | | | In the worker threads. | |||
* | Name more of the worker thread channels | Christopher Baines | 2023-05-02 | |
| | ||||
* | Remove the gbc prefix from the thread names | Christopher Baines | 2023-05-02 | |
| | | | | As this shouldn't be needed to help identify them. | |||
* | Name the thread pool threads for submitting builds | Christopher Baines | 2023-05-02 | |
| | ||||
* | Name the sqlite worker threads | Christopher Baines | 2023-05-02 | |
| | | | | So it's easier to debug issues. | |||
* | Support naming the worker thread channel threads | Christopher Baines | 2023-05-02 | |
| |