aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* Prioritise post build actionspost-build-prioritisationChristopher Baines2023-04-11
| | | | | | | By the priority of the build, and then by the bytes that need uploading. This should help ensure that priority builds get handled first when there's congestion getting data back to the coordinator. Prioritising builds with less data to upload should also keep things moving when uploads are slow as well.
* Add priority support to create-work-queueChristopher Baines2023-04-11
| | | | | | | This isn't ideal as the process-job interface changes when you enable prioritisation, but that's not a big issue. This should enable prioritising post build operations.
* Add missing joinChristopher Baines2023-04-11
|
* Use underscores for derivation_nameChristopher Baines2023-04-11
| | | | | | As this is more consistent in the JSON responses. Signed-off-by: Christopher Baines <mail@cbaines.net>
* Expose the derived priority to agentsChristopher Baines2023-04-11
| | | | | Rather than the priority, as it's the derived priority that they should be using for decision making.
* Remove datastore-select-allocated-buildsChristopher Baines2023-04-11
| | | | As it's a less well named copy of datastore-list-agent-builds.
* Drop the delay for retrying uploads on failureChristopher Baines2023-04-11
|
* Remove the crude alarm based timeout for submitting outputsChristopher Baines2023-04-11
| | | | | This should be unnecessary now that there's progress on getting the I/O operations to timeout.
* Reduce logging on build failuresChristopher Baines2023-04-11
|
* Include build priority when selecting allocated buildsChristopher Baines2023-04-11
|
* Strip down the guix-dev.scm fileChristopher Baines2023-04-10
| | | | Assume that a recent version of guix will be used.
* Include the build priority when agents fetch buildsChristopher Baines2023-04-10
| | | | This means the agent can use it to prioritise various things.
* Change allocate-builds to update-build-allocation-planChristopher Baines2023-04-10
| | | | As this is a better name.
* Use a timeout when substituting derivations in the publish hookChristopher Baines2023-04-10
| | | | As this can block if the store GC is running.
* Improve event/state id support for eventsChristopher Baines2023-04-03
| | | | | Support the Last-Event-ID header in the events endpoint, and include the event id's in the responses.
* Try to improve hook exception handlingChristopher Baines2023-04-02
| | | | | This should lead to more concise backtraces at least although it may reintroduce the problem where backtraces lead to excessive memory usage.
* Don't call (backtrace) in the build allocatorChristopher Baines2023-04-01
| | | | | It seems to cause the same memory issues as calling (backtrace) with the hooks.
* Give up printing backtraces for exceptions in hooksChristopher Baines2023-03-29
| | | | I think it's causing problems that I'm struggling to reproduce and debug.
* Try and ensure that the non-fibers sleep is used in placesChristopher Baines2023-03-29
| | | | | | When not using fibers. I don't know if a different sleep is being used, and I don't think I've read anything about having to avoid this, but I'm running out of ideas.
* Provide more information in process-event error handlingChristopher Baines2023-03-29
| | | | There's still problems here, but it's unclear where.
* Remove backtrace printing from create-thread-poolChristopher Baines2023-03-29
| | | | | Just in case this is causing a problem with the exception handling within proc.
* Always keep one thread running to process hooksChristopher Baines2023-03-29
| | | | This should reduce the need to keep stopping and starting threads.
* Guard against exceptions in the thread pool monitor threadChristopher Baines2023-03-29
| | | | As I've seen the dreaded encoding-error here now that there's some logging.
* Switch the timeout approach for guix-data-service requestsChristopher Baines2023-03-29
| | | | In case this helps with avoiding hooks hanging.
* Decrease the planned builds for agentsChristopher Baines2023-03-29
| | | | Since the allocator is fast now.
* Add more logging around parallel hooksChristopher Baines2023-03-28
| | | | | I still can't reproduce any problems locally, so this might help work out what the state of the different threads are.
* Change how parallel hook processing worksChristopher Baines2023-03-28
| | | | | | | | The previous approach was inefficient, since there was a thread that just repeatedly tries to queue every unprocessed hook event for processing. Instead, use a thread pool that pulls events from the database. This still involves some work to not process the same event in different threads, but it should hopefully scale better.
* Add create-thread-poolChristopher Baines2023-03-28
| | | | | | This is like create-work-queue, but pulls the jobs, rather than the jobs being pushed to it. This should work more efficiently for the hooks, where there are often lots of events to process.
* Remove %random-stateChristopher Baines2023-03-28
| | | | As it's unused.
* Move waiting after hook errors in to process-eventChristopher Baines2023-03-27
| | | | So that this happens for parallel hooks as well.
* Stop using the guix-memory-metrics-updaterChristopher Baines2023-03-27
| | | | | The data doesn't look particularly useful, and I think the memory problem I was chasing was down to a broken hook (and poor handling of that).
* Handle the tags being a vector in submit-buildChristopher Baines2023-03-27
| | | | As the retry hook uses this functionality.
* Tweak how the exception handling and logging works for hooksChristopher Baines2023-03-27
| | | | | Since the errors don't seem to be getting logged properly, the backtrace output stops part way.
* Instrument the size of some Guix managed hash tablesChristopher Baines2023-03-27
| | | | In case any of these are a factor in the occasional high memory use.
* Fix sending tags as part of the build-submitted eventChristopher Baines2023-03-26
|
* Include build tags in the agent-builds-allocated eventChristopher Baines2023-03-26
|
* Send an event when builds are allocated to an agentChristopher Baines2023-03-25
|
* Send event on allocation plan updateChristopher Baines2023-03-25
|
* Send event on agent status updateChristopher Baines2023-03-25
|
* Add timestamp to the eventsChristopher Baines2023-03-25
|
* Add processor count to the agent statusChristopher Baines2023-03-24
| | | | This is useful when interpreting the load information.
* Include the timestamp when fetching the agent status from the dbChristopher Baines2023-03-23
|
* Fix status load average handlingChristopher Baines2023-03-22
| | | | As the keys in JSON are strings.
* Include agent tags in the coordinator stateChristopher Baines2023-03-22
|
* Fix sharing build tags through the stateChristopher Baines2023-03-22
|
* Include the allocation plan size in the coordinator stateChristopher Baines2023-03-22
|
* Include build tags in the coordinator stateChristopher Baines2023-03-22
|
* Include agent requested systems in the coordinator stateChristopher Baines2023-03-22
|
* Include the last agent statuses in the overall statusChristopher Baines2023-03-22
|
* Have agents send their status every 30 secondsChristopher Baines2023-03-22
|