aboutsummaryrefslogtreecommitdiff
path: root/guix-build-coordinator/coordinator.scm
Commit message (Collapse)AuthorAge
* Delay storing derivations in the databaseChristopher Baines2021-05-21
| | | | | Until actually storing the build, since the build might not actually be submitted if there's a build for those outputs already.
* Make some SQLite related improvementsChristopher Baines2021-04-20
| | | | | Don't keep database connections around forever as this relates to cached query plans, and also run the optimize pragma when closing connections.
* Start the allocator and hook threads laterChristopher Baines2021-03-29
| | | | It's important that this code doesn't run until Sqitch has run.
* Add a new dynamic authentication approachChristopher Baines2021-02-28
| | | | | | This avoids the need to create agents upfront, which could be useful when creating many childhurd VMs or using scheduling tools to dynamically run agents.
* Add exception handling for the submit outputs hookChristopher Baines2021-02-18
|
* Add a hook for determining whether agents should submit outputsChristopher Baines2021-02-17
| | | | | This should make it possible to check properly whether the outputs are needed, instead of just assuming they are not if there's been a successful build.
* Show backtraces upon hook errorsChristopher Baines2021-02-14
| | | | | This might not be helpful, but I think it's still worth trying, even if all the line numbers are within Guile itself...
* Add more logging around build result processingChristopher Baines2021-02-08
|
* Trigger build allocations when necessary for deferred buildsChristopher Baines2021-02-06
|
* Use srfi-19 in the coordinator moduleChristopher Baines2021-02-06
|
* Don't use with-exception-handler with (backtrace)Christopher Baines2021-01-22
| | | | | | | | | | | With with-exception-handler being called with #:unwind? #f (implicitly). This breaks Guile internals used by (backtrace) [1], meaning you get a different exception/backtrace when Guile itself breaks. This should avoid the "string->number: Wrong type argument in position 1 (expecting string): #f" exception I've been haunted by for the last year. 1: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=46009
* Sure up handling of exceptions within Guile (backtrace)Christopher Baines2021-01-17
|
* Tell agents when to submit built outputsChristopher Baines2021-01-16
| | | | This is often unnecessary if the outputs have already been built.
* Move the build result storing logic in to the coordinator moduleChristopher Baines2021-01-16
| | | | | And out of the datastore. This means that datastore code doesn't have too much logic in it.
* Move build allocation complexity out of the datastoreChristopher Baines2021-01-16
| | | | | And in to the coordinator module. This will make adding more datastore's easier.
* Move triggering allocations out of the http serverChristopher Baines2021-01-15
| | | | As this code should be in the coordinator.
* Fix issues around datastore-count-builds-for-derivationChristopher Baines2020-12-27
| | | | My refactoring went quite wrong.
* Implement deferring buildsChristopher Baines2020-12-27
| | | | | | | | | This isn't intended as some time based scheduling, but more as a way to slow down builds by deferring processing them until some point in the future. I'm intending to use this to test fixed output derivations. I can look up all the derivations I want to test, then defer the builds to run spread out across some period. This feature saves having to submit the builds gradually.
* Change how some submit build options handle canceled buildsChristopher Baines2020-12-26
| | | | | | Don't include canceled builds in the build-for-derivation-exists? or build-for-output-already-exists? options. I think it makes sense to not include canceled builds in these options.
* Add a hook for when builds are canceledChristopher Baines2020-12-21
|
* Track the duration of hooksChristopher Baines2020-12-20
|
* Fix argument ordering for datastore-insert-buildChristopher Baines2020-12-17
|
* Add missing db argumentChristopher Baines2020-12-17
|
* Move cancel build logic in to the coordinatorChristopher Baines2020-12-16
|
* Move more logic around submitting builds in to the coordinatorChristopher Baines2020-12-16
| | | | | | | | | | Originally I was trying to keep the implementation details of the datastore in the datastore modules, but this approach starts to crack as you cope with more and more complicated transactions. This change should help resolve issues around getting the coordinator logic in to the coordinator module, and simplifying the SQLite datastore in preparation for adding PostgreSQL support.
* Copy the agent log formatter to the coordinatorChristopher Baines2020-12-16
| | | | For some consistency.
* Implement build cancelationChristopher Baines2020-12-16
|
* Fix bracketsChristopher Baines2020-12-10
|
* Use different buckets for the allocator duration metricChristopher Baines2020-12-10
| | | | As the allocator currently can take much longer than 10 seconds.
* Start tracking the submit build coordinator action durationChristopher Baines2020-12-07
|
* Move some metrics out of base-datastore-metrics-updaterChristopher Baines2020-12-04
| | | | | | | Some parts of this were quite slow with anything other than a small database, so instead of doing slow queries on every request, do some slow queries to setup the metrics, and then change them as part of the regular changes to the database.
* Improve how requested systems are handled in build allocationChristopher Baines2020-12-02
| | | | Just join against the database table, rather than using the values.
* Manually handle WAL checkpointingChristopher Baines2020-12-02
| | | | | SQLite's usual approach doesn't seem to always contain the size of the WAL, so move this logic in to the application and regularly run a checkpoint.
* Better handle fetching buildsChristopher Baines2020-11-27
| | | | | | | | | | | | Previously, an agent could end up fetching builds from the coordinator, but not receiving the response, say because of a network issue or timeout. When it retries, it would fetch even more builds, and there would be some allocated to it, but that it doesn't know about. These changes attempt to make fetching builds more idempotent, rather than returning the new allocated builds, it returns all the builds, and rather than requesting a number of new builds, it's the total number of allocated builds that is specified.
* Propagate tags for ensure-all-related-derivation-outputs-have-buildsChristopher Baines2020-11-24
| | | | | This seems like sensible behaviour, but it might be good to make this optional in the future.
* Fix build-started hook processing promptChristopher Baines2020-11-09
|
* Make hook processing a bit more efficientChristopher Baines2020-11-09
| | | | | Rather than polling the database every second, use some condition variables to wake threads when there's probably an event.
* Add logging around hook processingChristopher Baines2020-11-08
| | | | This might help work out why it gets stuck.
* Use the logger module to add times to the log outputChristopher Baines2020-11-07
| | | | | Just for the request processing at the moment, but with a plan for more things in the future.
* Better handle agent errors on the coordinator sideChristopher Baines2020-10-24
| | | | | Things like the agent not having the log file, or an output. This will allow the agent to actually retry the relevant thing.
* Add some validation for hooksChristopher Baines2020-10-24
|
* Remove unnecessary underscoreChristopher Baines2020-10-23
| | | | This matches a change in the guile prometheus library.
* Improve .tmp build log file handlingChristopher Baines2020-10-11
| | | | Make more of an effort to ignore the .tmp files.
* Move around the code for build log file locationsChristopher Baines2020-10-11
| | | | | | build-log-file-location replaces build-log-file-exists? as it doesn't always return a boolean, it also changes to return an absolute filepath for the log file if it exists, as this will be easier to use.
* Exclude .tmp files when checking for build logsChristopher Baines2020-10-10
|
* Separate the agent messaging server and client codeChristopher Baines2020-10-07
| | | | So that the client part doesn't depend on fibers.
* Split the fibers utils from the main utils moduleChristopher Baines2020-10-07
| | | | | To start making it possible to use the agent, without having to load anything related to fibers (as it doesn't work on the hurd yet).
* Track the number of builds the allocator is consideringChristopher Baines2020-09-23
|
* Fix assq-ref when handling build resultsChristopher Baines2020-09-20
|
* Create a run-coordinator-service procedureChristopher Baines2020-09-16
| | | | | | This is moving in the direction of not having to use the script to start the service. I think for a Guix service definition, being able to specify some Guile code directly will be better.