aboutsummaryrefslogtreecommitdiff
path: root/guix-build-coordinator
Commit message (Collapse)AuthorAge
* Use the logger module to add times to the log outputChristopher Baines2020-11-07
| | | | | Just for the request processing at the moment, but with a plan for more things in the future.
* Fix the unprocessed_builds table sticking aroundChristopher Baines2020-11-06
|
* Speed up populating the unbuilt_outputs tableChristopher Baines2020-11-06
|
* Rework how the derivation ordered allocator gets buildsChristopher Baines2020-11-06
| | | | | | Use a temporary table to avoid computing the priorities for all builds. This speeds up the allocation to only take a few seconds on the database I'm testing against.
* Handle multiple values in call-with-time-loggingChristopher Baines2020-11-06
|
* Use the unbuilt_outputs table in the derivation ordered allocatorChristopher Baines2020-11-06
| | | | As this speeds the query up substantially.
* Add an unbuilt_outputs tableChristopher Baines2020-11-06
| | | | | | | One of the slow things in the derivation ordered allocator is working out what outputs are unbuilt, as this requires looking at all the derivation outputs (of which there are lots), and checking if any build exists which has succeeded.
* Improve SQLite statement handlingChristopher Baines2020-11-04
| | | | | | | | | | | | | | | The Guix Build Coordinator would segfault, and this seemed to come when preparing statements. I think this is happening because the (sqlite3) bindings finalize statements when they're out of scope, and this happens in the garbage collector thread. SQLite is running in multi-threaded mode, which means actions relating to one database connection shouldn't happen concurrently in different threads, hence I think this is leading to a segfault. To work around this behaviour, pass #:cache? #t to sqlite-prepare so statements are long lived where possible, or in the few cases where the SQL is dynamic, make sure to finalize it before the garbage collector gets a chance. This'll hopefully mean that there's less segfaults...
* Include the Guile internal real and run times as metricsChristopher Baines2020-11-02
| | | | This will help track CPU time, as well as restarts/crashes.
* Attempt to more gracefully handle the problem of missing derivationsChristopher Baines2020-11-02
| | | | In the agent and allocator.
* Remove some left in debugging outputChristopher Baines2020-11-02
|
* Only consider builds created in the last two weeksChristopher Baines2020-10-29
| | | | | | For the derivation ordered allocator. This is an quick alternative for having some kind of archival mechanism for builds. It should reduce the time it takes the allocator to run.
* Only consider unprocessed builds for prioritisationChristopher Baines2020-10-29
| | | | As there's no need to consider unprocessed builds in this part of the query.
* Don't assume the missing input to a build is a direct inputChristopher Baines2020-10-24
| | | | | | | | | | | | Substitutes could be available for all direct inputs, but be missing for things they reference. This could happen if those builds happened on a machine with the store items available for example. Therefore, search the entire graph for the relevant derivation when looking for the derivation to build to provide the missing input. This change matches up with the similar improvement around handling fetching substitutes.
* Improve missing inputs behaviourChristopher Baines2020-10-24
| | | | | | | | When a substitute is found for a direct input, but it can't be fetched, this is probably because something it referenced isn't available. Therefore, look through the references recursively and collect up the store items that aren't available locally or via a substitute. Send this list to the coordinator so that it can schedule builds.
* Add missing newline to failed to fetch substitutes messageChristopher Baines2020-10-24
|
* Use valid-path? rather than file exists for testing store itemsChristopher Baines2020-10-24
| | | | | As the file might exist, but ignored because the daemon is treating it as invalid.
* Have the agent handle errors from the coordinatorChristopher Baines2020-10-24
| | | | | When submitting builds. The agent will now retry the relevant thing, like uploading the log file if the coordinator says that still needs doing.
* Better handle agent errors on the coordinator sideChristopher Baines2020-10-24
| | | | | Things like the agent not having the log file, or an output. This will allow the agent to actually retry the relevant thing.
* Extract out agents submitting log filesChristopher Baines2020-10-24
| | | | So that this code can be retried if submitting the build result fails.
* Add the ability to ignore errors when retryingChristopher Baines2020-10-24
| | | | As this will enable responding to some exceptions at a higher level.
* Improve the line length for the receiving outputs codeChristopher Baines2020-10-24
|
* Allow configuring the s3-publish-hook with a aws commandChristopher Baines2020-10-24
| | | | So that an absolute filename can be used.
* Add some validation for hooksChristopher Baines2020-10-24
|
* Make the s3 utils command configurableChristopher Baines2020-10-24
| | | | In case you want to use the absolute location of the binary.
* Remove unnecessary underscoreChristopher Baines2020-10-23
| | | | This matches a change in the guile prometheus library.
* Move the post-publish-behaviour inside the s3 publish hookChristopher Baines2020-10-22
| | | | | | So that it'll run only if the narinfo on S3 is changed, because this should prevent it running when the hook wouldn't change the narinfo on S3, because one already exists.
* client-communication: Do not use a hard-coded uri.Mathieu Othacehe2020-10-20
| | | | | * guix-build-coordinator/client-communication.scm (send-request): Use coordinator-uri instead of the hard-coded localhost uri.
* Display exception details prior to backtraceChristopher Baines2020-10-20
| | | | | To make sure some useful information makes it out, because (backtrace) can raise an exception.
* Support extending the S3 publish hookChristopher Baines2020-10-19
| | | | To allow doing things with the nar/narinfo files before they're deleted.
* Improve .tmp build log file handlingChristopher Baines2020-10-11
| | | | Make more of an effort to ignore the .tmp files.
* Show backtrace on agent exceptionsChristopher Baines2020-10-11
|
* Move the registry file to a clearer nameChristopher Baines2020-10-11
| | | | | This will be a breaking change for existing deployments, as the old sqitch.db file will need to be moved manually.
* Add a hook to recomress build log filesChristopher Baines2020-10-11
|
* Move around the code for build log file locationsChristopher Baines2020-10-11
| | | | | | build-log-file-location replaces build-log-file-exists? as it doesn't always return a boolean, it also changes to return an absolute filepath for the log file if it exists, as this will be easier to use.
* Exclude .tmp files when checking for build logsChristopher Baines2020-10-10
|
* Guard against receiving parts of build log filesChristopher Baines2020-10-10
|
* Fix missing bad-request procedureChristopher Baines2020-10-07
|
* Separate the agent messaging server and client codeChristopher Baines2020-10-07
| | | | So that the client part doesn't depend on fibers.
* Split the fibers utils from the main utils moduleChristopher Baines2020-10-07
| | | | | To start making it possible to use the agent, without having to load anything related to fibers (as it doesn't work on the hurd yet).
* Guard against Guix Data Service requests hangingChristopher Baines2020-10-02
| | | | | I don't know if this is happening, but the hooks are getting stuck, and this might be a cause.
* Track the number of builds the allocator is consideringChristopher Baines2020-09-23
|
* Fix assq-ref when handling build resultsChristopher Baines2020-09-20
|
* Work around Guile not printing backtraces without failingChristopher Baines2020-09-18
| | | | | | | | | | | | For some exceptions raised in worker threads, seemingly those that come from guile-sqlite3, Guile can't print the backtrace without erroring itself [1]. Work around Guile not being helpful by just printing out the backtrace, that Guile may fail to print, after the details of the exception. At least then there's something informative in the output. 1: In procedure string->number: Wrong type argument in position 1 (expecting string): #f
* Simplify sqlite transactionalityChristopher Baines2020-09-18
| | | | Remove one layer of exception handling, as I don't think it was adding much.
* Better describe the default hooksChristopher Baines2020-09-17
|
* Create a run-coordinator-service procedureChristopher Baines2020-09-16
| | | | | | This is moving in the direction of not having to use the script to start the service. I think for a Guix service definition, being able to specify some Guile code directly will be better.
* Move more coordinator service startup out of the scriptChristopher Baines2020-09-16
|
* Don't patch fibers, just use the different procedure directlyChristopher Baines2020-09-16
|
* Extract call-with-sigint to the utils moduleChristopher Baines2020-09-16
|