aboutsummaryrefslogtreecommitdiff
path: root/guix-data-service/model
Commit message (Collapse)AuthorAge
* Split up handling of package description dataChristopher Baines2024-01-31
| | | | To hopefully see which part is slow.
* Remove even more time loggingChristopher Baines2024-01-28
|
* Remove some time loggingChristopher Baines2024-01-27
| | | | As this is a bit noisy.
* Fixup testsChristopher Baines2024-01-18
|
* Split and instrument parts of inferior-packages->package-metadata-idsChristopher Baines2024-01-18
| | | | As parts of it are slow.
* Rewrite part of insert-missing-data-and-return-all-ids to avoid filterChristopher Baines2024-01-18
| | | | | As filter can use part of the input list, which then prevents modifying the filtered list.
* Have delete-duplicates/sort! take a equality procedureChristopher Baines2024-01-18
| | | | And change the default, as eq? doesn't always work.
* Use delete-duplicates/sort! in inferior-packages->license-set-idsChristopher Baines2024-01-18
| | | | As it should offer a speedup over delete-duplicates.
* Use delete-duplicates/sort! in insert-missing-data-and-return-all-idsChristopher Baines2024-01-18
| | | | As it's faster than delete-duplicates for large amounts of data.
* Memoize computing tokensChristopher Baines2023-11-24
| | | | | As I'm not sure how expensive this is, but it doesn't need doing for every request.
* Handle derivations with no sourcesChristopher Baines2023-11-05
|
* Include output information in the package page responseChristopher Baines2023-11-05
| | | | | As this will be useful for QA to say whether the package builds reproducibly or not.
* Use fibers when processing new revisionsChristopher Baines2023-11-05
| | | | | | | | Just have one fiber at the moment, but this will enable using fibers for parallelism in the future. Fibers seemed to cause problems with the logging setup, which was a bit odd in the first place. So move logging to the parent process which is better anyway.
* Make some sweeping changes to loading new revisionsChristopher Baines2023-11-02
| | | | | Move in the direction of being able to run multiple inferior REPLs, and use some vectors rather than lists in places (maybe this is more efficient).
* Remove redundant joins from the select build queryChristopher Baines2023-10-16
|
* Support polling git repositories for new branches/revisionsChristopher Baines2023-10-09
| | | | | | | | | | | This is mostly a workaround for the occasional problems with the guix-commits mailing list, as it can break and then the data service doesn't learn about new revisions until the problem is fixed. I think it's still a generally good feature though, and allows deploying the data service without it consuming emails to learn about new revisions, and is a step towards integrating some kind of way of notifying the data service to poll.
* Try to fix backfilling blocked_buildsChristopher Baines2023-07-02
|
* Filter out duplicate ids for blocking buildsChristopher Baines2023-07-02
|
* Query for outputs when build events arriveChristopher Baines2023-06-09
| | | | This will keep the substitute information more up to date.
* Fix ignoring canceled buildsChristopher Baines2023-05-18
| | | | | The previous changes only affected searching for package derivations, and they also didn't work.
* Ignore canceled builds when querying package derivationsChristopher Baines2023-05-18
| | | | | This will help when using this to submit builds, since you won't end up ignoring derivations with canceled builds.
* Ensure the known and unknown keys appearChristopher Baines2023-05-09
|
* Remove redundant match-lambda in select-package-output-availability-for-revisionChristopher Baines2023-05-09
|
* Use the package_derivations system id in a queryChristopher Baines2023-05-04
| | | | | Rather than the derivations system id, as this helps PostgreSQL run the query faster.
* Further tweak fetching narinfosChristopher Baines2023-04-28
| | | | | Move the batching to the database, which should reduce memory usage while removing the limit on the number of fetched narinfos.
* Improve performance of select-fixed-output-package-derivations-in-revisionChristopher Baines2023-03-11
|
* Fix query in get-count-for-next-levelChristopher Baines2023-03-09
|
* Avoid a recursive CTE for finding blocked builds where possibleChristopher Baines2023-03-09
| | | | | | Use the new approach of looking up the distribution of the derivations, and building a non recursive query specifically for this revision. This should avoid PostgreSQL picking a poor plan for performing the query.
* Store the distribution of derivations related to packagesChristopher Baines2023-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This might be generally useful, but I've been looking at it as it offers a way to try and improve query performance when you want to select all the derivations related to the packages for a revision. The data looks like this (for a specified system and target): ┌───────┬───────┐ │ level │ count │ ├───────┼───────┤ │ 15 │ 2 │ │ 14 │ 3 │ │ 13 │ 3 │ │ 12 │ 3 │ │ 11 │ 14 │ │ 10 │ 25 │ │ 9 │ 44 │ │ 8 │ 91 │ │ 7 │ 1084 │ │ 6 │ 311 │ │ 5 │ 432 │ │ 4 │ 515 │ │ 3 │ 548 │ │ 2 │ 2201 │ │ 1 │ 21162 │ │ 0 │ 22310 │ └───────┴───────┘ Level 0 reflects the number of packages. Level 1 is similar as you have all the derivations for the package origins. The remaining levels contain less packages since it's mostly just derivations involved in bootstrapping. When using a recursive CTE to collect all the derivations, PostgreSQL assumes that the each derivation has the same number of inputs, and this leads to a large overestimation of the number of derivations per a revision. This in turn can lead to PostgreSQL picking a slower way of running the query. When it's known how many new derivations you should see at each level, it's possible to inform PostgreSQL this by using LIMIT's at various points in the query. This reassures the query planner that it's not going to be handling lots of rows and helps it make better decisions about how to execute the query.
* Guard against divide by 0 in update-derivation-outputs-statisticsChristopher Baines2022-11-28
|
* Do derivation inputs and outputs housekeeping at the end of each jobChristopher Baines2022-11-28
| | | | | | This should help with query performance, as the recursive queries using derivation_inputs and derivation_outputs are particularly sensitive to the n_distinct values for these tables.
* Fix calling insert-blocked-buildsChristopher Baines2022-11-20
|
* Make backfilling blocked_builds a bit smarterChristopher Baines2022-11-12
| | | | And drop the chunk size.
* Handle deleting from blocked_builds when builds are scheduledChristopher Baines2022-11-12
| | | | As scheduling a build might unblock others.
* View scheduled builds like succeeded builds in terms of blockingChristopher Baines2022-11-12
| | | | | | | | This means that an output is viewed to not be blocking if it has a scheduled build, just as if it has a succeeded build. Also, scheduling builds will unblock blocked builds. This is helpful as it means that it reduces noise for blocking builds.
* Tweak backfilling the blocked buildsChristopher Baines2022-11-12
|
* Use latest_build_status rather than build_statusChristopher Baines2022-11-12
| | | | In various places in the blocked-builds module.
* Have insert-blocked-builds cache when the partitions existChristopher Baines2022-11-11
| | | | To make it more efficient.
* Rework insert-blocked-builds to make it more efficientChristopher Baines2022-11-11
| | | | This also fixes a typo in the partition name.
* Stop using exception handling when inserting blocked_builds entriesChristopher Baines2022-11-11
| | | | As it doesn't work in a transaction.
* Add a blocking builds pageChristopher Baines2022-11-11
|
* Add support for incrementally tracking blocked buildsChristopher Baines2022-11-11
| | | | | | | | | This will hopefully provide a less expensive way of finding out if a scheduled build is probably blocked by other builds failing or being canceled. By working this out when the build events are recieved, it should be more feasible to include information about whether builds are likely blocked or not in various places (e.g. revision comparisons).
* Improve chunking when inserting derivation inputsChristopher Baines2022-09-17
| | | | | Chunk the values inserted in the query, rather than the derivations involved, as this is more consistent.
* Reduce some chunk sizesChristopher Baines2022-09-17
|
* Further reduce some chunk sizesChristopher Baines2022-09-15
|
* Chunk the data for some queries in insert-missing-data-and-return-all-idsChristopher Baines2022-09-15
| | | | | This helps to avoid queries getting logged as slow just because of the amount of data.
* Format some queries generated in insert-missing-data-and-return-all-idsChristopher Baines2022-09-14
|
* Reduce some chunk sizesChristopher Baines2022-09-14
| | | | As these queries are still slow enough to be logged.
* Speed up finding the locales for a revisionChristopher Baines2022-09-14
|
* Reduce chunk size for inserting dervation inputsChristopher Baines2022-09-14
| | | | As this query can take some time.