1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-09-21 01:38:48 +02:00
phorge-phorge/src
epriestley 2e09a93dc1 Improve efficiency of worker task GC for huge loads
Summary:
Fixes T9808.

An instance imported a very large repository, generating approximately 4 million tasks over the course of a few days. A week later, these tasks started expiring and became candidates for garbage collection.

The GC works by deleting 100 rows at at time over and over again. It finds the rows it's going to delete by querying for old rows.

Currently, this query generates a `WHERE dateCreated < X ORDER BY id DESC` query. This query can not efficiently execute using a single key, because it relies on `dateCreated` order to find the rows, then on `id` order to sort them. With a table with 4M rows, this is slow.

This would still be OK, except that the query has to execute a lot of times since it only deletes 100 rows each time. Particularly, it needs to execute a total of ~40K times.

Instead, generate `WHERE dateCreated < X ORDER BY dateCreated DESC, id DESC`. This should have the same effect in general and the GC definitely doesn't care about the difference, but it should be more efficient at large scales.

Test Plan:
I had to `TRUNCATE` the problem table so I don't have a perfect repro to completely convincingly test this anymore. Both queries behave fine at small scales, which is why we haven't seen this before.

I was able to run the newer query in production before I nuked the table and have it complete in a reasonable amount of time, while the old query hung longer than I wanted to wait (several minutes?). The query plan for the new query was also a good one, while the query plan for the old query was terrible.

I loaded the daemon console and ran `bin/garbage collect --collector worker.tasks --trace`. I verified the queries looked reasonable and produced reasonable results in production.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T9808

Differential Revision: https://secure.phabricator.com/D14505
2015-11-17 17:05:10 -08:00
..
__tests__ Use PhutilClassMapQuery instead of PhutilSymbolLoader 2015-08-14 07:49:01 +10:00
aphront Allow a domain other than the install domain to serve as a short Phurl domain 2015-11-09 11:34:20 -08:00
applications Remarkup links to link to short url instead of long and fix commenting on Phurl's 2015-11-17 11:02:13 -08:00
docs Starting the Calendar user guide 2015-11-07 07:50:47 -08:00
extensions
infrastructure Improve efficiency of worker task GC for huge loads 2015-11-17 17:05:10 -08:00
view Implement a basic version of ApplicationEditor in Paste 2015-11-03 10:11:54 -08:00
__phutil_library_init__.php
__phutil_library_map__.php Remarkup links to link to short url instead of long and fix commenting on Phurl's 2015-11-17 11:02:13 -08:00