1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-10 08:52:39 +01:00
No description
Find a file
epriestley 2e09a93dc1 Improve efficiency of worker task GC for huge loads
Summary:
Fixes T9808.

An instance imported a very large repository, generating approximately 4 million tasks over the course of a few days. A week later, these tasks started expiring and became candidates for garbage collection.

The GC works by deleting 100 rows at at time over and over again. It finds the rows it's going to delete by querying for old rows.

Currently, this query generates a `WHERE dateCreated < X ORDER BY id DESC` query. This query can not efficiently execute using a single key, because it relies on `dateCreated` order to find the rows, then on `id` order to sort them. With a table with 4M rows, this is slow.

This would still be OK, except that the query has to execute a lot of times since it only deletes 100 rows each time. Particularly, it needs to execute a total of ~40K times.

Instead, generate `WHERE dateCreated < X ORDER BY dateCreated DESC, id DESC`. This should have the same effect in general and the GC definitely doesn't care about the difference, but it should be more efficient at large scales.

Test Plan:
I had to `TRUNCATE` the problem table so I don't have a perfect repro to completely convincingly test this anymore. Both queries behave fine at small scales, which is why we haven't seen this before.

I was able to run the newer query in production before I nuked the table and have it complete in a reasonable amount of time, while the old query hung longer than I wanted to wait (several minutes?). The query plan for the new query was also a good one, while the query plan for the old query was terrible.

I loaded the daemon console and ran `bin/garbage collect --collector worker.tasks --trace`. I verified the queries looked reasonable and produced reasonable results in production.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T9808

Differential Revision: https://secure.phabricator.com/D14505
2015-11-17 17:05:10 -08:00
bin Provide bin/garbage for interacting with garbage collection 2015-10-02 09:17:24 -07:00
conf Mark some strings for translation 2015-06-09 23:06:52 +10:00
externals Fix Oblivious skin summary remarkup and partially fix title 2015-11-06 20:24:11 +00:00
resources Fix file PHID extraction in Owners and Differential 2015-11-17 08:36:50 -08:00
scripts Various translation improvements 2015-11-03 07:02:46 +11:00
src Improve efficiency of worker task GC for huge loads 2015-11-17 17:05:10 -08:00
support Add a "Startup" to DarkConsole 2015-08-21 14:53:29 -07:00
webroot Add comments to internal Phame Posts 2015-11-10 08:19:38 -08:00
.arcconfig Use the configuration driven unit test engine 2015-08-11 07:57:11 +10:00
.arclint Apply phutil XHPAST linter standard 2015-11-13 07:09:12 +11:00
.arcunit Use the configuration driven unit test engine 2015-08-11 07:57:11 +10:00
.editorconfig Fix text lint issues 2015-02-12 07:00:13 +11:00
.gitignore Add custom Cows and Figlet directories to .gitignore 2015-10-08 20:23:05 -07:00
LICENSE Fix text lint issues 2015-02-12 07:00:13 +11:00
NOTICE Update Phabricator NOTICE file to reflect modern legal circumstances 2014-06-25 13:42:13 -07:00
README.md Remove push to IRC from "readme.md" too 2015-10-24 18:39:16 -07:00

Phabricator is a collection of web applications which help software companies build better software.

Phabricator includes applications for:

  • reviewing and auditing source code;
  • hosting and browsing repositories;
  • tracking bugs;
  • managing projects;
  • conversing with team members;
  • assembling a party to venture forth;
  • writing stuff down and reading it later;
  • hiding stuff from coworkers; and
  • also some other things.

You can learn more about the project (and find links to documentation and resources) at Phabricator.org

Phabricator is developed and maintained by Phacility.


SUPPORT RESOURCES

For resources on filing bugs, requesting features, reporting security issues, and getting other kinds of support, see Support Resources.

NO PULL REQUESTS!

We do not accept pull requests through GitHub. If you would like to contribute code, please read our Contributor's Guide.

LICENSE

Phabricator is released under the Apache 2.0 license except as otherwise noted.