1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-19 12:00:55 +01:00
No description
Find a file
epriestley ee937e99fb Fix unbounded expansion of allocating resource pool
Summary:
Ref T9252. I think there's a more complex version of this problem discussed elsewhere, but here's what we hit today:

  - 5 commits land at the same time and trigger 5 builds.
  - All of them go to acquire a working copy.
  - Working copies have a limit of 1 right now, so 1 of them gets the lease on it.
  - The other 4 all trigger allocation of //new// working copies. So now we have: 1 active, leased working copy and 4 pending, leased working copies.
  - The 4 pending working copies will never activate without manual intervention, so these 4 builds are stuck forever.

To fix this, prevent WorkingCopies from giving out leases until they activate. So now the leases won't acquire until we know the working copy is good, which solves the first problem.

However, this creates a secondary problem:

  - As above, all 5 go to acquire a working copy.
  - One gets it.
  - The other 4 trigger allocations, but no longer acquire leases. This is an improvement.
  - Every time the leases update, they trigger another allocation, but never acquire. They trigger, say, a few thousand allocations.
  - Eventually the first build finishes up and the second lease acquires the working copy. After some time, all of the builds finish.
  - However, they generated an unboundedly large number of pending working copy resources during this time.

This is technically "okay-ish", in that it did work correctly, it just generated a gigantic mess as a side effect.

To solve this, at least for now, provide a mechanism to impose allocation rate limits and put a cap on the number of allocating resources of a given type. As hard-coded, this the greater of "1" or "25% of the active resources in the pool".

So if there are 40 working copies active, we'll start allocating up to 10 more and then cut new allocations off until those allocations get sorted out. This prevents us from getting runaway queues of limitless size.

This also imposes a total active working copy resource limit of 1, which incidentally also fixes the problem, although I expect to raise this soon.

These mechanisms will need refinement, but the basic idea is:

  - Resources which aren't sure if they can actually activate should wait until they do activate before allowing leases to acquire them. I'm fairly confident this rule is a reasonable one.
  - Then we limit how many bookkeeping side effects Drydock can generate once it starts encountering limits.

Broadly, some amount of mess is inevitable because Drydock is allowed to try things that might not work. In an extreme case we could prevent this mess by setting all these limits at "1" forever, which would degrade Drydock to effectively be a synchronous, blocking queue.

The idea here is to put some amount of slack in the system (more than zero, but less than infinity) so we get the performance benefits of having a parallel, asyncronous system without a finite, manageable amount of mess.

Numbers larger than 0 but less than infinity are pretty tricky, but I think rules like "X% of active resources" seem fairly reasonable, at least for resources like working copies.

Test Plan:
Ran something like this:

```
for i in `seq 1 5`; do sh -c '(./bin/harbormaster build --plan 10 rX... &) &'; done;
```

Saw 5 plans launch, acquire leases, proceed in an orderly fashion, and eventually finish successfully.

Reviewers: hach-que, chad

Reviewed By: chad

Maniphest Tasks: T9252

Differential Revision: https://secure.phabricator.com/D14236
2015-10-05 15:59:16 -07:00
bin Provide bin/garbage for interacting with garbage collection 2015-10-02 09:17:24 -07:00
conf Mark some strings for translation 2015-06-09 23:06:52 +10:00
externals Use PEAR Text_Figlet to render figlet fonts 2015-09-13 12:31:07 -07:00
resources Correct a Dashboard status constant in a migration 2015-10-02 09:17:43 -07:00
scripts Provide bin/garbage for interacting with garbage collection 2015-10-02 09:17:24 -07:00
src Fix unbounded expansion of allocating resource pool 2015-10-05 15:59:16 -07:00
support Add a "Startup" to DarkConsole 2015-08-21 14:53:29 -07:00
webroot Update Asana Logo 2015-09-30 10:15:53 -07:00
.arcconfig Use the configuration driven unit test engine 2015-08-11 07:57:11 +10:00
.arclint Turn lint TODO comments back on 2015-05-27 10:06:55 -07:00
.arcunit Use the configuration driven unit test engine 2015-08-11 07:57:11 +10:00
.editorconfig Fix text lint issues 2015-02-12 07:00:13 +11:00
.gitignore When registering a device, write a device ID 2015-01-22 16:06:04 -08:00
LICENSE Fix text lint issues 2015-02-12 07:00:13 +11:00
NOTICE Update Phabricator NOTICE file to reflect modern legal circumstances 2014-06-25 13:42:13 -07:00
README.md Marginal improvements to README 2015-03-08 11:29:06 -07:00

Phabricator is an open source collection of web applications which help software companies build better software.

Phabricator includes applications for:

  • reviewing and auditing source code;
  • hosting and browsing repositories;
  • tracking bugs;
  • managing projects;
  • conversing with team members;
  • assembling a party to venture forth;
  • writing stuff down and reading it later;
  • hiding stuff from coworkers; and
  • also some other things.

You can learn more about the project (and find links to documentation and resources) at Phabricator.org

Phabricator is developed and maintained by Phacility.


BUG REPORTS

Please update your install to HEAD before filing bug reports. Follow our bug reporting guide for complete instructions.

FEATURE REQUESTS

We're big fans of feature requests that state core problems, not just 'add this'. We've compiled a short guide to effective upstream requests here.

COMMUNITY CHAT

Please visit our IRC Channel (#phabricator on FreeNode) to talk with other members of the Phabricator community. There might be someone there who can help you with setup issues or what image to choose for a macro.

SECURITY ISSUES

Phabricator participates in HackerOne and may pay out for various issues reported there. You can find out more information on our HackerOne page.

PULL REQUESTS

We do not accept pull requests through GitHub. If you would like to contribute code, please read our Contributor's Guide for more information.

LICENSE

Phabricator is released under the Apache 2.0 license except as otherwise noted.