1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-12 00:26:13 +01:00
Commit graph

34 commits

Author SHA1 Message Date
Joshua Spence
acb1eb81cc Move some PhabricatorSearchField subclasses
Summary: Move some `PhabricatorSearchField` subclasses to be adjacent to the application to which they belong. This seems generally better to me than lumping them all together in the `src/applications/search/field/` directory. I was also wondering if it makes sense to rename these subclasses as `PhabricatorXSearchField` rather than `PhabricatorSearchXField` (as per T5655), but wasn't really sure if these objects are meant to be search-fields, or just fields belonging to the #search application.

Test Plan: N/A.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin

Differential Revision: https://secure.phabricator.com/D13374
2015-07-06 22:52:05 +10:00
epriestley
729606ba93 Update BulkJob and MetaMTA search engines for redesign-2015 2015-06-23 13:39:27 -07:00
epriestley
3215899925 Execute Maniphest batch edits in the background with a web UI progress bar
Summary:
Ref T8637. This does nothing interesting, just has empty scaffolding for a bulk job queue.

Basic idea is that when you do something like a batch edit in Maniphest, we:

  - Create a BulkJob with all the details.
  - Queue a worker to start the job.
  - Send you to a progress bar page for the job.

In the background:

  - The "start job" worker creates a ton of Task objects, then queues worker tasks to do the work.

In the foreground:

  - Fancy ajax animates the progress bar and it goes wooosh.

In general:

  - Big jobs actually work.
  - Jobs get logged.
  - You can monitor jobs.
  - Terrible junk like T8637 should be much harder to write and much easier to catch and diagnose.

Test Plan:
No interesting code/beahavior yet. Clean `storage adjust`.

{F526411}

Reviewers: chad, btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T8637

Differential Revision: https://secure.phabricator.com/D13392
2015-06-23 13:36:16 -07:00
Joshua Spence
36e2d02d6e phtize all the things
Summary: `pht`ize a whole bunch of strings in rP.

Test Plan: Intense eyeballing.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: hach-que, Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12797
2015-05-22 21:16:39 +10:00
Joshua Spence
acb45968d8 Use __CLASS__ instead of hard-coding class names
Summary: Use `__CLASS__` instead of hard-coding class names. Depends on D12605.

Test Plan: Eyeball it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: hach-que, Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12806
2015-05-14 07:21:13 +10:00
epriestley
31a89bb94d Revert a json_decode() which decodes possible scalars
See D12714, D12680.

Auditors: joshuaspence
2015-05-05 11:08:32 -07:00
Joshua Spence
70c8649142 Use phutil_json_decode instead of json_decode
Summary: Generally, `phutil_json_decode` should be preferred over `json_decode`.

Test Plan: Eyellballed.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12680
2015-05-05 20:48:55 +10:00
epriestley
55e49d7e31 Provide more buildXClause() and buildXClauseParts() on PolicyAwareQuery
Summary:
Ref T4100. Ref T5595. These functions are trivial for now, but move us toward being able to define more default query behavior by default.

Future changes will give these methods meaningful, nontrivial behaviors.

Test Plan: `arc unit --everything`

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T5595, T4100

Differential Revision: https://secure.phabricator.com/D12454
2015-04-20 10:06:10 -07:00
epriestley
f5580c7a08 Make buildWhereClause() a method of AphrontCursorPagedPolicyAwareQuery
Summary:
Ref T4100. Ref T5595.

To support a unified "Projects:" query across all applications, a future diff is going to add a set of "Edge Logic" capabilities to `PolicyAwareQuery` which write the required SELECT, JOIN, WHERE, HAVING and GROUP clauses for you.

With the addition of "Edge Logic", we'll have three systems which may need to build components of query claues: ordering/paging, customfields/applicationsearch, and edge logic.

For most clauses, queries don't currently call into the parent explicitly to get default components. I want to move more query construction logic up the class tree so it can be shared.

For most methods, this isn't a problem, but many subclasses define a `buildWhereClause()`. Make all such definitions protected and consistent.

This causes no behavioral changes.

Test Plan: Ran `arc unit --everything`, which does a pretty through job of verifying this statically.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: yelirekim, hach-que, epriestley

Maniphest Tasks: T4100, T5595

Differential Revision: https://secure.phabricator.com/D12453
2015-04-20 10:06:09 -07:00
epriestley
ed2a5a9a34 Fix PhabricatorWorkerTriggerQuery method visibility
Summary: I got these wrong and the test didn't trigger for some reason that I haven't looked into.

Test Plan: `arc unit --everything`

Reviewers: hach-que, btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D11453
2015-01-22 16:10:08 -08:00
epriestley
77bcbed9f9 Implement PolicyAwareQuery for triggers
Summary:
Ref T6881. I tried to cheat here by not implementing this, but we need it for destroying triggers directly with `bin/remove destroy`, since that needs to load them by PHID.

So, cheat slightly less. Implement PolicyAware but not CursorPagedPolicyAware.

Test Plan:
  - Used `bin/remove destroy` to destroy a trigger by PHID.
  - Browsed daemon console.
  - Ran trigger daemon.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11445
2015-01-20 13:32:43 -08:00
epriestley
ef106d2979 Add order-by-ID to PhabricatorWorkerTriggerQuery
Summary:
Ref T6881. By design, the EXECUTION order only selects tasks which have been scheduled (since it performs a JOIN). This is inconsistent with other queries and problematic for withID/withPHID queries which may want to select an unscheduled task.

Switch to standard ID ordering by default.

Test Plan:
  - Instances console now finds unscheduled triggers.
  - Verified that all existing queries specify an explicit order.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11436
2015-01-19 16:55:52 -08:00
epriestley
3860c56e85 Allow querying triggers by ID/PHID
Summary: Ref T6881. I just want to show trigger info in the instance management console.

Test Plan: Will test in Instances.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11428
2015-01-19 16:55:08 -08:00
epriestley
19be32656f Implement clock/trigger infrastructure for scheduling actions
Summary:
Ref T6881. Hopefully, this is the hard part.

This adds a new daemon (the "trigger" daemon) which processes triggers, schedules them, and then executes them at the scheduled time. The design is a little complicated, but has these goals:

  - High resistance to race conditions: only the application writes to the trigger table; only the daemon writes to the event table. We won't lose events if someone saves a meeting at the same time as we're sending a reminder out for it.
  - Execution guarantees: scheduled events are guaranteed to execute exactly once.
  - Support for arbitrarily large queues: the daemon will make progress even if there are millions of triggers in queue. The cost to update the queue is proportional to the number of changes in it; the cost to process the queue is proportional to the number of events to execute.
  - Relatively good observability: you can monitor the state of the trigger queue reasonably well from the web UI.
  - Modular Infrastructure: this is a very low-level construct that Calendar, Phortune, etc., should be able to build on top of.

It doesn't have this stuff yet:

  - Not very robust to bad actions: a misbehaving trigger can stop the queue fairly easily. This is OK for now since we aren't planning to make it part of any other applications for a while. We do still get execute-exaclty-once, but it might not happen for a long time (until someone goes and fixes the queue), when we could theoretically continue executing other events.
  - Doesn't start automatically: normal users don't need to run this thing yet so I'm not starting it by default.
  - Not super well tested: I've vetted the basics but haven't run real workloads through this yet.
  - No sophisticated tooling: I added some basic stuff but it's missing some pieces we'll have to build sooner or later, e.g. `bin/trigger cancel` or whatever.
  - Intentionally not realtime: This design puts execution guarantees far above realtime concerns, and will not give you precise event execution at 1-second resolution. I think this is the correct goal to pursue architecturally, and certainly correct for subscriptions and meeting reminders. Events which execute after they have become irrelevant can simply decline to do anything (like a meeting reminder which executes after the meeting is over).

In general, the expectation for applications is:

  - When creating an object (like a calendar event) that needs to trigger a scheduled action, write a trigger (and save the PHID if you plan to update it later).
  - The daemon will process the event and schedule the action efficiently, in a race-free way.
  - If you want to move the action, update the trigger and the daemon will take care of it.
  - Your action will eventually dump a task into the task queue, and the task daemons will actually perform it.

Test Plan:
Using a test script like this:

```
<?php

require_once 'scripts/__init_script__.php';

$trigger = id(new PhabricatorWorkerTrigger())
  ->setAction(
    new PhabricatorLogTriggerAction(
      array(
        'message' => 'test',
      )))
  ->setClock(
    new PhabricatorMetronomicTriggerClock(
      array(
        'period' => 33,
      )))
  ->save();

var_dump($trigger);
```

...I queued triggers and ran the daemon:

  - Verified triggers fire;
  - verified triggers reschedule;
  - verified trigger events show up in the web UI;
  - tried different periods;
  - added some triggers while the daemon was running;
  - examined `phd debug` output for anything suspicious.

It seems to work in trivial use case, at least.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11419
2015-01-16 12:13:31 -08:00
epriestley
8c4f3edd8a Skip some repository checks in cluster enviornments
Summary:
Ref T2783. Currently, the repository edit page does some checks agaisnt the local system to look for binaries and files on disk. These checks don't make sense in a cluster environment.

Ideally, we could make a Conduit call to the host (e.g., add something like `diffusion.querysetupstatus`) to do these checks, but since they're pretty basic config things and cluster installs are advanced, it doesn't seem super worthwhile for now.

Test Plan: Saw fewer checks in a cluster repo.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T2783

Differential Revision: https://secure.phabricator.com/D11102
2014-12-31 11:50:35 -08:00
epriestley
ba4ebf28ad Allow archived tasks to be queried by object PHID and order by id
Summary: Ref T5402.

Test Plan:
  - Queried archived tasks.
  - Grepped for use sites and verified no other callsites are order-sensitive.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11089
2014-12-30 15:54:56 -08:00
Bob Trahan
9219645287 Daemons - add "objectPHID" to task tables.
Summary: Ref T5402. This more or less "fixes" it but there's probably some polish to do?

Test Plan:
stopped and started daemons. error logs look good.

ran bin/storage upgrade.  noted that `adjust` added the appropriate indices for active and archive task.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11044
2014-12-23 16:30:05 -08:00
Bob Trahan
a4474a4975 Daemons - introduce PhabricatorWorkerArchiveTaskQuery
Summary: Ref T5402. This cleans up some code and sets us up to use this sort of data more easily later.

Test Plan: viewed the daemon console from the web and the log of a specific archived daemon. both looked good. for other callsites looked really, really carefully.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11042
2014-12-23 15:45:42 -08:00
epriestley
7d9bda59a6 Prevent worker queue leases from exceeding 64 characters
Summary:
Ref T6742. Root cause of the issue:

  - Daemon was running on a machine with a very long host name, which produced a lease name which was longer than 64 characters.
  - MySQL wasn't set in STRICT_ALL_TABLES.
  - The daemon would `UPDATE .. SET leaseOwner = <very long string>` to lock a task, and MySQL would silently truncate.
  - The daemon would then try to select the locked task, but fail, because there's no matching lease owner.

To resolve this, use only the first 32 characters of the hostname. See IRC for more discussion.

Test Plan: Will confirm with reporter.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6742

Differential Revision: https://secure.phabricator.com/D10998
2014-12-17 11:10:01 -08:00
epriestley
b5f7e9eec6 Reverse meaning of task priority column
Summary:
Ref T6615. Mixing ASC and DESC ordering on a multipart key makes it dramatically less effective (or perhaps totally ineffective).

Reverse the meaning of the `priority` column so it goes in the same direction as the `id` column (both ascending, lower values execute sooner).

Test Plan:
  - Queued 1.2M tasks with `bin/worker flood`.
  - Processed ~1 task/second with `bin/phd debug taskmaster` before patch.
  - Applied patch, took ~5 seconds for ~1.2M rows.
  - Processed ~100-200 tasks/second with `bin/phd debug taskmaster` after patch.
  - "Next in Queue" query on daemon page dropped from 1.5s to <1ms.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: aklapper, 20after4, epriestley

Maniphest Tasks: T6615

Differential Revision: https://secure.phabricator.com/D10895
2014-11-24 11:10:35 -08:00
epriestley
914b8bb32c Fix daemon task queue to respect task priority
Summary:
Fixes an issue with T5336 / D9871. We did 99% of the work here but didn't actually turn on the priority sorting. The unit test passed by default, which didn't catch this.

  - Fix the unit test (it failed).
  - Fix the query (test now passes).
  - Add a "Next in Queue" element to the UI to make this kind of thing easier to spot/understand.

Test Plan: Ran unit test. Viewed "Next in Queue". Queued some tasks, flushed the queue. Web UI tracked the state sensibly.

Reviewers: joshuaspence, btrahan

Reviewed By: btrahan

Subscribers: cburroughs, epriestley

Differential Revision: https://secure.phabricator.com/D10766
2014-10-31 09:27:04 -07:00
Joshua Spence
9a679bf374 Allow worker tasks to have priorities
Summary: Fixes T5336. Currently, `PhabricatorWorkerLeaseQuery` is basically FIFO. It makes more sense for the queue to be a priority-queue, and to assign higher priorities to alerts (email and SMS).

Test Plan: Created dummy tasks in the queue (with different priorities). Verified that the priority field was set correctly in the DB and that the priority was shown on the `/daemon/` page. Started a `PhabricatorTaskmasterDaemon` and verified that the higher priority tasks were executed before lower priority tasks.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: epriestley, Korvin

Maniphest Tasks: T5336

Differential Revision: https://secure.phabricator.com/D9871
2014-07-12 03:02:06 +10:00
Joshua Spence
8756d82cf6 Remove @group annotations
Summary: I'm pretty sure that `@group` annotations are useless now... see D9855. Also fixed various other minor issues.

Test Plan: Eye-ball it.

Reviewers: #blessed_reviewers, epriestley, chad

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9859
2014-07-10 08:12:48 +10:00
Joshua Spence
0a62f13464 Change double quotes to single quotes.
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.

Test Plan: Eyeballed it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9431
2014-06-09 11:36:50 -07:00
epriestley
cb545856a9 Make task queue more robust against long-running tasks
Summary:
See discussion in D8773. Three small adjustments which should help prevent this kind of issue:

  - When queueing followup tasks, hold them on the worker until we finish the task, then queue them only if the work was successful.
  - Increase the default lease time from 60 seconds to 2 hours. Although most tasks finish in far fewer than 60 seconds, the daemons are generally stable nowadays and these short leases don't serve much of a purpose. I think they also date from an era where lease expiry and failure were less clearly distinguished.
  - Increase the default wait-after-failure from 60 seconds to 5 minutes. This largely dates from the MetaMTA era, where Facebook ran services with high failure rates and it was appropriate to repeatedly hammer them until things went through. In modern infrastructure, such failures are rare.

Test Plan:
  - Verified that tasks queued properly after the main task was updated.
  - Verified that leases default to 7200 seconds.
  - Intentionally failed a task and verified default 300 second wait before retry.
  - Removed all default leases shorter than 7200 seconds (there was only one).
  - Checked all the wait before retry implementations for anything much shorter than 5 minutes (they all seem reasonable).

Reviewers: btrahan, sowedance

Reviewed By: sowedance

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D8774
2014-04-15 08:42:02 -07:00
epriestley
0ed281d25e Make taskmaster consumption of failed tasks more FIFO-ey
Summary:
Ref T1049. See discussion in D7745. We have some specific interest in this for D7745, but generally we want to consume tasks with expired leases in roughly FIFO order, just like we consume new tasks in roughly FIFO order. Currently, when we select an expired task we order them by `id`, but this is the original insert order, not lease expiration order. Instead, order by `leaseExpires`.

This query is actually much better than the old one was, since the WHERE part is `leaseExpries < VALUE`.

Test Plan: Ran `EXPLAIN` on the query. Ran a taskmaster in debug mode and saw it lease new and expired tasks successfully.

Reviewers: hach-que, btrahan

Reviewed By: hach-que

CC: aran

Maniphest Tasks: T1049

Differential Revision: https://secure.phabricator.com/D7746
2013-12-08 21:07:13 -08:00
epriestley
af321df3c0 Fix a bad query in LeaseQuery
Summary: Fixes T3610. This got un-scoped at some point, and is only used in Drydock so it escaped notice.

Test Plan: Ran `bin/drydock lease --type host` without hitting an exception about incorrect query construction.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T3610

Differential Revision: https://secure.phabricator.com/D6552
2013-07-24 11:32:03 -07:00
Jakub Vrana
3391e3d34b Use (a = ? AND b = ?) instead of (a, b) IN (?, ?)
Summary: MySQL is not able to use indexes with searching for tuples.

Test Plan:
Explained the query before and after, saw `key_len` 16 instead of 8.
Also saw time 0.0 s instead of 2.9 s (but that was probably caused by warming up).

Reviewers: epriestley

Reviewed By: epriestley

CC: aran, Korvin

Differential Revision: https://secure.phabricator.com/D5580
2013-04-05 23:02:06 -07:00
epriestley
22c64c67ff Fix performance problem for large task queues
Summary:
Some time ago, we added `ORDER BY id ASC` to the worker `UPDATE ...` query, because someone reported that their MySQL read slaves were complaining about the query (I can't find the exact error message, but something to the effect of the rows the query affected not being deterministic). This seemed harmless since it should be the same as the query's implicit order (I guess?), but actually made the query dramatically slower for large numbers of rows.

On my local machine, this query takes about 2 seconds with ~1M rows. If I run `SELECT`, or run `UPDATE` without ORDER BY, the query takes < 0.01s. I don't understand exactly what's happening -- my guess is something to do with the ORDER BY implying that a lot of rows need to be locked?

In T2372, a user is seeing 20-60s rumtimes on this query.

I solved this by doing a SELECT, followed by an UPDATE. Each query runs quickly. This introduces the possibility of a race (two processes SELECT the same rows, then try to UPDATE), which we currently recover from by having the second UPDATE fail and then having that daemon try again 1 second later. This seems generally reasonable. Some alternatives I considered:

  - We could SELECT ... LOCK FOR UPDATE, but failing and retrying a little later seems at least as good as blocking.
  - We could select more rows than we need, and then try to lock some of them randomly. I think this would work well, but it's a bit more complex than what we're doing now so I left it until we have a clearer need.

Test Plan:
Inserted ~1M tasks into the queue. Ran `phd debug taskmaster`, saw ~2s task updates. Applied patch. Ran `phd debug taskmaster`, saw <1ms updates. Ran `phd launch 8 taskmaster`, saw rapid completion of tasks.

This stuff also has fairly thorough unit test coverage.

Reviewers: vrana, btrahan

Reviewed By: vrana

CC: aran

Maniphest Tasks: T2372

Differential Revision: https://secure.phabricator.com/D4576
2013-01-22 12:27:18 -08:00
vrana
ef85f49adc Delete license headers from files
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).

We are removing the headers for these reasons:

- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.

This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).

Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.

Reviewers: epriestley, davidrecordon

Reviewed By: epriestley

CC: aran, Korvin

Maniphest Tasks: T2035

Differential Revision: https://secure.phabricator.com/D3886
2012-11-05 11:16:51 -08:00
epriestley
0364ccdd5f Fix an issue where PhabricatorWorkerLeaseQuery may lease different tasks than it selects
Summary:
We lock tasks by setting `leaseOwner` to a unique value, but the value is currently unique-to-the-process rather than unique-to-the-query. This means that if a process leases a task, then leases another task, both tasks will have the same `leaseOwner`. This can cause an issue where we go to select the task we just leased and get the other task instead, if we aren't careful about the select construction.

We can avoid this by being clever and making sure the select is constructed correctly, but making the `leaseOwner` unique to the query is much simpler and more foolproof. This guarantees we always select only the rows we just leased.

Also remove `PhabricatorGoodForNothingWorker` since `PhabricatorTestWorker` fills its role of allowing things to be tested, and simplify the unit tests since we don't need to be clever about avoiding this issue any more.

Test Plan: Ran unit tests.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D3862
2012-11-01 11:30:49 -07:00
epriestley
f0fdcf1a51 Undumb the Drydock resource allocator pipeline
Summary:
This was the major goal of D3859/D3855, and to a lesser degree D3854/D3852.

As Drydock is allocating a resource, it may need to allocate other resources first. For example, if it's allocating a working copy, it may need to allocate a host first.

Currently, we have the process basically queue up the allocation (insert a task into the queue) and sleep() until it finishes. This is problematic for a bunch of reasons, but the major one is that if allocation takes more resources (host, port, machine, DNS) than you have daemons, they could all end up sleeping and waiting for some other daemon to do their work. This is really stupid. Even if you only take up some of them, you're spending slots sleeping when you could be doing useful work.

To partially get around this and make the CLI experience less dumb, there's this goofy `synchronous` flag that gets passed around everywhere and pushes the workflow through a pile of special cases. Basically the `synchronous` flag causes us to do everything in-process. But this is dumb too because we'd rather do things in parallel if we can, and we have to have a lot of special case code to make it work at all.

Get rid of all of this. Instead of sleep()ing, try to work on the tasks that need to be worked on. If another daemon grabbed them already that's fine, but in the worst case we just gracefully degrade and do everything in process. So we get the best of both worlds: if we have parallelizable tasks and free daemons, things will execute in parallel. If we have nonparallelizable tasks or no free daemons, things will execute in process.

Test Plan: Ran `drydock_control.php --trace` and saw it perform cascading allocations without sleeping or special casing.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D3861
2012-11-01 11:30:42 -07:00
epriestley
84ee4cd9f6 Factor out task execution and formalize permanent failures
Summary:
  - Clean up a TODO about permanent failures.
  - Clean up a TODO about failing tasks after too many retries.
  - Clean up a TODO about testing for bad leases.
  - Make the lease/retry implementation more flexible and natural.
  - Make completely bogus tasks fail permanently.
  - Make PhabricatorMetaMTAWorker use new `getWaitBeforeRetry()` (as intended), not hackily implement logic in `getRequiredLeaseTime()`.
  - Document worker hooks for failures and retries.
  - Provide coverage on everything.

Test Plan: Ran unit tests. Ran `bin/phd debug taskmaster`.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D3859
2012-11-01 11:30:23 -07:00
epriestley
88fad90c1c Move task leasing to a dedicated query
Summary: This simplifies the fairly thorny logic of leasing tasks a bit. I'm planning to introduce another callsite shortly for Drydock.

Test Plan: Ran `bin/phd debug taskmaster`, observed sensible queries and correct operation.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D3855
2012-11-01 11:30:16 -07:00