1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-27 17:22:42 +01:00
Commit graph

82 commits

Author SHA1 Message Date
epriestley
6b86f81fe4 Increase the visibility of permanent task failures in task queue
Make permanent failures always reach the log.
Make `bin/worker execute` report exceptions properly.
2015-03-15 13:27:05 -07:00
epriestley
a3518e19a5 Merge GC daemon into Trigger daemon
Summary:
Fixes T7352. This reduces the memory footprint for instances by combining these two similar daemons into one daemon which handles the responsibilities of both.

The fit isn't 100% perfect here but it's pretty close, and the GC daemon is fairly trivial.

Test Plan:
  - Adjusted all the numbers to small numbers (5 second sleep, 120 second GC length).
  - Added a ton of logging.
  - Started trigger daemon.
    - Saw it run a GC cycle.
    - Saw it reschedule another cycle after 120 seconds (adjusted down from 4 hours).
  - Reverted all the logging/small numbers.
  - Ran `bin/phd start`, saw stable trigger daemon running.
  - Grepped for removed daemon class name.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7352

Differential Revision: https://secure.phabricator.com/D11872
2015-02-24 14:50:39 -08:00
epriestley
af303f458b Convert taskmasters to use an autoscale pool
Summary: Ref T7352. This is pretty straightforward. I renamed `phd.start-taskmasters` to `phd.taskmasters` for clarity.

Test Plan:
  - Ran `phd start`, `phd start --autoscale-reserve 0.25`, `phd restart --autoscale-reserve 0.25`, etc.
  - Examined PID file to see options were passed.
  - I'm defaulting this off (0 reserve) and making it a flag rather than an option because it's a very advanced feature which is probably not useful outside of instancing.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7352

Differential Revision: https://secure.phabricator.com/D11871
2015-02-24 14:50:38 -08:00
epriestley
48fc3126a1 Support autoscaling daemons in phd
Summary: Ref T7352. This supports passing autoscaling configuration to daemons, and adds `debug --autoscale`.

Test Plan: See D11711.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7352

Differential Revision: https://secure.phabricator.com/D11860
2015-02-24 14:50:34 -08:00
epriestley
2cd77b5b58 Improve taskmaster behavior on empty queues
Summary:
Right now, taskmasters on empty queues sleep for 30 seconds. With a default setup (4 taskmasters), this averages out to 7.5 seconds between the time you do anything that queues something and the time that the taskmasters start work on it.

On instances, which currently launch a smaller number of taskmasters, this wait is even longer.

Instead, sleep for the number of seconds that there are taskmasters, with a random offset. This makes the average wait to start a task from an empty queue 1 second, and the average maximum load of an empty queue also one query per second.

On loaded instances this doesn't matter, but this should dramatically improve behavior for less-loaded instances without any real tradeoffs.

Test Plan: Started several taskmasters, saw them jitter out of sync and then use short sleeps to give an empty queue about a 1s delay.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D11772
2015-02-16 11:30:49 -08:00
epriestley
6f90fbdef8 Send emails for email invites
Summary:
Ref T7152. Ref T3554.

  - When an administrator clicks "send invites", queue tasks to send the invites.
  - Then, actually send the invites.
  - Make the links in the invites work properly.
  - Also provide `bin/worker execute` to make debugging one-off workers like this easier.
  - Clean up some UI, too.

Test Plan:
We now get as far as the exception which is a placeholder for a registration workflow.

{F291213}

{F291214}

{F291215}

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T3554, T7152

Differential Revision: https://secure.phabricator.com/D11736
2015-02-11 06:06:09 -08:00
epriestley
d804598f17 Add some of a billing daemon skeleton
Summary:
Ref T6881. This adds the worker, and a script to make it easier to test. It doesn't actually invoice anything.

I'm intentionally allowing the script to double-bill since it makes testing way easier (by letting you bill the same period over and over again), and provides a tool for recovery if billing screws up.

(This diff isn't very interesting, just trying to avoid a 5K-line diff at the end.)

Test Plan: Used `bin/phortune invoice ...` to get the worker to print out some date ranges which it would theoretically invoice.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11577
2015-01-30 11:29:05 -08:00
epriestley
ed2a5a9a34 Fix PhabricatorWorkerTriggerQuery method visibility
Summary: I got these wrong and the test didn't trigger for some reason that I haven't looked into.

Test Plan: `arc unit --everything`

Reviewers: hach-que, btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D11453
2015-01-22 16:10:08 -08:00
epriestley
77bcbed9f9 Implement PolicyAwareQuery for triggers
Summary:
Ref T6881. I tried to cheat here by not implementing this, but we need it for destroying triggers directly with `bin/remove destroy`, since that needs to load them by PHID.

So, cheat slightly less. Implement PolicyAware but not CursorPagedPolicyAware.

Test Plan:
  - Used `bin/remove destroy` to destroy a trigger by PHID.
  - Browsed daemon console.
  - Ran trigger daemon.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11445
2015-01-20 13:32:43 -08:00
epriestley
934df0e735 Add bin/trigger, for testing event triggers
Summary:
Ref T6881. This makes it easier to fire a trigger and make sure it works properly. You can use the `--now` flag to travel through time, and test scheduling conditions with `--last` and `--next`. It will tell you when the trigger would reschedule.

Better than waiting 24 hours to see if things work.

Test Plan: Fired some backups, got useful output which made me think my code probably works correctly.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11438
2015-01-20 11:31:32 -08:00
epriestley
02eca684ae Add a call to predict the next event for a trigger
Summary: Ref T6881. This is useful to show a "Next backup: 2:30 AM" sort of thing without requring callers to know how triggers work internally.

Test Plan: Showed that kind of thing in Instances.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11437
2015-01-19 16:56:03 -08:00
epriestley
ef106d2979 Add order-by-ID to PhabricatorWorkerTriggerQuery
Summary:
Ref T6881. By design, the EXECUTION order only selects tasks which have been scheduled (since it performs a JOIN). This is inconsistent with other queries and problematic for withID/withPHID queries which may want to select an unscheduled task.

Switch to standard ID ordering by default.

Test Plan:
  - Instances console now finds unscheduled triggers.
  - Verified that all existing queries specify an explicit order.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11436
2015-01-19 16:55:52 -08:00
epriestley
cccdc48883 Implement PhabricatorDestructibleInterface for event triggers
Summary: Ref T6881. When stuff with triggers is destroyed, it should destroy the triggers.

Test Plan: Will test in Instances.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11435
2015-01-19 16:55:38 -08:00
epriestley
7cbbd7868f Add a "schedule task" trigger action
Summary: Ref T6881. Add a standard "just queue a task" trigger action; I expect almost all application code to use this.

Test Plan: Will test in Instances.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11429
2015-01-19 16:55:23 -08:00
epriestley
3860c56e85 Allow querying triggers by ID/PHID
Summary: Ref T6881. I just want to show trigger info in the instance management console.

Test Plan: Will test in Instances.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11428
2015-01-19 16:55:08 -08:00
epriestley
a988a1a043 Add a "daily routine" trigger clock for backups, etc.
Summary: Ref T6881. Before implementing subscriptions, I'm going to vet triggers by using them to do backups. Each instance will get a daily trigger for backups, and that should give us a smaller-scale test to catch issues and limitations, with more opportunities for something to go wrong since it fires more often.

Test Plan: Added unit tests.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11427
2015-01-19 16:54:23 -08:00
epriestley
19be32656f Implement clock/trigger infrastructure for scheduling actions
Summary:
Ref T6881. Hopefully, this is the hard part.

This adds a new daemon (the "trigger" daemon) which processes triggers, schedules them, and then executes them at the scheduled time. The design is a little complicated, but has these goals:

  - High resistance to race conditions: only the application writes to the trigger table; only the daemon writes to the event table. We won't lose events if someone saves a meeting at the same time as we're sending a reminder out for it.
  - Execution guarantees: scheduled events are guaranteed to execute exactly once.
  - Support for arbitrarily large queues: the daemon will make progress even if there are millions of triggers in queue. The cost to update the queue is proportional to the number of changes in it; the cost to process the queue is proportional to the number of events to execute.
  - Relatively good observability: you can monitor the state of the trigger queue reasonably well from the web UI.
  - Modular Infrastructure: this is a very low-level construct that Calendar, Phortune, etc., should be able to build on top of.

It doesn't have this stuff yet:

  - Not very robust to bad actions: a misbehaving trigger can stop the queue fairly easily. This is OK for now since we aren't planning to make it part of any other applications for a while. We do still get execute-exaclty-once, but it might not happen for a long time (until someone goes and fixes the queue), when we could theoretically continue executing other events.
  - Doesn't start automatically: normal users don't need to run this thing yet so I'm not starting it by default.
  - Not super well tested: I've vetted the basics but haven't run real workloads through this yet.
  - No sophisticated tooling: I added some basic stuff but it's missing some pieces we'll have to build sooner or later, e.g. `bin/trigger cancel` or whatever.
  - Intentionally not realtime: This design puts execution guarantees far above realtime concerns, and will not give you precise event execution at 1-second resolution. I think this is the correct goal to pursue architecturally, and certainly correct for subscriptions and meeting reminders. Events which execute after they have become irrelevant can simply decline to do anything (like a meeting reminder which executes after the meeting is over).

In general, the expectation for applications is:

  - When creating an object (like a calendar event) that needs to trigger a scheduled action, write a trigger (and save the PHID if you plan to update it later).
  - The daemon will process the event and schedule the action efficiently, in a race-free way.
  - If you want to move the action, update the trigger and the daemon will take care of it.
  - Your action will eventually dump a task into the task queue, and the task daemons will actually perform it.

Test Plan:
Using a test script like this:

```
<?php

require_once 'scripts/__init_script__.php';

$trigger = id(new PhabricatorWorkerTrigger())
  ->setAction(
    new PhabricatorLogTriggerAction(
      array(
        'message' => 'test',
      )))
  ->setClock(
    new PhabricatorMetronomicTriggerClock(
      array(
        'period' => 33,
      )))
  ->save();

var_dump($trigger);
```

...I queued triggers and ran the daemon:

  - Verified triggers fire;
  - verified triggers reschedule;
  - verified trigger events show up in the web UI;
  - tried different periods;
  - added some triggers while the daemon was running;
  - examined `phd debug` output for anything suspicious.

It seems to work in trivial use case, at least.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11419
2015-01-16 12:13:31 -08:00
epriestley
66975fa51b Implement "trigger clocks" for scheduling events
Summary:
Ref T6881. This will probably make more sense in a couple of diffs, but this is a class that implements scheduling/recurrence rules. Two rules are provided:

  - Trigger an event at a specific time (e.g., a meeting reminder notification).
  - Trigger an event on the Nth day of every month (e.g., a subscription bill).

At some point, we'll presumably add a rule for T2896 (maybe using the "RRULE" spec) so you can do stuff like "the second to last thursday of every month", etc., but we don't need that for now.

(The "Nth day of every month, or move it back if no such day exists" rule doesn't seem to be expressible with the "RRULE" format, so implementing that wouldn't give us a superset of this. I think this rule is correct and desirable for this purpose, though.)

Test Plan: Added and executed unit tests.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11403
2015-01-15 15:57:45 -08:00
epriestley
b9788fed00 Recover more cleanly from worker tasks with unconstructable classes
Summary:
This is unusual, but if `getWorkerInstance()` throws we end up with an undefined `$worker` when recovering from the exception.

Instead, handle this case slightly more gracefully.

The easiest way to hit this is to schedule a task for a worker that doesn't exist (or remove an existing worker, which is what I did to hit it).

Test Plan: Saw a more graceful error recovery; ran some normal successful tasks out of the queue.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D11413
2015-01-15 15:57:02 -08:00
Joshua Spence
daadf95537 Fix visibility of PhutilArgumentWorkflow::didConstruct methods
Summary: Ref T6822.

Test Plan: `grep`. This method is only called from within `PhutilArgumentWorkflow::__construct`.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11415
2015-01-16 07:42:07 +11:00
Joshua Spence
62dfcd1e55 Fix the visibility of PhutilDaemon::run methods
Summary: Ref T6822. This method is only called from `PhutilDaemon::execute()` and can be made `protected`.

Test Plan: See D11404.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11405
2015-01-16 06:59:29 +11:00
Joshua Spence
d6b882a804 Fix visiblity of LiskDAO::getConfiguration()
Summary: Ref T6822.

Test Plan: `grep`

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: hach-que, Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11370
2015-01-14 06:54:13 +11:00
epriestley
8c4f3edd8a Skip some repository checks in cluster enviornments
Summary:
Ref T2783. Currently, the repository edit page does some checks agaisnt the local system to look for binaries and files on disk. These checks don't make sense in a cluster environment.

Ideally, we could make a Conduit call to the host (e.g., add something like `diffusion.querysetupstatus`) to do these checks, but since they're pretty basic config things and cluster installs are advanced, it doesn't seem super worthwhile for now.

Test Plan: Saw fewer checks in a cluster repo.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T2783

Differential Revision: https://secure.phabricator.com/D11102
2014-12-31 11:50:35 -08:00
epriestley
ba4ebf28ad Allow archived tasks to be queried by object PHID and order by id
Summary: Ref T5402.

Test Plan:
  - Queried archived tasks.
  - Grepped for use sites and verified no other callsites are order-sensitive.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11089
2014-12-30 15:54:56 -08:00
Bob Trahan
9219645287 Daemons - add "objectPHID" to task tables.
Summary: Ref T5402. This more or less "fixes" it but there's probably some polish to do?

Test Plan:
stopped and started daemons. error logs look good.

ran bin/storage upgrade.  noted that `adjust` added the appropriate indices for active and archive task.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11044
2014-12-23 16:30:05 -08:00
Bob Trahan
a4474a4975 Daemons - introduce PhabricatorWorkerArchiveTaskQuery
Summary: Ref T5402. This cleans up some code and sets us up to use this sort of data more easily later.

Test Plan: viewed the daemon console from the web and the log of a specific archived daemon. both looked good. for other callsites looked really, really carefully.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T5402

Differential Revision: https://secure.phabricator.com/D11042
2014-12-23 15:45:42 -08:00
epriestley
7d9bda59a6 Prevent worker queue leases from exceeding 64 characters
Summary:
Ref T6742. Root cause of the issue:

  - Daemon was running on a machine with a very long host name, which produced a lease name which was longer than 64 characters.
  - MySQL wasn't set in STRICT_ALL_TABLES.
  - The daemon would `UPDATE .. SET leaseOwner = <very long string>` to lock a task, and MySQL would silently truncate.
  - The daemon would then try to select the locked task, but fail, because there's no matching lease owner.

To resolve this, use only the first 32 characters of the hostname. See IRC for more discussion.

Test Plan: Will confirm with reporter.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6742

Differential Revision: https://secure.phabricator.com/D10998
2014-12-17 11:10:01 -08:00
epriestley
139c63bd84 When a worker task fails permanently, log the reason
Summary: Ref T6238. This makes debugging permanent task failures easier (we log reasoning for temporary failures already, just not permanent ones).

Test Plan: Saw more useful permanent failure log.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6238

Differential Revision: https://secure.phabricator.com/D10945
2014-12-08 11:27:10 -08:00
epriestley
9a7383121d Move cancel/retry/free task queue actions to bin/worker
Summary:
Fixes T6702. Ref T3554. Currently, tasks can be cancelled, retried and freed from the web UI by any logged in user.

This isn't appreciably dangerous (I can't come up with a way that a user could do anything security-affecting), but I think I probably intended this to be admin-only, but these actions should move to the CLI anyway.

Move them to the CLI. Lay some groundwork for some future `bin/worker cancel --class SomeTaskClass`, but don't implement that yet.

Test Plan: Used `cancel`, `retry` and `free` from the CLI. Hit all the error/success states.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T3554, T6702

Differential Revision: https://secure.phabricator.com/D10939
2014-12-06 09:14:16 -08:00
epriestley
b5f7e9eec6 Reverse meaning of task priority column
Summary:
Ref T6615. Mixing ASC and DESC ordering on a multipart key makes it dramatically less effective (or perhaps totally ineffective).

Reverse the meaning of the `priority` column so it goes in the same direction as the `id` column (both ascending, lower values execute sooner).

Test Plan:
  - Queued 1.2M tasks with `bin/worker flood`.
  - Processed ~1 task/second with `bin/phd debug taskmaster` before patch.
  - Applied patch, took ~5 seconds for ~1.2M rows.
  - Processed ~100-200 tasks/second with `bin/phd debug taskmaster` after patch.
  - "Next in Queue" query on daemon page dropped from 1.5s to <1ms.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: aklapper, 20after4, epriestley

Maniphest Tasks: T6615

Differential Revision: https://secure.phabricator.com/D10895
2014-11-24 11:10:35 -08:00
epriestley
7e1c312183 Add bin/worker flood, for flooding the task queue with work
Summary: Ref T6615. Ref T3554. We need better tooling around the queue eventually, so start here.

Test Plan: Added 100K+ tasks locally with `bin/worker flood`. Executed some of them with `bin/phd debug taskmaster` (we already have a TestWorker, used in unit tests).

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T3554, T6615

Differential Revision: https://secure.phabricator.com/D10894
2014-11-24 11:10:15 -08:00
epriestley
914b8bb32c Fix daemon task queue to respect task priority
Summary:
Fixes an issue with T5336 / D9871. We did 99% of the work here but didn't actually turn on the priority sorting. The unit test passed by default, which didn't catch this.

  - Fix the unit test (it failed).
  - Fix the query (test now passes).
  - Add a "Next in Queue" element to the UI to make this kind of thing easier to spot/understand.

Test Plan: Ran unit test. Viewed "Next in Queue". Queued some tasks, flushed the queue. Web UI tracked the state sensibly.

Reviewers: joshuaspence, btrahan

Reviewed By: btrahan

Subscribers: cburroughs, epriestley

Differential Revision: https://secure.phabricator.com/D10766
2014-10-31 09:27:04 -07:00
epriestley
8fa8415c07 Automatically build all Lisk schemata
Summary:
Ref T1191. Now that the whole database is covered, we don't need to do as much work to build expected schemata. Doing them database-by-database was helpful in converting, but is just reudndant work now.

Instead of requiring every application to build its Lisk objects, just build all Lisk objects.

I removed `harbormaster.lisk_counter` because it is unused.

It would be nice to autogenerate edge schemata, too, but that's a little trickier.

Test Plan: Database setup issues are all green.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley, hach-que

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10620
2014-10-02 09:51:20 -07:00
epriestley
300172e799 Support AUTO_INCREMENT in bin/storage adjust
Summary:
Ref T1191. When changing the column type of an AUTO_INCREMENT column, we currently may lose the autoincrement attribute.

Instead, support it. This is a bit messy because AUTO_INCREMENT columns interact with PRIMARY KEY columns (tables may only have one AUTO_INCREMENT column, and it must be a primary key). We need to migrate in more phases to avoid this issue.

Introduce new `auto` and `auto64` types to represent autoincrement IDs.

Test Plan:
  - Saw autoincrement show up correctly in web UI.
  - Fixed an autoincrement issue on the XHProf storage table with `bin/storage adjust` safely.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10607
2014-10-01 08:24:51 -07:00
epriestley
4fcc634a99 Fix almost all remaining schemata issues
Summary:
Ref T1191. This fixes nearly every remaining blocker for utf8mb4 -- primarily, overlong keys.

Remaining issue is https://secure.phabricator.com/T1191#77467

Test Plan: I'll annotate inline.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley, hach-que

Maniphest Tasks: T6099, T6129, T6133, T6134, T6150, T6148, T6147, T6146, T6105, T1191

Differential Revision: https://secure.phabricator.com/D10601
2014-10-01 08:18:36 -07:00
epriestley
03519c53bb Mark questionable column nullability for later
Summary:
Ref T1191. Ref T6203. While generating expected schemata, I ran into these columns which seem to have sketchy nullability.

  - Mark most of them for later resolution (T6203). They work fine today and don't need to block T1191. Changing them can break the application, so we can't autofix them.
  - Forgive a couple of them that are sort-of reasonable or going to get wiped out.

Test Plan: Saw 94 remaining warnings.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: hach-que, epriestley

Maniphest Tasks: T1191, T6203

Differential Revision: https://secure.phabricator.com/D10593
2014-10-01 07:59:44 -07:00
epriestley
098d0d93d6 Generate expected schemata for User/People tables
Summary:
Ref T1191. Some notes here:

  - Drops the old LDAP and OAuth info tables. These were migrated to the ExternalAccount table a very long time ago.
  - Separates surplus/missing keys from other types of surplus/missing things. In the long run, my plan is to have only two notice levels:
    - Error: something we can't fix (missing database, table, or column; overlong key).
    - Warning: something we can fix (surplus anything, missing key, bad column type, bad key columns, bad uniqueness, bad collation or charset).
    - For now, retaining three levels is helpful in generating all the expected scheamta.

Test Plan:
  - Saw ~200 issues resolve, leaving ~1,300.
  - Grepped for removed tables.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10580
2014-10-01 07:36:47 -07:00
epriestley
7499cb24ce Generate expected schemata for Workers, XHProf, PHPAAST, Tokens, System, Slowvote
Summary: T1191. Nothing very notable here.

Test Plan: Saw more blue in web UI.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D10522
2014-09-19 05:45:24 -07:00
Joshua Spence
0151c38b10 Apply some autofix linter rules
Summary: Self-explanatory.

Test Plan: Eyeball it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin

Differential Revision: https://secure.phabricator.com/D10454
2014-09-10 06:55:05 +10:00
epriestley
9309723ac4 Send graceful shutdown signals to daemons in Phabricator
Summary:
Fixes T5855. Adds a `--graceful N` flag to `phd stop` and `phd restart`.

`phd` will send SIGINT, wait `N` seconds, SIGTERM, wait 15 seconds, and SIGKILL. By default, `N` is 15.

Test Plan:
  - Ran `bin/phd debug ...` and used `^C` to interrupt daemons. Saw graceful shutdown behavior, and abrupt termination on multiple `^C`.
  - Ran `bin/phd start`, `bin/phd stop` and `bin/phd restart` with `--graceful` set to various things, notably `0`. Saw graceful shutdowns on the CLI and in the web UI. With `0`, abrupt shutdowns.

Reviewers: btrahan, hach-que

Reviewed By: hach-que

Subscribers: epriestley

Maniphest Tasks: T5855

Differential Revision: https://secure.phabricator.com/D10228
2014-08-11 20:18:31 -07:00
Joshua Spence
9a679bf374 Allow worker tasks to have priorities
Summary: Fixes T5336. Currently, `PhabricatorWorkerLeaseQuery` is basically FIFO. It makes more sense for the queue to be a priority-queue, and to assign higher priorities to alerts (email and SMS).

Test Plan: Created dummy tasks in the queue (with different priorities). Verified that the priority field was set correctly in the DB and that the priority was shown on the `/daemon/` page. Started a `PhabricatorTaskmasterDaemon` and verified that the higher priority tasks were executed before lower priority tasks.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: epriestley, Korvin

Maniphest Tasks: T5336

Differential Revision: https://secure.phabricator.com/D9871
2014-07-12 03:02:06 +10:00
Joshua Spence
8756d82cf6 Remove @group annotations
Summary: I'm pretty sure that `@group` annotations are useless now... see D9855. Also fixed various other minor issues.

Test Plan: Eye-ball it.

Reviewers: #blessed_reviewers, epriestley, chad

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9859
2014-07-10 08:12:48 +10:00
Joshua Spence
0a62f13464 Change double quotes to single quotes.
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.

Test Plan: Eyeballed it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9431
2014-06-09 11:36:50 -07:00
Bob Trahan
61dd5ab6c1 Worker - supporting running queued tasks in process
Summary: Ref D8930. My "send test" for SMS was failing before this patch, and now it works nicely.

Test Plan: Used new code in D8930 that uses $this->queueTask() to get some work done and it got done in process

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: epriestley, Korvin

Differential Revision: https://secure.phabricator.com/D9018
2014-05-08 13:14:58 -07:00
epriestley
4a6d2e9c97 Allow tasks to yield to other tasks
Summary:
For Harbormaster tasks which want to poll or wait, this lets them say "try again a little later" without having to sleep and hold a queue slot.

This is basically the same as failing, except that we don't increment the failure counter. Instead, we just set the current lease to the correct length and then exit. The task will be retried after the lease expires.

Test Plan: Using both `bin/harbormaster` and `phd debug taskmaster`, ran a lot of waiting tasks through the queue, faking them to either yield or not yield in a controlled manner. The queue responded as expected, yielding tasks appropraitely and retrying them later.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D8792
2014-04-16 13:02:12 -07:00
epriestley
cb545856a9 Make task queue more robust against long-running tasks
Summary:
See discussion in D8773. Three small adjustments which should help prevent this kind of issue:

  - When queueing followup tasks, hold them on the worker until we finish the task, then queue them only if the work was successful.
  - Increase the default lease time from 60 seconds to 2 hours. Although most tasks finish in far fewer than 60 seconds, the daemons are generally stable nowadays and these short leases don't serve much of a purpose. I think they also date from an era where lease expiry and failure were less clearly distinguished.
  - Increase the default wait-after-failure from 60 seconds to 5 minutes. This largely dates from the MetaMTA era, where Facebook ran services with high failure rates and it was appropriate to repeatedly hammer them until things went through. In modern infrastructure, such failures are rare.

Test Plan:
  - Verified that tasks queued properly after the main task was updated.
  - Verified that leases default to 7200 seconds.
  - Intentionally failed a task and verified default 300 second wait before retry.
  - Removed all default leases shorter than 7200 seconds (there was only one).
  - Checked all the wait before retry implementations for anything much shorter than 5 minutes (they all seem reasonable).

Reviewers: btrahan, sowedance

Reviewed By: sowedance

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D8774
2014-04-15 08:42:02 -07:00
Joshua Spence
e11adc4ad7 Added some additional assertion methods.
Summary:
There are quite a few tests in Arcanist, libphutil and Phabricator that do something similar to `$this->assertEqual(false, ...)` or `$this->assertEqual(true, ...)`.

This is unnecessarily verbose and it would be cleaner if we had `assertFalse` and `assertTrue` methods.

Test Plan: I contemplated adding a unit test for the `getCallerInfo` method but wasn't sure if it was required / where it should live.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley

CC: Korvin, epriestley, aran

Differential Revision: https://secure.phabricator.com/D8460
2014-03-08 19:16:21 -08:00
epriestley
1650874004 Modernize Drydock CLI management of task execution
Summary:
Ref T2015. Currently, Drydock has a `wait-for-lease` workflow which is invoked in the background by the `lease` workflow.

The goal of this mechanism is to allow `bin/drydock lease` to print out logs as the lease is acquired. However, this predates the `runAllTasksInProcess` flags, and they provide a simpler and more robust way (potentially with `--trace` and `PhutilConsole`) to do synchronous execution and debug logging.

Simplify this whole mechanism: just run everything in-process in `bin/drydock lease`, and do logging via `--trace`. We could thread a `PhutilConsole` through things too, but this seems good enough for now.

Also various cleanup/etc.

Test Plan: Ran `bin/drydock lease`. Ran `bin/harbormaster build X --plan Y`, for `Y` being a Drydock-dependent build plan.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D7835
2013-12-27 13:15:12 -08:00
epriestley
c462713584 Minor cleanup for task rendering in Daemons
Summary:
Fixes two issues:

  - When rendering a task's details, we currently issue a policy-oblivious query. Instead, issue a policy-aware query.
  - The formatting is a little bit weird, with the top half in a box and the bottom half with an older style. Make them consistent.

Test Plan: Looked at the detail pages for several tasks in queue.

Reviewers: btrahan, chad

Reviewed By: chad

CC: aran

Differential Revision: https://secure.phabricator.com/D7812
2013-12-20 18:02:32 -08:00
epriestley
0ed281d25e Make taskmaster consumption of failed tasks more FIFO-ey
Summary:
Ref T1049. See discussion in D7745. We have some specific interest in this for D7745, but generally we want to consume tasks with expired leases in roughly FIFO order, just like we consume new tasks in roughly FIFO order. Currently, when we select an expired task we order them by `id`, but this is the original insert order, not lease expiration order. Instead, order by `leaseExpires`.

This query is actually much better than the old one was, since the WHERE part is `leaseExpries < VALUE`.

Test Plan: Ran `EXPLAIN` on the query. Ran a taskmaster in debug mode and saw it lease new and expired tasks successfully.

Reviewers: hach-que, btrahan

Reviewed By: hach-que

CC: aran

Maniphest Tasks: T1049

Differential Revision: https://secure.phabricator.com/D7746
2013-12-08 21:07:13 -08:00