phorge-phorge

mirror of https://we.phorge.it/source/phorge.git synced 2024-11-30 02:32:42 +01:00

Author	SHA1	Message	Date
epriestley	4b10bc2b64	Correct schema irregularities (including weird keys) with worker task tables Summary: Ref T13253. Fixes T6615. See that task for discussion. - Remove three keys which serve no real purpose: `dataID` doesn't do anything for us, and the two `leaseOwner` keys are unused. - Rename `leaseOwner_2` to `key_owner`. - Fix an issue where `dataID` was nullable in the active table and non-nullable in the archive table. In practice, //all// workers have data, so all workers have a `dataID`: if they didn't, we'd already fatal when trying to move tasks to the archive table. Just clean this up for consistency, and remove the ancient codepath which imagined tasks with no data. Test Plan: - Ran `bin/storage upgrade`, inspected tables. - Ran `bin/phd debug taskmaster`, worked through a bunch of tasks with no problems. Reviewers: amckinley Reviewed By: amckinley Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Maniphest Tasks: T13253, T6615 Differential Revision: https://secure.phabricator.com/D20175	2019-02-15 19:17:33 -08:00
epriestley	454a762562	Queue search indexing tasks at a new PRIORITY_INDEX, not PRIORITY_IMPORT Summary: Depends on D20175. Ref T12425. Ref T13253. Currently, importing commits can stall search index rebuilds, since index rebuilds use an older priority from before T11677 and weren't really updated for D16585. In general, we'd like to complete all indexing tasks before continuing repository imports. A possible exception is if you rebuild an entire index with `bin/search index --rebuild-the-world`, but we could queue those at a separate lower priority if issues arise. Test Plan: Ran some search indexing through the queue. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13253, T12425 Differential Revision: https://secure.phabricator.com/D20177	2019-02-15 14:16:28 -08:00
epriestley	11cf8f05b1	Remove "getApplicationTransactionObject()" from ApplicationTransactionInterface Summary: Depends on D19919. Ref T11351. This method appeared in D8802 (note that "get...Object" was renamed to "get...Transaction" there, so this method was actually "new" even though a method of the same name had existed before). The goal at the time was to let Harbormaster post build results to Diffs and have them end up on Revisions, but this eventually got a better implementation (see below) where the Harbormaster-specific code can just specify a "publishable object" where build results should go. The new `get...Object` semantics ultimately broke some stuff, and the actual implementation in Differential was removed in D10911, so this method hasn't really served a purpose since December 2014. I think that broke the Harbormaster thing by accident and we just lived with it for a bit, then Harbormaster got some more work and D17139 introduced "publishable" objects which was a better approach. This was later refined by D19281. So: the original problem (sending build results to the right place) has a good solution now, this method hasn't done anything for 4 years, and it was probably a bad idea in the first place since it's pretty weird/surprising/fragile. Note that `Comment` objects still have an unrelated method with the same name. In that case, the method ties the `Comment` storage object to the related `Transaction` storage object. Test Plan: Grepped for `getApplicationTransactionObject`, verified that all remaining callsites are related to `Comment` objects. Reviewers: amckinley Reviewed By: amckinley Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Maniphest Tasks: T11351 Differential Revision: https://secure.phabricator.com/D19920	2018-12-20 15:16:19 -08:00
epriestley	937e88c399	Remove obsolete, no-op implementations of "willRenderTimeline()" Summary: Depends on D19918. Ref T11351. In D19918, I removed all calls to this method. Now, remove all implementations. All of these implementations just `return $timeline`, only the three sites in D19918 did anything interesting. Test Plan: Used `grep willRenderTimeline` to find callsites, found none. Reviewers: amckinley Reviewed By: amckinley Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Maniphest Tasks: T11351 Differential Revision: https://secure.phabricator.com/D19919	2018-12-20 15:04:49 -08:00
epriestley	b2e91d2205	Move the "container updated" message for Buildables that build Diffs outside of the transaction Summary: Ref T13216. See PHI970. Ref T13054. See some discussion in T13216. When a Harbormaster Buildable object is first created for a Diff, it has no `containerPHID` since the revision has not yet been created. We later (after creating a revision) send the Buildable a message telling it that we've added a container and it should re-link the container object. Currently, we send this message in `applyExternalEffects()`, which runs inside the Differential transaction. If Harbormaster races quickly enough, it can read the `Diff` object before the transaction commits, and not see the container update. Add a `didCommitTransaction()` callback after the transactions commit, then move the message code there instead. Test Plan: - See T13216 for substantial evidence that this change is on the right track. - Before change: added `sleep(15)`, reproduced the issue reliably. - After change: unable to reproduce issue even with `sleep(15)` (the `containerPHID` always populates correctly). Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13216, T13054 Differential Revision: https://secure.phabricator.com/D19807	2018-11-16 12:34:06 -08:00
epriestley	98690ee326	Update many Phabricator queries for new %Q query semantics Summary: Depends on D19785. Ref T13217. This converts many of the most common clause construction pathways to the new %Q / %LQ / %LO / %LA / %LJ semantics. Test Plan: Browsed around a bunch, saw fewer warnings and no obvious behavioral errors. The transformations here are generally mechanical (although I did them by hand). Reviewers: amckinley Reviewed By: amckinley Subscribers: hach-que Maniphest Tasks: T13217 Differential Revision: https://secure.phabricator.com/D19789	2018-11-15 03:48:10 -08:00
epriestley	c32fa06266	Use phutil_microseconds_since(...) to simplify some timing arithmetic Summary: Depends on D19796. Simplify some timing code by using phutil_microseconds_since() instead of duplicate casting and arithmetic. Test Plan: Grepped for `1000000` to find these. Pulled, pushed, made a conduit call. This isn't exhaustive but it should be hard for these to break in a bad way since they're all just diagnostic. Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D19797	2018-11-08 16:46:32 -08:00
epriestley	44f0664d2c	Add a "lock log" for debugging where locks are being held Summary: Depends on D19173. Ref T13096. Adds an optional, disabled-by-default lock log to make it easier to figure out what is acquiring and holding locks. Test Plan: Ran `bin/lock log --enable`, `--disable`, `--name`, etc. Saw sensible-looking output with log enabled and daemons restarted. Saw no additional output with log disabled and daemons restarted. Maniphest Tasks: T13096 Differential Revision: https://secure.phabricator.com/D19174	2018-03-05 17:55:34 -08:00
epriestley	84df122085	When exporting more than 1,000 records, export in the background Summary: Depends on D18961. Ref T13049. Currently, longer exports don't give the user any feedback, and exports that take longer than 30 seconds are likely to timeout. For small exports (up to 1,000 rows) continue doing the export in the web process. For large exports, queue a bulk job and do them in the workers instead. This sends the user through the bulk operation UI and is similar to bulk edits. It's a little clunky for now, but you get your data at the end, which is far better than hanging for 30 seconds and then fataling. Test Plan: Exported small result sets, got the same workflow as before. Exported very large result sets, went through the bulk flow, got reasonable results out. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18962	2018-01-29 16:08:02 -08:00
epriestley	6d2d1d3a97	Add `bin/garbage compact-edges` to compact edges into the new format Summary: Depends on D18947. Ref T13051. This goes through transaction tables and compacts the edge storage into the slim format. I put this on `bin/garbage` instead of `bin/storage` because `bin/storage` has a lot of weird stuff about how it manages databases so that it can run before configuration (e.g., all the `--user`, `--password` type flags for configuring DB connections). Test Plan: Loaded an object with a bunch of transactions. Ran migration. Spot checked table for sanity. Loaded another copy of the object in the web UI, compared the two pages, saw no user-visible changes. Here's a concrete example of the migration effect -- old row: ``` ************************* 44. row *********************** id: 757 phid: PHID-XACT-PSTE-5gnaaway2vnyen5 authorPHID: PHID-USER-cvfydnwadpdj7vdon36z objectPHID: PHID-PSTE-5uj6oqv4kmhtr6ctwcq7 viewPolicy: public editPolicy: PHID-USER-cvfydnwadpdj7vdon36z commentPHID: NULL commentVersion: 0 transactionType: core:edge oldValue: {"PHID-PROJ-wh32nih7q5scvc5lvipv":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-wh32nih7q5scvc5lvipv","dateCreated":"1449170691","seq":"0","dataID":null,"data":[]},"PHID-PROJ-5r2ed5v27xrgltvou5or":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-5r2ed5v27xrgltvou5or","dateCreated":"1449170683","seq":"0","dataID":null,"data":[]},"PHID-PROJ-zfp44q7loir643b5i4v4":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-zfp44q7loir643b5i4v4","dateCreated":"1449170668","seq":"0","dataID":null,"data":[]},"PHID-PROJ-okljqs7prifhajtvia3t":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-okljqs7prifhajtvia3t","dateCreated":"1448902756","seq":"0","dataID":null,"data":[]},"PHID-PROJ-3cuwfuuh4pwqyuof2hhr":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-3cuwfuuh4pwqyuof2hhr","dateCreated":"1448899367","seq":"0","dataID":null,"data":[]},"PHID-PROJ-amvkc5zw2gsy7tyvocug":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-amvkc5zw2gsy7tyvocug","dateCreated":"1448833330","seq":"0","dataID":null,"data":[]}} newValue: {"PHID-PROJ-wh32nih7q5scvc5lvipv":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-wh32nih7q5scvc5lvipv","dateCreated":"1449170691","seq":"0","dataID":null,"data":[]},"PHID-PROJ-5r2ed5v27xrgltvou5or":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-5r2ed5v27xrgltvou5or","dateCreated":"1449170683","seq":"0","dataID":null,"data":[]},"PHID-PROJ-zfp44q7loir643b5i4v4":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-zfp44q7loir643b5i4v4","dateCreated":"1449170668","seq":"0","dataID":null,"data":[]},"PHID-PROJ-okljqs7prifhajtvia3t":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-okljqs7prifhajtvia3t","dateCreated":"1448902756","seq":"0","dataID":null,"data":[]},"PHID-PROJ-3cuwfuuh4pwqyuof2hhr":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-3cuwfuuh4pwqyuof2hhr","dateCreated":"1448899367","seq":"0","dataID":null,"data":[]},"PHID-PROJ-amvkc5zw2gsy7tyvocug":{"src":"PHID-PSTE-5uj6oqv4kmhtr6ctwcq7","type":"41","dst":"PHID-PROJ-amvkc5zw2gsy7tyvocug","dateCreated":"1448833330","seq":"0","dataID":null,"data":[]},"PHID-PROJ-tbowhnwinujwhb346q36":{"dst":"PHID-PROJ-tbowhnwinujwhb346q36","type":41,"data":[]},"PHID-PROJ-izrto7uflimduo6uw2tp":{"dst":"PHID-PROJ-izrto7uflimduo6uw2tp","type":41,"data":[]}} contentSource: {"source":"web","params":[]} metadata: {"edge:type":41} dateCreated: 1450197571 dateModified: 1450197571 ``` New row: ``` *********************** 44. row ************************* id: 757 phid: PHID-XACT-PSTE-5gnaaway2vnyen5 authorPHID: PHID-USER-cvfydnwadpdj7vdon36z objectPHID: PHID-PSTE-5uj6oqv4kmhtr6ctwcq7 viewPolicy: public editPolicy: PHID-USER-cvfydnwadpdj7vdon36z commentPHID: NULL commentVersion: 0 transactionType: core:edge oldValue: [] newValue: ["PHID-PROJ-tbowhnwinujwhb346q36","PHID-PROJ-izrto7uflimduo6uw2tp"] contentSource: {"source":"web","params":[]} metadata: {"edge:type":41} dateCreated: 1450197571 dateModified: 1450197571 ``` Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13051 Differential Revision: https://secure.phabricator.com/D18948	2018-01-29 11:34:40 -08:00
epriestley	3038d564a6	Allow bulk edits to be made silently if you have CLI access Summary: Fixes T13042. This hooks up the new "silent" mode from D18882 and makes it actually work. The UI (where we tell you to go run some command and then reload the page) is pretty clumsy, but should solve some problems for now and can be cleaned up eventually. The actual mechanics (timeline aggregation, Herald interaction, etc.) are on firmer ground. Test Plan: - Made a normal bulk edit, got mail and feed stories. - Made a silent bulk edit, no mail and no feed. - Saw "Silent Edit" marker in timeline for silent edits: {F5386245} Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13042 Differential Revision: https://secure.phabricator.com/D18883	2018-01-19 13:24:54 -08:00
Dmitri Iouchtchenko	9bd6a37055	Fix spelling Summary: Noticed a couple of typos in the docs, and then things got out of hand. Test Plan: - Stared at the words until my eyes watered and the letters began to swim on the screen. - Consulted a dictionary. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley, yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam Differential Revision: https://secure.phabricator.com/D18693	2017-10-09 10:48:04 -07:00
epriestley	e9208ed3da	Fix a spelling error in worker triggers Summary: This word is not spelled properly. Test Plan: Read the word. Reviewers: chad Reviewed By: chad Differential Revision: https://secure.phabricator.com/D18250	2017-07-20 14:20:44 -07:00
epriestley	3400f24c8b	Send permanent dameon failures to the log, even when not running in verbose mode Summary: Fixes T12803. An install is having difficulty diagnosing mail failures, and one component is that permanent task failures aren't reaching the log. It's reasonable to send these to the log even when "phd.verbose" is off. See T12803 for a rough review of when we generate these failrues today. Test Plan: - Faked some exceptions. - Got a result in the log (P2058) with `phd.verbose` turned off. Reviewers: chad, amckinley Reviewed By: chad Maniphest Tasks: T12803 Differential Revision: https://secure.phabricator.com/D18106	2017-06-08 15:26:19 -07:00
epriestley	5c1e4488de	Remove all "Phabricator Bot" code Summary: Closes T7829 as wontfix. Closes T7965 as wontfix. Closes T7800 as wontfix. Closes T2731 as wontfix. Closes T1271 as wontfix. We aren't maintaining this at all (see, e.g., T7829) and a user reported a technically accurate security issue via HackerOne: <https://hackerone.com/reports/222870> Just throw it away until we get to the eventual Conphernece bot/API update and can do this stuff correctly. Test Plan: Grepped for `phabricatorbot`. Reviewers: chad Reviewed By: chad Maniphest Tasks: T7965, T7829, T7800, T2731, T1271 Differential Revision: https://secure.phabricator.com/D17756	2017-04-21 12:48:35 -07:00
epriestley	a41d158490	Only hibernate the Taskmaster after 15 seconds of inactivity Under some workloads, the taskmaster may hibernate and launch more rapidly than it should. Require 15 seconds of inactivity before hibernating. Also hibernate for longer. Auditors: chad	2017-03-25 05:01:32 -07:00
epriestley	2cda280cde	Make the default Trigger hibernation 3 minutes instead of 5 seconds The `min()` vs `max()` fix in D17560 meant that the Trigger daemon only hibernates for 5 seconds, so we do a full GC sweep every 5 seconds. This ends up eating a fair amount of CPU for no real benefit. The GC cursors should move to persistent storage, but just bump this default up in the meantime. Auditors: chad	2017-03-25 04:14:32 -07:00
epriestley	8b553d2f18	Allow taskmaster daemons to hibernate Summary: Ref T12298. Like PullLocal daemons, this allows the last daemon in the pool to hibernate if there's no work to be done, and awakens the pool when work arrives. Test Plan: - Ran `bin/phd debug task --trace`. - Saw the pool hibernate and look for tasks. - Commented on an object. - Saw the pool wake up and process the queue. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12298 Differential Revision: https://secure.phabricator.com/D17559	2017-03-24 13:51:37 -07:00
epriestley	f13637627d	Improve daemon "waiting" message, config reload behavior Summary: Ref T12298. Two minor daemon improvements: - Make the "waiting" message reflect hibernation. - Don't trigger a reload right after launching. Test Plan: - Read "waiting" message. - Ran "bin/phd start", didn't see an immediate SIGHUP in the log. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12298 Differential Revision: https://secure.phabricator.com/D17550	2017-03-24 08:32:08 -07:00
epriestley	9099485a71	Allow the PullLocal daemon to hibernate, and wake it when repositories need an update Summary: Ref T12298. This allows the PullLocal daemon to hibernate like the Trigger daemon, but automatically wakes it back up when it needs to do something. Test Plan: - Ran `bin/phd debug pulllocal --trace`. - Saw the daemon hibernate after doing a checkup on repositories. - Saw periodic queries to look for new update messages. - After clicking "Update Now" in the web UI to schedule an update, saw the daemon wake up immediately. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12298 Differential Revision: https://secure.phabricator.com/D17540	2017-03-23 10:52:28 -07:00
epriestley	90ec21f999	Add "--pool" and "--duration" flags to daemon CLI tools Summary: Ref T12331. These changes are intended to make it easier to debug T12331 since I'm having difficulty reproducing the issue locally. Test Plan: - Ran `bin/phd debug task --pool 4` and got an autoscaling pool. - Ran `bin/worker flood --duration 3` and got some 3-second-long tasks to execute with `bin/worker execute ...`. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12331 Differential Revision: https://secure.phabricator.com/D17431	2017-02-28 07:43:46 -08:00
epriestley	40cc403d23	Allow the Trigger daemon to hibernate, reducing processes to 0 Summary: Ref T12298. The trigger daemon already has routine long-term sleep, and few external events can impact when it should ideally wake up. The relevant events are: - Someone creates a new Nuance source (ideally, we should wake up right away and start polling it). - Someone creates a Calendar event about 16 minutes in the future (ideally, we should send them a reminder in about a minute). - Someone changes GC config to be extremely aggressive (ideally, we should immediately respect the change). None of these cases are very important. We don't hibernate for more than 3 minutes, so the worst case is that your Nuance source takes 3 minutes to start importing or your Calendar notification comes two minutes too late (13 minutes before the event instead of 15). This change makes GC sightly more CPU-expensive on average: currently, we do a GC sweep every 4 hours. After this change, we'll end up doing one every 3 minutes, because we lose the fact that we did a sweep recently when the daemon restarts. We could fix this by keeping track of when the last GC sweep was in the database, instead of in the Daemon process, but the cost of a sweep is normally very small so I don't plan to do this anytime soon. Test Plan: - Ran `bin/phd debug trigger`, saw daemon go through 3-minute hibernate + restart cycles. - Ran `bin/phd debug task`, saw daemon run normally. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12298 Differential Revision: https://secure.phabricator.com/D17408	2017-02-24 10:54:05 -08:00
Chad Little	bf44210dc8	Reduce application search engine results list for Dashboards Summary: Ref T10390. Simplifies dropdown by rolling out canUseInPanel in useless panels Test Plan: Add a query panel, see less options. Reviewers: epriestley Reviewed By: epriestley Subscribers: Korvin Maniphest Tasks: T10390 Differential Revision: https://secure.phabricator.com/D17341	2017-02-22 12:42:43 -08:00
Jakub Vrana	a778151f28	Fix errors found by PHPStan Test Plan: Ran `phpstan analyze -a autoload.php phabricator/src`. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: Korvin, hach-que Differential Revision: https://secure.phabricator.com/D17371	2017-02-17 10:10:15 +00:00
Josh Cox	ac66522c2e	Add a flag to ./bin/worker to select tasks based on their failureCount Summary: I frequently run into a situation where I want to kill tasks that have accumulated a lot of failures regardless of what class they are. Or I'll want to kill every worker of a certain class but only if it has failed at least once. This change allows me to run `./bin/worker cancel --class <MYCLASS> --min-failure-count 5` to only kill tasks with at least 5 failed attempts. The `--min-failure-count N` argument can be used by itself as well as with `--class CLASSNAME`. I don't think it makes sense for it to work with `--id ID`, but I'm not dead set on that or anything. Test Plan: I ran the worker management workflow with and without the `--min-failure-count` argument and it worked as expected. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: Korvin, epriestley, yelirekim Differential Revision: https://secure.phabricator.com/D16906	2016-10-12 09:49:29 -04:00
epriestley	706c21375e	Remove empty implementations of `describeAutomaticCapabilities()` Summary: This has been replaced by `PolicyCodex` after D16830. Also: - Rebuild Celerity map to fix grumpy unit test. - Fix one issue on the policy exception workflow to accommodate the new code. Test Plan: - `arc unit --everything` - Viewed policy explanations. - Viewed policy errors. Reviewers: chad Reviewed By: chad Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam Differential Revision: https://secure.phabricator.com/D16831	2016-11-09 15:24:22 -08:00
epriestley	960c0be689	Fix some issues with Phabricator i18n string extraction Summary: Ref T5267. Fix one minor bug (paths were not being resolved properly) and one minor string issue (missing `%d` in a string). Test Plan: Extracted strings, got a cleaner result. Reviewers: chad Reviewed By: chad Maniphest Tasks: T5267 Differential Revision: https://secure.phabricator.com/D16808	2016-11-06 11:12:45 -08:00
epriestley	6b16f930c4	Automatically send (not-so-great) email notifications for upcoming events Summary: Ref T7931. This is still quite rough, but should technically send vaguely-useful email as part of the standard trigger infrastructure. Test Plan: Ran `bin/phd start`, created an event shortly, saw reminder email send in `bin/mail list-outbound`. Reviewers: chad Reviewed By: chad Maniphest Tasks: T7931 Differential Revision: https://secure.phabricator.com/D16784	2016-11-01 13:24:40 -07:00
epriestley	7678f412be	Hold a lock while collecting garbage Summary: Fixes T11771. Adds a lock around each GC process so we don't try to, e.g., delete old files on two machines at once just because they're both running trigger daemons. The other aspects of this daemon (actual triggers; nuance importers) already have separate locks. Test Plan: Ran `bin/phd debug trigger --trace`, saw daemon acquire locks and collect garbage. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11771 Differential Revision: https://secure.phabricator.com/D16739	2016-10-20 13:40:00 -07:00
epriestley	db2425b300	Do initial repository imports at a lower priority and finish importing commits before starting new ones Summary: Fixes T11677. This makes two minor adjustments to the repository import daemons: - The first step ("Message") now queues at a slightly-lower-than-default (for already-imported repositories) or very-low (for newly importing repositories) priority level. - The other steps now queue at "default" priority level. This is actually what they already did, but without this change their behavior would be to inherit the priority level of their parents. This has two effects: - When adding new repositories to an existing install, they shouldn't block other things from happening anymore. - The daemons will tend to start one commit and run through all of its steps before starting another commit. This makes progress through the queue more even and predictable. - Before, they did ALL the message tasks, then ALL the change tasks, etc. This works fine but is confusing/uneven/less-predictable because each type of task takes a different amount of time. Test Plan: - Added a new repository. - Saw all of its "message" steps queue at priority 4000. - Saw followups queue at priority 2000. - Saw progress generally "finish what you started" -- go through the queue one commit at a time, instead of one type of task at a time. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11677 Differential Revision: https://secure.phabricator.com/D16585	2016-09-21 16:41:01 -07:00
Josh Cox	8cdf1a890a	Updated the docs so chatbots can use the Conduit API Summary: Previously, the chatbot docs instructed users to get certificates for the conduit API and put the cert in a `conduit.cert` config key. In order to get the chatbot to work, I needed to instead get an API key and put it in the `conduit.token` config entry. Test Plan: Doc fix. Tried the new documented way and it worked. Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin, epriestley Differential Revision: https://secure.phabricator.com/D16443	2016-08-24 19:05:30 -04:00
Josh Cox	605210bc95	Make the chatbot obey the object name blacklist Summary: Fixes T11508. The config entry `remarkup.ignored-object-names` already contains a blacklist of object names that should be ignored in the web UI. This change makes that blacklist also apply to the chatbot. This makes it possible to have a chatbot ignore things like V1, V2, Q1 and any other phrases the user may not want to generate links to objects. Test Plan: Create objects (tasks, slowvotes, etc.) then mention the object names in chat (with the bot running). The bot should respond with helpful links to the given objects. Then add the object names to the blacklist through the config web UI. This apparently triggers the bot to restart itself. Then mention the object names in chat again. The bot should no longer respond with links because those object names have been added to the blacklist regex. Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: epriestley Maniphest Tasks: T11508 Differential Revision: https://secure.phabricator.com/D16442	2016-08-23 07:38:27 -05:00
epriestley	3bd0da0ec2	Add a missing table key to improve performance of "Recently Completed Tasks" query Summary: Fixes T11490. Currently, this query can not use a key and the table size may be quite large. Adjust the query so it can use a key for both selection and ordering, and add that key. Test Plan: Ran `EXPLAIN` on the old query in production, then added the key and ran `EXPLAIN` on the new query. Saw key in use, and "rows" examined drop from 29,273 to 15. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11490 Differential Revision: https://secure.phabricator.com/D16423	2016-08-19 11:53:09 -07:00
epriestley	ca78c1825a	When already running as the daemon user, don't "sudo" daemon commands Summary: The cluster synchronization code runs either actively (before returning a response to `git clone`, for example) or passively (routinely, as the daemons update reposiories). The active sync runs as the web user (if running `git clone http://...`) or the VCS user (if running `git clone ssh://...`). But the passive sync runs as the daemon user. All of these sync processes need to run actual commands as the daemon user (`git fetch ...`). For the active ones, we must `sudo`. For the passive ones, we're already the right user. We run the same code, and end up trying to sudo to ourselves, which `sudo` isn't happy about by default. Depending on how `sudo` is configured and which users things are running as this might work anyway, but it's silly and if it doesn't work it requires you to go make non-obvious, weird config changes that are unintuitive and somewhat nonsensical. This is probably worse on the balance than adding a bit of complexity to the code. Instead, test which user we're running as. If it's already the right user, don't sudo. Test Plan: - Ran `bin/repository update --trace` as daemon user, saw no more `sudo`. - Ran a `git clone` to make sure that didn't break. Reviewers: chad, avivey Reviewed By: avivey Differential Revision: https://secure.phabricator.com/D16391	2016-08-11 16:41:19 -07:00
epriestley	5e3efca08a	In taskmaster daemons, only close connections which were not used recently Summary: Ref T11458. Depends on D16388. Currently, we're very aggressive about closing connections in the taskmaster daemons. This can end up taking up a lot of resources. In particular, because the outgoing port for outbound connections normally can not be reused for 60 seconds after a connection closes, we may exhaust outbound ports on the host if there's a big queue full of stuff that's being processed very quickly. At a minimum, we //always// are holding open a `worker` connection, which we always need again right away. So even in the best case we end up opening/closing this about once per second and each daemon takes up about ~60 outbound ports when it should take up ~1. So, make two adjustments: - First, only close connections which we haven't issued a query on in the last 60 seconds. This should prevent us from closing connections that we'll need again immediately in most cases. In the worst case, we shouldn't be eating up any extra ports under default TCP behavior. - Second, explicitly close connections. We were relying on implicit/GC behavior (maybe as a holdover from very long ago, before we got connection wrappers in place?), which probably did about the same thing but isn't as predictable and can't be profiled or instrumented. Test Plan: This is somewhat difficult to test completely convincingly in isolation since the problem behavior depends on production scales and the workload, and to some degree on configuration. I tested that this stuff baiscally works by adding logging to connect/close and running the daemons, verifying that they churned connections a lot before this change (e.g., ~1/s even at no load) and churn rarely afterward (e.g., almost never at no load). I ran some workload through them to make sure I didn't completely break anything. The best real test is just seeing how production responds. Current inbound/outbound connections on `secure001` are 1,200: ``` secure001 $ netstat -t \| grep :mysql \| wc -l 1164 ``` Current outbound from `repo001` are 18,600: ``` repo001 $ netstat -t \| grep :mysql \| wc -l 18663 ``` Reviewers: chad Reviewed By: chad Maniphest Tasks: T11458 Differential Revision: https://secure.phabricator.com/D16389	2016-08-11 12:03:56 -07:00
epriestley	4068ee2a75	Make permanent worker failures more user-friendly Summary: Ref T11309. In that task, a user misunderstood two parts of this error: - They took "exception" to mean "unexpected failure", when it was intended to mean "rare circumstance". - They intereted the internal ID number of a commit to mean that Phabricator was malfunctioning. Make the language of this condition more direct, explaining what the situation means in greater detail. Additionally, we would previously re-throw this exception, which would make the daemon exit, wait a moment, and restart. This was normal and expected. When //unexpected// failures occur, it's important do to this: it prevents a daemon failing in a loop from causing too many side effects (e.g., limit of 1 email per 5 seconds instead of thousands per second). When expected, permanent failures occur, we do not need to do this: the task will not be retried. I just did it because it was slightly more consistent ("failures restart daemons") and we had few permanent failure types at the time. We have more now, and restarting the daemons generates some additional logs which have the potential to confuse. Cycling the daemon also (intentionally) reduces the rate at which we process tasks, which can be bad for permanent failures like "deleted commit" because users can delete a huge number of commits and possibly clog up the queue with cycle-after-failure actions. Test Plan: Tried to process a deleted commit, saw a new message: ``` 2016-07-11 9:30:22 AM [STDE] <VERB> PhabricatorTaskmasterDaemon Task 1428658 was cancelled: Commit "R55:6c46b7d0fb82a859ca3f87a95dc8dcceef8088c9" (with internal ID "282161") is no longer reachable from any branch, tag, or ref in this repository, so it will not be imported. This usually means that the branch the commit was on was deleted or overwritten. ``` Reviewers: chad Reviewed By: chad Maniphest Tasks: T11309 Differential Revision: https://secure.phabricator.com/D16268	2016-07-11 09:21:39 -07:00
epriestley	c510c925cf	Allow worker tasks to be cancelled by classname Summary: Ref T3554. Makes `bin/worker cancel --class <classname>` work (cancel all tasks with that type). This is useful in development if your queue is full of a bunch of gunk, and a need has occasionally arisen in production environments (usually "one option is cancel everything and move on"). Test Plan: Ran `bin/worker cancel` to cancel blocks of tasks by class name. Reviewers: chad Reviewed By: chad Maniphest Tasks: T3554 Differential Revision: https://secure.phabricator.com/D16267	2016-07-11 09:21:16 -07:00
Aviv Eyal	a3bb35e9d2	make Trigger Daemon sleep correctly when one-time triggers exist Summary: Trigger daemon is trying to find the next event to invoke before sleeping, but the query includes already-elapsed triggers. It then tries to sleep for 0 seconds. Test Plan: On a new instance, schedule a single trigger of type `PhabricatorOneTimeTriggerClock` to a very near time. Use top to see trigger daemon not going to 100% CPU once the event has elapsed. Reviewers: #blessed_reviewers, epriestley Subscribers: Korvin Differential Revision: https://secure.phabricator.com/D15750	2016-04-18 14:17:10 -07:00
epriestley	601aaa5a86	Modularize content sources Summary: Ref T10537. For Nuance, I want to introduce new sources (like "GitHub" or "GitHub via Nuance" or something) but this needs to modularize eventually. Split ContentSource apart so applications can add new content sources. Test Plan: This change has huge surface area, so I'll hold it until post-release. I think it's fairly safe (and if it does break anything, the breaks should be fatals, not anything subtle or difficult to fix), there's just no reason not to hold it for a few hours. - Viewed new module page. - Grepped for all removed functions/constants. - Viewed some transactions. - Hovered over timestamps to get content source details. - Added a comment via Conduit. - Added a comment via web. - Ran `bin/storage upgrade --namespace XXXXX --no-quickstart -f` to re-run all historic migrations. - Generated some objects with `bin/lipsum`. - Ran a bulk job on some tasks. - Ran unit tests. {F1190182} Reviewers: chad Reviewed By: chad Maniphest Tasks: T10537 Differential Revision: https://secure.phabricator.com/D15521	2016-03-26 11:59:45 -07:00
epriestley	de23ba0002	Fix a minor issue in Nuance which could cause the trigger daemon to poll too often Summary: Ref T10537. Currently, when you have at least two cursors, the daemon can poll too frequently when processing the last source because it never hits the end-of-list condition. Test Plan: - Ran `bin/phd debug trigger`. - Observed huge volumes of output before change as triggers fired as fast as possible. - Observed reasonable poll frequency after change. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10537 Differential Revision: https://secure.phabricator.com/D15464	2016-03-12 05:04:42 -08:00
epriestley	2a3c3b2b98	Provide `bin/nuance import` and ngram indexes for sources Summary: Ref T10537. More infrastructure: - Put a `bin/nuance` in place with `bin/nuance import`. This has no useful behavior yet. - Allow sources to be searched by substring. This supports `bin/nuance import --source whatever` so you don't have to dig up PHIDs. Test Plan: - Applied migrations. - Ran `bin/nuance import --source ...` (no meaningful effect, but works fine). - Searched for sources by substring in the UI. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10537 Differential Revision: https://secure.phabricator.com/D15436	2016-03-08 10:30:24 -08:00
epriestley	3f4cc3ad6e	Allow Nuances sources to provide import cursors Summary: Ref T10537. Some sources (like the future "GitHub Repository" source) need to poll remotes. - Provide a mechanism for sources to emit import cursors. - Hook them into the trigger daemon so they'll fire periodically. - Provide some storage. This diff does nothing useful or interesting, and is pure infrastructure. Test Plan: - Ran `bin/storage upgrade -f`, no adjustment issues. - Poked around Nuance. - Ran the trigger daemon, verified it didn't crash and checked for Nuance stuff to do. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10537 Differential Revision: https://secure.phabricator.com/D15435	2016-03-08 10:30:04 -08:00
epriestley	abb4c03b47	Remove shouldShowSubscribersProperty() from SubscribableInterface Summary: Every caller returns `true`. This was added a long time ago for Projects, but projects are no longer subscribable. I don't anticipate needing this in the future. Test Plan: Grepped for this method. Reviewers: chad Reviewed By: chad Differential Revision: https://secure.phabricator.com/D15409	2016-03-06 06:01:36 -08:00
Sébastien Santoro	a4db6f387d	Fix typo: discsussions → discussions Test Plan: Read again the sentence. Reviewers: joshuaspence, #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley Differential Revision: https://secure.phabricator.com/D15316	2016-02-21 01:51:03 -08:00
epriestley	5c2e49a812	Allow any user to watch any project they can see Summary: Ref T6183. Ref T10054. Historically, only members could watch projects because there were some weird special cases with policies. These policy issues have been resolved and Herald is generally powerful enough to do equivalent watches on most objects anyway. Also puts a "Watch Project" button on the feed panel to make the behavior and meaning more obvious. Test Plan: - Watched a project I was not a member of. - Clicked the feed watch/unwatch button. {F1064909} Reviewers: chad Reviewed By: chad Maniphest Tasks: T6183, T10054 Differential Revision: https://secure.phabricator.com/D15063	2016-01-19 19:38:30 -08:00
epriestley	96b1665eaa	Link "continue" action to confirm dialog in bulk jobs that are unconfirmed Summary: See Q266. Test Plan: Created a bulk job, clicked "Details" instead of "Confirm", clicked "Continue" to get back to confirmation dialog. Reviewers: chad Reviewed By: chad Differential Revision: https://secure.phabricator.com/D14985	2016-01-10 10:55:58 -08:00
epriestley	4bba3fd4c1	Fully modularize DestructionEngine Summary: Ref T9979. Convert all DestructionEngine behaviors to extensions. Test Plan: {F1033244} Destroyed an object, verifying: - Herald transcripts were destroyed; - edges were destroyed; - flags were destroyed; - tokens were destroyed; - transactions were destroyed; - worker tasks were cancelled. Reviewers: chad Reviewed By: chad Maniphest Tasks: T9979 Differential Revision: https://secure.phabricator.com/D14832	2015-12-21 17:03:44 -08:00
epriestley	e9af4f8970	Fix an issue where Drydock followup tasks would not queue if the main task failed Summary: Ref T9994. This fixes the first issue discussed on that task, which is that when a merge fails after "arc land", we would not clean up all the leases properly. Specifically, when a merge fails, we use `queueTask()` to schedule a followup task. This followup destroys the lease and frees the underlying resource. However, the default behavior of `queueTask()` is to //not queue tasks// if the parent task fails. This is a reasonable, safe behavior that was originally introduced in D8774, where it kept us from sending too much mail if a task did "send some mail" and then failed a little later on and got retried. Since I think the default behavior is correct, I just special cased the behavior for Drydock to make it queue even on failure. These are the only types of followup tasks we currently want to queue on main task failure. (It's possible that future Blueprints might want some kind of more specialized behavior, where some tasks queue only on success, but we can cross that bridge when we come to it.) Test Plan: - See T9994#149878 for test case setup. - I ran that test case again with this patch, and saw the followup task queue properly in the `--trace` log, a correspoinding update task show up in `/daemon/`, and the lease get destroyed when I ran it a moment later. {F1029915} Reviewers: chad Reviewed By: chad Maniphest Tasks: T9994 Differential Revision: https://secure.phabricator.com/D14818	2015-12-18 08:17:04 -08:00
epriestley	b964f8873b	Fix daemon restart behavior to check once every 10 seconds Summary: This logic is flipped. Test Plan: - Before change: ran `bin/phd debug task`, saw queries to the config table every second. - After change: ran `bin/phd debug task`, saw queries to the config table every 10 seconds. Reviewers: chad, joshuaspence Reviewed By: chad, joshuaspence Differential Revision: https://secure.phabricator.com/D14542	2015-11-23 05:59:04 -08:00
epriestley	2e09a93dc1	Improve efficiency of worker task GC for huge loads Summary: Fixes T9808. An instance imported a very large repository, generating approximately 4 million tasks over the course of a few days. A week later, these tasks started expiring and became candidates for garbage collection. The GC works by deleting 100 rows at at time over and over again. It finds the rows it's going to delete by querying for old rows. Currently, this query generates a `WHERE dateCreated < X ORDER BY id DESC` query. This query can not efficiently execute using a single key, because it relies on `dateCreated` order to find the rows, then on `id` order to sort them. With a table with 4M rows, this is slow. This would still be OK, except that the query has to execute a lot of times since it only deletes 100 rows each time. Particularly, it needs to execute a total of ~40K times. Instead, generate `WHERE dateCreated < X ORDER BY dateCreated DESC, id DESC`. This should have the same effect in general and the GC definitely doesn't care about the difference, but it should be more efficient at large scales. Test Plan: I had to `TRUNCATE` the problem table so I don't have a perfect repro to completely convincingly test this anymore. Both queries behave fine at small scales, which is why we haven't seen this before. I was able to run the newer query in production before I nuked the table and have it complete in a reasonable amount of time, while the old query hung longer than I wanted to wait (several minutes?). The query plan for the new query was also a good one, while the query plan for the old query was terrible. I loaded the daemon console and ran `bin/garbage collect --collector worker.tasks --trace`. I verified the queries looked reasonable and produced reasonable results in production. Reviewers: chad Reviewed By: chad Maniphest Tasks: T9808 Differential Revision: https://secure.phabricator.com/D14505	2015-11-17 17:05:10 -08:00

1 2 3 4 5 ...

326 commits