Summary:
Depends on D19075. Ref T13073. If a lease acquires a resource but finds that the resource builds directly into a dead state (which can happen for any number of reasonable reasons), reset the lease and throw it back in the pool.
This isn't the lease's fault and it hasn't caused any side effects or done anything we can't undo, so we can safely reset it and throw it back in the pool.
Test Plan:
- Created a blueprint which throws from `allocateResource()` so that resources never activate.
- Tried to lease it with `bin/drydock lease ...`.
- Before patch: lease was broken and destroyed after a failed activation.
- After patch: lease was returned to the pool after a failed activation.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19076
Summary: Ref T13073. Depends on D19074. Update icons and UI for resource status.
Test Plan: Viewed resources in detail view and list view, saw better status icons.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19075
Summary:
Depends on D19073. Ref T13073. Give leases a normal header tag and try to wrangle their status constants a bit.
Also, try to capture the "status class" pattern a bit. Since we target PHP 5.2.3 we can't use `static::` so the actual subclass is kind of a mess. Not exactly sure if I want to stick with this or not. We could consider targeting PHP 5.3.0 instead to get `static::` / late static binding.
Test Plan: Viewed leases and lease lists, saw better and more conventional status information.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19074
Summary:
Depends on D19072. Ref T13073. Currently, you can leave leases stranded by using `^C` to interrupt the script. Handle signals and release leases on destruction if they haven't activated yet.
Also, print out more useful information before and after activation.
Test Plan: Mashed ^C while runnning `bin/drydock lease ... --trace`, saw the lease release.
Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19073
Summary: Depends on D19071. Ref T13073. While the daemons are supposedly doing things, show the user any logs they generate. There's often something relevant but unearthing it can be involved.
Test Plan: {F5427773}
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19072
Summary: Depends on D19070. Ref T13073. Some messages contain an interesting story or a clever anecdote. Respect newlines during rendering to preserve authorial intent.
Test Plan:
Viewed a message with linebreaks and could still read it.
{F5427754}
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19071
Summary:
Ref T13073. When a Blueprint says it will be able to allocate a resource but then throws an exception while attempting that allocation, we currently fail the lease permanently.
This is excessively harsh. This blueprint may have the best of intentions and have encountered a legitimately unforseeable failure (like a `vm.new` call to build a VM failed) and be able to succeed in the future.
Even if this blueprint is a dirty liar, other blueprints (or existing resources) may be able to satisfy the lease in the future.
Even if every blueprint is implemented incorrectly, leaving the lease alive lets it converge to success after the blueprints are fixed.
Instead of failing, log the issue and yield.
(In the future, it might make sense to distinguish more narrowly between "actually, all the resources are used up" and all other failure types, since the former is likely more routine and less concerning.)
Test Plan:
- Wrote a broken `Hoax` blueprint which always claims it can allocate but never actually allocates (just `throw` in `allocateResource()`).
- Used `bin/phd drydock lease` to acquire a Hoax lease.
- Before patch: lease abruptly failed permanently.
- After patch: lease yields after allocation fails.
{F5427747}
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19070
Summary:
Depends on D18848. Ref PHI243. This puts a bit of logic up front to figure out the blueprint type before we actually start editing it.
This implementation is a little messy but it keeps the API clean. Eventually, the implementation could probably go in the TransactionTypes so more code is shared, but I'd like to wait for a couple more of these first.
This capability probably isn't too useful, but just pays down a bit of technical debt from the caveat introduced in D18822.
Test Plan:
- Created a new blueprint with the API.
- Tried to create a blueprint without a "type" (got a helpful error).
- Created and edited blueprints via the web UI.
- Tried to change the "type" of an existing blueprint (got a helpful error).
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D18849
Summary: Ref PHI243. This is a followup to D18822, which added an edit-only `drydock.blueprint.edit`. By modularizing transactions (here) and then adding a "type" transaction (next change) I intend to remove the "edit-only" limitation and make this API method fully functional.
Test Plan: Created and edited blueprints via the web UI. Edited blueprints via the API. Disabled/enabled blueprints (currently web UI only).
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D18845
Summary:
Ref: https://admin.phacility.com/PHI243
Since our use case primarily focuses on transaction editing, this patch implements the `drydock.blueprint.edit` api method with the understanding that:
a) this is a work in progress
b) object editing is supported, but object creation is not yet implemented
Test Plan:
* updated existing blueprints via Conduit UI
* regression tested `maniphest.edit` by creating new and updating existing tasks
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, yelirekim, jcox
Differential Revision: https://secure.phabricator.com/D18822
Summary: Depends on D18734. See PHI176. We run this query on the main Drydock lease web UI, among other places. There is currently no `status` key which can satisfy it.
Test Plan:
Viewed Drydock lease page to get the query.
Ran ##explain SELECT * FROM `drydock_lease` WHERE (status IN ('pending', 'acquired', 'active')) ORDER BY `id` DESC LIMIT 101;## before and after the change.
I don't have a ton of leases locally so the un-key'd EXPLAIN isn't //that// bad, but still shows that we're getting a better key. Before:
```
mysql> explain SELECT * FROM `drydock_lease` WHERE (status IN ('pending', 'acquired', 'active')) ORDER BY `id` DESC LIMIT 101;
+----+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | drydock_lease | index | NULL | PRIMARY | 4 | NULL | 101 | Using where |
+----+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
```
After:
```
mysql> explain SELECT * FROM `drydock_lease` WHERE (status IN ('pending', 'acquired', 'active')) ORDER BY `id` DESC LIMIT 101;
+----+-------------+---------------+-------+---------------+------------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+------------+---------+------+------+---------------------------------------+
| 1 | SIMPLE | drydock_lease | range | key_status | key_status | 130 | NULL | 5 | Using index condition; Using filesort |
+----+-------------+---------------+-------+---------------+------------+---------+------+------+---------------------------------------+
1 row in set (0.00 sec)
```
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D18735
Summary: Noticed a couple of typos in the docs, and then things got out of hand.
Test Plan:
- Stared at the words until my eyes watered and the letters began to swim on the screen.
- Consulted a dictionary.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D18693
Summary:
Ref T2543. These are currently numeric values, like "0" and "3". I want to replace them with strings, like "accepted", and move definitions from Arcanist to Phabricator.
To set the stage for this, reduce the number of callsites where Phabricator invokes `ArcanistDifferentialRevisionStatus`.
This is just the easy ones. I'll hold this until the release cut.
Test Plan:
- Called `differential.find`.
- Called `differential.getrevision`.
- Called `differential.query`.
- Removed all reviewers from a revision, saw warning.
- Abandoned the no-reviewers revision, no more warning.
- Attached a revision to a task to get it to show the state icon with the status on a tooltip.
- Viewed revision bucketing on dashboard.
- Used `bin/search index` to reindex a revision.
- Hit the "Land Revision" endpoint.
I didn't explicitly test these cases:
- Doorkeeper Asana integration, since setup takes a thousand years.
- Disambiguation logic when multiple hashes match, since setup is also very involved.
- Releeph because it's Releeph.
Reviewers: chad
Reviewed By: chad
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T2543
Differential Revision: https://secure.phabricator.com/D18339
Summary:
This has been replaced by `PolicyCodex` after D16830. Also:
- Rebuild Celerity map to fix grumpy unit test.
- Fix one issue on the policy exception workflow to accommodate the new code.
Test Plan:
- `arc unit --everything`
- Viewed policy explanations.
- Viewed policy errors.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D16831
Summary:
This search engine ports cleanly to Conduit out of the box.
Ref T11694
Test Plan: called the API method from the console, browsed blueprints in the ui
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Maniphest Tasks: T11694
Differential Revision: https://secure.phabricator.com/D16593
Summary:
`DrydockAuthorizationSearchEngine` was being used solely to display authorizations for a specific blueprint from the web UI and consequently expected that callers set a specific blueprint before performing a query. Here we check to see if a blueprint has been set in cases where the engine could be operating from either Conduit or the web.
Ref T11694
Test Plan:
- called the API method from the console
- approved an authorization
- followed the "view all" link from a blueprint page
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Maniphest Tasks: T11694
Differential Revision: https://secure.phabricator.com/D16592
Summary: Fixes T11501. Let's you pass in a full PHUIIconView or just the icon name to give ObjectListItem a large icon.
Test Plan: Alamanac, Applications, Drydock, Settings, Search Typeahead, Config page...
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T11501
Differential Revision: https://secure.phabricator.com/D16421
Summary: Ref T10628. This moves everything else over. I'll clean up the cruft in the next diff.
Test Plan:
- Viewed Conduit API page, toggled tabs.
- Viewed Harbormaster build, toggled tabs.
- Viewed a Drydock lease, swapped tabs.
- Viewed a Drydock resource, swapped tabs.
- Viewed mail, swapped tabs.
- Grepped for `addPropertyList(...)`, looked for any remaining calls with a second argument.
- Also checked rSAAS for any calls, but we don't have anything there that uses tabs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10628
Differential Revision: https://secure.phabricator.com/D16207
Summary:
Ref T11034. This seems a little more promising. Two problems at the moment:
- This doesn't actually provide any useful information at all right now.
- Many object types have no profile images.
Test Plan:
{F1695254}
{F1695255}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11034
Differential Revision: https://secure.phabricator.com/D16155
Summary:
Ref T10748. These:
- Look nice.
- Hint at panel contents / effects.
- Hint which panels have been customized.
- Allow panels with issues or errors to be highlighted with an alert/attention icon.
Test Plan: {F1256156}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15836
Summary: Updates Console and Operations page.
Test Plan: Pull up Console, pull up status page
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D15615
Summary: Cleans up EditEngine, adds new layout to EditEngine and descendents
Test Plan: Test creating a new form, reordering, marking and unmarking defaults. View new forms.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D15531
Summary: Updates Drydock to use two column + curtain layouts.
Test Plan: Tested what I could get to, need @epriestley to run this locally for edge cases.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D15467
Summary:
Ref T10537. More infrastructure:
- Put a `bin/nuance` in place with `bin/nuance import`. This has no useful behavior yet.
- Allow sources to be searched by substring. This supports `bin/nuance import --source whatever` so you don't have to dig up PHIDs.
Test Plan:
- Applied migrations.
- Ran `bin/nuance import --source ...` (no meaningful effect, but works fine).
- Searched for sources by substring in the UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10537
Differential Revision: https://secure.phabricator.com/D15436
Summary: Ref T10093. Changes must be pushed to staging before they can be landed from the web.
Test Plan:
Changes must be pushed to staging before they can be landed from the web.
{F1161909}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10093
Differential Revision: https://secure.phabricator.com/D15427
Summary: Ref T10093. Show better errors when a commit fails because it has already been merged and when a fetch fails because the ref isn't present in the remote.
Test Plan:
{F1160794}
{F1160795}
Reviewers: chad
Reviewed By: chad
Subscribers: michaeljs1990, yelirekim
Maniphest Tasks: T10093
Differential Revision: https://secure.phabricator.com/D15420
Summary:
Ref T10246. Drydock still has a few outstanding issues, but it's mostly minor UI stuff and the documentation has reasonable caveats about it.
Broadly, I don't expect to make any major changes to Drydock in the future (i.e., all the fundamentals seem sound at this point) and it doesn't have any major technical debt or, like, obsolete APIs or anything.
Test Plan: Saw "Drydock" as not-a-prototype in Applications.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10246
Differential Revision: https://secure.phabricator.com/D15401
Summary: Ref T10457. Use modern controller and UI tech to build the list view and actions.
Test Plan:
- Viewed operation list.
- Viewed operation detail.
- Checked menus on mobile.
{F1139757}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10457
Differential Revision: https://secure.phabricator.com/D15393
Summary:
Ref T10457.
- Let blueprints be tagged so you can search and annotate them a little more easily.
- Give each blueprint type an optional icon to make things a little easier to parse visually.
Test Plan:
- Tagged blueprints.
- Searched by tags.
- Looked at nice little icons.
{F1139712}
Reviewers: chad
Reviewed By: chad
Subscribers: yelirekim
Maniphest Tasks: T10457
Differential Revision: https://secure.phabricator.com/D15392
Summary:
Ref T10457. Fixes T10024. This primarily just modernizes blueprints to use EditEngine.
This also fixes T10024, which was an issue with stored properties not being flagged correctly.
Also slightly improves typeaheads for blueprints (more information, disabled state).
Test Plan:
- Created and edited various types of blueprints.
- Set and removed limits.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10024, T10457
Differential Revision: https://secure.phabricator.com/D15390
Summary:
Ref T10457. The ngram indexing seems to be working well; extend it into Drydock.
Also clean up the list controller a little bit.
Test Plan:
- Ran migrations.
- Searched for blueprints by name.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10457
Differential Revision: https://secure.phabricator.com/D15389
Summary:
Ref T10449. Currently, we store classes (like "AlmanacClusterRepositoryServiceType") in the database.
Instead, store types (like "cluster.repository").
This is a small change, but types are a little more flexible (they let us freely reanme classes), a little cleaner (fewer magic strings in the codebase), and a little better for API usage (they're more human readable).
Make this minor usability change now, before we unprototype.
Also make services searchable by type.
Also remove old Almanac API endpoints.
Test Plan:
- Ran migration, verified all data migrated properly.
- Created, edited, rebound, and changed properties of services.
- Searched for services by service type.
- Reviewed available Conduit methods.
Reviewers: chad
Reviewed By: chad
Subscribers: yelirekim
Maniphest Tasks: T10449
Differential Revision: https://secure.phabricator.com/D15346
Summary:
Fixes T9762. Ref T10246.
**Disabling Bindings**: Previously, there was no formal way to disable bindings. The internal callers sometimes check some informal property on the binding, but this is a common need and deserves first-class support in the UI. Allow bindings to be disabled.
**Deleting Interfaces**: Previously, you could not delete interfaces. Now, you can delete unused interfaces.
Also some minor cleanup and slightly less mysterious documentation.
Test Plan: Disabled bindings and deleted interfaces.
Reviewers: chad
Reviewed By: chad
Subscribers: yelirekim
Maniphest Tasks: T9762, T10246
Differential Revision: https://secure.phabricator.com/D15345
Summary: Mostly for consistency, we're not using other forms of icons and this makes all classes that use an icon call it in the same way.
Test Plan: tested uiexamples, lots of other random pages.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D15125
Summary:
Ref T9994.
- Allow errors to be dismissed.
- Tailor messaging for closed/abandoned revisions.
- Reduce scare messaging on land dialog, since it's not really that scary anymore.
Test Plan:
- Dismissed errors.
- Hit new warnings.
- Wasn't as scared when landing.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9994
Differential Revision: https://secure.phabricator.com/D14886
Summary:
Fixes T10037. When we're building commit `aabbccdd`, we currently do this to check it out:
git reset --hard aabbccdd
However, this has an undesirable side effect of moving the current branch pointer to point at `aabbccdd`. The current branch pointer may be some totally different branch which `aabbccdd` is not part of, so this is confusing and misleading.
Instead, use `git reset --hard HEAD` to get the primary effect we want (destroying staged changes) and then `git checkout aabbccdd` to checkout the commit in a detached HEAD state.
Test Plan:
- Ran a build (a commit-focused operation) successfully.
- Verified working copy was pointed at a detached HEAD afterward:
```
builder@sbuild001:/var/drydock/workingcopy-167/repo/git-test-ii$ git status
HEAD detached at ffc7635
nothing to commit, working directory clean
```
- Ran a land (a branch-foused operation) successfully.
- Verified working copy was pointed at a branch afterward:
```
builder@sbuild001:/core/data/drydock/workingcopy-168/repo/git-test$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
```
Reviewers: chad
Reviewed By: chad
Subscribers: yelirekim
Maniphest Tasks: T10037
Differential Revision: https://secure.phabricator.com/D14850
Summary:
Fixes T9994. Currently, when Drydock can't allocate a new resource because some limit has been reached, it waits patiently for a resource to become available.
It is possible that no resource will ever become available. Particularly with "Working Copy" resources, the new lease may want a copy of `rB`, but the resource may already be maxed out on `rA`.
Right now, no process exists to automatically reclaim the unused `rA`.
When we encounter this situation, try to reclaim one of the other resources if it is just sitting there unused.
Specifically:
- Add a "reclaim" command which means "release this resource //if// it is completely unused".
- Add a `bin/drydock reclaim` to send this command to every active resource.
- When we try to acquire a resource and can't, but only because of some kind of limit / utilization problem, try to release an unused resource to free up some room.
Test Plan:
- Set "Working Copy" resource limit to 1.
- Ran "Test Configuration" in `rA`, which worked.
- Ran "Test Configuration" in `rB`, which hung forever.
- Applied patch.
- Ran "Test Configuration" in `rB`, saw it reclaim the `rA` resource, use the slot, then succeed.
- Ran "Test Configuration" in `rA` again, saw it grab the slot back.
- Ran `bin/drydock reclaim` and saw it reclaim a bunch of old orphaned resources.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9994
Differential Revision: https://secure.phabricator.com/D14819
Summary:
Ref T9994. This fixes the first issue discussed on that task, which is that when a merge fails after "arc land", we would not clean up all the leases properly.
Specifically, when a merge fails, we use `queueTask()` to schedule a followup task. This followup destroys the lease and frees the underlying resource.
However, the default behavior of `queueTask()` is to //not queue tasks// if the parent task fails. This is a reasonable, safe behavior that was originally introduced in D8774, where it kept us from sending too much mail if a task did "send some mail" and then failed a little later on and got retried.
Since I think the default behavior is correct, I just special cased the behavior for Drydock to make it queue even on failure. These are the only types of followup tasks we currently want to queue on main task failure.
(It's possible that future Blueprints might want some kind of more specialized behavior, where some tasks queue only on success, but we can cross that bridge when we come to it.)
Test Plan:
- See T9994#149878 for test case setup.
- I ran that test case again with this patch, and saw the followup task queue properly in the `--trace` log, a correspoinding update task show up in `/daemon/`, and the lease get destroyed when I ran it a moment later.
{F1029915}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9994
Differential Revision: https://secure.phabricator.com/D14818
Summary:
Ref T10004. After D14804, we get this behavior by default and no longer need to set it explicitly.
(If some endpoint did eventually need to set it explicitly, it could just change what it passes to `setHref()`, but I believe we currently have no such endpoints and do not foresee ever having any.)
Test Plan:
- As a logged out user, clicked various links in Differential, Maniphest, Files, etc., always got redirected to a sensible place after login.
- Grepped for `setObjectURI()`, `getObjectURI()` (there are a few remaining callsites, but to a different method with the same name in Doorkeeper).
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que
Maniphest Tasks: T10004
Differential Revision: https://secure.phabricator.com/D14805
Summary:
Fixes T9669. Two issues:
- We were using `repositoryPHIDs` instead of `blueprintPHIDs` for the list of allowed blueprints. Use the correct value.
- We weren't enforcing `allowedBlueprintPHIDs` fully correctly. We //did// require an authorization, so the net effect was correct in nearly all cases, but we could have selected from too large a pool in the case where the application itself was doing the authorization (e.g., from the command line).
Test Plan: Ran a build through Drydock/Harbormaster locally.
Reviewers: chad, tycho.tatitscheff
Reviewed By: chad, tycho.tatitscheff
Subscribers: tycho.tatitscheff
Maniphest Tasks: T9669
Differential Revision: https://secure.phabricator.com/D14368
Summary: This is //hilarious//.
Test Plan: Test icon on local install.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D14351
Summary: I didn't test the positive version of this -- the constant has value `2` but when we read it from the database it's `"2"` or whatever. Just do this for now and maybe someday we'll use strings.
Test Plan: will do production things
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D14352
Summary:
Ref T182. Make the disabled state of the button more accurately reflect whether clicking it will work.
Don't allow "land" to proceed unless the revision is accepted.
Test Plan: Saw button in disabled state, clicked it, got "only accepted revisions" message.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14350
Summary:
Ref T182. Ref T9252.
- Adds a "Test" repository operation that just runs `git status` to see if things work.
- Adds a button for it in Edit Repository.
- Shows operation status on the operation detail view to make this workflow work a little better.
- Adds a lot of words. Words words words words.
Test Plan:
- Tested repository operation.
- Read words.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182, T9252
Differential Revision: https://secure.phabricator.com/D14349
Summary:
Ref T182. When viewing a revision, if there are several error operations and then a success operation, we currently show the last error. This is misleading.
Instead, don't show anything if there's a success (this may require tuning eventually if you can land multiple times onto different branches or whatever, but should be reasonable for now).
Also make the table a little nicer, particularly for merge failure output.
Test Plan: {F910385}
Reviewers: chad, Mnkras
Reviewed By: Mnkras
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14348
Summary:
Ref T182. This command should never actually generate a commit because `--squash` prevents that, but `git` seems to sometimes hit a check for username/email configuration (maybe when merging a non-fastforward?).
Give it some dummy values to placate it. This command shouldn't commit anything so these values should never actually be used.
Test Plan: Landed rGITTESTd8c8643cb02bbe60048c6c206afc2940c760a77e.
Reviewers: chad, Mnkras
Reviewed By: Mnkras
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14347
Summary:
Ref T182. I lifted this logic out of `arc`, but the context is a little different there, and this option is too strict in "Land Revision".
Specifically, it prevents `git` from merging unless the merge is //strictly// a fast-foward, even with `--squash`. That means revisions can't merge unless they're rebased on the current `master`, even if they have no conflicts.
(This whole process will probably need additional refinement, but the behavior without this flag is more reasonable overall than the behavior with it for now.)
Test Plan: Will land stuff in production~~
Reviewers: chad, Mnkras
Reviewed By: Mnkras
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14346
Summary:
Ref T182. We just show "an error happened" right now. Improve this behavior.
This error handling chain is a bit ad-hoc for now but we can formalize it as we hit other cases.
Test Plan:
{F910247}
{F910248}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14343
Summary:
Ref T182. Couple of minor improvements here:
- Show the Drydock lease when viewing a Repository Operation detail screen. This just makes it easier to jump around between relevant objects.
- When tasks are waiting for a lease, awaken them when it breaks or is released, not just when it is acquired. This makes the queue move forward faster when errors occur.
Test Plan:
- Viewed a repository operation and saw a link to the lease.
- Did a bad land (intentional merge problem) and got an error in about ~3 seconds instead of ~17.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14341
Summary:
Ref T182.
- We just show the oldest operation right now, but we usually care about the oldest non-failure.
- Only query for actual land operations when rendering the revision operations dialog (maybe eventually we'll show more stuff?).
- For now, prevent multiple lands / repeated lands or queueing up lands while other lands are happening.
Test Plan: Landed a revision. Tried to land it more / again.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14338
Summary:
Ref T182. Currently, the "RepositoryLand" operation is responsible for performing merges when landing a revision.
However, we'd like to be able to perform these merges in a larger set of cases in the future. For example:
- After Releeph is revamped, when someone says "I want to merge bug fix X into stable branch Y", it would probably be nice to make that a Buildable and let tests run against it without requring that it actually be pushed anywhere.
- Same deal if we want a merge-from-Diffusion or cherry-pick-from-Diffusion operation.
- Similar deal if we want a "random web UI edits from Diffusion".
Move the merging part into WorkingCopy so more applications can share/use it in the future.
A big chunk of this is me making stuff up for now (the ol' undocumented dictionary full of arbitrary magic keys), but I anticipate formalizing it as we move along.
Test Plan: Pushed rGITTEST0d58eef3ce0fa5a10732d2efefc56aec126bc219 up from my local install via "Land Revision".
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14337
Summary:
Ref T9252. Right now, we have very strict limits on Drydock: one lease per host, and one working copy per working copy blueprint.
These are silly and getting in the way of using "Land Revision" more widely, since we need at least one working copy for each landable repository.
For now, just remove the host limit and put a simple limit on working copies. This might need to be fancier some day (e.g., limit working copies per-host) but it is generally reasonable for the use cases of today.
Also add a `--background` flag to make testing a little easier.
(Limits are also less important nowadays than they were in the past, because pools expand slowly now and we seem to have stamped out all the "runaway train" bugs where allocators go crazy and allocate a million things.)
Test Plan:
- With a limit of 5, ran 10 concurrent builds and saw them finish after allocating 5 total resources.
- Removed limit, raised taskmaster concurrency to 128, ran thousands of builds in blocks of 128 or 256.
- Saw Drydock gradually expand the pool, allocating a few more working copies at first and a lot of working copies later.
- Got ~256 builds in ~140 seconds, which isn't a breakneck pace or anything but isn't too bad.
- This stuff seems to be mostly bottlenecked on `sbuild` throttling inbound SSH connections. I haven't tweaked it.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14334
Summary:
Fixes T9519. Right now, build steps go straight from the build to the edit screen.
This means that there's no way to see their edit history or review details without edit permission. In particular, this makes it a bit harder to catch the Drydock Blueprint authorization warnings from T9519.
- Add a standard view screen.
- Add a little warning callout to blueprint authorizations.
This also does a bit of a touchup on the weird dropshadow element from T9586. Maybe not totally design-approved now but it's less ugly, at least.
Test Plan:
{F906695}
{F906696}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9519
Differential Revision: https://secure.phabricator.com/D14330
Summary:
Ref T182. Replace the total mess we had before with a sort-of-reasonable element.
This automatically updates using "javascript".
Test Plan:
{F901983}
{F901984}
Used "Land Revision", saw the land status go from "Waiting" -> "Working" -> "Landed" without having to mash reload over and over again.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14314
Summary: Ref T182. Nothing fancy, just make these slightly easier to work with.
Test Plan: {F884754}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14295
Summary: Ref T182. Make a reasonable attempt to get the commit message, author, and committer data correct.
Test Plan: BEHOLD: rGITTEST810b7f17cd0c909256a45d29a5062fcf417d0489
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14280
Summary:
Ref T9252. This fixes a bug from D14236. D14272 discusses the observable effects of the bug, primarily that the window for racing is widened from ~a few milliseconds to several minutes under our configuration.
This SQL query is missing a `GROUP BY` clause, so all of the resources get counted as having the same status (specifically, the alphabetically earliest status any resource had, I think). For test cases this often gets the right result since the number of resources may be small and they may all have the same status, but in production this isn't true. In particular, the allocator would sometimes see "35 destroyed resources" (or whatever), when the real counts were "32 destroyed resources + 3 pending resources".
Since this allocator behavior is soft/advisory this didn't cause any actual problems, per se (we do expect races here occasionally), it just made the race very very easy to hit. For example, Drydock in production currently has three pending working copy resources. Although we do expect this to be //possible//, getting 4 resources when the configured limit is 1 should be hard (not lightning strike / cosmic radiaion hard, but "happens once a year" hard).
Also exclude destroyed resources since we never care about them.
Test Plan:
Followed the plan from D14272 and restarted two Harbormaster workers at the same time.
After this patch was applied, they no longer created two different resources (we expect it to be possible for this to happen, just very hard).
We should still be able to force this race by putting something like `sleep(10)` right before the query, then `sleep(10)` right after it. That would prevent the allocators from seeing one another (so they would both think there were no other resources) and push us down the pathway where we exceed the soft limit.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14274
Summary:
Ref T9252. This is mostly a fix for an edge case from D14236. Here's the setup:
- There are no resources.
- A request for a new resource arrives.
- We build a new resource.
Now, if we were leasing an existing resource, we'd call `canAcquireLeaseOnResource()` before acquiring a lease on the new resource.
However, for new resources we don't do that: we just acquire a lease immediately. This is wrong, because we now allow and expect some resources to be unleasable when created.
In a more complex workflow, this can also produce the wrong result and leave the lease acquired sub-optimally (and, today, deadlocked).
Make the "can we acquire?" pathway consistent for new and existing resources, so we always do the same set of checks.
Test Plan:
- Started daemons.
- Deleted all working copy resources.
- Ran two working-copy-using build plans at the same time.
- Before this change, one would often [1] acquire a lease on a pending resource which never allocated, then deadlock.
- After this change, the same thing happens except that the lease remains pending and the work completes.
[1] Although the race this implies is allowed (resource pool limits are soft/advisory, and it is expected that we may occasionally run over them), it's MUCH easier to hit right now than I would expect it to be, so I think there's probably at least one more small bug here somewhere. I'll see if I can root it out after this change.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14272
Summary:
Ref T182. If 35 other things are configured completely correctly, make it remotely possible that this button may do something approximating the thing that the user wanted.
This primarily fleshes out the idea that "operations" (like landing, merging or cherry-picking) can have some beahavior, and when we run an operation we do whatever that behavior is instead of just running `git show`.
Broadly, this isn't too terrible because Drydock seems like it actually works properly for the most part (???!?!).
Test Plan: {F876431}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14270
Summary:
Ref T182. This doesn't do anything interesting yet and is mostly scaffolding, but here's roughly the workflow. From previous revision, you can configure "Repository Automation" for a repository:
{F875741}
If it's configured, a new "Land Revision" button shows up:
{F875743}
Once you click it you get a big warning dialog that it won't work, and then this shows up at the top of the revision (completely temporary/placeholder UI, some day a nice progress bar or whatever):
{F875747}
If you're lucky, the operation eventually sort of works:
{F875750}
It only runs `git show` right now, doesn't actually do any writes or anything.
Test Plan:
- Clicked "Land Revision".
- Watched `phd debug task`.
- Saw it log `git show` to output.
- Verified operation success in UI (by fiddling URL, no way to get there normally yet).
Reviewers: chad
Reviewed By: chad
Subscribers: revi
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14266
Summary:
Ref T182. This allows you to assign blueprints that a repository can use to perform working copy operations. Eventually, this will support "merge this" in Differential, etc.
This is just UI for now, with no material effects.
Most of this diff is just taking logic that was in the existing "Blueprints" CustomField and putting it in more general places so Diffusion (which does not use CustomFields) can also access it.
Test Plan:
- Configured repository automation for a repository.
- Removed repository automation for a repository.
Reviewers: chad
Reviewed By: chad
Subscribers: avivey
Maniphest Tasks: T182
Differential Revision: https://secure.phabricator.com/D14259
Summary:
Ref T9519. When acquiring leases on resources:
- Only consider resources created by authorized blueprints.
- Only consider authorized blueprints when creating new resources.
- Fail with a tailored error if no blueprints are allowed.
- Fail with a tailored error if missing authorizations are causing acquisition failure.
One somewhat-substantial issue with this is that it's pretty hard to figure out from the Harbormaster side. Specifically, the Build step UI does not show field value anywhere, so the presence of unapproved blueprints is not communicated. This is much more clear in Drydock. I'll plan to address this in future changes to Harbormaster, since there are other related/similar issues anyway.
Test Plan: {F872527}
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9519
Differential Revision: https://secure.phabricator.com/D14254
Summary:
Ref T9519. This is like 80% of the way there and doesn't fully work yet, but roughly shows the shape of things to come. Here's how it works:
First, there's a new custom field type for blueprints which works like a normal typeahead but has some extra logic. It's implemented this way to make it easy to add to Blueprints in Drydock and Build Plans in Harbormaster. Here, I've added a "Use Blueprints" field to the "WorkingCopy" blueprint, so you can control which hosts the working copies are permitted to allocate on:
{F869865}
This control has a bit of custom rendering logic. Instead of rendering a normal list of PHIDs, it renders an annotated list with icons:
{F869866}
These icons show whether the blueprint on the other size of the authorization has approved this object. Once you have a green checkmark, you're good to go.
On the blueprint side, things look like this:
{F869867}
This table shows all the objects which have asked for access to this blueprint. In this case it's showing that one object is approved to use the blueprint since I already approved it, but by default new requests come in here as "Authorization Requested" and someone has to go approve them.
You approve them from within the authorization detail screen:
{F869868}
You can use the "Approve" or "Decline" buttons to allow or prevent use of the blueprint.
This doesn't actually do anything yet -- objects don't need to be authorized in order to use blueprints quite yet. That will come in the next diff, I just wanted to get the UI in reasonable shape first.
The authorization also has a second piece of state, which is whether the request from the object is active or inactive. We use this to keep track of the authorization if the blueprint is (maybe temporarily) deleted.
For example, you might have a Build Plan that uses Blueprints A and B. For a couple days, you only want to use A, so you remove B from the "Use Blueprints: ..." field. Later, you can add B back and it will connect to its old authorization again, so you don't need to go re-approve things (and if you're declined, you stay declined instead of being able to request authorization over and over again). This should make working with authorizations a little easier and less labor intensive.
Stuff not in this diff:
- Actually preventing any allocations (next diff).
- Probably should have transactions for approve/decline, at least, at some point, so there's a log of who did approvals and when.
- Maybe should have a more clear/loud error state when no blueprints are approved?
- Should probably restrict the typeahead to specific blueprint types.
Test Plan:
- Added the field.
- Typed some stuff into it.
- Saw the UI update properly.
- Approved an authorization.
- Declined an authorization.
- Saw active authorizations on a blueprint page.
- Didn't see any inactive authroizations there.
- Clicked "View All Authorizations", saw all authorizations.
Reviewers: chad, hach-que
Reviewed By: chad
Maniphest Tasks: T9519
Differential Revision: https://secure.phabricator.com/D14251
Summary:
Ref T9252. I think there's a more complex version of this problem discussed elsewhere, but here's what we hit today:
- 5 commits land at the same time and trigger 5 builds.
- All of them go to acquire a working copy.
- Working copies have a limit of 1 right now, so 1 of them gets the lease on it.
- The other 4 all trigger allocation of //new// working copies. So now we have: 1 active, leased working copy and 4 pending, leased working copies.
- The 4 pending working copies will never activate without manual intervention, so these 4 builds are stuck forever.
To fix this, prevent WorkingCopies from giving out leases until they activate. So now the leases won't acquire until we know the working copy is good, which solves the first problem.
However, this creates a secondary problem:
- As above, all 5 go to acquire a working copy.
- One gets it.
- The other 4 trigger allocations, but no longer acquire leases. This is an improvement.
- Every time the leases update, they trigger another allocation, but never acquire. They trigger, say, a few thousand allocations.
- Eventually the first build finishes up and the second lease acquires the working copy. After some time, all of the builds finish.
- However, they generated an unboundedly large number of pending working copy resources during this time.
This is technically "okay-ish", in that it did work correctly, it just generated a gigantic mess as a side effect.
To solve this, at least for now, provide a mechanism to impose allocation rate limits and put a cap on the number of allocating resources of a given type. As hard-coded, this the greater of "1" or "25% of the active resources in the pool".
So if there are 40 working copies active, we'll start allocating up to 10 more and then cut new allocations off until those allocations get sorted out. This prevents us from getting runaway queues of limitless size.
This also imposes a total active working copy resource limit of 1, which incidentally also fixes the problem, although I expect to raise this soon.
These mechanisms will need refinement, but the basic idea is:
- Resources which aren't sure if they can actually activate should wait until they do activate before allowing leases to acquire them. I'm fairly confident this rule is a reasonable one.
- Then we limit how many bookkeeping side effects Drydock can generate once it starts encountering limits.
Broadly, some amount of mess is inevitable because Drydock is allowed to try things that might not work. In an extreme case we could prevent this mess by setting all these limits at "1" forever, which would degrade Drydock to effectively be a synchronous, blocking queue.
The idea here is to put some amount of slack in the system (more than zero, but less than infinity) so we get the performance benefits of having a parallel, asyncronous system without a finite, manageable amount of mess.
Numbers larger than 0 but less than infinity are pretty tricky, but I think rules like "X% of active resources" seem fairly reasonable, at least for resources like working copies.
Test Plan:
Ran something like this:
```
for i in `seq 1 5`; do sh -c '(./bin/harbormaster build --plan 10 rX... &) &'; done;
```
Saw 5 plans launch, acquire leases, proceed in an orderly fashion, and eventually finish successfully.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14236
Summary:
Ref T9252. Currently, Harbormaster and Drydock work like this in some cases:
# Queue a lease for activation.
# Then, a little later, save the lease PHID somewhere.
# When the target/resource is destroyed, destroy the lease.
However, something can happen between (1) and (2). In Drydock this window is very short and the "something" would have to be a lighting strike or something similar, but in Harbormaster we wait until the resource activates to do (2) so the window can be many minutes long. In particular, a user can use "Abort Build" during those many minutes.
If they do, the target is destroyed but it doesn't yet have a record of the artifact, so the artifact isn't cleaned up.
Make these things work like this instead:
# Create a new lease and pre-generate a PHID for it.
# Save that PHID as something that needs to be cleaned up.
# Queue the lease for activation.
# When the target/resource is destroyed, destroy the lease if it exists.
This makes sure there's no step in the process where we might lose track of a lease/resource.
Also, clean up and standardize some other stuff I hit.
Test Plan:
- Stopped daemons.
- Restarted a build in Harbormaster.
- Stepped through the build one stage at a time using `bin/worker execute ...`.
- After the lease was queued, but before it activated, aborted the build.
- Processed the Harbormaster side of things only.
- Saw the lease get destroyed properly.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14234
Summary:
Fixes T9494. This:
- Removes all the random GC.x.y.z config.
- Puts it all in one place that's locked and which you use `bin/garbage set-policy ...` to adjust.
- Makes every TTL-based GC configurable.
- Simplifies the code in the actual GCs.
Test Plan:
- Ran `bin/garbage collect` to collect some garbage, until it stopped collecting.
- Ran `bin/garbage set-policy ...` to shorten policy. Saw change in web UI. Ran `bin/garbage collect` again and saw it collect more garbage.
- Set policy to indefinite and saw it not collect garabge.
- Set policy to default and saw it reflected in web UI / `collect`.
- Ran `bin/phd debug trigger` and saw all GCs fire with reasonable looking queries.
- Read new docs.
{F857928}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9494
Differential Revision: https://secure.phabricator.com/D14219
Summary: Ref T9494. Depends on D14216. Remove 10 copies of this code.
Test Plan: Ran `arc unit --everything`, browsed Config > Modules, clicked around Herald / etc.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9494
Differential Revision: https://secure.phabricator.com/D14217
Summary:
Ref T9252. This primarily allows Harbormaster to request (and Drydock to fulfill) working copies with a patch from a staging area. Doing this means we can do builds on in-review changes from `arc diff`.
This is a little cobbled-together but should basically work.
Also fix some other issues:
- Yielded, awakend workers are fine to update but could complain.
- We can't log slot lock failures to resources if we don't end up saving them.
- Killing the transaction would wipe out the log.
- Fix some TODOs, etc.
Test Plan: Ran Harbormaster builds on a local revision.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14214
Summary:
Ref T9252. Long ago you sometimes manually created resources, so they had human-enterable names. However, users never make resources manually any more, so this field isn't really useful any more.
In particular, it means we write a lot of untranslatable strings like "Working Copy" to the database in the default locale. Instead, do the call at runtime so resource names are translatable.
Also clean up a few minor things I hit while kicking the tires here.
It's possible we might eventually want to introduce a human-choosable label so you can rename your favorite resources and this would just be a default name. I don't really have much of a use case for that yet, though, and I'm not sure there will ever be one.
Test Plan:
- Restarted a Harbormaster build, got a clean build.
- Released all leases/resources, restarted build, got a clean build with proper resource names.
Reviewers: hach-que, chad
Reviewed By: hach-que, chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14213
Summary:
Ref T9252. See companion change in D14211. This does the same thing for leases.
Particularly, most of the TODOs about error handling can just be removed because they'll do the right things by default now.
This and D14211 also move slot lock release to after resource destruction. This feels cleaner than trying to release early at release/break.
Test Plan: Restarted a Harbormaster build, got a clean build result. This needs more vetting but I'll clean up any issues as I hit them.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14212
Summary:
Ref T9252. Currently, error handling behavior isn't great and a lot of errors aren't dealt with properly. Try to improve this by making default behaviors better:
- Yields, slot lock exceptions, and aggregate or proxy exceptions containing an excpetion of these types turn into yields.
- All other exceptions are considered permanent failures. They break the resource and
This feels a little bit "magical" but I want to try to get the default behaviors to align reasonably well with expectations so that blueprints mostly don't need to have a ton of error handling. This will probably need at least some refinement down the road, but it's a reasonable rule for all exception/error conditions we currently have.
Test Plan: I did a clean build, but haven't vetted this super thoroughly. Next diff will do the same thing to leases, then I'll work on stabilizing this code better.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14211
Summary: Ref T9252. Add a bit more logging and improve some behaviors.
Test Plan: Restarted a build, got a good result.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14210
Summary:
Ref T9252. This is the same as D14201, but for lease stuff instead of resource stuff.
This one is a little heavier but still feels pretty reasonable to me at the end of the day (worker is <1K lines and has a ton of comment stuff).
Also fixes a few random bugs I hit in the task queue.
Test Plan:
- Restarted some Harbormaster builds, saw them go through cleanly.
- Released pre-activation resources/leases.
- Probably still kinda buggy but I'll iron the details out over time.
Logs are starting to look somewhat plausible:
{F855747}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14202
Summary:
Ref T9252. Currently, Drydock Leases and Resources have several workers:
- Resources: ResourceWorker, ResourceUpdateWorker, ResourceDestroyWorker
- Leases: AllocatorWorker, LeaseWorker, LeaseUpdateWorker, LeaseDestroyWorker
This is kind of a lot of stuff, and it creates some problems.
In particular, leases and resources in early lifecycle phases (pending/allocating/acquiring) can't process commands yet, because that code is only in the "UpdateWorker" classes. If they aren't able to move forward because of a bug, they also can't be released because they can't react to the release command until later in their lifecycle. This creates a soft hang where I have to go wipe stuff out of the database since there's no other way to get rid of it.
Instead, I want leases and resources to be releasable from any (pre-release / pre-destroy) phase of their lifecycle. To support this, all the workers before the "UpdateWorker" need to be able to process commands.
A second, similar issue is that logging and exception handling behaviors are underpowered right now. Elsewhere I began improving this, but ran into issues where all of the workers needed to share very similar exception code. Merging them will make this future change simpler.
This diff fixes this for resources: it merges the Worker, UpdateWorker and DestroyWorker logic into UpdateWorker and throws away the other two workers.
Test Plan: Nothing substantive yet, see next diff. I'll do the same thing for Leases, then test both more thoroughly.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14201
Summary: Ref T9252. Make log types modular so they can be translated and have complicated rendering logic if necessary (currently, none have this).
Test Plan: {F855330}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14198
Summary:
Ref T9252. Drydock logs are almost exclusively useful as a diagnostic tool for debugging immediate problems, so GC them fairly aggressively.
(I expect 99% of the usefulness of these logs to be within the first 24 hours, basically "why isn't my thing working". I can't really think of any cases where having old logs would be useful.)
Test Plan:
- Ran GC, saw it hit the log table (with no effect).
- Changed TTL from 30 days to 30 seconds, ran GC, saw it wipe recent logs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14197
Summary:
Ref T9252. Several general changes here:
- Moves logs to use PHIDs instead of IDs. This generally improves flexibility (for example, it's a lot easier to render handles).
- Adds `blueprintPHID` to logs. Although you can usually figure this out from the leasePHID or resourcePHID, it lets us query relevant logs on Blueprint views.
- Instead of making logs a top-level object, make them strictly a sub-object of Blueprints, Resources and Leases. So you go Drydock > Lease > Logs, etc., to get to logs.
- I might restore the "everything" view eventually, but it doesn't interact well with policies and I'm not sure it's very useful. A policy-violating `bin/drydock log` might be cleaner.
- Policy-wise, we always show you that logs exist, we just don't show you log content if it's about something you can't see. This is similar to seeing restricted handles in other applications.
- Instead of just having a message, give logs "type" + "data". This will let logs be more structured and translatable. This is similar to recent changes to Herald which seem to have worked well.
Test Plan:
Added some placeholder log writes, viewed those logs in the UI.
{F855199}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14196
Summary: Ref T9252. We're currently resetting to the local branch, but should be resetting to the origin branch.
Test Plan: Restarted a build, had it run `git show`, saw proper HEAD.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14194
Summary: Replaces D13687. Leases track an owner but don't currently show it.
Test Plan:
Looked at a lease.
{F851223}
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D14191
Summary:
Ref T9252. For building Phabricator itself, we need to have `libphutil/`, `arcanist/` and `phabricator/` next to one another on disk.
Expand the Drydock WorkingCopy resource so that it can have multiple repositories if the caller needs them.
I'm not sure if I'm going to put the actual config for this in Harbormaster or Drydock yet, but the WorkingCopy resource itself should work the same way in either case.
Test Plan: Restarted a Harbormaster build which leases a working copy, saw it build as expected.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14180
Summary:
Ref T9252. Currently, Harbormaster does this when trying to acquire a working copy:
- Ask for a working copy.
- Yield for 15 seconds.
- Check if we have a working copy yet.
That's OK, but Drydock takes ~1s to acquire a working copy lease if a resource is already available, so we end up doing this:
- T+0: Ask for a working copy.
- T+0: Yield for 15 seconds.
- T+1: Working copy lease activates.
- T+15: Working copy lease is used.
- T+16: Build finishes.
So we end up spending about 2 seconds doing work and 14 seconds sleeping.
One way to fix this would be to fiddle with the yield duration, so we yield for 1, 2, 4, ... seconds or something. This probably isn't a bad idea for longer leases (i.e., wait for 15, 30, 45 ... seconds or similar) but it implies a lot of churn for short leases.
Instead, let tasks "awaken" other tasks when they complete. The "awaken" operation means: if a task is in a yielded state (no failures, no owner, explicitly yielded, future expires time), pretend it only yielded until right now instead of whenever it really yielded to.
Basically, this rewrites history so that even though Harbormaster did a `yield(15)`, we pretend it did a `yield(4)` after we activate the lease if lease activation took 4 seconds.
If this misses, it's fine: we fall back to the normal yield behavior and things move forward normally a few seconds later.
If it hits, we get a more nimble process pretty cleanly.
Test Plan:
- Restarted a build plan (lease working copy + run `ls`) with this patch no-op'd, took about 16 seconds.
- Restarted a build plan with this patch active, took about 1 second.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14178
Summary: Ref T9252. Show the user when a resource or lease has a pending release command in queue.
Test Plan: Released a resource and lease from the web UI. In both cases, saw a "releasing" tag and the action disable.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14177
Summary:
Fixes T6569. This implements an expiry mechanism for Drydock resources which parallels the mechanism for leases.
A few things are missing that we'll probably need in the future:
- An "EXPIRES" command to update the expiration time. This would let resources be permanent while leased, then expire after, say, 24 hours without any leases.
- A callback like `shouldActuallyExpireRightNow()` for resources and leases that lets them decide not to expire at the last second.
- A callback like `didAcquireLease()` for resource blueprints, to parallel `didReleaseLease()`, letting them clear or extend their timer.
However, this stuff would mostly just let us tune behaviors, not really open up new capabilities.
Test Plan: Changed host resources to expire after 60 seconds, leased one, saw it vanish 60 seconds later.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T6569
Differential Revision: https://secure.phabricator.com/D14176
Summary: Ref T9252. This is still crude in a few ways but basically works, at least for commits.
Test Plan:
- Made a build plan with just this build step.
- Ran `bin/harbormaster build --plan 10 ...` on a commit.
- It actually built a working copy, leased it, took no action, and released the lease. MAGIC~~~
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14160
Summary: Ref T9252. This is the same as D14157, just for Resources and their leases.
Test Plan: Viewed a resource, saw only active leases, clicked "View All Leases", queried, clicked around, used crumbs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14158
Summary:
Ref T9252. Currently, Drydock blueprint pages:
- show all resources, even if there are a million;
- show resources in all states, although destroyed resources are usually uninteresting;
- have some junky `$pager` code.
Instead, show the few most recent active resources and link to a filtered resource view in ApplicationSearch.
Test Plan:
- Viewed some blueprints.
- Clicked "View All Resources".
- Saw all resources.
- Used query / crumbs / etc.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14157
Summary: Ref T9252. If you have a blueprint and you do not like that blueprint very much, you can disable it.
Test Plan: Disabled / enabled some blueprints.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14156
Summary:
Ref T9252. Move these to the more modern stuff to pick up ordering and interface support for free.
Also work around the blueprint / custom field integration a little more gracefully.
Test Plan: Searched for blueprints, resources and leases.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14155
Summary: Ref T9252. Resources always have a corresponding blueprint, and it makes sense to use the same policies for both.
Test Plan: Viewed resources in web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14154
Summary:
Ref T9252. Drydock currently uses integer statuses, but there's no reason for this (they don't need to be ordered) and it makes debugging them, working with them, future APIs, etc., more cumbersome.
Switch to string instead.
Also rename `STATUS_OPEN` to `STATUS_ACTIVE` and `STATUS_CLOSED` to `STATUS_RELEASED` for consistency. This makes resources and leases have more similar states, and gives resource states more accurate names.
Test Plan: Browsed web UI, grepped for changed constants, applied patch, inspected database.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14153
Summary:
Ref T9252. Leases currently have a `resourceID`, but this is a bit nonstandard and generally less flexible than giving them a `resourcePHID`.
In particular, a `resourcePHID` is easier to use when rendering interfaces, since you can get handles out of a PHID.
Add a PHID column, copy over all the PHIDs that correspond to existing IDs, then drop the ID column.
Test Plan:
- Browsed web UIs.
- Inspected database during/after migration.
- Grepped for `resourceID`.
- Allocated a new lease with `bin/drydock lease`.
Reviewers: chad, hach-que
Reviewed By: hach-que
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14151
Summary: Ref T9252. This is now more consistent (same as the equivalent Resource state) and accurate (leases can end up in this state a bunch of ways, including by expiring).
Test Plan: `grep`, browsed around web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14150
Summary: Ref T6569. If a lease is activated with an expiration date, schedule a task to try to clean it up after that time.
Test Plan:
- Used `bin/drydock lease ... --until ...` to activate a lease in the near future.
- Waited for a bit.
- Saw it expire and get destroyed at the scheduled time.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T6569
Differential Revision: https://secure.phabricator.com/D14148
Summary:
Ref T9252. This simplifies some Drydock code.
Most of this code relates to the old notion of Drydock being able to enumerate all the tasks it needs to complete in order to acquire a lease. The code has stepped back from this, since it's unnecessary, the queue is more powerful than it used to be, and it would be a lot of work to keep track of.
The ~only thing that should ever wait for leases in modern code is `bin/drydock lease`, and it's fine for it to just sit there sleeping, so this just does that.
This reduces the granularity of logging, but I'll address that separately in future logging-focused changes.
Test Plan: Used `bin/drydock lease` to acquire a lease, saw it acquire cleanly.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14147