Summary:
Depends on D19750. See T13210. The `bin/drydock lease` command makes it easier to request ad-hoc leases, but currently takes lease attributes in the form `--attributes x=y,a=b`.
This was okay for all leases at the time, but doesn't really work for modern WorkingCopy resources since they take a `repositories.map` which has a dictionary as a value. You can't specify that with `repositories.map=...`.
Instead, point `--attributes` at a JSON file or use `--attributes -` to read from stdin.
Test Plan: Used `--attributes` with a file and stdin to allocate working copy leases with repositories.
Reviewers: amckinley
Reviewed By: amckinley
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D19751
Summary: See PHI513. `fprintf()` takes `(thing, pattern, args, ...)` but we aren't passing a `pattern`, so if the command returns a "%" in the output we get an error.
Test Plan:
- Installed `bytes`, a great useful program which prints all the bytes, on my HoaxOS(tm) system (see D19102).
```
epriestley@orbital ~/dev/phabricator $ ./bin/drydock command --lease 76287 -- bytes # Before patch.
[2018-03-29 02:09:08] ERROR 2: fprintf(): Too few arguments at [/Users/epriestley/dev/core/lib/phabricator/src/applications/drydock/management/DrydockManagementCommandWorkflow.php:60]
arcanist(head=experimental, ref.master=b8c9c385a7f5, ref.experimental=925c60e7b837), corgi(head=master, ref.master=6371578c9d32), instances(head=master, ref.master=d983b9517924), ledger(head=master, ref.master=4da4a24b8779), libcore(), phabricator(head=hoax1, ref.master=b586ee065a75, ref.hoax1=f8d7480bbdd1, custom=4), phutil(head=master, ref.master=1ad42491e44a), secure(head=master, ref.master=988cf9bd7958), services(head=master, ref.master=6b3fb8d8dd0a)
#0 fprintf(resource, string) called at [<phabricator>/src/applications/drydock/management/DrydockManagementCommandWorkflow.php:60]
#1 DrydockManagementCommandWorkflow::execute(PhutilArgumentParser) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:441]
#2 PhutilArgumentParser::parseWorkflowsFull(array) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:333]
#3 PhutilArgumentParser::parseWorkflows(array) called at [<phabricator>/scripts/drydock/drydock_control.php:21]
epriestley@orbital ~/dev/phabricator $ ./bin/drydock command --lease 76287 -- bytes # After patch.
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
```
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D19264
Summary: Ref T13073. The new log output from `bin/drydock lease` currently uses HTML handle rendering, but should render to text.
Test Plan: Ran `bin/drydock lease` and saw normal text in log output. Viewed the same logs from the web UI and saw HTML.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19101
Summary:
Depends on D19076. Ref T13073. Blueprints are stored as an attribute and `setAttributes()` overwrites all attributes.
This is sorta junk but make it less obviously broken, at least.
Test Plan: Ran `bin/drydock lease --type working-copy --attributes x=y` without instantly getting a fatal about "no blueprint PHIDs".
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19077
Summary:
Depends on D19072. Ref T13073. Currently, you can leave leases stranded by using `^C` to interrupt the script. Handle signals and release leases on destruction if they haven't activated yet.
Also, print out more useful information before and after activation.
Test Plan: Mashed ^C while runnning `bin/drydock lease ... --trace`, saw the lease release.
Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19073
Summary: Depends on D19071. Ref T13073. While the daemons are supposedly doing things, show the user any logs they generate. There's often something relevant but unearthing it can be involved.
Test Plan: {F5427773}
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13073
Differential Revision: https://secure.phabricator.com/D19072
Summary:
Fixes T9994. Currently, when Drydock can't allocate a new resource because some limit has been reached, it waits patiently for a resource to become available.
It is possible that no resource will ever become available. Particularly with "Working Copy" resources, the new lease may want a copy of `rB`, but the resource may already be maxed out on `rA`.
Right now, no process exists to automatically reclaim the unused `rA`.
When we encounter this situation, try to reclaim one of the other resources if it is just sitting there unused.
Specifically:
- Add a "reclaim" command which means "release this resource //if// it is completely unused".
- Add a `bin/drydock reclaim` to send this command to every active resource.
- When we try to acquire a resource and can't, but only because of some kind of limit / utilization problem, try to release an unused resource to free up some room.
Test Plan:
- Set "Working Copy" resource limit to 1.
- Ran "Test Configuration" in `rA`, which worked.
- Ran "Test Configuration" in `rB`, which hung forever.
- Applied patch.
- Ran "Test Configuration" in `rB`, saw it reclaim the `rA` resource, use the slot, then succeed.
- Ran "Test Configuration" in `rA` again, saw it grab the slot back.
- Ran `bin/drydock reclaim` and saw it reclaim a bunch of old orphaned resources.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9994
Differential Revision: https://secure.phabricator.com/D14819
Summary:
Ref T9519. When acquiring leases on resources:
- Only consider resources created by authorized blueprints.
- Only consider authorized blueprints when creating new resources.
- Fail with a tailored error if no blueprints are allowed.
- Fail with a tailored error if missing authorizations are causing acquisition failure.
One somewhat-substantial issue with this is that it's pretty hard to figure out from the Harbormaster side. Specifically, the Build step UI does not show field value anywhere, so the presence of unapproved blueprints is not communicated. This is much more clear in Drydock. I'll plan to address this in future changes to Harbormaster, since there are other related/similar issues anyway.
Test Plan: {F872527}
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9519
Differential Revision: https://secure.phabricator.com/D14254
Summary: Ref T6569. If a lease is activated with an expiration date, schedule a task to try to clean it up after that time.
Test Plan:
- Used `bin/drydock lease ... --until ...` to activate a lease in the near future.
- Waited for a bit.
- Saw it expire and get destroyed at the scheduled time.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T6569
Differential Revision: https://secure.phabricator.com/D14148
Summary: Ref T9252. Some leases or resources may need to remove data, tear down VMs, etc., during cleanup. After they are released, queue a "destroy" phase for performing teardown.
Test Plan:
- Used `bin/drydock lease ...` to create a working copy lease.
- Used `bin/drydock release-lease` and `bin/drydock release-resource` to release the lease and then the working copy and host.
- Saw working copy and host get destroyed and cleaned up properly.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T6569, T9252
Differential Revision: https://secure.phabricator.com/D14144
Summary:
Ref T9252. Broadly, Drydock currently races on releasing objects from the "active" state. To reproduce this:
- Scatter some sleep()s pretty much anywhere in the release code.
- Release several times from web UI or CLI in quick succession.
Resources or leases will execute some release code twice or otherwise do inconsistent things.
(I didn't chase down a detailed reproduction scenario for this since inspection of the code makes it clear that there are no meaningful locks or mechanisms preventing this.)
Instead, add a Harbormaster-style command queue to resources and leases. When something wants to do a release, it adds a command to the queue and schedules a worker. The workers acquire a lock, then try to consume commands from the queue.
This guarantees that only one process is responsible for writes to active resource/leases.
This is the last major step to giving resources and leases a single writer during all states:
- Resource, Unsaved: AllocatorWorker
- Resource, Pending: ResourceWorker (Possible rename to "Allocated?")
- Resource, Open: This diff, ResourceUpdateWorker. (Likely rename to "Active").
- Resource, Closed/Broken: Future destruction worker. (Likely rename to "Released" / "Broken"; maybe remove "Broken").
- Resource, Destroyed: No writes.
- Lease, Unsaved: Whatever wants the lease.
- Lease, Pending: AllocatorWorker
- Lease, Acquired: LeaseWorker
- Lease, Active: This diff, LeaseUpdateWorker.
- Lease, Released/Broken: Future destruction worker (Maybe remove "Broken"?)
- Lease, Expired: No writes. (Likely rename to "Destroyed").
In most phases, we can already guarantee that there is a single writer without doing any extra work. This is more complicated in the "Active" case because the release buttons on the web UI, the release tools on the CLI, the lease requestor itself, the garbage collector, and any other release process cleaning up related objects may try to effect a release. All of these could race one another (and, in many cases, race other processes from other phases because all of these get to act immediately) as this code is currently written. Using a queue here lets us make sure there's only a single writer in this phase.
One thing which is notable is that whatever acquires a lease **can not write to it**! It is never the writer once it queues the lease for activation. It can not write to any resources, either. And, likewise, Blueprints can not write to resources while acquiring or releasing leases.
We may need to provide a mechinism so that blueprints and/or resource/lease holders get to attach some storage to resources/leases for bookkeeping. For example, a blueprint might need to keep some kind of cache on a resource to help it manage state. But I think we can cross that bridge when we come to it, and nothing else would need to write to this storage so it's technically straightforward to introduce such a mechanism if we need one.
Test Plan:
- Viewed buttons in web UI, checked enabled/disabled states.
- Clicked the buttons.
- Saw commands show up in the command queue.
- Saw some daemon stuff get scheduled.
- Ran CLI tools, saw commands get consumed and resources/leases release.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14143
Summary:
Ref T9253. For resources and leases that need to do something which takes a lot of time or requires waiting, allow them to allocate/acquire first and then activate later.
When we allocate a resource or acquire a lease, the blueprint can either activate it immediately (if all the work can happen quickly/inline) or activate it later. If the blueprint activates it later, we queue a worker to handle activating it.
Rebuild the "working copy" blueprint to work with this model: it allocates/acquires and activates in a separate step, once it is able to acquire a host.
Test Plan: With some power of imagination, brought up a bunch of working copies with `bin/drydock lease --type working-copy ...`
Reviewers: hach-que, chad
Reviewed By: hach-que, chad
Maniphest Tasks: T9253
Differential Revision: https://secure.phabricator.com/D14127
Summary: Ref T9253. Provide a meaningful command interface for Almanac hosts.
Test Plan:
Configued and leased a real host (`sbuild001.phacility.net`) and ran a command on it.
```
$ ./bin/drydock command --lease 90 -- ls /
bin
boot
core
dev
etc
home
initrd.img
lib
lib64
lost+found
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
vmlinuz
```
Reviewers: chad, hach-que
Reviewed By: chad, hach-que
Maniphest Tasks: T9253
Differential Revision: https://secure.phabricator.com/D14126
Summary:
Ref T9253. Broadly, this realigns Allocator behavior to be more consistent and straightforward and amenable to intended future changes.
This attempts to make language more consistent: resources are "allocated" and leases are "acquired".
This prepares for (but does not implement) optimistic "slot locking", as discussed in D10304. Although I suspect some blueprints will need to perform other locking eventually, this does feel like a good fit for most of the locking blueprints need to do.
In particular, I've made the blueprint operations on `$resource` and `$lease` objects more purposeful: they need to invoke an activator on the appropriate object to be implemented correctly. Before they invoke this activator method, they configure the object. In a future diff, this configuration will include specifying slot locks that the lease or resource must acquire. So the API will be something like:
$lease
->setActivateWhenAcquired(true)
->needSlotLock('x')
->needSlotLock('y')
->acquireOnResource($resource);
In the common case where slot locks are a good fit, I think this should make correct blueprint implementation very straightforward.
This prepares for (but does not implement) resources and leases which need significant setup steps. I've basically carved out two modes:
- The "activate immediately" mode, as here, immediately opens the resource or activates the lease. This is appropriate if little or no setup is required. I expect many leases to operate in this mode, although I expect many resources will operate in the other mode.
- The "allocate now, activate later" mode, which is not fully implemented yet. This will queue setup workers when the allocator exits. Overall, this will work very similarly to Harbormaster.
- This new structure makes it acceptable for blueprints to sleep as long as they want during resource allocation and lease acquisition, so long as they are not waiting on anything which needs to be completed by the queue. Putting a `sleep(15 * 60)` in your EC2Blueprint to wait for EC2 to bring a machine up will perform worse than using delayed activation, but won't deadlock the queue or block any locks.
Overall, this flow is more similar to Harbormaster's flow. Having consistency between Harbormaster's model and Drydock's model is good, and I think Harbormaster's model is also simply much better than Drydock's (what exists today in Drydock was implemented a long time ago, and we had more support and infrastructure by the time Harbormaster was implemented, as well as a more clearly defined problem).
The particular strength of Harbormaster is that objects always (or almost always, at least) have a single, clearly defined writer. Ensuring objects have only one writer prevents races and makes reasoning about everything easier.
Drydock does not currently have a clearly defined single writer, but this moves us in that direction. We'll probably need more primitives eventually to flesh this out, like Harbormaster's command queue for messaging objects which you can't write to.
This blueprint was originally implemented in D13843. This makes a few changes to the blueprint itself:
- A bunch of code from that (e.g., interfaces) doesn't exist yet.
- I let the blueprint have multiple services. This simplifies the code a little and seems like it costs us nothing.
This also removes `bin/drydock create-resource`, which no longer makes sense to expose. It won't get locking, leasing, etc., correct, and can not be made correct.
NOTE: This technically works but doesn't do anything useful yet.
Test Plan: Used `bin/drydock lease --type host` to acquire leases against these blueprints.
Reviewers: hach-que, chad
Reviewed By: hach-que, chad
Subscribers: Mnkras
Maniphest Tasks: T9253
Differential Revision: https://secure.phabricator.com/D14117
Summary: Ref T6822.
Test Plan: `grep`. This method is only called from within `PhutilArgumentWorkflow::__construct`.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11415
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.
Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.
Test Plan: Lint; ran some of the scripts.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7838
Summary:
Ref T2015. Moves a bunch of raw object loads into modern policy-aware queries.
Also straightens out the Log and Lease policies a little bit: there are legitimate states where these objects are not attached to a resource (particularly, while a lease is being acquired). Handle these more gracefully.
Test Plan: Lint / browsed stuff.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7836
Summary:
Ref T2015. Currently, Drydock has a `wait-for-lease` workflow which is invoked in the background by the `lease` workflow.
The goal of this mechanism is to allow `bin/drydock lease` to print out logs as the lease is acquired. However, this predates the `runAllTasksInProcess` flags, and they provide a simpler and more robust way (potentially with `--trace` and `PhutilConsole`) to do synchronous execution and debug logging.
Simplify this whole mechanism: just run everything in-process in `bin/drydock lease`, and do logging via `--trace`. We could thread a `PhutilConsole` through things too, but this seems good enough for now.
Also various cleanup/etc.
Test Plan: Ran `bin/drydock lease`. Ran `bin/harbormaster build X --plan Y`, for `Y` being a Drydock-dependent build plan.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7835
Summary: Ref T2015. Update DrydockLog for policy awareness and give it a policy query.
Test Plan: Browsed all the log interfaces.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7831
Summary: Ref T2015. DrydockLease predates widespread adoption of policies. Make it -- and its query -- policy aware.
Test Plan: Browsed leases from the web UI. Grepped for callsites.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7826
Summary:
//(this diff used to be about applying policies to blueprints)//
This restructures Drydock so that blueprints are instances in the DB, with an associated implementation class. Thus resources now have a `blueprintPHID` instead of `blueprintClass` and DrydockBlueprint becomes a DAO. The old DrydockBlueprint is renamed to DrydockBlueprintImplementation, and the DrydockBlueprint DAO has a `blueprintClass` column on it.
This now just implements CAN_VIEW and CAN_EDIT policies for blueprints, although they are probably not enforced in all of the places they could be.
Test Plan: Used the `create-resource` and `lease` commands. Closed resources and leases in the UI. Clicked around the new and old lists to make sure everything is still working.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T4111, T2015
Differential Revision: https://secure.phabricator.com/D7638
Summary:
This adds a Drydock blueprint for preallocated, remote hosts. This will be used by the Harbormaster interface to allow users to specify remote hosts that builds can be run on.
This adds a `canAllocateResource` method to Drydock blueprints; it is used to detect whether a blueprint can allocate a resource for the given type and attributes.
Test Plan:
Ran:
```
bin/drydock lease --type host --attributes remote=true,preallocated=true,host=192.168.56.101,port=22,user=james,keyfile=,path=C:\\Build\\,platform=windows
```
and saw the "C:\Build\<id>" folder appear on the remote Windows machine. Viewed the lease and resource in Drydock as well.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran, jamesr
Maniphest Tasks: T4111
Differential Revision: https://secure.phabricator.com/D7593
Summary: Permit the forcible release of Drydock leases. The implementation isn't very exciting for now.
Test Plan: Released leases via web and CLI.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D4181
Summary: This does nothing fancy, just closes the resource and releases/breaks leases. They'll get cleaned up in some to-be-written GC process.
Test Plan: Closed resources from web UI and CLI.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D3998
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary:
Tightens up a bunch of stuff:
- In `drydock lease`, pull and print logs so the user can see what's happening.
- Remove `DrydockAllocator`, which was a dumb class that did nothing. Move the tiny amount of logic it held directly to `DrydockLease`.
- Move `resourceType` from worker task metadata directly to `DrydockLease`. Other things (like the web UI) can be more informative with this information available.
- Pass leases to `allocateResource()`. We always allocate in response to a lease activation request, and the lease often has vital information. This also allows us to associate logs with leases correctly.
Test Plan: Ran `drydock lease --type host` and saw it perform a host allocation in EC2.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D3870
Summary: Add a bin/drydock symlink and break it into workflows. Nothing too special here.
Test Plan: Ran `bin/drydock wait-for-lease`, `bin/drydock lease`, `bin/drydock help`, etc.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D3867