Summary:
Ref T5822. This prepares for inline compression and garbage collection of build logs.
This reduces the API surface area and removes a log from the "wait" step that would just log a message every 15 seconds. (If this is actually useful, I think we find a better way to communicate it.)
Test Plan:
Ran a build, saw a log:
{F1136691}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T5822
Differential Revision: https://secure.phabricator.com/D15371
Summary:
Ref T9123. To run upstream builds in Harbormaster/Drydock, we need to be able to check out `libphutil`, `arcanist` and `phabricator` next to one another.
This adds an "Also Clone: ..." field to Harbormaster working copy build steps so I can type all three repos into it and get a proper clone with everything we need.
This is somewhat upstream-centric and a bit narrow, but I don't think it's totally unreasonable, and most of the underlying stuff is relatively general.
This adds some more typechecking and improves data/type handling for custom fields, too. In particular, it prevents users from entering an invalid/restricted value in a field (for example, you can't "Also Clone" a repository you don't have permission to see).
Test Plan: Restarted build, got a Drydock resource with multiple repositories in it.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9123
Differential Revision: https://secure.phabricator.com/D14183
Summary:
Ref T9252. Currently, Harbormaster does this when trying to acquire a working copy:
- Ask for a working copy.
- Yield for 15 seconds.
- Check if we have a working copy yet.
That's OK, but Drydock takes ~1s to acquire a working copy lease if a resource is already available, so we end up doing this:
- T+0: Ask for a working copy.
- T+0: Yield for 15 seconds.
- T+1: Working copy lease activates.
- T+15: Working copy lease is used.
- T+16: Build finishes.
So we end up spending about 2 seconds doing work and 14 seconds sleeping.
One way to fix this would be to fiddle with the yield duration, so we yield for 1, 2, 4, ... seconds or something. This probably isn't a bad idea for longer leases (i.e., wait for 15, 30, 45 ... seconds or similar) but it implies a lot of churn for short leases.
Instead, let tasks "awaken" other tasks when they complete. The "awaken" operation means: if a task is in a yielded state (no failures, no owner, explicitly yielded, future expires time), pretend it only yielded until right now instead of whenever it really yielded to.
Basically, this rewrites history so that even though Harbormaster did a `yield(15)`, we pretend it did a `yield(4)` after we activate the lease if lease activation took 4 seconds.
If this misses, it's fine: we fall back to the normal yield behavior and things move forward normally a few seconds later.
If it hits, we get a more nimble process pretty cleanly.
Test Plan:
- Restarted a build plan (lease working copy + run `ls`) with this patch no-op'd, took about 16 seconds.
- Restarted a build plan with this patch active, took about 1 second.
Reviewers: hach-que, chad
Reviewed By: chad
Maniphest Tasks: T9252
Differential Revision: https://secure.phabricator.com/D14178
Summary:
Fixes T7370. Two changes:
- Make the default to show nothing, instead of showing all the data. This is a better default because the data is sometimes sensitive. Workers should have to opt in to revealing it.
- For TargetWorkers, link to the target (technically the build, for now, since there's no dedicated target detail page).
Test Plan: {F698325}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T7370
Differential Revision: https://secure.phabricator.com/D13845
Summary:
Ref T8096. Fixes a few bugs and glitches.
- Set build completion time when handling a message.
- Format duration information in a more human-readable way.
- Use a table for build variables.
- Fix up container PHIDs on diffs (a touch hacky, should be OK for now though).
Test Plan: Browsed around the UI.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T8096
Differential Revision: https://secure.phabricator.com/D13382
Summary: Ref T6822. This method is only called from within the `PhabricatorWorker::executeTask()` and `PhabricatorWorker::scheduleTask()` methods.
Test Plan: `grep`ped for `->doWork`.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11406
Summary: Ref T5936. This implements build implementations aborting early when the build has since been restarted. Build steps now periodically poll to see if the build's current generation does not match their generation, and they throw a `HarbormasterBuildAbortedException` if that is the case.
Test Plan: Tested locally on my machine with the sleep build step.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Maniphest Tasks: T5936
Differential Revision: https://secure.phabricator.com/D10322
Summary:
Ref T1049. This keeps track of how long a build target takes to execute in Harbormaster and displays it in the build view page. I'm not sure whether "Started" is really that useful once the target has completed?
Also, I change the name of the time taken depending on whether or not the target has completed; if it's still in progress it's called "Elapsed" and if it's completed then it's "Duration". The primary reason for this is that "Duration" sounds like post tense, whereas "Elapsed" is current tense. I'm not sure whether this is okay or not?
Test Plan: Ran a Sleep build step and saw the target dates / times appear correctly.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: talshiri, epriestley, Korvin
Maniphest Tasks: T5824, T1049
Differential Revision: https://secure.phabricator.com/D10174
Summary: We've received feedback that the "core - exception" is incredibly confusing, to the point where developers see this and write off the build failure as a Phabricator error that is unrelated to their changes.
Test Plan: Ran a build with a `exit 1` run step, didn't see the "core - exception" appear.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D10090
Summary:
For Harbormaster tasks which want to poll or wait, this lets them say "try again a little later" without having to sleep and hold a queue slot.
This is basically the same as failing, except that we don't increment the failure counter. Instead, we just set the current lease to the correct length and then exit. The task will be retried after the lease expires.
Test Plan: Using both `bin/harbormaster` and `phd debug taskmaster`, ran a lot of waiting tasks through the queue, faking them to either yield or not yield in a controlled manner. The queue responded as expected, yielding tasks appropraitely and retrying them later.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D8792
Summary:
This hooks up all the pieces of the build pipeline so `harbormaster.sendmessage` actually works. Particularly:
- Candidate build steps (i.e., those which interact with external systems) can now "Wait for Message". This pauses them indefinitely when they complete, until something calls `harbormaster.sendmessage`.
- After processing a target, we check if we should move it to PASSED or WAITING.
- Before updating a build, we move WAITING targets with pending messages to either PASSED or FAILED.
- I added an explicit "Building" state, which doesn't affect workflows but communicates more information to human users.
A big part of this is avoiding races. I believe we get the correct behavior no matter which order events occur in:
- We update builds after targets complete and after we receive messages, so we're guaranteed to update once both these conditions are true. This means messages can't be lost (even if they arrive before a build completes).
- The minor changes to the build engine logic mean that firing additional build updates is always safe, no matter what the current state of the build is.
- The build itself is protected by a lock in the build engine.
- The target is not covered by an explicit lock, but for all states only the engine (waiting) //or// the worker (all other states) can interact with it. All of the interactions also move the target state forward to the same destination and have no other side effects.
- Messages are only consumed inside the engine lock, so they don't need an explicit lock.
Test Plan:
- Made an HTTP request wait after completion, then ran a pile of builds through it using `bin/harbormaster build` and the web UI.
- Passed and failed message-awaiting builds with `harbormaster.sendmessage`.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, zeeg
Differential Revision: https://secure.phabricator.com/D8788
Summary:
See discussion in D8773. Three small adjustments which should help prevent this kind of issue:
- When queueing followup tasks, hold them on the worker until we finish the task, then queue them only if the work was successful.
- Increase the default lease time from 60 seconds to 2 hours. Although most tasks finish in far fewer than 60 seconds, the daemons are generally stable nowadays and these short leases don't serve much of a purpose. I think they also date from an era where lease expiry and failure were less clearly distinguished.
- Increase the default wait-after-failure from 60 seconds to 5 minutes. This largely dates from the MetaMTA era, where Facebook ran services with high failure rates and it was appropriate to repeatedly hammer them until things went through. In modern infrastructure, such failures are rare.
Test Plan:
- Verified that tasks queued properly after the main task was updated.
- Verified that leases default to 7200 seconds.
- Intentionally failed a task and verified default 300 second wait before retry.
- Removed all default leases shorter than 7200 seconds (there was only one).
- Checked all the wait before retry implementations for anything much shorter than 5 minutes (they all seem reasonable).
Reviewers: btrahan, sowedance
Reviewed By: sowedance
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D8774
Summary:
Ref T1049. Fixes T4602. Moves all the funky field stuff to CustomField. Uses ApplicationTransactions to apply and record edits.
This makes "artifact" fields a little less nice (but still perfectly usable). With D8599, I think they're reasonable overall. We can improve this in the future.
All other field types are better (e.g., fixes weird bugs with "bool", fixes lots of weird behavior around required fields), and this gives us access to many new field types.
Test Plan:
Made a bunch of step edits. Here's an example:
{F133694}
Note that:
- "Required" fields work correctly.
- the transaction record is shown at the bottom of the page.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4602, T1049
Differential Revision: https://secure.phabricator.com/D8600
Summary:
Ref T2015. Several fixes:
- `checkForCancellation()` no longer exists, and isn't relevant for resumable stops. Throw it away for now.
- Fix an issue where a build could pass even if the final step failed.
- `phlog()` exceptions so they show up in `bin/harbormaster` and the daemon logs.
- Write an exception log if a step fails.
- Add a "throw an exception" step to debug this stuff more easily.
Test Plan:
- Grepped for `checkForCancellation()`.
- Ran a failing build where the final step caused the failure.
- Observed `phlog()` in `bin/harbormaster` output.
- Observed log in web UI:
{F101168}
Reviewers: btrahan, hach-que
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7935
Summary:
Ref T1049. Currently you can cancel a build, but now that we're tracking a lot more state we can stop, resume, and restart builds.
When the user issues a command against a build, I'm writing it into an auxiliary queue (`HarbormasterBuildCommand`) and then reading them out in the worker. This is mostly to avoid race messes where we try to `save()` the object in multiple places: basically, the BuildEngine is the //only// thing that writes to Build objects, and it holds a lock while it does it.
Test Plan:
- Created a plan which runs "sleep 2" a bunch of times in a row.
- Stopped, resumed, and restarted it.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, chad
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7892
Summary:
Ref T1049. Currently, the Harbormaster worker looks like this:
foreach (step) {
run_step(step);
}
This means steps can't ever be run in parallel. Instead, split it into two workers. The "Build" worker starts things off, and basically does:
update_build();
(We could theoretically do this in the original process because it should never take very long, but since there's a lock and it's a little bit complex it seemed cleaner to separate it.)
The "Target" worker runs an individual target (like a command, or an HTTP request, or whatever), then updates the build:
run_one_step(step);
update_build();
The new `update_build()` mechanism in `HarbormasterBuildEngine` does this, roughly:
figure_out_overall_status_of_all_steps();
if (build is done) { done(); }
if (build is fail) { fail(); }
foreach (step that is ready to run) {
queue_target_worker_for_step(step);
}
So, overall:
- The part of the code that updates Builds is completely separated from the part of the code that updates Targets.
- Targets can run in parallel.
Test Plan:
- Ran a bunch of builds via `bin/harbormaster build`.
- Ran a bunch of builds via web UI.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7890
Summary:
This adds LeaseHostBuildStepImplementation for getting leases on hosts in Drydock via Harbormaster. It stores the resulting lease in an artifact.
There is also a few bug fixes as well.
Test Plan: Created a build plan with a "Lease Host" build step. Ran the build plan and saw the build pass and the artifact in the database.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T1049, T4111
Differential Revision: https://secure.phabricator.com/D7706
Summary: This implements build targets as outlined in D7582. Build targets represent an instance of a build step particular to the build. Logs and artifacts have been adjusted to attach to build targets instead of build / build step pairs.
Test Plan: Ran builds and clicked around the interface. Everything seemed to work.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T4111, T1049
Differential Revision: https://secure.phabricator.com/D7703
Summary:
This uses an event listener to render the status of builds on their buildables. The revision and commit view now renders out the status of each of the builds.
Currently the revision controller has the results for the latest diff rendered out. We might want to show the status of previous diffs in the future, but for now I think the latest diff should do fine.
There's also a number of bug fixes in this diff, including a particularly nasty one where builds would have a build plan PHID generated for them, which resulted in handle lookups always returning invalid objects.
Test Plan: Ran builds against diffs and commits, saw them appear on the revision and commit view controllers.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7544
Summary:
Depends on D7519.
This implements support for build logs in Harbormaster. This includes support for appending to a log from the "Run Remote Command" build step.
It also adds the ability to cancel builds.
Currently the build view page doesn't update the logs live; I'm sure this can be achieved with Javelin, but I don't have enough experience with Javelin to actually make it poll from updates to content in the background.
{F79151}
{F79153}
{F79150}
{F79152}
Test Plan:
Tested this by setting up SSH on a Windows machine and using a Remote Command configured with:
```
C:\Windows\system32\cmd.exe /C cd C:\Build && mkdir Build_${timestamp} && cd Build_${timestamp} && git clone --recursive https://github.com/hach-que/Tychaia.git && cd Tychaia && Protobuild.exe && C:\Windows\Microsoft.NET\Framework\v4.0.30319\MSBuild.exe Tychaia.Windows.sln
```
and observed the output of the build stream from the Windows machine into Phabricator.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7521
Summary: This implements an interface for adding new build steps, editing existing build steps and deleting build steps from build plans. It uses the settings definitions on the build implementation to work out what fields should be displayed on the edit page.
Test Plan:
See screenshots:
{F78529}
{F78532}
{F78528}
{F78531}
{F78527}
{F78530}
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7500
Summary:
Depends on D7498.
This implements support for a "build step implementation". Build steps have an associated class name (which makes the class in PHP) and a details field, which is serialized JSON (same as PhabricatorRepository).
This also implements a SleepBuildStepImplementation which just pauses the build for a specified period of seconds.
Test Plan:
Inserted a build step with `insert into harbormaster_buildstep (phid, buildPlanPHID, className, details, dateCreated, dateModified) values ('', 'PHID-HMCP-zkh5w6czfbfpk2gxwdeo', 'SleepBuildStepImplementation', '{"seconds":5}', NOW(), NOW());` (adjusting the build plan PHID as appropriate).
Started the daemon and applied the build plan to a buildable, and saw the daemon take a 5 second delay after creating `SleepBuildStepImplementation`.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran, chad
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7499
Summary: This implements a basic Harbormaster daemon that takes pending builds and builds them (currently just sleeps 15 seconds before moving to passed state). It also implements an interface to apply a build plan to a buildable, so that users can kick off builds for a buildable.
Test Plan: Ran `bin/phd debug PhabricatorHarbormasterBuildDaemon` and used the interface to start some builds by applying a build plan. Observed them move from 'pending' to 'building' to 'passed'.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T1049
Differential Revision: https://secure.phabricator.com/D7498
Summary: Ref T603. This swaps almost all queries against the repository table over to be policy aware.
Test Plan:
- Made an audit comment on a commit.
- Ran `save_lint.php`.
- Looked up a commit with `diffusion.getcommits`.
- Looked up lint messages with `diffusion.getlintmessages`.
- Clicked an external/submodule in Diffusion.
- Viewed main lint and repository lint in Diffusion.
- Completed and validated Owners paths in Owners.
- Executed dry runs via Herald.
- Queried for package owners with `owners.query`.
- Viewed Owners package.
- Edited Owners package.
- Viewed Owners package list.
- Executed `repository.query`.
- Viewed "Repository" tool repository list.
- Edited Arcanist project.
- Hit "Delete" on repository (this just tells you to use the CLI).
- Created a repository.
- Edited a repository.
- Ran `bin/repository list`.
- Ran `bin/search index rGTESTff45d13dffcfb3ea85b03aac8cc36251cacdf01c`
- Pushed and parsed a commit.
- Skipped all the Drydock stuff, as it it's hard to test and isn't normally reachable.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T603
Differential Revision: https://secure.phabricator.com/D7132
Summary:
This is very preliminary and doesn't actually do anything useful. In theory, it uses Drydock to check out a working copy and run tests. In practice, it's not actually capable of running any of our tests (because of complicated interdependency stuff), but does check out a working copy and //try// to run tests there.
Adds various sorts of utility methods to various things as well.
Test Plan: Ran `reparse.php --harbormaster --trace <commit>`, observed attempt to run tests via Drydock.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015, T1049
Differential Revision: https://secure.phabricator.com/D4215