Summary:
Ref T13658. This just scrubs some of the simple references from the codebase.
Most of what's left is in documentation which won't be relevant for a fork and/or which I need to separately revise (or more-or-less delete) at some point anyway.
I removed the "install RHEL" and "install Ubuntu" scripts outright since I don't have any reasonable way to test them and don't plan to maintain them.
Test Plan: Grepped for "phacility", "epriestley"; ran unit tests.
Reviewers: cspeckmim
Reviewed By: cspeckmim
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13658
Differential Revision: https://secure.phabricator.com/D21678
Summary:
Ref T13529. Now that instances can be renamed, an instance may have multiple valid SSH usernames and the preferred SSH username may not be the intenal instance name.
`PhacilitySiteSource` should already always set `diffusion.ssh-username` correctly, to the current preferred SSH username (which may be "new-name" after a rename from "old-name"), so we should never be able to reach this code without an accurate `diffusion.ssh-username` value available.
The code to resolve names into instances also already works for both "ssh old-name@..." and "ssh new-name@...".
So I believe this code has no beneficial effects and only causes harm: it may force us to return "old-name" when falling through would correctly return "new-name".
Test Plan:
- Previously: renamed an instance, then SSH'd to it using both the old and new names. Both work.
- Previously: verified that `diffusion.ssh-username` is set correctly after a rename.
- Verified that Diffusion "Clone" UI now shows "new-name" after an instance rename.
- The real question here is: does this break something I'm not thinking of? And the change probably has to go to production to answer that.
Maniphest Tasks: T13529
Differential Revision: https://secure.phabricator.com/D21259
Summary:
Ref T13216. When a repository is clustered, we run this cleanup code (to tell the repository to update, and log some timing information) on both nodes. Currently, we do slightly too much work, which is unnecessary and can be a bit confusing to human readers.
The double update message doesn't hurt anything, but there's no reason to write it twice.
Likewise, the second timing information update query doesn't do anything: there's no PushEvent object with the right identifier, so it just updates nothing. We don't need to run it, and it's confusing that we do.
Instead, only do these writes if we're actually the final node with the repository on it.
Test Plan: Added some logging, saw double writes/updates before the change and no doubles afterwards, with no other behavioral changes.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13216
Differential Revision: https://secure.phabricator.com/D19778
Summary:
The name of networks should be unique.
Also adds support for exact-name queries for AlamanacNetworks.
Test Plan: Applied migration with existing duplicates, saw networks renamed, attempted to add duplicates, got a nice error message.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D19379
Summary:
Ref T4292. Currently, we hold one big lock around the whole `bin/repository update` workflow.
When running multiple daemons on different hosts, this lock can end up being contentious. In particular, we'll hold it during `git fetch` on every host globally, even though it's only useful to hold it locally per-device (that is, it's fine/good/expected if `repo001` and `repo002` happen to be fetching from a repository they are observing at the same time).
Instead, split it into two locks:
- One lock is scoped to the current device, and held during pull (usually `git fetch`). This just keeps multiple daemons accidentally running on the same host from making a mess when trying to initialize or update a working copy.
- One lock is scoped globally, and held during discovery. This makes sure daemons on different hosts don't step on each other when updating the database.
If we fail to acquire either lock, assume some other process is legitimately doing the work and bail more quietly instead of fataling. In approximately 100% of cases where users have hit this lock contention, that was the case: some other daemon was running somewhere doing the work and the error didn't actually represent an issue.
If there's an actual problem, we still raise a diagnostically useful message if you run `bin/repository update` manually, so there are still tools to figure out that something is hung or whatever.
Test Plan:
- Ran `bin/repository update`, `pull`, `discover`.
- Added `sleep(5)`, forced processes to contend, got lock exceptions and graceful exit with diagnostic message.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15903
Summary:
Ref T4292. This consolidates code for figuring out which user we should connect to hosts with.
Also narrows a lock window.
Test Plan: Browsed Diffusion, pulled and pushed through an SSH proxy.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15754
Summary:
Ref T10756. When repositories are properly configured for the cluster (which is hard to set up today), be smart about which repositories are expected to exist on the current host, and only pull them.
This generally allows daemons to pretty much do the right thing no matter how many copies are running, although there may still be some lock contention issues that need to be sorted out.
Test Plan: {F1214483}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10756
Differential Revision: https://secure.phabricator.com/D15682
Summary:
Ref T6741. Ref T10246.
Root problem: to provide Drydock in the cluster, we need to expose Almanac, and doing so would let users accidentally or intentionally create a bunch of `repo006.phacility.net` devices/services which could conflict with the real ones we manage.
There's currently no way to say "you can't create anything named `*.blah.net`". This adds "namespaces", which let you do that (well, not yet, but they will after the next diff).
After the next diff, if you try to create `repo003.phacility.net`, but the namespace `phacility.net` already exists and you don't have permission to edit it, you'll be asked to choose a different name.
Also various modernizations and some new docs.
Test Plan:
- Created cool namespaces like `this.computer`.
- Almanac namespaces don't actually enforce policies yet.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T6741, T10246
Differential Revision: https://secure.phabricator.com/D15324
Summary:
Ref T5833. In some cases, we need to know if an Almanac device is the localhost or not, so we can either handle or forward the request.
To accomplish this, write a device ID when running `bin/almanac register`.
Using `--allow-key-reuse` and `--identify-as`, multiple devices are permitted to //authenticate// as one device but //identify// as different devices. In the Phacility cluster, this allows all the `repoXXX` machines to have one keypair (making key management much easier) but still work as separate devices. This is an advanced feature; normal installs with 1-3 hosts would just generate a key + device per host and identify/authenticate as the same device.
Test Plan: Ran commands with lots of flags like `PHACILITY_INSTANCE=local sudo -E ./bin/almanac register --device daemon.phacility.net --private-key ~/dev/core/conf/keys/daemon.key --force --allow-key-reuse --identify-as local001.phacility.net`. Got a good result from `AlmanacKeys::getDeviceID()` afterward.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5833
Differential Revision: https://secure.phabricator.com/D11452
Summary:
Ref T2783. This is basically a more refined version of D10400, which churned a bit on things like SSH key storage, the actual way the signing protocol shook out, etc.
- When Phabricator tries to make an intra-cluster service call as the omnipotent user, sign it with the host's device key.
- Add `bin/almanac register` to say "this host is X device, identified by private key Y". This stores the keypair locally, adds the public key to Almanac, and trusts it.
Net effect is that once a host has been registered, the daemons can make calls to other nodes as the omnipotent user. This is primarily necessary so they can access repository API methods on remote hosts.
Test Plan:
- Ran `bin/almanac register` with various valid and invalid inputs.
- Verified keys get generated/added/stored properly.
- Made a device-signed cluster Conduit call.
- Made a normal old user-signed cluster Conduit call.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T2783
Differential Revision: https://secure.phabricator.com/D11158
Summary:
Ref T6240. Some discussion in that task. In instance/cluster environments, daemons need to make Conduit calls that bypass policy checks.
We can't just let anyone add SSH keys with this capability to the web directly, because then an adminstrator could just add a key they own and start signing requests with it, bypassing policy checks.
Add a `bin/almanac trust-key --id <x>` workflow for trusting keys. Only trusted keys can sign requests.
Test Plan:
- Generated a user key.
- Generated a device key.
- Trusted a device key.
- Untrusted a device key.
- Hit the various errors on trust/untrust.
- Tried to edit a trusted key.
{F236010}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T6240
Differential Revision: https://secure.phabricator.com/D10878
Summary: Ref T5833. An interface is an IP (maybe v4, maybe v6) and port on a specified network (public internet, VPN, NAT block, etc).
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5833
Differential Revision: https://secure.phabricator.com/D10718
Summary: Ref T5833. The "uninteresting" part of this object is virtually identical to AlmanacService.
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5833
Differential Revision: https://secure.phabricator.com/D10714
Summary:
Ref T4209. This creates storage for public keys against authorized hosts, such that servers can be authorized to make Conduit calls as the omnipotent user.
Servers are registered into this system by running the following command once:
```
bin/almanac register
```
NOTE: This doesn't implement authorization between servers, just the storage of public keys.
Placing this against Almanac seemed like the most sensible place, since I'm imagining in future that the `register` command will accept more information (like the hostname of the server so it can be found in the service directory).
Test Plan: Ran `bin/almanac register` and saw the host (and public key information) appear in the database.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Maniphest Tasks: T4209
Differential Revision: https://secure.phabricator.com/D10400