Summary:
Depends on D19247. Ref T13109. When we receive an SSH request, generate a random unique ID for the request. Then thread it down through the process tree.
The immediate goal is to let the `ssh-exec` process coordinate with `commit-hook` process and log information about read and write lock wait times. Today, there's no way for `ssh-exec` to interact with the `PushEvent`, but this is the most helpful place to store this data for users.
Test Plan: Made pushes, saw the `PushEvent` table populate with a random request ID. Exported data and saw the ID preserved in the export.
Maniphest Tasks: T13109
Differential Revision: https://secure.phabricator.com/D19249
Summary:
See PHI305. Ref T13046.
The SSH workflows currently extend `PhabricatorManagementWorkflow` to benefit from sharing all the standard argument parsing code. Sharing the parsing code is good, but it also means they inherit a `getViewer()` method which returns the ommnipotent viewer.
This is appropriate for everything else which extends `ManagementWorkflow` (like `bin/storage`, `bin/auth`, etc.) but not appropriate for SSH workflows, which have a real user.
This caused a bug with the pull logs where `pullerPHID` was not recorded properly. We used `$this->getViewer()->getPHID()` but the correct code was `$this->getUser()->getPHID()`.
To harden this against future mistakes:
- Don't extend `ManagementWorkflow`. Extend `PhutilArgumentWorkflow` instead. We **only** want the argument parsing code.
- Rename `get/setUser()` to `get/setSSHUser()` to make them explicit.
Then, fix the pull log bug by calling `getSSHUser()` instead of `getViewer()`.
Test Plan:
- Pulled and pushed to a repository over SSH.
- Grepped all the SSH stuff for the altered symbols.
- Saw pulls record a valid `pullerPHID` in the pull log.
- Used `echo {} | ssh ... conduit conduit.ping` to test conduit over SSH.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13046
Differential Revision: https://secure.phabricator.com/D18912
Summary:
Ref T11939. IPv4 addresses can normally only be written in one way, but IPv6 addresses have several formats.
For example, the addresses "FFF::", "FfF::", "fff::", "0ffF::", "0fFf:0::", and "0FfF:0:0:0:0:0:0:0" are all the same address.
Normalize all addresses before writing them to logs, etc, so we store the most-preferred form ("fff::", above).
Test Plan:
Ran an SSH clone over IPv6:
```
$ git fetch ssh://local@::1/diffusion/26/locktopia.git
```
It worked; verified that address read out of `SSH_CLIENT` sensibly.
Faked my remote address as a non-preferred-form IPv6 address using `preamble.php`.
Failed to login, verified that the preferred-form version of the address appeared in the user activity log.
Made IPv6 requests over HTTP:
```
$ curl -H "Host: local.phacility.com" "http://[::1]/"
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11939
Differential Revision: https://secure.phabricator.com/D16987
Summary: Ref T11766. When users run `git pull` or similar, log the operation in the pull log.
Test Plan: Performed SSH pulls, got a log in the database. Today, this event log is purely diagnostic and has no UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11766
Differential Revision: https://secure.phabricator.com/D16738
Summary: Fixes T7034. Like HTTP, proxy requests to the correct host if a repository has an Almanac service host.
Test Plan: Ran VCS requests through the proxy.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7034
Differential Revision: https://secure.phabricator.com/D11543
Summary:
Ref T7034.
In a cluster environment, when a user connects with a VCS request over SSH (like `git pull`), the receiving server may need to proxy it to a server which can actually satisfy the request.
In order to proxy the request, we need to know which repository the user is interested in accessing.
Split the SSH workflow into two steps:
# First, identify the repository.
# Then, execute the operation.
In the future, this will allow us to put a possible "proxy the whole thing somewhere else" step in the middle, mirroring the behavior of Conduit.
This is trivially easy in `git` and `hg`. Both identify the repository on the commmand line.
This is fiendishly complex in `svn`, for the same reasons that hosting SVN was hard in the first place. Specifically:
- The client doesn't tell us what it's after.
- To get it to tell us, we have to send it a server capabilities string //first//.
- We can't just start an `svnserve` process and read the repository out after a little while, because we may need to proxy the request once we figure out the repository.
- We can't consume the client protocol frame that tells us what the client wants, because when we start the real server request it won't know what the client is after if it never receives that frame.
- On the other hand, we must consume the second copy of the server protocol frame that would be sent to the client, or they'll get two "HELLO" messages and not know what to do.
The approach here is straightforward, but the implementation is not trivial. Roughly:
- Start `svnserve`, read the "hello" frame from it.
- Kill `svnserve`.
- Send the "hello" to the client.
- Wait for the client to send us "I want repository X".
- Save the message it sent us in the "peekBuffer".
- Return "this is a request for repository X", so we can proxy it.
Then, to continue the request:
- Start the real `svnserve`.
- Read the "hello" frame from it and throw it away.
- Write the data in the "peekBuffer" to it, as though we'd just received it from the client.
- State of the world is normal again, so we can continue.
Also fixed some other issues:
- SVN could choke if `repository.default-local-path` contained extra slashes.
- PHP might emit some complaints when executing the commit hook; silence those.
Test Plan: Pushed and pulled repositories in SVN, Mercurial and Git.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7034
Differential Revision: https://secure.phabricator.com/D11541
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.
Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.
Test Plan: Lint; ran some of the scripts.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7838
Summary: Fixes T4241. Ref T4206. See T4241 for a description here. Generally, when we connect a fat pipe (`git-upload-pack`) to a narrow one (`git` over SSH) we currently read limitless data into memory. Instead, throttle reads until writes catch up. This is now possible because of the previous changes in this sequence.
Test Plan:
- Ran `git clone` and `git push` on the entire Wine repository.
- Observed CPU and memory usage.
- Memory usage was constant and low, CPU usage was high only during I/O (which is expected, since we have to actually do work, although thre might be room to further reduce this).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4241, T4206
Differential Revision: https://secure.phabricator.com/D7776
Summary: Ref T4189. Updates the Phabricator stuff to use the new, more sensible semantics from D7769. Basically, this works correctly now and doesn't need workarounds.
Test Plan: Pushed Wine repo in 1m13s.
Reviewers: btrahan, zeeg
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4189
Differential Revision: https://secure.phabricator.com/D7770
Summary:
Ref T2230. The SVN protocol has a sensible protocol format with a good spec here:
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_ra_svn/protocol
Particularly, compare this statement to the clown show that is the Mercurial wire protocol:
> It is possible to parse an item without knowing its type in advance.
WHAT A REASONABLE STATEMENT TO BE ABLE TO MAKE ABOUT A WIRE PROTOCOL
Although it makes substantially more sense than Mercurial, it's much heavier-weight than the Git or Mercurial protocols, since it isn't distributed.
It's also not possible to figure out if a request is a write request (or even which repository it is against) without proxying some of the protocol frames. Finally, several protocol commands embed repository URLs, and we need to reach into the protocol and translate them.
Test Plan: Ran various SVN commands over SSH (`svn log`, `svn up`, `svn commit`, etc).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7556
Summary:
Ref T2230. This is substantially more complicated than Git, but mostly because Mercurial's protocol is a like 50 ad-hoc extensions cobbled together. Because we must decode protocol frames in order to determine if a request is read or write, 90% of this is implementing a stream parser for the protocol.
Mercurial's own parser is simpler, but relies on blocking reads. Since we don't even have methods for blocking reads right now and keeping the whole thing non-blocking is conceptually better, I made the parser nonblocking. It ends up being a lot of stuff. I made an effort to cover it reasonably well with unit tests, and to make sure we fail closed (i.e., reject requests) if there are any parts of the protocol I got wrong.
A lot of the complexity is sharable with the HTTP stuff, so it ends up being not-so-bad, just very hard to verify by inspection as clearly correct.
Test Plan:
- Ran `hg clone` over SSH.
- Ran `hg fetch` over SSH.
- Ran `hg push` over SSH, to a read-only repo (error) and a read-write repo (success).
Reviewers: btrahan, asherkin
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7553
Summary:
Ref T2230. In Git, we can determine if a command is read-only or read/write from the command itself, but this isn't the case in Mercurial or SVN.
For Mercurial and SVN, we need to proxy the protocol that's coming over the wire, look at each request from the client, and then check if it's a read or a write. To support this, provide a more flexible version of `passthruIO`.
The way this will work is:
- The SSH IO channel is wrapped in a `ProtocolChannel` which can parse the the incoming stream into message objects.
- The `willWriteCallback` will look at those messages and determine if they're reads or writes.
- If they're writes, it will check for write permission.
- If we're good to go, the message object is converted back into a byte stream and handed to the underlying command.
Test Plan: Executed `git clone`, `git clone --depth 3`, `git push` (against no-write repo, got error), `git push` (against valid repo).
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, asherkin, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7551
Summary: Like D7423, but for SSH.
Test Plan: Ran `git clone ssh://...`, got a clone.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7424
Summary:
- Add web UI for configuring SSH hosting.
- Route git reads (`git-upload-pack` over SSH).
Test Plan:
>>> orbital ~ $ git clone ssh://127.0.0.1/
Cloning into '127.0.0.1'...
Exception: Unrecognized repository path "/". Expected a path like "/diffusion/X/".
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/X/
Cloning into 'X'...
Exception: No repository "X" exists!
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/MT/
Cloning into 'MT'...
Exception: This repository is not available over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/P/
Cloning into 'P'...
Exception: TODO: Implement serve over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7421
Summary:
- Build "sshd-auth" (for authentication) and "sshd-exec" (for command execution) binaries. These are callable by "sshd-vcs", located [[https://github.com/epriestley/sshd-vcs | in my account on GitHub]]. They are based on precursors [[https://github.com/epriestley/sshd-vcs-glue | here on GitHub]] which I deployed for TenXer about a year ago, so I have some confidence they at least basically work.
- The problem this solves is that normally every user would need an account on a machine to connect to it, and/or their public keys would all need to be listed in `~/.authorized_keys`. This is a big pain in most installs. Software like Gitosis/Gitolite solve this problem by giving you an easy way to add public keys to `~/.authorized_keys`, but this is pretty gross.
- Roughly, instead of looking in `~/.authorized_keys` when a user connects, the patched sshd instead runs `echo <public key> | sshd-auth`. The `sshd-auth` script looks up the public key and authorizes the matching user, if they exist. It also forces sshd to run `sshd-exec` instead of a normal shell.
- `sshd-exec` receives the authenticated user and any command which was passed to ssh (like `git receive-pack`) and can route them appropriately.
- Overall, this permits a single account to be set up on a server which all Phabricator users can connect to without any extra work, and which can safely execute commands and apply appropriate permissions, and disable users when they are disabled in Phabricator and all that stuff.
- Build out "sshd-exec" to do more thorough checks and setup, and delegate command execution to Workflows (they now exist, and did not when I originally built this stuff).
- Convert @btrahan's conduit API script into a workflow and slightly simplify it (ConduitCall did not exist at the time it was written).
The next steps here on the Repository side are to implement Workflows for Git, SVN and HG wire protocols. These will mostly just proxy the protocols, but also need to enforce permissions. So the approach will basically be:
- Implement workflows for stuff like `git receive-pack`.
- These workflows will implement enough of the underlying protocol to determine what resource the user is trying to access, and whether they want to read or write it.
- They'll then do a permissons check, and kick the user out if they don't have permission to do whatever they are trying to do.
- If the user does have permission, we just proxy the rest of the transaction.
Next steps on the Conduit side are more simple:
- Make ConduitClient understand "ssh://" URLs.
Test Plan: Ran `sshd-exec --phabricator-ssh-user epriestley conduit differential.query`, etc. This will get a more comprehensive test once I set up sshd-vcs.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T603, T550
Differential Revision: https://secure.phabricator.com/D4229