Summary:
Ref T2230. When fully set up, we have up to three users who all need to write into the repositories:
- The webserver needs to write for HTTP receives.
- The SSH user needs to write for SSH receives.
- The daemons need to write for "git fetch", "git clone", etc.
These three users don't need to be different, but in practice they are often not likely to all be the same user. If for no other reason, making them all the same user requires you to "git clone httpd@host.com", and installs are likely to prefer "git clone git@host.com".
Using three different users also allows better privilege separation. Particularly, the daemon user can be the //only// user with write access to the repositories. The webserver and SSH user can accomplish their writes through `sudo`, with a whitelisted set of commands. This means that even if you compromise the `ssh` user, you need to find a way to escallate from there to the daemon user in order to, e.g., write arbitrary stuff into the repository or bypass commit hooks.
This lays some of the groundwork for a highly-separated configuration where the SSH and HTTP users have the fewest privileges possible and use `sudo` to interact with repositories. Some future work which might make sense:
- Make `bin/phd` respect this (require start as the right user, or as root and drop privileges, if this configuration is set).
- Execute all `git/hg/svn` commands via sudo?
Users aren't expected to configure this yet so I haven't written any documentation.
Test Plan:
Added an SSH user ("dweller") and gave it sudo by adding this to `/etc/sudoers`:
dweller ALL=(epriestley) SETENV: NOPASSWD: /usr/bin/git-upload-pack, /usr/bin/git-receive-pack
Then I ran git pushes and pulls over SSH via "dweller@localhost". They successfully interacted with the repository on disk as the "epriestley" user.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7589
Summary:
Ref T2230. Very rarely, even though we've flushed the connection and sent all the data, we'll close the connection before Git is happy with it and it will flip out with an error like this:
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
This is hard to reproduce because it depends on the order of read/write operations we can't directly control. I only saw it about 2% of the time, by just running `git pull` over and over again.
Waiting for Git to close its side of the connection seems to fix it.
Test Plan: Ran `git clone` a ton of times without seeing the error again. Ran `git push` a ton of times with new commits.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7558
Summary:
Ref T2230. The SVN protocol has a sensible protocol format with a good spec here:
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_ra_svn/protocol
Particularly, compare this statement to the clown show that is the Mercurial wire protocol:
> It is possible to parse an item without knowing its type in advance.
WHAT A REASONABLE STATEMENT TO BE ABLE TO MAKE ABOUT A WIRE PROTOCOL
Although it makes substantially more sense than Mercurial, it's much heavier-weight than the Git or Mercurial protocols, since it isn't distributed.
It's also not possible to figure out if a request is a write request (or even which repository it is against) without proxying some of the protocol frames. Finally, several protocol commands embed repository URLs, and we need to reach into the protocol and translate them.
Test Plan: Ran various SVN commands over SSH (`svn log`, `svn up`, `svn commit`, etc).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7556
Summary:
Ref T2230. In Git, we can determine if a command is read-only or read/write from the command itself, but this isn't the case in Mercurial or SVN.
For Mercurial and SVN, we need to proxy the protocol that's coming over the wire, look at each request from the client, and then check if it's a read or a write. To support this, provide a more flexible version of `passthruIO`.
The way this will work is:
- The SSH IO channel is wrapped in a `ProtocolChannel` which can parse the the incoming stream into message objects.
- The `willWriteCallback` will look at those messages and determine if they're reads or writes.
- If they're writes, it will check for write permission.
- If we're good to go, the message object is converted back into a byte stream and handed to the underlying command.
Test Plan: Executed `git clone`, `git clone --depth 3`, `git push` (against no-write repo, got error), `git push` (against valid repo).
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, asherkin, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7551
Summary:
Ref T2350. Fixes T2231.
- Adds log flags around discovery.
- Adds message flags for "needs update". This is basically an out-of-band hint to the daemons that a repository should be pulled sooner than normal. We set the flag when users push a revision, and expose a Conduit method that `arc land` will be able to use.
Test Plan: See screenshots.
Reviewers: btrahan, chad
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2350, T2231
Differential Revision: https://secure.phabricator.com/D7467
Summary: Looks like this is pretty straightforward; same as the reads except mark it as needing PUSH.
Test Plan: Ran `git push`, pushed over SSH to a hosted repo.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7425