Summary:
Fixes T13192. See PHI1015. When you deactivate a repository, we currently stop serving it.
This creates a problem for intracluster sync, since new nodes can't sync it. If nothing else, this means that if you "ship of theseus" your cluster and turn nodes over one at a time, you will eventually lose the entire repository. Since that's clearly a bad outcome, support sync.
Test Plan:
Testing this requires a "real" cluster, so I mostly used `secure`.
I deactivated rGITTEST and ran this on `secure002`:
```
./bin/repository thaw --demote secure002.phacility.net --force GITTEST && ./bin/repository update GITTEST
```
Before the patch, this failed:
```
[2019-01-31 19:40:37] EXCEPTION: (CommandException) Command failed with error #128!
COMMAND
git fetch --prune -- 'ssh://172.30.0.64:22/diffusion/GITTEST/' '+refs/*:refs/*'
STDOUT
(empty)
STDERR
Warning: Permanently added '172.30.0.64' (RSA) to the list of known hosts.
phabricator-ssh-exec: This repository ("rGITTEST") is not available over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
```
After applying (a similar patch to) this patch to `secure001`, the sync worked.
I'll repeat this test with the actual patch once this deploys to `secure`.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13192
Differential Revision: https://secure.phabricator.com/D20077
Summary:
Depends on D19356. Fixes T10883. Ref T13120.
- Add a "writable" property to the bindings, defaulting to "true" with a nice dropdown.
- When selecting hosts, allow callers to request a writable host.
- If the caller wants a writable host, only return hosts if they're writable.
- In SVN and Mercurial, we sometimes return only writable hosts when we //could// return read-only hosts, but figuring out if these request are read-only or read-write is currently tricky. Since these repositories can't really cluster yet, this shouldn't matter too much today.
Test Plan:
- Without any config changes, viewed repositories via web UI and pushed/pulled via SSH and HTTP.
- Made all nodes in the cluster read-only by disabling "writable", pulled and hit the web UI (worked), tried to push via SSH and HTTP (got errors about read-only).
- Put everything back, pulled and pushed.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13120, T10883
Differential Revision: https://secure.phabricator.com/D19357
Summary:
Depends on D19355. Ref T10883. Ref T13120. Rather than adding a million parameters here, wrap the selector-parameters in an `$options`.
The next change adds a new "writable" option to support forcing selection of writable hosts.
Test Plan: Pulled and pushed via HTTP and SSH, viewed repositories via Diffusion.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13120, T10883
Differential Revision: https://secure.phabricator.com/D19356
Summary:
Depends on D19247. Ref T13109. When we receive an SSH request, generate a random unique ID for the request. Then thread it down through the process tree.
The immediate goal is to let the `ssh-exec` process coordinate with `commit-hook` process and log information about read and write lock wait times. Today, there's no way for `ssh-exec` to interact with the `PushEvent`, but this is the most helpful place to store this data for users.
Test Plan: Made pushes, saw the `PushEvent` table populate with a random request ID. Exported data and saw the ID preserved in the export.
Maniphest Tasks: T13109
Differential Revision: https://secure.phabricator.com/D19249
Summary:
Depends on D18915. Ref T13046.
- Distinguish between HTTP and HTTPS.
- Use more constants and fewer magical strings.
- For HTTP responses, give them better type information and more helpful UI behaviors.
Test Plan: Pulled over SSH and HTTP. Reviewed resulting logs from the web UI. Hit errors like missing/invalid credentials.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13046
Differential Revision: https://secure.phabricator.com/D18917
Summary:
See PHI305. Ref T13046.
The SSH workflows currently extend `PhabricatorManagementWorkflow` to benefit from sharing all the standard argument parsing code. Sharing the parsing code is good, but it also means they inherit a `getViewer()` method which returns the ommnipotent viewer.
This is appropriate for everything else which extends `ManagementWorkflow` (like `bin/storage`, `bin/auth`, etc.) but not appropriate for SSH workflows, which have a real user.
This caused a bug with the pull logs where `pullerPHID` was not recorded properly. We used `$this->getViewer()->getPHID()` but the correct code was `$this->getUser()->getPHID()`.
To harden this against future mistakes:
- Don't extend `ManagementWorkflow`. Extend `PhutilArgumentWorkflow` instead. We **only** want the argument parsing code.
- Rename `get/setUser()` to `get/setSSHUser()` to make them explicit.
Then, fix the pull log bug by calling `getSSHUser()` instead of `getViewer()`.
Test Plan:
- Pulled and pushed to a repository over SSH.
- Grepped all the SSH stuff for the altered symbols.
- Saw pulls record a valid `pullerPHID` in the pull log.
- Used `echo {} | ssh ... conduit conduit.ping` to test conduit over SSH.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13046
Differential Revision: https://secure.phabricator.com/D18912
Summary:
Fixes T11938.
Note that there's a subcase here: if you `hg clone` or `svn checkout` a short `/source/` URI that ends in `.git`, we miss the lookup and don't get this far, so you still get a generic error message.
Hopefully it is clear enough on its own that `proto://.../blah.git` is, in fact, a Git repository, since it says ".git" at the end.
If that doesn't prove to be true, we can be more surgical about this.
Test Plan:
```
$ git clone ssh://local@localvault.phacility.com/source/quack.notgit/
Cloning into 'quack.notgit'...
phabricator-ssh-exec: This repository ("quack.notgit") is not a Git repository. Use "hg" to interact with this repository.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
```
```
$ hg clone ssh://local@localvault.phacility.com/source/phabx
remote: phabricator-ssh-exec: This repository ("phabx") is not a Mercurial repository. Use "git" to interact with this repository.
abort: no suitable response from remote hg!
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11938
Differential Revision: https://secure.phabricator.com/D16976
Summary: Fixes T4245. When a repository has a short name, use `/source/shortname/` as its primary URI.
Test Plan:
- Cloned Git repositories from shortnames via HTTP and SSH.
- Cloned Mercurial repositories from shortnames via HTTP and SSH.
- Cloned Subversion repositories from shortnames via SSH.
- Browsed Git, Mercurial and Subversion repositories.
- Added and removed short names to various repositories.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4245
Differential Revision: https://secure.phabricator.com/D16851
Summary: Ref T11766. When users run `git pull` or similar, log the operation in the pull log.
Test Plan: Performed SSH pulls, got a log in the database. Today, this event log is purely diagnostic and has no UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11766
Differential Revision: https://secure.phabricator.com/D16738
Summary:
Ref T10748. This needs more extensive testing and is sure to have some rough edges, but seems to basically work so far.
Throwing this up so I can work through it more deliberately and make notes.
Test Plan:
- Ran migration.
- Used `bin/repository list` to list existing repositories.
- Used `bin/repository update <repository>` to update various repositories.
- Updated a migrated, hosted Git repository.
- Updated a migrated, observed Git repository.
- Converted an observed repository into a hosted repository by toggling the I/O mode of the URI.
- Conveted a hosted repository into an observed repository by toggling it back.
- Created and activated a new empty hosted Git repository.
- Created and activated an observed Git repository.
- Updated a mirrored repository.
- Cloned and pushed over HTTP.
- Tried to HTTP push a read-only repository.
- Cloned and pushed over SSH.
- Tried to SSH push a read-only repository.
- Updated several Mercurial repositories.
- Updated several Subversion repositories.
- Created and edited repositories via the API.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15842
Summary:
Ref T10860. This allows us to recover if the connection to the database is lost during a push.
If we lose the connection to the master database during a push, we would previously freeze the repository. This is very safe, but not very operator-friendly since you have to go manually unfreeze it.
We don't need to be quite this aggressive about freezing things. The repository state is still consistent after we've "upgraded" the lock by setting `isWriting = 1`, so we're actually fine even if we lost the global lock.
Instead of just freezing the repository immediately, sit there in a loop waiting for the master to come back up for a few minutes. If it recovers, we can release the lock and everything will be OK again.
Basically, the changes are:
- If we can't release the lock at first, sit in a loop trying really hard to release it for a while.
- Add a unique lock identifier so we can be certain we're only releasing //our// lock no matter what else is going on.
- Do the version reads on the same connection holding the lock, so we can be sure we haven't lost the lock before we do that read.
Test Plan:
- Added a `sleep(10)` after accepting the write but before releasing the lock so I could run `mysqld stop` and force this issue to occur.
- Pushed like this:
```
$ echo D >> record && git commit -am D && git push
[master 707ecc3] D
1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 254 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
BEGIN SLEEP
```
- Here, I stopped `mysqld` from the CLI in another terminal window.
```
END SLEEP
# CRITICAL. Failed to release cluster write lock!
# The connection to the master database was lost while receiving the write.
# This process will spend 300 more second(s) attempting to recover, then give up.
```
- Here, I started `mysqld` again.
```
# RECOVERED. Link to master database was restored.
# Released cluster write lock.
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
2cbf87c..707ecc3 master -> master
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15792
Summary:
Ref T4292. Before we write or read a hosted, clustered Git repository over SSH, check if another version of the repository exists on another node that is more up-to-date.
If such a version does exist, fetch that version first. This allows reads and writes of any node to always act on the most up-to-date code.
Test Plan: Faked my way through this and got a fetch via `bin/repository update`; this is difficult to test locally and needs more work before we can put it in production.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15757
Summary:
Ref T4292. This consolidates code for figuring out which user we should connect to hosts with.
Also narrows a lock window.
Test Plan: Browsed Diffusion, pulled and pushed through an SSH proxy.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15754
Summary: Ref T4245. Consolidates the URI parsing/rewriting logic so that repositories can be served from either `/diffusion/XYZ/` or `/diffusion/123/`, over both HTTP and SSH.
Test Plan:
- Pulled a Git repository by ID and callsign over HTTP and SSH.
- Pulled a Mercurial repository by ID and callsign over HTTP and SSH.
- Pulled a Subversion repository by ID and callsign over SSH (no HTTP support for SVN).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4245
Differential Revision: https://secure.phabricator.com/D15302
Summary: Fixes T7894.
Test Plan:
```
$ git pull
phabricator-ssh-exec: You do not have permission to access the Diffusion application, so you can not interact with repositories over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
```
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T7894
Differential Revision: https://secure.phabricator.com/D12555
Summary: Fixes T7034. Like HTTP, proxy requests to the correct host if a repository has an Almanac service host.
Test Plan: Ran VCS requests through the proxy.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7034
Differential Revision: https://secure.phabricator.com/D11543
Summary:
Ref T7034.
In a cluster environment, when a user connects with a VCS request over SSH (like `git pull`), the receiving server may need to proxy it to a server which can actually satisfy the request.
In order to proxy the request, we need to know which repository the user is interested in accessing.
Split the SSH workflow into two steps:
# First, identify the repository.
# Then, execute the operation.
In the future, this will allow us to put a possible "proxy the whole thing somewhere else" step in the middle, mirroring the behavior of Conduit.
This is trivially easy in `git` and `hg`. Both identify the repository on the commmand line.
This is fiendishly complex in `svn`, for the same reasons that hosting SVN was hard in the first place. Specifically:
- The client doesn't tell us what it's after.
- To get it to tell us, we have to send it a server capabilities string //first//.
- We can't just start an `svnserve` process and read the repository out after a little while, because we may need to proxy the request once we figure out the repository.
- We can't consume the client protocol frame that tells us what the client wants, because when we start the real server request it won't know what the client is after if it never receives that frame.
- On the other hand, we must consume the second copy of the server protocol frame that would be sent to the client, or they'll get two "HELLO" messages and not know what to do.
The approach here is straightforward, but the implementation is not trivial. Roughly:
- Start `svnserve`, read the "hello" frame from it.
- Kill `svnserve`.
- Send the "hello" to the client.
- Wait for the client to send us "I want repository X".
- Save the message it sent us in the "peekBuffer".
- Return "this is a request for repository X", so we can proxy it.
Then, to continue the request:
- Start the real `svnserve`.
- Read the "hello" frame from it and throw it away.
- Write the data in the "peekBuffer" to it, as though we'd just received it from the client.
- State of the world is normal again, so we can continue.
Also fixed some other issues:
- SVN could choke if `repository.default-local-path` contained extra slashes.
- PHP might emit some complaints when executing the commit hook; silence those.
Test Plan: Pushed and pulled repositories in SVN, Mercurial and Git.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7034
Differential Revision: https://secure.phabricator.com/D11541
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary:
Via HackerOne. In regular expressions, "$" matches "end of input, or before terminating newline". This means that the expression `/^A$/` matches two strings: `"A"`, and `"A\n"`.
When we care about this, use `\z` instead, which matches "end of input" only.
This allowed registration of `"username\n"` and similar.
Test Plan:
- Grepped codebase for all calls to `preg_match()` / `preg_match_all()`.
- Fixed the ones where this seemed like it could have an impact.
- Added and executed unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: aran, epriestley
Differential Revision: https://secure.phabricator.com/D8516
Summary: Ref T4195. Stores remote address and protocol in the logs, where possible.
Test Plan: Pushed some stuff, looked at the log, saw data.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7711
Summary:
Ref T4189. T4189 describes most of the intent here:
- When updating hosted repositories, sync a pre-commit hook into them instead of doing a `git fetch`.
- The hook calls into Phabricator. The acting Phabricator user is sent via PHABRICATOR_USER in the environment. The active repository is sent via CLI.
- The hook doesn't do anything useful yet; it just veifies basic parameters, does a little parsing, and exits 0 to allow the commit.
Test Plan:
- Performed Git pushes and pulls over SSH and HTTP.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4189
Differential Revision: https://secure.phabricator.com/D7682
Summary:
Ref T2230. When fully set up, we have up to three users who all need to write into the repositories:
- The webserver needs to write for HTTP receives.
- The SSH user needs to write for SSH receives.
- The daemons need to write for "git fetch", "git clone", etc.
These three users don't need to be different, but in practice they are often not likely to all be the same user. If for no other reason, making them all the same user requires you to "git clone httpd@host.com", and installs are likely to prefer "git clone git@host.com".
Using three different users also allows better privilege separation. Particularly, the daemon user can be the //only// user with write access to the repositories. The webserver and SSH user can accomplish their writes through `sudo`, with a whitelisted set of commands. This means that even if you compromise the `ssh` user, you need to find a way to escallate from there to the daemon user in order to, e.g., write arbitrary stuff into the repository or bypass commit hooks.
This lays some of the groundwork for a highly-separated configuration where the SSH and HTTP users have the fewest privileges possible and use `sudo` to interact with repositories. Some future work which might make sense:
- Make `bin/phd` respect this (require start as the right user, or as root and drop privileges, if this configuration is set).
- Execute all `git/hg/svn` commands via sudo?
Users aren't expected to configure this yet so I haven't written any documentation.
Test Plan:
Added an SSH user ("dweller") and gave it sudo by adding this to `/etc/sudoers`:
dweller ALL=(epriestley) SETENV: NOPASSWD: /usr/bin/git-upload-pack, /usr/bin/git-receive-pack
Then I ran git pushes and pulls over SSH via "dweller@localhost". They successfully interacted with the repository on disk as the "epriestley" user.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7589
Summary:
Ref T2230. The SVN protocol has a sensible protocol format with a good spec here:
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_ra_svn/protocol
Particularly, compare this statement to the clown show that is the Mercurial wire protocol:
> It is possible to parse an item without knowing its type in advance.
WHAT A REASONABLE STATEMENT TO BE ABLE TO MAKE ABOUT A WIRE PROTOCOL
Although it makes substantially more sense than Mercurial, it's much heavier-weight than the Git or Mercurial protocols, since it isn't distributed.
It's also not possible to figure out if a request is a write request (or even which repository it is against) without proxying some of the protocol frames. Finally, several protocol commands embed repository URLs, and we need to reach into the protocol and translate them.
Test Plan: Ran various SVN commands over SSH (`svn log`, `svn up`, `svn commit`, etc).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7556
Summary:
Ref T2230. In Git, we can determine if a command is read-only or read/write from the command itself, but this isn't the case in Mercurial or SVN.
For Mercurial and SVN, we need to proxy the protocol that's coming over the wire, look at each request from the client, and then check if it's a read or a write. To support this, provide a more flexible version of `passthruIO`.
The way this will work is:
- The SSH IO channel is wrapped in a `ProtocolChannel` which can parse the the incoming stream into message objects.
- The `willWriteCallback` will look at those messages and determine if they're reads or writes.
- If they're writes, it will check for write permission.
- If we're good to go, the message object is converted back into a byte stream and handed to the underlying command.
Test Plan: Executed `git clone`, `git clone --depth 3`, `git push` (against no-write repo, got error), `git push` (against valid repo).
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, asherkin, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7551
Summary: Like D7423, but for SSH.
Test Plan: Ran `git clone ssh://...`, got a clone.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7424
Summary:
- Add web UI for configuring SSH hosting.
- Route git reads (`git-upload-pack` over SSH).
Test Plan:
>>> orbital ~ $ git clone ssh://127.0.0.1/
Cloning into '127.0.0.1'...
Exception: Unrecognized repository path "/". Expected a path like "/diffusion/X/".
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/X/
Cloning into 'X'...
Exception: No repository "X" exists!
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/MT/
Cloning into 'MT'...
Exception: This repository is not available over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
>>> orbital ~ $ git clone ssh://127.0.0.1/diffusion/P/
Cloning into 'P'...
Exception: TODO: Implement serve over SSH.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7421