Summary: Fixes T12243. That error occured due to network flakiness with some mounted filesystems so I'm not sure how best to simulate it. But you can look and see that the PhutilProxyException does indeed expect an exception as its second arg.
Test Plan: Look at method signature... look at callsite... now back at the method. Smile and nod.
Reviewers: #blessed_reviewers, yelirekim, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Maniphest Tasks: T12243
Differential Revision: https://secure.phabricator.com/D17335
Summary:
Fixes T12087. When transitioning into a clustered configuration for the first time, the documentation recommends using a one-device cluster as a transitional step.
However, installs may not do this for whatever reason, and we aren't as clear as we could be in warning about clusterizing directly into a multi-device cluster.
Roughly, when you do this, we end up believing that working copies exist on several different devices, but have no information about which copy or copies are up to date. //Usually// they all were already synchronized and are all up to date, but we can't make this assumption safely without risking data.
Instead, we err on the side of caution, and require a human to tell us which copy we should consider to be up-to-date, using `bin/repository thaw --promote`.
Test Plan:
```
$ ./bin/repository clusterize rLOCKS --service repos001.phacility.net
Service "repos001.phacility.net" is actively bound to more than one device
(local002.local, local001.phacility.net).
If you clusterize a repository onto this service it will be unclear which
devices have up-to-date copies of the repository. This leader/follower
ambiguity will freeze the repository. You may need to manually promote a
device to unfreeze it. See "Ambiguous Leaders" in the documentation for
discussion.
Continue anyway? [y/N]
```
Read other changes.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12087
Differential Revision: https://secure.phabricator.com/D17169
Summary:
Fixes T11940. In 2.11.0, Git has made a change so that newly-pushed changes are held in a temporary area until the hook accepts or rejects them.
This magic temporary area is only readable if the appropriate `GIT_ENVIRONMENTAL_MAGIC` variables are available. When executing `git` commands, pass them through from the calling context.
We're intentionally conservative about which variables we pass, and with good reason (see "httpoxy" in T11359). I think this continues to be the correct default behavior.
Test Plan:
- Upgraded to Git 2.11.0.
- Tried to push over SSH, got a hook error.
- Applied patch.
- Pulled and pushed over SSH.
- Pulled and pushed over HTTP.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11940
Differential Revision: https://secure.phabricator.com/D16988
Summary:
Ref T11044. Few issues here:
- The `PhutilProxyException` is missing an argument (hit this while in read-only mode).
- The `$ref_key` is unused.
- When you add a new master to an existing cluster, we can incorrectly apply `.php` patches which we should not reapply. Instead, mark them as already-applied.
Test Plan:
- Poked this locally, but will initialize `secure004` as an empty master to be sure.
Reviewers: chad, avivey
Reviewed By: avivey
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16916
Summary:
Fixes T11590. Currently, we incorrectly consider cluster repository versions that are (or were) on devices which are no longer part of the active cluster service when building this status screen.
Instead, ignore them. This is just a display bug; the actual `ClusterEngine` already had similar logic.
Test Plan:
- Added a bad leader record to `repository_workingcopyversion`.
- Before patch, got a bad "Partial (1w)" sync:
{F1802292}
- After patch, got a good "Sycnchronized":
{F1802293}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11590
Differential Revision: https://secure.phabricator.com/D16492
Summary:
I converted this call incorrectly in D16092. We should pass the `PhutilURI` object, not the string version of it.
Specifically, this resulted in hitting an error like this if a replica needed synchronization:
```
[2016-08-11 21:22:37] EXCEPTION: (InvalidArgumentException) Argument 1 passed to DiffusionCommandEngine::setURI() must be an instance of PhutilURI, string given, called in...
#0 PhutilErrorHandler::handleError(integer, string, string, integer, array) called at [<phabricator>/src/applications/diffusion/protocol/DiffusionCommandEngine.php:52]
#1 DiffusionCommandEngine::setURI(string) called at [<phabricator>/src/applications/diffusion/protocol/DiffusionRepositoryClusterEngine.php:601]
...
```
Test Plan: Clusterized an observed repository, demoted a node, ran `bin/repository update Rxxx` to update, saw no typehint fatal.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D16390
Summary: Ref T10227. When we perform `git` http operations (fetch, mirror) check if we should use a proxy; if we should, set `http_proxy` or `https_proxy` in the environment to make `git` have `curl` use it.
Test Plan:
- Configured a proxy extension to run stuff through a local instance of Charles.
- Ran `repository pull` and `repository mirror`.
- Saw `git` HTTP requests route through the proxy.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10227
Differential Revision: https://secure.phabricator.com/D16092
Summary:
Ref T4292. For hosted, clustered repositories we have a good way to increment the internal version of the repository: every time a user pushes something, we increment the version by 1.
We don't have a great way to do this for observed/remote repositories because when we `git fetch` we might get nothing, or we might get some changes, and we can't easily tell //what// changes we got.
For example, if we see that another node is at "version 97", and we do a fetch and see some changes, we don't know if we're in sync with them (i.e., also at "version 97") or ahead of them (at "version 98").
This implements a simple way to version an observed repository:
- Take the head of every branch/tag.
- Look them up.
- Pick the biggest internal ID number.
This will work //except// when branches are deleted, which could cause the version to go backward if the "biggest commit" is the one that was deleted. This should be OK, since it's rare and the effects are minor and the repository will "self-heal" on the next actual push.
Test Plan:
- Created an observed repository.
- Ran `bin/repository update` and observed a sensible version number appear in the version table.
- Pushed to the remote, did another update, saw a sensible update.
- Did an update with no push, saw no effect on version number.
- Toggled repository to hosted, saw the version reset.
- Simulated read traffic to out-of-sync node, saw it do a remote fetch.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15986
Summary:
Fixes T11020. I think this resolves things -- `$new_version` (set above) should be used, not `$new_log` directly.
Specifically, we would get into trouble if the initial push failed for some reason (working copy not initialized yet, commit hook rejected, etc).
Test Plan: Made a bad push to a new repository. Saw it freeze before the patch and succeed afterwards.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11020
Differential Revision: https://secure.phabricator.com/D15969
Summary: Fixes T10941. This avoids a confusing dead end when configuring Subversion hosting, where `svnserve` will fail to execute hooks if the CWD isn't readable by the vcs-user.
Test Plan:
- Updated and committed in a hosted SVN repository.
- Ran some git operations, too.
- @dpotter confirmed this locally in T10941.
Reviewers: chad
Reviewed By: chad
Subscribers: dpotter
Maniphest Tasks: T10941
Differential Revision: https://secure.phabricator.com/D15879
Summary:
Ref T10748. Ref T10366. Allows users to set credential for new URIs.
- Ref T7221. Our handling of the "git://" protocol is currently incorrect. This protocol is not authenticated, but is considered an SSH protocol. In the new UI, it is considered an anonymous/unauthenticated protocol instead.
- Ref T10241. This fixes the `PassphraseCredentialControl` so it doesn't silently edit the value if the current value is not visible to you and/or not valid.
Test Plan:
Performed a whole lot of credential edits, removals, and adjustments. I'll give this additional vetting before cutting over to it.
{F1253207}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T7221, T10241, T10366, T10748
Differential Revision: https://secure.phabricator.com/D15829
Summary:
Ref T10860. This allows us to recover if the connection to the database is lost during a push.
If we lose the connection to the master database during a push, we would previously freeze the repository. This is very safe, but not very operator-friendly since you have to go manually unfreeze it.
We don't need to be quite this aggressive about freezing things. The repository state is still consistent after we've "upgraded" the lock by setting `isWriting = 1`, so we're actually fine even if we lost the global lock.
Instead of just freezing the repository immediately, sit there in a loop waiting for the master to come back up for a few minutes. If it recovers, we can release the lock and everything will be OK again.
Basically, the changes are:
- If we can't release the lock at first, sit in a loop trying really hard to release it for a while.
- Add a unique lock identifier so we can be certain we're only releasing //our// lock no matter what else is going on.
- Do the version reads on the same connection holding the lock, so we can be sure we haven't lost the lock before we do that read.
Test Plan:
- Added a `sleep(10)` after accepting the write but before releasing the lock so I could run `mysqld stop` and force this issue to occur.
- Pushed like this:
```
$ echo D >> record && git commit -am D && git push
[master 707ecc3] D
1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 254 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
BEGIN SLEEP
```
- Here, I stopped `mysqld` from the CLI in another terminal window.
```
END SLEEP
# CRITICAL. Failed to release cluster write lock!
# The connection to the master database was lost while receiving the write.
# This process will spend 300 more second(s) attempting to recover, then give up.
```
- Here, I started `mysqld` again.
```
# RECOVERED. Link to master database was restored.
# Released cluster write lock.
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
2cbf87c..707ecc3 master -> master
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15792
Summary:
Ref T10860. At least in Git over SSH, we can freely echo a bunch of stuff to stderr and Git will print it to the console, so we can tell users what's going on.
This should make debugging, etc., easier. We could tone this down a little bit once things are more stable if it's a little too chatty.
Test Plan:
```
$ echo D >> record && git commit -am D && git push
[master ca5efff] D
1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 256 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
8616189..ca5efff master -> master
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15791
Summary: Ref T10860. This doesn't change anything, it just separates all this stuff out of `PhabricatorRepository` since I'm planning to add a bit more state to it and it's already pretty big and fairly separable.
Test Plan: Pulled, pushed, browsed Diffusion.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15790
Summary:
Ref T4292. We currently synchronize hosted, clustered, Git repositories when we receive an SSH pull or push.
Additionally:
- Synchronize before HTTP reads and writes.
- Synchronize reads before Conduit requests.
We could relax Conduit eventually and allow Diffusion to say "it's OK to give me stale data".
We could also redirect some set of these actions to just go to the up-to-date host instead of connecting to a random host and synchronizing it. However, this potentially won't work as well at scale: if you have a larger number of servers, it sends all of the traffic to the leader immediately following a write. That can cause "thundering herd" issues, and isn't efficient if replicas are in different geographical regions and the write just went to the east coast but most clients are on the west coast. In large-scale cases, it's better to go to the local replica, wait for an update, then serve traffic from it -- particularly given that writes are relatively rare. But we can finesse this later once things are solid.
Test Plan:
- Pushed and pulled a Git repository over HTTP.
- Browsed a Git repository from the web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15758
Summary:
Ref T4292. When you run `git fetch` and connect to, say, `repo001.west.company.com`, we'll look at the current version of the repository in other nodes in the cluster.
If `repo002.east.company.com` has a newer version of the repository, we'll fetch that version first, then respond to your request.
To do this, we need to run `git fetch repo002.east.company.com ...` and have that connect to the other host and be able to fetch data.
This change allows us to run `PHABRICATOR_AS_DEVICE=1 git fetch ...` to use device credentials to do this fetch. (Device credentials are already supported and used, they just always connect as a user right now, but these fetches should be doable without having a user. We will have a valid user when you run `git fetch` yourself, but we won't have one if the daemons notice that a repository is out of date and want to update it, so the update code should not depend on having a user.)
Test Plan:
```
$ PHABRICATOR_AS_DEVICE=1 ./bin/ssh-connect local.phacility.com
Warning: Permanently added 'local.phacility.com' (RSA) to the list of known hosts.
PTY allocation request failed on channel 0
phabricator-ssh-exec: Welcome to Phabricator.
You are logged in as device/daemon.phacility.net.
You haven't specified a command to run. This means you're requesting an interactive shell, but Phabricator does not provide an interactive shell over SSH.
Usually, you should run a command like `git clone` or `hg push` rather than connecting directly with SSH.
Supported commands are: conduit, git-lfs-authenticate, git-receive-pack, git-upload-pack, hg, svnserve.
Connection to local.phacility.com closed.
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15755
Summary:
Ref T4292. Ref T10366. Depends on D15751. Today, generating repository commands is purely a function of the repository, so they use protocols and credentials based on the repository configuration.
For example, a repository with an SSH "remote URI" always generate SSH "remote commands".
This needs to change in the future:
- After T10366, repositories won't necessarily just have one type of remote URI. They can only have one at a time still, but the repository itself won't change based on which one is currently active.
- For T4292, I need to generate intracluster commands, regardless of repository configuration. These will have different protocols and credentials.
Prepare for these cases by separating out command construction, so they'll be able to generate commands in a more flexible way.
Test Plan:
- Added unit tests.
- Browsed diffusion.
- Ran `bin/phd debug pull` to pull a bunch of repos.
- Ran daemons.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292, T10366
Differential Revision: https://secure.phabricator.com/D15752
Summary:
If Mercurial 3.4+ is used to host repositories in Phabricator, any clients using 3.5+ will receive an exception after the bundle is pushed up. Clients will also fail to update phases for changesets pushed up.
Before directly responding to mercurial clients with all capabilities, this change filters out the 'bundle2' capability so the client negotiates using a legacy bundle wire format instead.
Test Plan:
Server: Mercurial 3.5
Client: Mercurial 3.4
Test with both HTTP and SSH protocols:
1. Create a local commit on client
2. Push commit to server
3. Verify the client emits something like:
```
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 1 changes to 1 files
```
Closes T9450
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin
Maniphest Tasks: T9450
Differential Revision: https://secure.phabricator.com/D14241
Summary: All classes should extend from some other class. See D13275 for some explanation.
Test Plan: `arc unit`
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13283
Summary: Applied some more linter fixes that I previously missed because my global `arc` install was out-of-date.
Test Plan: Will run `arc unit` on another host.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9443
Summary: If you push a large binary and the data crosses multiple data frames, we can end up in a loop in the parser.
Test Plan:
After this change, I was able to push a 95MB binary in 7s, which seems reasonable:
>>> orbital ~/repos/INIS $ svn st
A large2.bin
>>> orbital ~/repos/INIS $ ls -alh
total 390648
drwxr-xr-x 6 epriestley admin 204B Dec 18 17:14 .
drwxr-xr-x 98 epriestley admin 3.3K Dec 16 11:19 ..
drwxr-xr-x 7 epriestley admin 238B Dec 18 17:14 .svn
-rw-r--r-- 1 epriestley admin 80B Dec 18 15:07 README
-rw-r--r-- 1 epriestley admin 95M Dec 18 16:53 large.bin
-rw-r--r-- 1 epriestley admin 95M Dec 18 17:14 large2.bin
>>> orbital ~/repos/INIS $ time svn commit -m 'another large binary'
Adding (bin) large2.bin
Transmitting file data .
Committed revision 25.
real 0m7.215s
user 0m5.327s
sys 0m0.407s
>>> orbital ~/repos/INIS $
There may be room to improve this by using `PhutilRope`.
Reviewers: wrotte, btrahan, wotte
Reviewed By: wotte
CC: aran
Differential Revision: https://secure.phabricator.com/D7798
Summary:
Ref T2230. The SVN protocol has a sensible protocol format with a good spec here:
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_ra_svn/protocol
Particularly, compare this statement to the clown show that is the Mercurial wire protocol:
> It is possible to parse an item without knowing its type in advance.
WHAT A REASONABLE STATEMENT TO BE ABLE TO MAKE ABOUT A WIRE PROTOCOL
Although it makes substantially more sense than Mercurial, it's much heavier-weight than the Git or Mercurial protocols, since it isn't distributed.
It's also not possible to figure out if a request is a write request (or even which repository it is against) without proxying some of the protocol frames. Finally, several protocol commands embed repository URLs, and we need to reach into the protocol and translate them.
Test Plan: Ran various SVN commands over SSH (`svn log`, `svn up`, `svn commit`, etc).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7556
Summary:
Ref T2230. This is substantially more complicated than Git, but mostly because Mercurial's protocol is a like 50 ad-hoc extensions cobbled together. Because we must decode protocol frames in order to determine if a request is read or write, 90% of this is implementing a stream parser for the protocol.
Mercurial's own parser is simpler, but relies on blocking reads. Since we don't even have methods for blocking reads right now and keeping the whole thing non-blocking is conceptually better, I made the parser nonblocking. It ends up being a lot of stuff. I made an effort to cover it reasonably well with unit tests, and to make sure we fail closed (i.e., reject requests) if there are any parts of the protocol I got wrong.
A lot of the complexity is sharable with the HTTP stuff, so it ends up being not-so-bad, just very hard to verify by inspection as clearly correct.
Test Plan:
- Ran `hg clone` over SSH.
- Ran `hg fetch` over SSH.
- Ran `hg push` over SSH, to a read-only repo (error) and a read-write repo (success).
Reviewers: btrahan, asherkin
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7553
Summary: Ref T2230. This is easily the worst thing I've had to write in a while. I'll leave some notes inline.
Test Plan: Ran `hg clone http://...` on a hosted repo. Ran `hg push` on the same. Changed sync'd both ways.
Reviewers: asherkin, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7520