Summary: There were a number of places that were generating nonsense queries for both hosted and non-hosted subversion repositories.
Test Plan: Attempted several activities in Diffusion with both a hosted and non-hosted subversion repository, including viewing various types of diffs and raw files.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Differential Revision: https://secure.phabricator.com/D7799
Summary: If you push a large binary and the data crosses multiple data frames, we can end up in a loop in the parser.
Test Plan:
After this change, I was able to push a 95MB binary in 7s, which seems reasonable:
>>> orbital ~/repos/INIS $ svn st
A large2.bin
>>> orbital ~/repos/INIS $ ls -alh
total 390648
drwxr-xr-x 6 epriestley admin 204B Dec 18 17:14 .
drwxr-xr-x 98 epriestley admin 3.3K Dec 16 11:19 ..
drwxr-xr-x 7 epriestley admin 238B Dec 18 17:14 .svn
-rw-r--r-- 1 epriestley admin 80B Dec 18 15:07 README
-rw-r--r-- 1 epriestley admin 95M Dec 18 16:53 large.bin
-rw-r--r-- 1 epriestley admin 95M Dec 18 17:14 large2.bin
>>> orbital ~/repos/INIS $ time svn commit -m 'another large binary'
Adding (bin) large2.bin
Transmitting file data .
Committed revision 25.
real 0m7.215s
user 0m5.327s
sys 0m0.407s
>>> orbital ~/repos/INIS $
There may be room to improve this by using `PhutilRope`.
Reviewers: wrotte, btrahan, wotte
Reviewed By: wotte
CC: aran
Differential Revision: https://secure.phabricator.com/D7798
Summary: Ref T4195. Same as D7793, but for mercurial. (As usual, SVN needs some goofy nonsense instead, so the next diff will just make this field work.)
Test Plan: Ran `reparse.php` on Git and Mercurial commits, var_dump'd the output and it looked correct.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7795
Summary: Ref T4195. I need to issue this command from the pre-commit hook to get commit bodies for hooks.
Test Plan: Ran `reparse.php --message --trace` and dumped the $ref, which looked correct.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7793
Summary: We currently have a lot of calls to `addCrumb(id(new PhabricatorCrumbView())->...)` which can be expressed much more simply with a convenience method. Nearly all crumbs are only textual.
Test Plan:
- This was mostly automated, then I cleaned up a few unusual sites manually.
- Bunch of grep / randomly clicking around.
Reviewers: btrahan, chad
Reviewed By: btrahan
CC: hach-que, aran
Differential Revision: https://secure.phabricator.com/D7787
Summary: Ref T4195. Add Mercurial support to the content hook phase.
Test Plan:
Here are some `commit` push logs for a Mercurial repo:
{F90689}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7792
Summary: Ref T4195. Adds support for diff content rules.
Test Plan: Pushed SVN and Git changes through, saw them generate reasonable transcripts. Mercurial still isn't hooked up to this phase.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7791
Summary: Ref T4195. This doesn't provide any interesting fields yet (content, affected paths, commit message) but fires the hook correctly.
Test Plan: Added a blocking hook and saw it fire.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7789
Summary:
Ref T615. Ref T4237. With `--debug`, Mercurial will echo an "ignoring untrusted configuration option" warning **to stdout** if `.hgrc` has the wrong owner.
However, we need `--debug` to make `{parents}` usable, at least until the patches I got into the upstream are widely deployed. So after getting `--debug` output, strip off any leading warnings.
These warnings should always be in English, at least, since we set `LANG` explicitly.
Test Plan: Unit tests. @asherkin, maybe you can confirm this? I can't actually get the warning, but I think my `hg` in PATH is just a bit out of date.
Reviewers: asherkin, btrahan
Reviewed By: asherkin
CC: asherkin, aran
Maniphest Tasks: T615, T4237
Differential Revision: https://secure.phabricator.com/D7784
Summary: Ref T4195. SVN has no such thing as refs (I was thinking about writing a quasi-ref anyway like `HEAD: r23 -> r24`, but I'm not sure it would actually be useful). And content is very easy to build.
Test Plan: Pushed some stuff to SVN, got logs from it.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7766
Summary: Ref T4195. This doesn't actually work like I thought it did: it only fires locally, when you run `hg tag`. Mercurial tags are also weird and basically don't make any sense and everyone should use bookmarks instead. We could implement some flavor of this eventually, but I'd like to see users request it first. They can implement their own with content-based hooks once those work, anyway.
Test Plan: This code didn't do anything.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7765
Summary:
Ref T4195. This pulls the central logic of HookEngine up one level and makes all the git stuff genrate PushLogs.
In future diffs, everything will generate PushLogs and we can hand those off to Herald.
Test Plan:
Pushed a pile of valid/invalid stuff:
{F89256}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7761
Summary: Fixes T4224. If you `git merge-base A B`, and they have //no// ancestor, the command exits with an error. Assume errors mean "no ancestry" and continue.
Test Plan: Completely rewrite a repository with a `--force` push.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4224
Differential Revision: https://secure.phabricator.com/D7756
Summary: We run `git` on a different port than 22, so would like to reflect this change in the UI.
Test Plan: Set diffusion.ssh-port in settings, then make sure it's reflected on the Diffusion repository Clone URI.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran, dctrwatson
Differential Revision: https://secure.phabricator.com/D7755
Summary: Fixes T4223. The output of `ls-tree` is partially delimited by spaces
and partially delimited by `\t`. The code I added in D7744 to help debug the
issue in T4159 doesn't work properly for files with 7 or more bytes in their
filesize, because the internals use `%7s`.
Auditors: btrahan
Summary: Most checks were actually in place, but `ExecFuture` throws a `CommandException` which wasn't taken into account.
Test Plan: look at the first command and no longer saw an exception. Also, other commits worked as well.
Reviewers: richardvanvelzen
Reviewed By: richardvanvelzen
CC: krisbuist, Korvin, epriestley, aran
Differential Revision: https://secure.phabricator.com/D7730
Summary: This locks push logs down a little bit and makes them slightly more administrative. Primarily, don't show IPs to googlebot, etc.
Test Plan: Viewed push logs as edit and non-edit users.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D7722
Summary:
Ref T4195. Like the previous diffs, these both create a useful log and give us an object to hand off to Herald.
Surface this information in Diffusion, too, and clean things up a little bit.
Test Plan: {F87565}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7718
Summary: Ref T4195. Add UI options to filter push logs by pusher and repository. Add a link from the repository view page to the push logs.
Test Plan: Viewed a hosted repository, clicked logs link, saw logs. Filtered lgos by repo/pusher.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7713
Summary: Ref T4195. Stores remote address and protocol in the logs, where possible.
Test Plan: Pushed some stuff, looked at the log, saw data.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7711
Summary:
Ref T4195. This log serves two purposes:
- It's a log, so you can see what happened. Particularly, in Git/Hg, there is no other way to tell:
- Who //pushed// a change (vs committed / authored)?
- When was a change pushed?
- What was the old value of some tag/branch before someone destroyed it?
- We can hand these objects off to Herald to implement pre-commit rules.
This is a very basic implementation, but gets some data written and has a basic UI for it.
Test Plan: {F87339}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7705
Summary:
Ref T4189. This doesn't add any rules yet, but does all the heavy lifting to figure out what's changed and put it in a consuamble (if somewhat ad-hoc) datastructure, which lists all the ref and tag modifications and all the new commits in a consistent way.
From here, it should be fairly straightforward to add top-level rules (e.g., ff pushes only).
Test Plan: Output is huge, see comments.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4189
Differential Revision: https://secure.phabricator.com/D7687
Summary: Ref T4189. Fixes T2066. Mercurial has a //lot// of hooks so I'm not 100% sure this is all we need to install (we may need separate hooks for tags/bookmarks) but it should cover most of what we're after at least.
Test Plan:
- `bin/repository pull`'d a Mercurial repo and got a hook install.
- Pushed to a Mercurial repository over SSH and HTTP, with good/bad hooks. Saw hooks fire.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2066, T4189
Differential Revision: https://secure.phabricator.com/D7685
Summary:
Ref T4189. This adds SVN support, which was a little more messy than I though. Principally, we can not use `PHABRICATOR_USER` for Subversion, because it strips away the entire environment for "security reasons".
Instead, use `--tunnel-user` plus `svnlook author` to figure out the author.
Also fix "ssh://" clone URIs, which needs to be "svn+ssh://".
Test Plan:
- Made SVN commits through the hook.
- Made Git commits, too, to make sure I didn't break anything.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4189
Differential Revision: https://secure.phabricator.com/D7683
Summary:
Ref T4189. T4189 describes most of the intent here:
- When updating hosted repositories, sync a pre-commit hook into them instead of doing a `git fetch`.
- The hook calls into Phabricator. The acting Phabricator user is sent via PHABRICATOR_USER in the environment. The active repository is sent via CLI.
- The hook doesn't do anything useful yet; it just veifies basic parameters, does a little parsing, and exits 0 to allow the commit.
Test Plan:
- Performed Git pushes and pulls over SSH and HTTP.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4189
Differential Revision: https://secure.phabricator.com/D7682
Summary: This was broken in rP51fb1ca16d7f.
Test Plan: Imported a repository with file:/// location, it worked.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Differential Revision: https://secure.phabricator.com/D7636
Summary: Fixes T2230. This isn't a total walk in the park to configure, but should work for early adopters now.
Test Plan: Read documentation, browsed UI.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7634
Summary:
Ref T4038. This adds everything except the actual pushing part for mirrors.
This isn't the most beautiful or sophisticated UI, but I want get the authoritative repositories self-hosted and get users beta-ing hosting as soon as possible. We can do transactions, etc., later on.
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4038
Differential Revision: https://secure.phabricator.com/D7632
Summary: Fixes T4122. Ref T2230. Instead of storing credentials on each repository, store them in Passphrase. This allows easy creation/management of many repositories which share credentials.
Test Plan:
- Upgraded repositories.
- Created and edited repositories.
- Pulled HTTP and SSH repositories.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230, T4122
Differential Revision: https://secure.phabricator.com/D7629
Summary:
Ref T2230. When fully set up, we have up to three users who all need to write into the repositories:
- The webserver needs to write for HTTP receives.
- The SSH user needs to write for SSH receives.
- The daemons need to write for "git fetch", "git clone", etc.
These three users don't need to be different, but in practice they are often not likely to all be the same user. If for no other reason, making them all the same user requires you to "git clone httpd@host.com", and installs are likely to prefer "git clone git@host.com".
Using three different users also allows better privilege separation. Particularly, the daemon user can be the //only// user with write access to the repositories. The webserver and SSH user can accomplish their writes through `sudo`, with a whitelisted set of commands. This means that even if you compromise the `ssh` user, you need to find a way to escallate from there to the daemon user in order to, e.g., write arbitrary stuff into the repository or bypass commit hooks.
This lays some of the groundwork for a highly-separated configuration where the SSH and HTTP users have the fewest privileges possible and use `sudo` to interact with repositories. Some future work which might make sense:
- Make `bin/phd` respect this (require start as the right user, or as root and drop privileges, if this configuration is set).
- Execute all `git/hg/svn` commands via sudo?
Users aren't expected to configure this yet so I haven't written any documentation.
Test Plan:
Added an SSH user ("dweller") and gave it sudo by adding this to `/etc/sudoers`:
dweller ALL=(epriestley) SETENV: NOPASSWD: /usr/bin/git-upload-pack, /usr/bin/git-receive-pack
Then I ran git pushes and pulls over SSH via "dweller@localhost". They successfully interacted with the repository on disk as the "epriestley" user.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7589
Summary:
Ref T4039. This fixes an issue where a user with the ability to create repositories could view repositories he is otherwise not permitted to see, by following these steps:
- Suppose you want to see repository "A".
- Create a repository with the same VCS, called "B".
- Edit the local path, changing "/var/repo/B" to "/var/repo/A".
- Now it points at a working copy of a repository you can't see.
- Although you won't be able to make it through discovery (the pull will fail with the wrong credentials), you can read some information out of the repository directly through the Diffusion UI, probably?
I'm not sure this was really practical to execute since there are a bunch of sanity checks along most/all of the major pathways, but lock it down since normal users shouldn't be editing it anyway. In the best case, this would make a mess.
Test Plan: {F81391}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4039
Differential Revision: https://secure.phabricator.com/D7580
Summary:
Small step forward which improves existing stuff or lays groudwork for future stuff:
- Currently, to check for email verification, we have to single-query the email address on every page. Instead, denoramlize it into the user object.
- Migrate all the existing users.
- When the user verifies an email, mark them as `isEmailVerified` if the email is their primary email.
- Just make the checks look at the `isEmailVerified` field.
- Add a new check, `isUserActivated()`, to cover email-verified plus disabled. Currently, a non-verified-but-not-disabled user could theoretically use Conduit over SSH, if anyone deployed it. Tighten that up.
- Add an `isApproved` flag, which is always true for now. In a future diff, I want to add a default-on admin approval queue for new accounts, to prevent configuration mistakes. The way it will work is:
- When the queue is enabled, registering users are created with `isApproved = false`.
- Admins are sent an email, "[Phabricator] New User Approval (alincoln)", telling them that a new user is waiting for approval.
- They go to the web UI and approve the user.
- Manually-created accounts are auto-approved.
- The email will have instructions for disabling the queue.
I think this queue will be helpful for new installs and give them peace of mind, and when you go to disable it we have a better opportunity to warn you about exactly what that means.
Generally, I want to improve the default safety of registration, since if you just blindly coast through the path of least resistance right now your install ends up pretty open, and realistically few installs are on VPNs.
Test Plan:
- Ran migration, verified `isEmailVerified` populated correctly.
- Created a new user, checked DB for verified (not verified).
- Verified, checked DB (now verified).
- Used Conduit, People, Diffusion.
Reviewers: btrahan
Reviewed By: btrahan
CC: chad, aran
Differential Revision: https://secure.phabricator.com/D7572
Summary: We don't actually support this yet, so hide the configuration.
Test Plan: Edited branches for an hg repo.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D7563
Summary:
Ref T2230. As far as I can tell, getting SVN working over HTTP is incredibly complicated. It's all DAV-based and doesn't appear to have any kind of binary we can just execute and pass requests through to. Don't support it for now.
- Disable it in the UI.
- Make sure all the error messages are reasonable.
Test Plan: Tried to HTTP an SVN repo. Tried to clone a Git repo with SVN, got a good error message.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7562
Summary:
Ref T2230. Very rarely, even though we've flushed the connection and sent all the data, we'll close the connection before Git is happy with it and it will flip out with an error like this:
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
This is hard to reproduce because it depends on the order of read/write operations we can't directly control. I only saw it about 2% of the time, by just running `git pull` over and over again.
Waiting for Git to close its side of the connection seems to fix it.
Test Plan: Ran `git clone` a ton of times without seeing the error again. Ran `git push` a ton of times with new commits.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7558
Summary:
Ref T2230. The SVN protocol has a sensible protocol format with a good spec here:
http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_ra_svn/protocol
Particularly, compare this statement to the clown show that is the Mercurial wire protocol:
> It is possible to parse an item without knowing its type in advance.
WHAT A REASONABLE STATEMENT TO BE ABLE TO MAKE ABOUT A WIRE PROTOCOL
Although it makes substantially more sense than Mercurial, it's much heavier-weight than the Git or Mercurial protocols, since it isn't distributed.
It's also not possible to figure out if a request is a write request (or even which repository it is against) without proxying some of the protocol frames. Finally, several protocol commands embed repository URLs, and we need to reach into the protocol and translate them.
Test Plan: Ran various SVN commands over SSH (`svn log`, `svn up`, `svn commit`, etc).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7556
Summary:
Ref T2230. This is substantially more complicated than Git, but mostly because Mercurial's protocol is a like 50 ad-hoc extensions cobbled together. Because we must decode protocol frames in order to determine if a request is read or write, 90% of this is implementing a stream parser for the protocol.
Mercurial's own parser is simpler, but relies on blocking reads. Since we don't even have methods for blocking reads right now and keeping the whole thing non-blocking is conceptually better, I made the parser nonblocking. It ends up being a lot of stuff. I made an effort to cover it reasonably well with unit tests, and to make sure we fail closed (i.e., reject requests) if there are any parts of the protocol I got wrong.
A lot of the complexity is sharable with the HTTP stuff, so it ends up being not-so-bad, just very hard to verify by inspection as clearly correct.
Test Plan:
- Ran `hg clone` over SSH.
- Ran `hg fetch` over SSH.
- Ran `hg push` over SSH, to a read-only repo (error) and a read-write repo (success).
Reviewers: btrahan, asherkin
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7553
Summary: Ref T2230. Fixes T4079. As it turns out, this is Git being weird. See comments for some detials about what's going on here.
Test Plan: Created shallow and deep Git clones.
Reviewers: hach-que, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4079, T2230
Differential Revision: https://secure.phabricator.com/D7554
Summary:
Ref T2230. In Git, we can determine if a command is read-only or read/write from the command itself, but this isn't the case in Mercurial or SVN.
For Mercurial and SVN, we need to proxy the protocol that's coming over the wire, look at each request from the client, and then check if it's a read or a write. To support this, provide a more flexible version of `passthruIO`.
The way this will work is:
- The SSH IO channel is wrapped in a `ProtocolChannel` which can parse the the incoming stream into message objects.
- The `willWriteCallback` will look at those messages and determine if they're reads or writes.
- If they're writes, it will check for write permission.
- If we're good to go, the message object is converted back into a byte stream and handed to the underlying command.
Test Plan: Executed `git clone`, `git clone --depth 3`, `git push` (against no-write repo, got error), `git push` (against valid repo).
Reviewers: btrahan
Reviewed By: btrahan
CC: hach-que, asherkin, aran
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7551