Summary:
Ref T13432. Git has a "diff.suppressBlankEmpty" config option which makes it emit nonstandard diffs with trimmed trailing whitespace on unchanged blank lines.
Currently, we don't parse these diffs correctly. Even if we do in the future, emitting a more standard diff is desirable.
Explicitly disable this option when executing "git diff" so we build more standard diffs.
Test Plan:
- Configured this option.
- Modified a file with a blank line in it without changing the blank line, got this goofy display diff:
{F6985234}
- Applied patch, rediffed the same change, saw "-c diff.suppressBlankEmpty" in "--trace" output and got this sensible diff:
{F6985235}
Maniphest Tasks: T13432
Differential Revision: https://secure.phabricator.com/D20877
Summary:
See PHI904. Ref T13210. Ref T13209. Currently, we have an `hg cat` construction which attempts to pass a literal `%p` to Mercurial. This fails because you can't pass `%` through `%s` outside of `wilds`.
It also uses `%C` to pass a list of file paths. This is broadly unsafe and can cause command execution if you modify a file named, e.g., `; rm -rf xyz` or similar. I think it would be difficult to turn this into an attack but it's fairly bad. This dates from D5144 in 2013.
Test Plan: With this patch, created D19757 which has valid binary data (see F5962134).
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13210, T13209
Differential Revision: https://secure.phabricator.com/D19758
Summary:
See https://discourse.phabricator-community.org/t/arc-not-supporting-git-2-17-1/.
When treating it a large file binary, we try to get the "old" and "new" content using `git ls-tree` and `cat-file`.
If the file is new or deleted, there is no old file, so we try to work with filename `null`.
Under git < 2.17.1, that gets treated as `git ls-tree -- .`, which falls in the next condition under "no such path".
In git 2.18, etc, this is an error.
Explicitly bail out if there is no filename.
Test Plan: Add a new, large (>4Mb) file, arc diff.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D19513
Summary:
Filenames are last in `git status --porcelain=2` lines; they
are not escaped in any way, despite the fields being
whitespace-delimited. `explode` thus happily chops apart filenames
with spaces in them, causing later git operations to operate only on
the filename up to the first space.
Split the lines into the right number of elements -- in all cases,
this is one more than the index we're using, since filenames come last.
Test Plan:
Altering a file with a space in its path, and running `arc diff -a`.
Added tests.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D19389
Summary:
Currently, `arc` on `git` uses the following commands to examine the
state of the working tree and history; example times for a no-op diff in a
165k-file working tree are also shown:
```
1) git diff --no-ext-diff --no-textconv --submodule=short --raw 'fb062d4ecce5d9c1786b7bfc8a0dedf6b11fdd96' --
= 1,722,514 us
2a) git diff --no-ext-diff --no-textconv --submodule=short --raw 'HEAD' --
= 1,715,507 us
2b) git ls-files --others --exclude-standard
= 2,359,202 us
3) git diff-files --name-only
= 1,333,274 us
```
Steps (2a) and (2b) are run concurrently; this results in a total elapsed
wallclock time of approximately 5.4 seconds. This is inefficient -- all four of
the above steps must both load the index and examine the working copy, which may
be slow operations when large repositories are used. Additionally, none of the
effort of those stat calls on the working tree, or load time of the index, is
shared across the processes.
Step (1) is called from `getCommitRangeStatus`, which was split out in D4095; it
is currently never called on its own, only ever from `getWorkingCopyStatus`,
where it it combined with `getUncommittedStatus`. The current behavior of the
method is to return the set of changes //either// in local commits //or//
uncommitted in the working tree, which duplicates work that
`getUncommittedStatus` is intended to do. Changing the behavior of this method
(in Git, and other VCSes) to only examine _committed_ status seems both inline
with the name of the method and the original description of it in D4095 -- and
also serves to make it much faster, as it is an operation that need not inspect
the working tree at all.
Steps (2a), (2b), and (3) attempt to gather the state of the working copy, and
as such are all I/O bound but must examine nearly identical data. For git
2.11.0 and higher, we can instead rely on the machine-parseable `git status
--porcelain=2` format, which provides the information from all of these commands
at once. It also allows additional performance improvements, as `git status`
has been the focus of several optimizations in the latest versions of git (the
untracked cache and fsmonitor services, for instance), which are not available
in the lower-level `diff`, `ls-files`, and `diff-files` commands.
This has the added benefit of fixing a bug noticed in T9455, in that uncommitted
or unstaged changes in modules can now be detected, regardless of if they also
have changed their base commit. It further resolves a bug where `.gitmodules`
appeared to have unstaged changes, when in reality the unstaged changes were in
submodules elsewhere in the tree.
For backwards compatibility with versions of git < 2.11.0, the old code is left
in place. It is possible that the simpler output from v1 of `git status
--porcelain` would also suffice for some of the above benefits, but the payoff
of parsing yet another format is deemed insufficient; users wishing improved
performance should simply upgrade `git`.
Alltogether, these result in the following, for a no-op diff in a
165k-working-file tree:
```
1) git diff --no-ext-diff --no-textconv --submodule=short --raw 'fb062d4ecce5d9c1786b7bfc8a0dedf6b11fdd96' HEAD --
= 9,227 us
2) git status --porcelain=2 -z
= 739,964 us
```
...for a total of 749ms, an improvement of 4.7s.
Depends on D18841.
Test Plan: Existing tests.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D18842
Summary:
This adds tests that detail the current behavior of `arc` in
the presence of `git` submodules.
Test Plan: No behavior change; wrote the tests such that they pass.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D18841
Test Plan:
Removed a submodule with `diff.submodule` set to `log`, saw
`arc diff` error; with this change, it no longer does.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Maniphest Tasks: T10881
Differential Revision: https://secure.phabricator.com/D17327
Summary:
`git ls-remote` has an unusual way to indicate a URL was not
found: echoing back user input
```
$ git ls-remote --get-url does_not_exist
does_not_exist
$ echo $?
0
```
`getRemoteURI` handles checking for remotes other than 'origin', but
the error handling always matched against the string 'origin'
regardless of remote name.
Test Plan:
With a git config along the lines of:
```
[remote "my_special_name"]
url = ssh://secure@secure.phabricator.com/diffusion/ARC/arcanist.git
fetch = +refs/heads/*:refs/remotes/my_special_name/*
[branch "master"]
remote = github
merge = refs/heads/master
[remote "github"]
# url = git@github.com:phacility/arcanist.git
fetch = +refs/heads/*:refs/remotes/github/*
```
and running in a branch tracking `master` (github). `arc which` would
(without this diff) show:
```
The remote URI for this working copy is "github".
```
With this diff, `arc which` correctly shows:
```
Unable to determine the remote URI for this repository.
```
When diffing against a tracking branch with a propertly configured
remote (the happy path), `arc which` still correctly identifies the
remote URI:
```
The remote URI for this working copy is
"ssh://secure@secure.phabricator.com/diffusion/ARC/arcanist.git".
```
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, chad, epriestley
Differential Revision: https://secure.phabricator.com/D17110
Summary:
`parseGitRawDiff` dealt with the `A`, `M`, and `D` status flags from
`git diff --raw`, for file additions, modifications, and deletions
respectively. However, it failed to cope with `C` and `R` flags, for
copies and renames. Git version 2.9 and above default to resolving
renames, even in `git diff --raw` output, making this lack of support
only salient now (though users with Git's `diff.rename` set
encountered it previously).
Those two flags differ from the other three in that they offer both the
source and destination filename, separated by a tab. As
`parseGitRawDiff` was not aware of this property, it returned a
"filename" of `"oldfile\tnewfile"`. This is surfaced in several
places, including as passed to linters as a filename to check.
Needless to say, this file is nearly guaranteed to never exist on
disk.
Detect both the `C` and `R` flag types, and generate either a file
addition, or a pair of addition/deletion entries.
Test Plan:
Renamed a file, with a linter that printed each file it was
called with.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: jboning, Korvin
Differential Revision: https://secure.phabricator.com/D16387
Summary:
When doing svn copy, or svn mv, a SynthenticAdditionDiff is generated.
If the path is a directory, an error will occur when checking the
mime-type of the directory. Immediately after the properties check,
the function returns null if the path is a directory. Move this
check to before the properties check to avoid exiting with an error.
```
Command failed with error #1!
COMMAND
svn propget 'svn:mime-type' '/home/trasz/svn/ports/cad/py-pycam'@
STDOUT
(empty)
STDERR
svn: warning: W200017: Property 'svn:mime-type' not found on '/home/trasz/svn/ports/cad/py-pycam@'
svn: E200000: A problem occurred; see other errors for details
(Run with `--trace` for a full exception trace.)
```
Test Plan: Created differentials of changes with `svn copy` and `svn mv`
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin
Tags: #subversion
Differential Revision: https://secure.phabricator.com/D15985
Summary:
`arc which` is currently broken for svn repos.
Fixes T6635
(I think I independently wrote an identical change to yours)
Test Plan: `∴../../p/arcanist/bin/arc which` on a svn repo.
Reviewers: aik099, epriestley, #blessed_reviewers
Reviewed By: aik099, epriestley, #blessed_reviewers
Subscribers: aik099, Korvin
Maniphest Tasks: T6635
Differential Revision: https://secure.phabricator.com/D15922
Summary:
This is mostly just a personal quality-of-life fix. I run this command fairly often and having it return a little faster is nice.
This replaces a `git show` for each individual branch with a big `git for-each-ref` which we were already running anyway. This is quite a bit faster.
This command also occasionally hangs or segfaults for me while executing the huge pile of subprocesses. This is unreliable to reproduce, probably some bug in some PHP extension I have, and likely hard to narrow down, and this approach is better in every way anyway.
Test Plan:
- Ran `arc branch` in Git, observed faster output (in my `phabricator/`, about 2000ms -> 1200ms).
- Ran `arc feature` in Mercurial.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D15735
Summary:
Fixes T9953.
- "-c" was introduced in 1.7.2.
- "--no-color" has existed forever as far as I can tell.
- "--no-column" was introducd in 1.7.11, but there was nothing that needed to be disabled before that (hopefully).
Test Plan:
- Ran `arc which --trace` and observed a reasonable `git branch` command with correct output.
- Ran `arc which --trace` with a faked older Git version, observed command omit `--no-column`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9953
Differential Revision: https://secure.phabricator.com/D14735
Summary:
Instead of blindly assuming that "origin" is the repository that
arcanist should communicate with, use the remote that is configured
for the branch in git.
Test Plan:
Used `arc which` with a branch with no upstream, an
origin/master upstream, and an upstream/master upstream -- the last of
which is being used to create and land this diff.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: joshuaspence, Korvin
Differential Revision: https://secure.phabricator.com/D14530
Summary:
Fixes T9661. Users can construct arbitrarily long chains from the remote, like:
(remote) origin/master -> (local) cascade-a -> (local) cascade-b -> (local) cascade-c -> (local) cascade-d
When a user lands "cascade-d" onto "origin/master", we should pull A, B and C if they aren't ahead of the remote.
If a user lands "cascade-d" onto itself, we should pull A, B, and C if they aren't ahead of the remote, then reset D to the remote.
We also find this chain if the last component of it is connected by the local branch having the same name as the remote branch (typical for "master") instead of an actual connection through tracking brnaches.
Test Plan: See comment below.
Reviewers: chad
Reviewed By: chad
Subscribers: edibiase
Maniphest Tasks: T9661
Differential Revision: https://secure.phabricator.com/D14361
Summary:
Ref T9661. I need to reuse this to fix the complex workflow described in T9661 where we need to follow multiple paths to the upstream and cascade updates across them.
Pull the logic into a separate class to make this easier and less copy/pastey.
This shouldn't change any behavior.
Test Plan: Ran `arc land --preview` from detached head, remote-tracking branch, non-tracking branch, local-tracking branch. Selection of target/remote seemed correct in all cases.
Reviewers: chad
Reviewed By: chad
Subscribers: edibiase
Maniphest Tasks: T9661
Differential Revision: https://secure.phabricator.com/D14360
Summary:
Fixes T9543. Fixes T9658. Ref T3855.
Major functional change is that you can have a sequence of branches like:
origin/master -> notmaster -> feature1
...where they track each other, but you named your local master something else. Currently, we resolve only one level of upstreams, so we try to land onto "notmaster" in this case, which is wrong.
Instead, keep resolving upstreams until we either hit a cycle, don't have another upstream to look at, or find someting in a remote. In this case we'll eventually find "origin/master" and select "origin" as the remote and "master" as the target.
Other minor changes:
- Make this selection process explicit.
- Make the help 3000x longer.
Also fix a bug where we could incorrectly try to tell Differential to update awith `--preview`.
Test Plan:
- Landed from a tag.
- Landed from a tracking branch.
- Landed from an nth-degree tracking branch.
- Tried to land from a local branch with a cycle in upstreams.
- Landed with --remote and --onto.
- Read `arc help land`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9658, T3855, T9543
Differential Revision: https://secure.phabricator.com/D14357
Summary:
Fixes T9455. Depends on D14136. When you have a dirty submodule:
$ nano submodule/file.c # save changes
...we currently ask you to make a commit when you run `arc diff`, which is meaningless and misleading.
Instead, prompt the user separately.
This behavior isn't perfect but I think it's about the best we can do within reason.
Test Plan:
- Ran `arc diff` in a working copy with uncommitted submodule changes only, got new prompt.
- Ran `arc diff` in a working copy with submodule base commit changes only, got old (correct) prompt.
- Ran `arc diff` in a working copy with both, got only old prompt (which is incomplete, but reasonable/meaningful).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9455
Differential Revision: https://secure.phabricator.com/D14137
Summary:
'git ls-remote --get-url' is more correct, but younger and less
supported. This commit tempers previous optimism about its availability,
improving support for users of older git packages.
Test Plan:
* Set `git config url.xttps.insteadOf https` rewrite rule.
* Ran `arc which` with git 1.7.5 in `$PATH`, saw rewritten configured remote.
* Ran `arc which` with git 1.7.4 in `$PATH`, saw configured remote.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D13998
Summary:
This guards against stability issues with the output format of 'git
blame' (such as git config, localization (ref T5554) or future changes).
For example, `git config blame.blankboundary true` breaks `arc cover`
before this patch.
Test Plan:
* Set `git config blame.blankboundary true` on a test repo.
* Ran `arc cover`. It failed with an exception ("Bad blame?").
* Applied this patch.
* `arc cover` works.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin
Maniphest Tasks: T5554
Differential Revision: https://secure.phabricator.com/D13993
Summary:
Ref T5554. Both the current branch name (if on a branch), as well as the
list of all local branches, can be retrieved without having to parse the
output from "git branch".
Unfortunately, there seems to be no git plumbing for "get list of
branches containing this commit" yet.
(see http://marc.info/?l=git&m=141408477614635&w=2)
For that case, this commit whitelists the output from "git branch" using
the known valid branch names from "git for-each-ref".
Test Plan:
Set up a test repo with this structure:
```
| * Commit B1, on branch "subfeature"
| /
| * Commit A1, on branch "feature"
|/
* Commit M1, on branch "master"
|
```
In `subfeature`, I tried:
* `arc which --base 'git:branch-unique(master)'`
* `arc feature`
After that, I detached my HEAD (don't worry, I got better) and tried again.
Nothing looked broken.
(Tested with git 1.7.2.5 and 2.5.0.)
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin
Maniphest Tasks: T5554
Differential Revision: https://secure.phabricator.com/D13989
Summary:
Ref T5554. This makes git remote URL detection locale-agnostic.
The previously suggested `git config remote.origin.url` command does
almost the same, but does not support the URL rewriting features in
git-config (`url.<base>.insteadOf`).
This one does, although it has the unintuitive behavior of just printing
the passed remote name when the remote does not exist, or even when
called outside a git repo.
Test Plan:
* Switched to non-english locale in which git has a translation.
* Ran `arc which` on the Arcanist repo. It could not determine the remote URI.
* Applied patch, `arc which` found the URI.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: johnny-bit, Korvin
Maniphest Tasks: T5554
Differential Revision: https://secure.phabricator.com/D13983
Summary: All base classes should extend from `Phobject` or some other classes. See D13275 for some explanation.
Test Plan: `arc unit`
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13281
Summary: Ref T7604. Remove "arcanist projects" from `ArcanistWorkingCopy` and a few other callsites. Depends on D12999.
Test Plan: Can't really think of how to test this.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Maniphest Tasks: T7604
Differential Revision: https://secure.phabricator.com/D12945
Summary: I found a few strings that I had missed, using a mostly-broken-but-somewhat-okay custom linter ruler (https://secure.phabricator.com/differential/diff/30988/).
Test Plan: Intense eyeballing.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: aurelijus, Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12888
Summary: Ref T7977. The `ArcanistTestCase` class is pointless and can be replaced by `ArcanistPhutilTestCase`. Furthermore, it sorta makes sense to just rename `ArcanistPhutilTestCase` to `PhutilTestCase`. Depends on D12664 and D12666.
Test Plan: `arc unit`
Reviewers: avivey, #blessed_reviewers, epriestley
Reviewed By: avivey, #blessed_reviewers, epriestley
Subscribers: aurelijus, Korvin, epriestley
Maniphest Tasks: T7977
Differential Revision: https://secure.phabricator.com/D12665
Summary: Ref T5781. `git show .` works like HEAD, but that isn't what `arc browse .` means.
Test Plan: Ran `arc browse .` with a repository at a published commit.
Reviewers: chad, btrahan, avive, avivey
Reviewed By: avivey
Subscribers: epriestley, avivey
Maniphest Tasks: T5781
Differential Revision: https://secure.phabricator.com/D10197
Summary: Ref T5781. This makes things like `arc browse master` work (but they open the commit, not a revision).
Test Plan: Ran `arc browse master`.
Reviewers: csilvers, btrahan
Reviewed By: btrahan
Subscribers: epriestley, spicyj
Maniphest Tasks: T5781
Differential Revision: https://secure.phabricator.com/D10143
Summary:
Fixes T5555. Normally, when we `svn diff subdir/`, we use `--depth empty` to get only changes for the directory itself (usually, property changes).
However, this flag has no effect if the directory is newly added.
Adjust the diff parser so that if two sets of hunks are specified for a single file in a raw diff, we let the last one win instead of including both. This approach is a broadly more reasonable interpretation of these diffs.
Test Plan:
- Added a new file in a new subdirectory in Subversion.
- Ran `arc diff --only`.
- No double file content in resulting diff.
- Added unit test.
- There's fairly comprehensive unit test coverage for this stuff.
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T5555
Differential Revision: https://secure.phabricator.com/D9921
Summary:
These are documented as being identical, but `git diff a b` works if `a` is a tree (for example, `4b825d...`, the empty tree hash), but `git diff a..b` does not.
Particularly, with the `a..b` form, `arc diff --base arc:empty` does not work. With the `a b` form, it does.
Test Plan: Ran `arc diff --base arc:empty` in a repository and got a diff.
Reviewers: btrahan, talshiri
Reviewed By: talshiri
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9898
Summary: We manually quote this in a couple of places. That works fine on Lunix, but does not work on Windows. Instead, explicitly parameterize the command so the correct quoting rules are applied for the OS.
Test Plan: See IRC; windows user had issues fixed by this. `arc:upstream` still works locally.
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9868
Summary: I'm pretty sure that `@group` annotations are useless now... I believe that they were originally used by Diviner?
Test Plan: Eye-balled it.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin, aurelijus
Differential Revision: https://secure.phabricator.com/D9855
Summary: There were two callsites which needed some information from the username. Only one worked "correctly", causing `arc diff` to not amend commits anymore because the author could not be parser.
Test Plan: run `arc diff` with changes
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9797
Summary:
There's some sort of race inside `git` here, where the `git diff-files` command exits with different results some of the time when run in parallel with `git ls-files` or `git diff` (running either command was sufficient to trigger the race).
Run it separately to avoid the race.
I poked around the `git` source a little bit but quickly lost interest given that the issue seems fixed and this workaround is essentially reasonable.
Test Plan: Ran test 20x in a row without failures.
Reviewers: hach-que
Reviewed By: hach-que
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9616
Summary:
This leads to information being lost when others do `arc patch` because the name is used as the email address.
For example:
username = Richard van Velzen
Would give:
'authorName' => null,
'authorEmail' => 'Richard van Velzen'
Test Plan:
ran it through my head a couple of times, and tested it with the common options which all gave the expected result:
'rvanvelzen@company.com',
'Richard van Velzen',
'Richard van Velzen <rvanvelzen@company.com>',
'Richard van Velzen rvanvelzen@company.com',
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9605