1
0
Fork 0
mirror of https://we.phorge.it/source/arcanist.git synced 2025-01-10 23:01:04 +01:00
No description
Find a file
Alex Vandiver 9144658e69 Accelerate working tree operations in git
Summary:
Currently, `arc` on `git` uses the following commands to examine the
state of the working tree and history; example times for a no-op diff in a
165k-file working tree are also shown:

```
1)  git diff --no-ext-diff --no-textconv --submodule=short --raw 'fb062d4ecce5d9c1786b7bfc8a0dedf6b11fdd96' --
  = 1,722,514 us

2a) git diff --no-ext-diff --no-textconv --submodule=short --raw 'HEAD' --
  = 1,715,507 us

2b) git ls-files --others --exclude-standard
  = 2,359,202 us

3)  git diff-files --name-only
  = 1,333,274 us
```

Steps (2a) and (2b) are run concurrently; this results in a total elapsed
wallclock time of approximately 5.4 seconds.  This is inefficient -- all four of
the above steps must both load the index and examine the working copy, which may
be slow operations when large repositories are used.  Additionally, none of the
effort of those stat calls on the working tree, or load time of the index, is
shared across the processes.

Step (1) is called from `getCommitRangeStatus`, which was split out in D4095; it
is currently never called on its own, only ever from `getWorkingCopyStatus`,
where it it combined with `getUncommittedStatus`.  The current behavior of the
method is to return the set of changes //either// in local commits //or//
uncommitted in the working tree, which duplicates work that
`getUncommittedStatus` is intended to do.  Changing the behavior of this method
(in Git, and other VCSes) to only examine _committed_ status seems both inline
with the name of the method and the original description of it in D4095 -- and
also serves to make it much faster, as it is an operation that need not inspect
the working tree at all.

Steps (2a), (2b), and (3) attempt to gather the state of the working copy, and
as such are all I/O bound but must examine nearly identical data.  For git
2.11.0 and higher, we can instead rely on the machine-parseable `git status
--porcelain=2` format, which provides the information from all of these commands
at once.  It also allows additional performance improvements, as `git status`
has been the focus of several optimizations in the latest versions of git (the
untracked cache and fsmonitor services, for instance), which are not available
in the lower-level `diff`, `ls-files`, and `diff-files` commands.

This has the added benefit of fixing a bug noticed in T9455, in that uncommitted
or unstaged changes in modules can now be detected, regardless of if they also
have changed their base commit.  It further resolves a bug where `.gitmodules`
appeared to have unstaged changes, when in reality the unstaged changes were in
submodules elsewhere in the tree.

For backwards compatibility with versions of git < 2.11.0, the old code is left
in place.  It is possible that the simpler output from v1 of `git status
--porcelain` would also suffice for some of the above benefits, but the payoff
of parsing yet another format is deemed insufficient; users wishing improved
performance should simply upgrade `git`.

Alltogether, these result in the following, for a no-op diff in a
165k-working-file tree:

```
1) git diff --no-ext-diff --no-textconv --submodule=short --raw 'fb062d4ecce5d9c1786b7bfc8a0dedf6b11fdd96' HEAD --
 = 9,227 us
2) git status --porcelain=2 -z
 = 739,964 us
```

...for a total of 749ms, an improvement of 4.7s.

Depends on D18841.

Test Plan: Existing tests.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D18842
2017-12-23 20:12:50 -08:00
bin Added ArcanistTextLinter::LINT_BOF_WHITESPACE and ArcanistTextLinter::LINT_EOF_WHITESPACE 2014-01-13 18:05:42 -08:00
externals Don't bundle PEP8 2015-05-20 07:03:22 +10:00
resources Add some more spelling corrections to Spelling linter 2014-12-22 16:08:02 -08:00
scripts Reduce the strength of "arc executing on arc" from an error to a warning 2017-07-21 12:09:59 -07:00
src Accelerate working tree operations in git 2017-12-23 20:12:50 -08:00
.arcconfig Set "history.immutable" to "false" explicitly in .arcconfig in Arcanist 2016-08-03 08:13:09 -07:00
.arclint Fold ArcanistPhutilXHPASTLinter into ArcanistXHPASTLinter 2015-11-13 07:08:40 +11:00
.arcunit Rough version of configuration driven unit test engine 2015-08-11 06:54:16 +10:00
.editorconfig Test XHPAST linter rules in isolation 2015-11-19 08:57:23 +11:00
.gitignore Update .gitignore. 2014-06-14 11:44:38 -07:00
LICENSE Fix text lint issues 2015-04-07 18:09:27 +10:00
NOTICE Remove duplicate newline 2014-07-17 08:25:22 +10:00
README.md Move README to Markdown 2015-04-13 13:01:16 -07:00

Arcanist is the command-line tool for Phabricator. It allows you to interact with Phabricator installs to send code for review, download patches, transfer files, view status, make API calls, and various other things. You can read more in the User Guide

For more information about Phabricator, see http://phabricator.org/.

LICENSE

Arcanist is released under the Apache 2.0 license except as otherwise noted.