1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-05 21:26:14 +01:00
Commit graph

166 commits

Author SHA1 Message Date
epriestley
d6fd365704 Correct Diffusion browse behavior when visiting a path URI with no trailing slash
Summary:
See PHI1983. Ref T13599. Ref T13589. Currently, if you browse to a path browse URI in Diffusion without a trailing slash (`/browse/master/src`), you get a nonsensical view (the directory as a single item).

Be more precise in how "git ls-tree" arguments are constructed.

Test Plan: Visited files and directories in the browse view, with and without trailing slashes. Saw improved behavior for directories with no trailing slash and reasonable behavior in all other cases.

Maniphest Tasks: T13599, T13589

Differential Revision: https://secure.phabricator.com/D21528
2021-01-28 08:52:58 -08:00
epriestley
888604c9dd Fix a "setExternalURI()" fatal while browsing directories with submodules
Summary:
Ref T13595. See that task for discussion.

D21511 renamed the iteration variable here (previously "$path") but did not rename this use of it.

Test Plan:
  - In Diffusion, browsed a directory with a submodule.
    - Before: "setExternalURI()" fatal in conduit call.
    - After: directory listing including submodule.

Maniphest Tasks: T13595

Differential Revision: https://secure.phabricator.com/D21520
2021-01-26 09:14:21 -08:00
epriestley
bafe8d1bbd Correct Git repository browse behavior for differences in "ls-tree" output
Summary:
Ref T13589. The output for "git ls-tree commit:path" (the old invocation) and "git ls-tree commit -- path" (the new invocation) differs: the latter emits absolute paths.

Update the code to account for this difference in behavior.

Test Plan:
  - Browsed a non-root directory in a Git repository in Diffusion.
  - Before: saw absolute paths.
  - After: saw relative paths.

Maniphest Tasks: T13589

Differential Revision: https://secure.phabricator.com/D21519
2021-01-25 09:13:36 -08:00
epriestley
0e28105ff7 Further correct and disambigutate ref selectors passed to Git on the CLI
Summary:
Ref T13589. In D21510, not every ref selector got touched, and this isn't a valid construction in Git:

```
$ git ls-tree ... -- ''
```

Thus:

  - Disambiguate more (all?) ref selectors.
  - Correct the construction of "git ls-tree" when there is no path.
  - Clean some stuff up: make the construction of some flags and arguments more explicit, get rid of a needless "%C", prefer "%Ls" over acrobatics, etc.

Test Plan: Browsed/updated a local Git repository. (This change is somewhat difficult to test exhaustively, as evidenced by the "ls-tree" issue in D21510.)

Maniphest Tasks: T13589

Differential Revision: https://secure.phabricator.com/D21511
2021-01-20 12:07:14 -08:00
epriestley
ea9cb0b625 Disambiguate Git ref selectors in some Git command line invocations
Summary: Ref T13589. See that task for discussion.

Test Plan: Executed most commands via "bin/conduit" or in isolation.

Maniphest Tasks: T13589

Differential Revision: https://secure.phabricator.com/D21510
2021-01-13 12:31:28 -08:00
epriestley
e454c3dafe Wrap all direct access to author/committer properties on "CommitData"
Summary: Ref T13552. Currently, various callers read raw properties off "CommitData" directly. Wrap these in accessors to support storage changes which persist "CommitRef" information instead.

Test Plan:
- Ran "diffusion.querycommits", saw the same data before and after.
- Looked at a commit, saw authorship information and date.
- Viewed tags in a repository, saw author information.
- Ran "rebuild-identities", saw no net effect.
- Grepped for callers to "getCommitDetail(...)".

Maniphest Tasks: T13552

Differential Revision: https://secure.phabricator.com/D21448
2020-09-15 17:36:39 -07:00
epriestley
7d6874d9f0 Turn "bypassCache" into a no-op in "diffusion.querycommits"
Summary: Ref T13552. The internal caller for this now uses "internal.commit.search", which is always authority-reading. No legitimate external caller should rely on the behavior of "bypassCache"; no-op it to simplify behavior.

Test Plan: Called "diffusion.querycommits", saw the same data as before.

Maniphest Tasks: T13552

Differential Revision: https://secure.phabricator.com/D21447
2020-09-15 17:36:39 -07:00
epriestley
a9506097ea Add "internal.commit.search" to replace the cache bypass mode of "diffusion.querycommits"
Summary:
Ref T13552. Commit parsers currently invoke a special mode of "diffusion.querycommits", which is an older frozen method.

The replacement, "diffusion.commit.search", is not really appropriate for low-level access. This mode of having a single method which operates in "cache" or "non-cache" modes also ends up in a lot of unnecessary field shuffling.

Provide "internal.commit.search" as a modern equivalent that returns a "DiffusionCommitRef"-compatible structure.

Test Plan: Executed "internal.commit.search", got sensible low-level commit results.

Maniphest Tasks: T13552

Differential Revision: https://secure.phabricator.com/D21443
2020-09-15 17:36:38 -07:00
epriestley
a745055813 Lift Diffusion Conduit call proxying to the root level of Conduit
Summary:
Ref T13552. Some Diffusion conduit calls may only be served by a node which hosts a working copy on disk, so they're proxied if received by a different node.

This capability is currently bound tightly to "DiffusionRequest", which is a bundle of context parameters used by some Diffusion calls. However, call proxying is not fundamentally a Diffusion behavior.

I want to perform proxying on a "*.search" call which does not use the "DiffusionRequest" parameter bundle. Lift proxying to the root level of Conduit.

Test Plan: Browsed diffusion in a clusterized repsository.

Maniphest Tasks: T13552

Differential Revision: https://secure.phabricator.com/D21442
2020-09-15 17:36:37 -07:00
Arturas Moskvinas arturas@uber.com
8cc6fe465c Fix diffusion.branchquery returning dictionary instead of array when branches are filtered out
Summary:
`diffusion.branchquery` can return dictionary instead of array if some branches are filtered out.
Eg.:
```
{
  "result": [
    {
      "shortName": "master",
      "commitIdentifier": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
      "refType": "branch",
      "rawFields": {
        "objectname": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
        "objecttype": "commit",
```
might become:
```
{
  "result": {
    "1": {
      "shortName": "master",
      "commitIdentifier": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
      "refType": "branch",
      "rawFields": {
        "objectname": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
        "objecttype": "commit",

```
Reproduction - find repository which has couple of branches, setup to track only some of them, execute `diffusion.branchquery` API call - result is dictionary instead of array

Test Plan: Apply patch, execution `diffusion.branchquery` call - result is no longer dictionary if it was one before

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D20973
2020-02-12 11:50:22 -08:00
Austin McKinley
2f313a0e0d Remove "unstable" status and T2784-specific warning message
Summary: Ref T2784. These are lookin' pretty stable. Subclasses like `DiffusionGetLintMessagesConduitAPIMethod` have their warnings about unstable methods, so just remove this warning in the base class.

Test Plan: Loaded `/conduit`, observed lack of unstable warnings. Only unstable methods are now `diffusion.getlintmessages`, `diffusion.looksoon`, and `diffusion.updatecoverage`.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Maniphest Tasks: T2784

Differential Revision: https://secure.phabricator.com/D20651
2019-07-15 14:05:26 -07:00
epriestley
a82a2b8459 Simplify non-bare working copy rules for the new "git fetch" strategy
Summary:
Ref T13280. In D20421, I changed our observe strategy to `git fetch <uri>` in all cases.

This doesn't work in an ancient, non-bare repository if `master` is checked out and `master` is also fetch: `git` refuses to overwrite the local ref unless we pass `--update-head-ok`. Pass this flag.

Also, remove some code which examines branches and tags in a special way for non-bare working copies. The old `git fetch <origin>` code without explicit revsets meant that `refs/remotes/orgin/heads/xyz` got updated instead of `refs/heads/xyz`. We now update our local refs in all cases (bare and non-bare) so we can throw away this special casing.

Test Plan:
  - Replaced a modern bare working copy with a non-bare working copy by explicitly using `git clone` without `--bare`.
  - Ran `bin/repository update`, hit the `--update-head-ok` error. Applied the patch, got a clean update.
  - Used the "repository.branchquery" API method...
    - ...with "contains" to trigger the "git branch" case. Got identical results after removing the special casing.
    - ...without "contains" to trigger the "low level ref" case. Got identical results after removing the special casing.
  - Grepped for `isWorkingCopyBare()`. The only remaining callsites deal with hook paths, and genuinely need to be specialized.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13280

Differential Revision: https://secure.phabricator.com/D20450
2019-04-19 07:06:10 -07:00
epriestley
34e90d8f51 Clean up a few "%Q" stragglers in SVN repository browsing code
Summary: See <https://discourse.phabricator-community.org/t/unsafe-raw-string-warning-for-svn-in-diffusion/2469>.

Test Plan: Browed a Subversion repository in Diffusion. These are all reachable from the main landing page if the repository has commits/files, I think. Before change: errors in log; after change: no issues.

Reviewers: amckinley

Reviewed By: amckinley

Differential Revision: https://secure.phabricator.com/D20244
2019-03-05 11:30:55 -08:00
epriestley
933462b487 Continue cleaning up queries in the wake of changes to "%Q"
Summary: Depends on D19810. Ref T13217. Ref T13216. I mostly used `grep implode | grep OR` and `grep implode | grep AND` to find these -- not totally exhaustive but should be a big chunk of the callsites that are missing `%LO` / `%LA`.

Test Plan:
These are tricky to test exhaustively, but I made an attempt to hit most of them:

- Browsed Almanac interfaces.
- Created/browsed Calendar events.
- Enabled/disabled/showed the lock log.
- Browsed repositories.
- Loaded Facts UI.
- Poked at Multimeter.
- Used typeahead for users and projects.
- Browsed Phriction.
- Ran various fulltext searches.

Not sure these are reachable:

- All the lint stuff might be dead/unreachable/nonfunctional?

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim

Maniphest Tasks: T13217, T13216

Differential Revision: https://secure.phabricator.com/D19814
2018-11-16 12:49:44 -08:00
epriestley
e26c4bddab Replace magical "branch" behavior in "diffusion.branchquery" with an explicit "patterns"
Summary:
See PHI958. Ref T13210. Previously, see PHI720.

The use case for the magic in PHI720 involves multiple patterns, and no parameter can be passed to `branch` that will result in multiple patterns being passed to `git`.

Replace the implicit magic with an explicit `patterns` parameter.

This whole thing is a bit shaky but probably isn't hurting anything.

Test Plan:
  - Ran query with no `patterns`.
  - Ran query with invalid `patterns`, got readable error.
  - Ran query with various valid `patterns` (plain branch name, globs with "?" and "*"), got sensible results.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13210

Differential Revision: https://secure.phabricator.com/D19771
2018-11-14 14:50:53 -08:00
epriestley
da40f80741 Update PhabricatorLiskDAO::chunkSQL() for new %Q semantics
Summary:
Ref T13217. This method is slightly tricky:

  - We can't safely return a string: return an array instead.
  - It no longer makes sense to accept glue. All callers use `', '` as glue anyway, so hard-code that.

Then convert all callsites.

Test Plan: Browsed around, saw fewer "unsafe" errors in error log.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13217

Differential Revision: https://secure.phabricator.com/D19784
2018-11-13 08:59:18 -08:00
epriestley
3574a55a95 Deprecate Conduit method "diffusion.getrecentcommitsbypath"
Summary:
See D19558. This method has no callers and just wraps `diffusion.historyquery`, since D5960 (2013).

This was introduced in D315 (which didn't make it out of FB, I think) inside Facebook for unclear purposes in 2011.

Test Plan: Grepped for callers, found none.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: artms

Differential Revision: https://secure.phabricator.com/D19559
2018-08-03 09:48:58 -07:00
Arturas Moskvinas
356b2781bc Gracefully fail request if non existing callsign is passed to getrecentcommitsbypath instead of crashing
Summary:
`diffusion.getrecentcommitsbypath` fails with 500 error when non existing callsign is passed:
```
>>> UNRECOVERABLE FATAL ERROR <<<

Call to a member function getCommit() on null

```

Expected Behavior:
Return more graceful error notifying caller that such callsign/repository does not exist

Reproduction steps:
Open conduit: https://secure.phabricator.com/conduit/method/diffusion.getrecentcommitsbypath/
Enter:
callsign: "obviouslynotexisting"
path: "/random"
Click call method

Test Plan: after applying patch - call no longer fails with 500s

Reviewers: Pawka, epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19558
2018-08-02 19:49:10 +03:00
epriestley
8ab8c390b7 If "branch" is provided to "diffusion.branchquery", use it as the "<pattern>" argument to "git branch --contains ..."
Summary:
Ref T13151. See PHI720. If you want to test if commit X appears on specific branch Y, `git branch --contains X -- Y` is faster than (effectively) `git branch --contains X | grep Y`.

Since this call has a "branch" parameter anyway, use it as the pattern argument if provided.

Test Plan:
  - Called the API method with no parameters, got all branches.
  - Called the API method with `master`, got just master.
  - Called the API method with `maste*`, got master. This behavior is not officially supported and may change in the future.
  - Viewed a commit, still saw all branches.
    - Grepped for `diffusion.branchquery` and verified that no remaining callsites pass a default "branch" parameter.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13151

Differential Revision: https://secure.phabricator.com/D19499
2018-06-22 17:38:19 -07:00
epriestley
5b640a434c Support an "Ancestors Of: ..." constraint in commit queries
Summary:
Ref T13137. See PHI609. An install would like to filter audit requests on a particular branch, e.g. "master".

This is difficult in the general case because we can not apply this constraint efficiently under every conceivable data shape, but we can do a reasonable job in most practical cases.

See T13137#238822 for more detailed discussion on the approach here.

This is a bit rough, but should do the job for now.

Test Plan:
- Filtered commits by various branches, e.g. "master"; "lfs". Saw correct-seeming results.
- Stubbed out the "just list everything" path to hit the `diffusion.internal.ancestors` path, saw the same correct-seeming results.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13137

Differential Revision: https://secure.phabricator.com/D19431
2018-05-08 15:51:42 -07:00
epriestley
a5bbadbaba Fix another Git 2.16.0 CLI compatibility issue
Summary:
This command also needs a "." instead of an empty string now.

(This powers the file browser typeahead in Diffusion.)

Test Plan: Will test in production since there's still no easy 2.16 installer for macOS.

Reviewers: amckinley

Reviewed By: amckinley

Differential Revision: https://secure.phabricator.com/D19010
2018-02-07 17:54:39 -08:00
epriestley
162563d40b Move the fix for Git 2.16.0 from the "Mercurial" part of the code to the "Git" part of the code
Summary: Ref T13050. Oh boy. Both of them run `grep`!

Test Plan: Will push again.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13050

Differential Revision: https://secure.phabricator.com/D18945
2018-01-26 13:48:35 -08:00
epriestley
a80d1e7e7d Pass "." to git grep to satisfy "all paths" for Git 2.16.0
Summary:
Ref T13050. See <https://discourse.phabricator-community.org/t/issues-with-git-2-16-0/1004/2>.

`secure` picked up 2.16.0 so this reproduces now: <https://secure.phabricator.com/source/phabricator/browse/master/?grep=dog>

Test Plan: Will push.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13050

Differential Revision: https://secure.phabricator.com/D18944
2018-01-26 13:38:18 -08:00
epriestley
7cbbe2ccf7 When users browse to a submodule path in Diffusion explicitly, don't fatal
Summary: Ref T13030. See PHI254. This behavior could be cleaner than I've made it, but it fixes the "this is totally broken" issue, replacing a fatal/exception with an informative (just not terribly useful) page.

Test Plan:
  - Added a submodule to a repository.
  - In Diffusion, clicked some other file next to the submodule, then edited the URI to the submodule path instead.
    - Before patch: fatal.
    - After patch: relatively useful message about this being a submodule.

Note that it's normally hard to hit this URI directly. In the browse view, submodules are marked up as directories and linked to a separate submodule resolution flow.

{F5321524}

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13030

Differential Revision: https://secure.phabricator.com/D18831
2017-12-18 09:18:22 -08:00
epriestley
a989dd181d Fix Mercurial commit history ordering
Summary:
See <https://discourse.phabricator-community.org/t/diffusion-observed-mercurial-repository-history-broken/825>.

In D18769, I rewrote this from using the `--branch` flag (which is unsafe and does not function on branches named `--config=x.y` and such).

However, this rewrite accidentally changed the result order, which impacted Mercurial commit hisotry lists and graphs. Swap the order of the constraints so we get newest-to-oldest again, as expected.

Test Plan: Viewed a Mercurial repository's history graph, saw sensible chronology after the patch.

Reviewers: amckinley

Reviewed By: amckinley

Differential Revision: https://secure.phabricator.com/D18817
2017-12-05 12:12:44 -08:00
epriestley
a7921a4448 Filter and reject "--config" and "--debugger" flags to Mercurial in any position
Summary:
Ref T13012. These flags can be exploited by attackers to execute code remotely. See T13012 for discussion and context.

Additionally, harden some Mercurial commands where possible (by using additional quoting or embedding arguments in other constructs) so they resist these flags and behave properly when passed arguments with these values.

Test Plan:
  - Added unit tests.
  - Verified "--config" and "--debugger" commands are rejected.
  - Verified more commands now work properly even with branches and files named `--debugger`, although not all of them do.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13012

Differential Revision: https://secure.phabricator.com/D18769
2017-11-10 08:42:07 -08:00
epriestley
85011a46d0 Bail out of PhabricatorRepositoryGraphCache more aggressively after cache fills
Summary:
Ref PHI109. Ref T11786. We currently test elapsed time every 64 iterations (since iterations are normally very fast), but at least one install is seeing the page timeout after 30 seconds.

One reason could be that cache fills may occur, and are likely to be much slower than normal iterations. In an extreme case, we could do 64 cache fills before checking the time. Tweak thing so that we always check the time after doing a cache fill, regardless of how many iterations have elapsed since the last attempt.

Additionally, this API method currently accepts an arbitrary number of paths, but implicitly limits each cache query to 500ms. If more than 60 paths are passed, this may exceed 30s. Only let the cache churn for a maximum of 10s across all paths.

If this is more the latter issue than the former, this might replace the GraphCache timeouts with `git` timeouts, but at least our understanding of what's going on here will improve.

Test Plan: This is difficult to test convincingly locally, since I can't reproduce the original issue. It still works after these changes, but it worked fine before these changes too.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T11786

Differential Revision: https://secure.phabricator.com/D18692
2017-10-06 14:12:58 -07:00
epriestley
d5163d0143 Write patterns to "git grep" on stdin instead of passing them with "-e"
Summary:
Fixes T12807. Some shells may apparently mangle/strip UTF8 characters? Just dodge this whole problem by sending the pattern over stdin rather than actually figuring out the particulars.

Related tasks, like T7339 and T5554, discuss finding broader fixes for this class of issue, and this definitely isn't exactly a fully legitimate fix, but in many cases (as here) we can reasonably just avoid the problem rather than actually fixing it, at least for a long time.

Test Plan: Searched for emoji and non-emoji locally, but this worked fine (on OSX) for me before the patch too.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T12807

Differential Revision: https://secure.phabricator.com/D18105
2017-06-08 15:25:55 -07:00
Jakub Vrana
9f3cde4db7 Fix errors found by PHPStan
Test Plan: None.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D17377
2017-02-18 09:24:56 +00:00
Jakub Vrana
a778151f28 Fix errors found by PHPStan
Test Plan: Ran `phpstan analyze -a autoload.php phabricator/src`.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D17371
2017-02-17 10:10:15 +00:00
epriestley
2e3e078358 Remove "diffusion.createcomment" Conduit API method
Summary: Ref T10978. This was introduced in D6923 in 2013 as a deprecated method (before methods were extensible) and has only ever been deprecated. It no longer works after D17250 (despite my mistaken claim there that we never had an API for actions), and has been superceded by `diffusion.commit.edit` which is a modern, fully-power method.

Test Plan: Viewed Conduit console, no longer saw method.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10978

Differential Revision: https://secure.phabricator.com/D17254
2017-01-26 12:57:15 -08:00
epriestley
19525ed81a Add diffusion.commit.search Conduit API method
Summary: Ref T10978. This is bare bones, but the SearchEngine is at least mostly in reasonable shape now, so get it in place and freeze the old stuff. I previously froze `audit.query`, which did much the same thing.

Test Plan: Issued some queries with the API, technically got results back.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10978

Differential Revision: https://secure.phabricator.com/D17194
2017-01-12 13:23:29 -08:00
epriestley
2941b34acb Add "diffusion.commit.edit", a v3 edit API endpoint for commits
Summary: Ref T10978. This currently does almost nothing, but gets it in place so I can add stuff to it.

Test Plan: Made a comment on a commit using the API.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10978

Differential Revision: https://secure.phabricator.com/D17178
2017-01-11 10:38:14 -08:00
Aviv Eyal
43f9927a38 Compare two branches
Summary:
This shows the commits list only (Actual `git diff` will show up at a later date).
The inputs are left as text-fields, to allow the form to accept anything that can be resolved. The form is GET, to allow sharing URIs.

The conduit method response array is compatible with that of `diffusion.historyquery`, to make it easy to build
the "history" table.

The hardest part here was, of course, Naming. I think "from" and "onto" are unconfusing, and I'm fairly confident that the "to merge"
instructions are in sync with the actual content of the page.

Test Plan: Look at several "compare" views, with various values of "from" and "onto".

Reviewers: #blessed_reviewers!, epriestley

Subscribers: caov297, 20after4, Sam2304, reardencode, baileyb, chad, Korvin

Maniphest Tasks: T929

Differential Revision: https://secure.phabricator.com/D15330
2016-12-05 16:25:49 -08:00
Mukunda Modell
c0bf08058b Check for empty output from git ls-tree
Summary: Fixes T10155

Test Plan: View an empty repository in diffusion, check for the exception.
See T10155 for steps to reproduce

Reviewers: epriestley

Subscribers:
2016-09-10 06:02:48 -05:00
epriestley
c55de86f0e Return Diffusion diffs through Files, not directly over Conduit
Summary:
Fixes T10423. Ref T11524. This changes `diffusion.rawdiffquery` to return a file PHID instead of a blob of data.

This is better in general, but particularly better for huge diffs (as in T10423) and diffs with non-utf8 data (as in T10423).

Test Plan:
  - Used `bin/differential extract` to extract a latin1 diff, got a clean diff.
  - Used `bin/repository reparse --herald` to rerun herald on a latin1 diff, got a clean result.
  - Pushed latin1 diffs to test commit hooks.
  - Triggered the the too large / too slow logic.
  - Viewed latin1 diffs in Diffusion.
  - Used "blame past this change" in Diffusion to hit the `before` logic.

Reviewers: chad

Reviewed By: chad

Subscribers: eadler

Maniphest Tasks: T10423, T11524

Differential Revision: https://secure.phabricator.com/D16460
2016-08-27 09:11:03 -07:00
epriestley
771579496f Make logic for streaming VCS stuff directly to Files more reusable
Summary:
Ref T11524. Ref T10423. Earlier, I converted `diffusion.filecontentquery` to put the actual file content in Files, then return a PHID for the file, instead of trying to send the content over Conduit.

In T11524, we have a similar set of problems with diffs that contain non-UTF8 data (and, in T10423, diffs that are simply enormous).

I want to provide an API method to do the same sort of thing with diff output (like from `git diff`), so we call the method, it shoves the data in Files, and then we go pull it out of Files.

To support this, take the "shove the output of a Future into Files" logic and put it in a new base `FileFuture` query. This will let me make `RawDiffQuery` share the logic more easily.

Test Plan: Browsed Diffusion, ran `diffusion.filecontentquery` to fetch file content.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10423, T11524

Differential Revision: https://secure.phabricator.com/D16458
2016-08-27 09:10:20 -07:00
epriestley
d44a5fa933 In Git, only use "--find-copies-harder" on small diffs
Summary:
Ref T10423. This flag can cause `git diff` to take an enormously long time (the problem case was a 5M line, 20K file commit).

Instead:

  - Run without the flag first.
  - If that shows that the diff is definitely small, try again with the flag.
  - If that works, return the slower, better output.
  - If the fast diff affects too many paths or generating the slow diff takes too long, return the faster, slightly worse output.

The quality of the output differs in how well Git is able to detect "M" and "C" (moves and copies of files).

For example, if you copy `src/` to `srcpro/`, the fast output may not show that you copied files. The slow output will.

I think this is rarely useful for large copies anyway: it's interesting if a 1-2 file diff is a copy, but usually obvious/uninteresting if a 500-file diff is a copy.

Test Plan:
  - Ran `bin/repository reparse --change rXnnn` on Git changes.
  - Saw fast and slow commands execute normally.
  - Tried on a large diff, saw only the fast command execute.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10423

Differential Revision: https://secure.phabricator.com/D16266
2016-07-10 08:03:57 -07:00
epriestley
2a1393c008 Fix impropery history graph trace in Mercurial
Summary: Fixes T11267. This data was coming back weird (in reverse order relative to the graph itself). Previously it worked OK anyway, but the new logic is a little more sensitive to the input.

Test Plan: Viewed a Mercurial repository with linear history, saw linear history.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T11267

Differential Revision: https://secure.phabricator.com/D16229
2016-07-04 10:24:14 -07:00
epriestley
f5f784f4c1 Version clustered, observed repositories in a reasonable way (by largest discovered HEAD)
Summary:
Ref T4292. For hosted, clustered repositories we have a good way to increment the internal version of the repository: every time a user pushes something, we increment the version by 1.

We don't have a great way to do this for observed/remote repositories because when we `git fetch` we might get nothing, or we might get some changes, and we can't easily tell //what// changes we got.

For example, if we see that another node is at "version 97", and we do a fetch and see some changes, we don't know if we're in sync with them (i.e., also at "version 97") or ahead of them (at "version 98").

This implements a simple way to version an observed repository:

  - Take the head of every branch/tag.
  - Look them up.
  - Pick the biggest internal ID number.

This will work //except// when branches are deleted, which could cause the version to go backward if the "biggest commit" is the one that was deleted. This should be OK, since it's rare and the effects are minor and the repository will "self-heal" on the next actual push.

Test Plan:
  - Created an observed repository.
  - Ran `bin/repository update` and observed a sensible version number appear in the version table.
  - Pushed to the remote, did another update, saw a sensible update.
  - Did an update with no push, saw no effect on version number.
  - Toggled repository to hosted, saw the version reset.
  - Simulated read traffic to out-of-sync node, saw it do a remote fetch.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15986
2016-05-30 09:53:01 -07:00
epriestley
da599386f6 Add diffusion.uri.edit for creating and editing repository URIs
Summary: Ref T10748. Brings the rest of the transactions to EditEngine, supports creating via API.

Test Plan:
  - Created a URI via API.
  - Created a URI via web.
  - Tried to apply sneaky transactions, got rejected with good error messages. <_< >_>

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15821
2016-04-29 13:55:48 -07:00
epriestley
dc75b4bd06 Move all cluster locking logic to a separate class
Summary: Ref T10860. This doesn't change anything, it just separates all this stuff out of `PhabricatorRepository` since I'm planning to add a bit more state to it and it's already pretty big and fairly separable.

Test Plan: Pulled, pushed, browsed Diffusion.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10860

Differential Revision: https://secure.phabricator.com/D15790
2016-04-25 11:20:29 -07:00
epriestley
711f13660e Synchronize working copies before doing a "bypassCache" commit read
Summary:
Ref T4292. When the daemons make a query for repository information, we need to make sure the working copy on disk is up to date before we serve the response, since we might not have the inforamtion we need to respond otherwise.

We do this automatically for almost all Diffusion methods, but this particular method is a little unusual and does not get this check for free. Add this check.

Test Plan:
  - Made this code throw.
  - Ran `bin/repository reparse --message ...`, saw the code get hit.
  - Ran `bin/repository lookup-user ...`, saw this code get hit.
  - Made this code not throw.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15783
2016-04-22 08:11:43 -07:00
epriestley
d87c500002 Synchronize (hosted, clustered, Git) repositories over Conduit + HTTP
Summary:
Ref T4292. We currently synchronize hosted, clustered, Git repositories when we receive an SSH pull or push.

Additionally:

  - Synchronize before HTTP reads and writes.
  - Synchronize reads before Conduit requests.

We could relax Conduit eventually and allow Diffusion to say "it's OK to give me stale data".

We could also redirect some set of these actions to just go to the up-to-date host instead of connecting to a random host and synchronizing it. However, this potentially won't work as well at scale: if you have a larger number of servers, it sends all of the traffic to the leader immediately following a write. That can cause "thundering herd" issues, and isn't efficient if replicas are in different geographical regions and the write just went to the east coast but most clients are on the west coast. In large-scale cases, it's better to go to the local replica, wait for an update, then serve traffic from it -- particularly given that writes are relatively rare. But we can finesse this later once things are solid.

Test Plan:
  - Pushed and pulled a Git repository over HTTP.
  - Browsed a Git repository from the web UI.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15758
2016-04-19 13:05:45 -07:00
epriestley
575c01373e Extract repository command construction from Repositories
Summary:
Ref T4292. Ref T10366. Depends on D15751. Today, generating repository commands is purely a function of the repository, so they use protocols and credentials based on the repository configuration.

For example, a repository with an SSH "remote URI" always generate SSH "remote commands".

This needs to change in the future:

  - After T10366, repositories won't necessarily just have one type of remote URI. They can only have one at a time still, but the repository itself won't change based on which one is currently active.
  - For T4292, I need to generate intracluster commands, regardless of repository configuration. These will have different protocols and credentials.

Prepare for these cases by separating out command construction, so they'll be able to generate commands in a more flexible way.

Test Plan:
  - Added unit tests.
  - Browsed diffusion.
  - Ran `bin/phd debug pull` to pull a bunch of repos.
  - Ran daemons.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292, T10366

Differential Revision: https://secure.phabricator.com/D15752
2016-04-19 04:51:48 -07:00
epriestley
adf42db5ea Trivially implement RepositoryEditEngine and API methods
Summary: Ref T10748. Ref T10337. This technically implements this stuff, but it does not do anything useful yet. This skips all the hard stuff.

Test Plan:
  - Technically used `diffusion.repository.search` to get repository information.
  - Technically used `diffusion.repository.edit` to change a repository name.
  - Used `editpro/` to edit a repository name.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10337, T10748

Differential Revision: https://secure.phabricator.com/D15736
2016-04-17 16:02:13 -07:00
June Rhodes
7150aa8e19 Use Conduit in PhabricatorRepositoryGitCommitChangeParserWorker
Summary:
Ref T2783.  This allows this worker to run on a machine different to the one that stores the repository, by routing the execution of Git over Conduit calls.

This API method is super gross, but fixing it isn't straightforward and it runs into other complicated considerations. We can fix it later; for now, just define it as "internal" to limit how much mess this creates.

"Internal" methods do not appear on the console.

Test Plan: Ran `bin/repository reparse --change <commit> --trace` on several commits, saw daemons make a Conduit call instead of running a `git` command.

Reviewers: hach-que, chad

Reviewed By: chad

Subscribers: joshuaspence, Korvin, epriestley

Maniphest Tasks: T2783

Differential Revision: https://secure.phabricator.com/D11874
2016-04-14 04:53:03 -07:00
epriestley
601aaa5a86 Modularize content sources
Summary:
Ref T10537. For Nuance, I want to introduce new sources (like "GitHub" or "GitHub via Nuance" or something) but this needs to modularize eventually.

Split ContentSource apart so applications can add new content sources.

Test Plan:
This change has huge surface area, so I'll hold it until post-release. I think it's fairly safe (and if it does break anything, the breaks should be fatals, not anything subtle or difficult to fix), there's just no reason not to hold it for a few hours.

- Viewed new module page.
- Grepped for all removed functions/constants.
- Viewed some transactions.
- Hovered over timestamps to get content source details.
- Added a comment via Conduit.
- Added a comment via web.
- Ran `bin/storage upgrade --namespace XXXXX --no-quickstart -f` to re-run all historic migrations.
- Generated some objects with `bin/lipsum`.
- Ran a bulk job on some tasks.
- Ran unit tests.

{F1190182}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10537

Differential Revision: https://secure.phabricator.com/D15521
2016-03-26 11:59:45 -07:00
Vlad Albulescu
130e1d1f68 Unbreak regex filename search
Summary:
D9087 adds a nice typeahead but breaks the existing regex
search by quoting the pattern. Ideally, this change won't break the
typeahead, which as far as I can tell doesn't use the `pattern`
argument.

Test Plan:
Not yet.
RFC as to whether this change makes sense, will fix my local setup and resend if so.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D15500
2016-03-20 10:15:21 -07:00
epriestley
58c2141ffd Fix limit calculation for largeish Mercurial repsositories
Summary:
Fixes T10304. In Mercurial, we must enumerate the whole file tree. Currently, we incorrectly count files within directories (which won't be shown) toward the "100 file" limit at top level, so directories with more than 100 subpaths are truncated improperly.

This is approxiately the same as @richardvanvelzen's fix.

Test Plan: Viewed a large Mercurial repository, saw a complete directory listing.

Reviewers: chad

Reviewed By: chad

Subscribers: richardvanvelzen

Maniphest Tasks: T10304

Differential Revision: https://secure.phabricator.com/D15282
2016-02-16 14:14:59 -08:00