1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-15 19:32:40 +01:00
Commit graph

54 commits

Author SHA1 Message Date
epriestley
02d7bb8604 Add "bin/repository mark-reachable" for fixing commit reachability flags
Summary:
Ref T9028. This corrects the reachability of existing commits in a repository.

In particular, it can be used to mark deleted commits as unreachable.

Test Plan:
  - Ran it on a bad repository, with bad args, etc.
  - Ran it on a clean repo, got no changes.
  - Marked a reachable commit as unreachable, ran script, got it marked reachable.
  - Started deleting tags and branches from the local working copy while running the script, saw greater parts of the repository get marked unreachable.
  - Pulled repository again, everything automatically revived.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T9028

Differential Revision: https://secure.phabricator.com/D16132
2016-06-16 11:21:17 -07:00
epriestley
1c73ad6a1b Make repository daemon locks more granular and forgiving
Summary:
Ref T4292. Currently, we hold one big lock around the whole `bin/repository update` workflow.

When running multiple daemons on different hosts, this lock can end up being contentious. In particular, we'll hold it during `git fetch` on every host globally, even though it's only useful to hold it locally per-device (that is, it's fine/good/expected if `repo001` and `repo002` happen to be fetching from a repository they are observing at the same time).

Instead, split it into two locks:

  - One lock is scoped to the current device, and held during pull (usually `git fetch`). This just keeps multiple daemons accidentally running on the same host from making a mess when trying to initialize or update a working copy.
  - One lock is scoped globally, and held during discovery. This makes sure daemons on different hosts don't step on each other when updating the database.

If we fail to acquire either lock, assume some other process is legitimately doing the work and bail more quietly instead of fataling. In approximately 100% of cases where users have hit this lock contention, that was the case: some other daemon was running somewhere doing the work and the error didn't actually represent an issue.

If there's an actual problem, we still raise a diagnostically useful message if you run `bin/repository update` manually, so there are still tools to figure out that something is hung or whatever.

Test Plan:
  - Ran `bin/repository update`, `pull`, `discover`.
  - Added `sleep(5)`, forced processes to contend, got lock exceptions and graceful exit with diagnostic message.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15903
2016-05-13 05:17:27 -07:00
epriestley
984dff0ae3 Provide a more consistent, mostly relaxed severity for updating non-cluster repositories on cluster devices
Summary:
Fixes T10940. Two issues currently:

First, `PullLocal` deamon refuses to update non-cluster repositories on cluster devices. However, this is surprising/confusing/bad because as soon as you enroll a repository host in the cluster, most of the repositories on it stop working until you `clusterize` them. This is especially confusing because the documentation gives you a very nice, gradual walkthrough about going through things slowly and being able to check your work at every step, but we really drop you off a bit of a cliff here. The workflow implied by the documentation is a desirable one.

This operation is generally only unsafe/problematic if the daemon would be creating a //new// working copy. If a working copy already exists, we can reasonably guess that it's almost certainly because you've enrolled a previously un-clustered host into a new cluster. This allows the nice, gradual workflow the documentation describes to proceed as expected, without any weird surprises.

Instead of refusing to update these repositories, only refuse to update them if updating would create a new working copy. This should make transitioning much smoother without any meaningful reduction in safety.

Second, the lower-level `bin/repository update`, `refs`, `mirror`, etc., commands don't apply this same check. However, these commands are potentially just as dangerous. Use the same code to do a similar check there, making sure we only operate on repositories that are either expected to be on the current device, or which already exist here.

Test Plan:
  - Ran `bin/phd debug pull`, saw diagnostic information choose to update most repositories (including some non-cluster repositories) but properly skip non-cluster repositories that do not exist locally.
  - Ran `bin/repository update`, etc., saw the command apply consistent rules to the rules applied by `PullLocal` and refuse to update non-local repositories it would need to create.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10940

Differential Revision: https://secure.phabricator.com/D15902
2016-05-12 15:51:14 -07:00
epriestley
29d1115037 Swap Repository Edit UI to new code
Summary:
Ref T10748. This needs more extensive testing and is sure to have some rough edges, but seems to basically work so far.

Throwing this up so I can work through it more deliberately and make notes.

Test Plan:
- Ran migration.
- Used `bin/repository list` to list existing repositories.
- Used `bin/repository update <repository>` to update various repositories.
- Updated a migrated, hosted Git repository.
- Updated a migrated, observed Git repository.
- Converted an observed repository into a hosted repository by toggling the I/O mode of the URI.
- Conveted a hosted repository into an observed repository by toggling it back.
- Created and activated a new empty hosted Git repository.
- Created and activated an observed Git repository.
- Updated a mirrored repository.
- Cloned and pushed over HTTP.
- Tried to HTTP push a read-only repository.
- Cloned and pushed over SSH.
- Tried to SSH push a read-only repository.
- Updated several Mercurial repositories.
- Updated several Subversion repositories.
- Created and edited repositories via the API.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15842
2016-05-04 16:19:57 -07:00
epriestley
42eaa88f80 Cut mirroring over to new URIs
Summary:
Ref T10748. This migrates and swaps mirroring to `PhabricatorRepositoryURI`, obsoleting `PhabricatorRepositoryMirror`.

This prevents you from editing, adding or disabling mirrors unless you know a secret URI (until the UI cuts over fully), but existing mirroring is not affected.

Test Plan:
  - Added a mirroring URI to an old repository.
  - Verified it worked with `bin/repository mirror`.
  - Migrated forward.
  - Verified it still worked with `bin/repository mirror`.
  - Wow, mirroring: https://github.com/epriestley/locktopia-mirror

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15841
2016-05-04 16:16:16 -07:00
epriestley
34e870819c Remove "bin/repository edit" workflow
Summary:
Ref T10748. In D14250#158181, I accepted this conditional on removing it once Conduit could handle it.

Conduit can now handle it, or at least will be able to as soon as T10748 cuts over.

Test Plan: Grepped for `repository edit`.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10748

Differential Revision: https://secure.phabricator.com/D15839
2016-05-04 16:10:49 -07:00
epriestley
dd2b10b8f8 Guarantee repositories have unique local paths
Summary:
Ref T4039. Long ago these were more freely editable and there were some security concerns around creating a repository, then setting its local path to point somewhere it shouldn't.

Local paths are no longer editable so there's no real reason we need to provide a uniqueness guarantee anymore, but you could still make a mistake with `bin/repository move-paths` by accident, and it's a little cleaner to pull them out into their own column with a key.

(We still don't -- and, largely can't -- guarantee that two paths aren't //equivalent// since one might be symlinked to the other, or symlinked only on some hosts, or whatever, but the primary value here is as a sanity check that you aren't goofing things up and pointing a bunch of repositories at the same working copy by mistake.)

Test Plan:
  - Ran migrations.
  - Grepped for `local-path`.
  - Listed and moved paths with `bin/repository`.
  - Created a new repository, verified its local path populated correctly.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4039

Differential Revision: https://secure.phabricator.com/D15837
2016-05-04 16:09:52 -07:00
epriestley
dc3a13c5e8 Add bin/repository clusterize and document setup and migration for clusters
Summary: Ref T4292. This provides at least some sort of hint about how to set up cluster repositories.

Test Plan:
  - Read documentation.
  - Ran `bin/repository clusterize` to add + remove clusters.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15798
2016-04-26 10:07:17 -07:00
epriestley
550a82d438 Fix two minor formatting issues with bin/repository move-paths
Summary: This gets over-escaped instead of bolded right now, but I only ever hit it when exporting/importing and never both cleaning it up.

Test Plan: Ran `bin/repository move-paths`, saw bolded "Move" instead of ANSI escape sequences.

Reviewers: chad

Reviewed By: chad

Differential Revision: https://secure.phabricator.com/D15797
2016-04-25 12:29:15 -07:00
epriestley
711f13660e Synchronize working copies before doing a "bypassCache" commit read
Summary:
Ref T4292. When the daemons make a query for repository information, we need to make sure the working copy on disk is up to date before we serve the response, since we might not have the inforamtion we need to respond otherwise.

We do this automatically for almost all Diffusion methods, but this particular method is a little unusual and does not get this check for free. Add this check.

Test Plan:
  - Made this code throw.
  - Ran `bin/repository reparse --message ...`, saw the code get hit.
  - Ran `bin/repository lookup-user ...`, saw this code get hit.
  - Made this code not throw.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4292

Differential Revision: https://secure.phabricator.com/D15783
2016-04-22 08:11:43 -07:00
epriestley
bd4fb3c9fa Implement bin/repository thaw for unfreezing cluster repositories
Summary:
Ref T10751. Add support tooling for manually prying your way out of trouble if disaster strikes.

Refine documentation, try to refer to devices as "devices" more consistently instead of sometimes calling them "nodes".

Test Plan: Promoted and demoted repository devices with `bin/repository thaw`.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15768
2016-04-20 10:45:58 -07:00
epriestley
8d6488f290 Fix a typo in bin/repository help update
Summary: Fixes T10741. The workflow is `refs`, not `ref`.

Test Plan: o.O

Reviewers: chad, cspeckmim

Reviewed By: cspeckmim

Maniphest Tasks: T10741

Differential Revision: https://secure.phabricator.com/D15652
2016-04-07 05:39:37 -07:00
epriestley
601aaa5a86 Modularize content sources
Summary:
Ref T10537. For Nuance, I want to introduce new sources (like "GitHub" or "GitHub via Nuance" or something) but this needs to modularize eventually.

Split ContentSource apart so applications can add new content sources.

Test Plan:
This change has huge surface area, so I'll hold it until post-release. I think it's fairly safe (and if it does break anything, the breaks should be fatals, not anything subtle or difficult to fix), there's just no reason not to hold it for a few hours.

- Viewed new module page.
- Grepped for all removed functions/constants.
- Viewed some transactions.
- Hovered over timestamps to get content source details.
- Added a comment via Conduit.
- Added a comment via web.
- Ran `bin/storage upgrade --namespace XXXXX --no-quickstart -f` to re-run all historic migrations.
- Generated some objects with `bin/lipsum`.
- Ran a bulk job on some tasks.
- Ran unit tests.

{F1190182}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10537

Differential Revision: https://secure.phabricator.com/D15521
2016-03-26 11:59:45 -07:00
epriestley
edcc3232aa Remove calls to getCallsign() in bin/repository scripts
Summary:
Ref T4245. Prepare these scripts for a callsign-free world. This also makes them more flexible and easier to use.

The following are now valid ways to identify a repository for these scripts: ID (`3`), PHID (`PHID-REPO-wxyz`), R<ID> (`R3`), r<CALLSIGN> (`rSKYNET`), CALLSIGN (`SKYNET`).

In the future, a human-readable label (`skynet`) may also become valid.

Test Plan:
- Ran `bin/repository reparse --all ...` with `rX`, `X`, `3`, `R3`.
- Ran `bin/repository reparse --change ...` with `rXaaa`, including short versions.
- Ran `bin/repository update ...` with `rX`, `X`, `3`, `R3`.
- Ran `bin/repository refs ...` with various identifiers.
- Ran `bin/repository pull ...` with various identifiers.
- Ran `bin/repository mirror ...` with various identifiers.
- Ran `bin/repository mark-imported ...` with various identifiers.
- Ran `bin/repository list`.
- Ran `bin/repository importing ...` with various identifiers and examined output.
- Ran `bin/repository edit ...` with various identifiers.
- Ran `bin/repository discover ...` with various identifiers.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4245

Differential Revision: https://secure.phabricator.com/D14924
2016-01-02 04:23:26 -08:00
Chad Horohoe
e80970eba0 Allow editing hosting policies via command line
Summary:
Exposes the serve-over-http and serve-over-ssh options for a repository
to the `bin/repository edit` endpoint.

Test Plan: Ran `bin/repository` with the new options over several hundred repos

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin, chasemp, 20after4, epriestley

Differential Revision: https://secure.phabricator.com/D14250
2015-11-04 10:15:42 -08:00
Joshua Spence
c35b564f4d Various translation improvements
Summary: Depends on D14070.

Test Plan: Eyeball it.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D14073
2015-11-03 07:02:46 +11:00
epriestley
fc72b000f0 Add repository list-paths and repository move-paths
Summary: Ref T7149. These formalize the local path adjustment step for cluster imports, rather than relying on an ad-hoc script.

Test Plan: Used `list-paths` and `move-paths` to list and move repositories.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7149

Differential Revision: https://secure.phabricator.com/D13621
2015-07-16 14:11:33 -07:00
epriestley
04516d256b Add an "--importing" flag to bin/repository reparse
Summary: Fixes T6839. Sometimes, worker tasks go astray for whatever reason. This automates the step of `bin/repository importing | xargs | mangle mangle | bin/repostiory reparse`.

Test Plan: Ran various flavors of the command, got good looking results.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6839

Differential Revision: https://secure.phabricator.com/D13362
2015-06-20 05:25:44 -07:00
Joshua Spence
f47e69c015 Mark some strings for translation
Summary: Add some more `pht`izations.

Test Plan: Eyeball it.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: hach-que, Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D13200
2015-06-09 23:06:52 +10:00
Joshua Spence
1b12249b2c Fix some format strings
Summary: These format strings use `%d` instead of `%s`.

Test Plan: Eyeball it.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12996
2015-05-25 21:29:30 +10:00
Joshua Spence
36e2d02d6e phtize all the things
Summary: `pht`ize a whole bunch of strings in rP.

Test Plan: Intense eyeballing.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: hach-que, Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12797
2015-05-22 21:16:39 +10:00
epriestley
fad75f939d Improve messaging around repository locks
Summary:
Fixes T6958. Ref T7484.

  - When we collide on a lock in `bin/repository update`, explain what that means.
  - GlobalLock currently uses a "lock name" which is different from the lock's actual name. Don't do this. There's a small chance this fixes T7484, but I don't have high hopes.

Test Plan: Ran `bin/repository update X` in two windows really fast, got the new message in one of them.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6958, T7484

Differential Revision: https://secure.phabricator.com/D12574
2015-04-27 10:25:53 -07:00
epriestley
2b3d3cf7e4 Enforce that global locks have keys shorter than 64 characters
Summary:
Fixes T7484. There's a bunch of spooky mystery here but the current behavior can probably cause problems in at least some situations.

Also moves a couple callsigns to monograms (see T4245).

Test Plan:
  - Faked a short lock length to hit the exception.
  - Updated normally.
  - Grepped for other use sites, none seemed suspicious or likely to overflow the lock length.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7484

Differential Revision: https://secure.phabricator.com/D12263
2015-04-02 13:42:22 -07:00
epriestley
e9886c4353 Fix an issue where we would try to release an unheld lock
Summary: Fixes T7484. If the lock failed, we'd still try to unlock it, which is incorrect.

Test Plan: Ran two `bin/repository update X` in different windows, got proper LockException instead of indirect symptomatic "not locked by this process" exception.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7484

Differential Revision: https://secure.phabricator.com/D12253
2015-04-01 17:37:46 -07:00
epriestley
35c55f7ddf Improve visibility of repository credential errors
Summary:
Fixes T7310. We have a whole mechanism for surfacing update errors, but only surface actual update errors, not pull errors.

Instead, surface pull errors too.

Then format them a little more nicely.

Test Plan: {F309769}

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T7310

Differential Revision: https://secure.phabricator.com/D11821
2015-02-19 10:32:25 -08:00
Joshua Spence
aaf8d73ec7 Fix pht method calls
Summary: Ref T7046. This is mainly a proof-of-concept for D11661.

Test Plan: `arc lint`

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T7046

Differential Revision: https://secure.phabricator.com/D11680
2015-02-10 18:57:45 +11:00
Joshua Spence
daadf95537 Fix visibility of PhutilArgumentWorkflow::didConstruct methods
Summary: Ref T6822.

Test Plan: `grep`. This method is only called from within `PhutilArgumentWorkflow::__construct`.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11415
2015-01-16 07:42:07 +11:00
Bob Trahan
648fa2e1bc Repositories - Move scripts/repository/reparse.php to bin/repository reparse
Summary:
Fixes T5966. Accomplishes a few things

 - see title
 - adds a force-autoclose flag and the plumbing for it
 - removes references to some HarborMaster thing that used to key off commits and seems long dead, but forgotten :/

Test Plan:
ran a few commands. These first three had great success:

`./repository reparse --all FIRSTREPO --message --change  --herald --owners`
`./repository reparse --all FIRSTREPO --message --change  --herald --owners --min-date yesterday`
`./repository reparse --all FIRSTREPO --message --change  --herald --owners --min-date yesterday --force-autoclose`

...and these next two showed me some errors as expected:

`./repository reparse --all FIRSTREPO --message --change  --herald --owners --min-date garbagedata`
`./repository reparse --all GARBAGEREPO --message --change  --herald --owners`

Also, made a diff in a repository with autoclose disabled and commited the diff. Later, reparse the diff with force-autoclose. Verified the diff closed and that the reason "why" had the proper message text.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: joshuaspence, epriestley, Korvin

Maniphest Tasks: T5966

Differential Revision: https://secure.phabricator.com/D10492
2015-01-06 11:42:15 -08:00
epriestley
4f4dc9c83e Update PhabricatorRepositoryManagementLookupUsersWorkflow to use ConduitCall
Summary:
Ref T2783.

This updates PhabricatorRepositoryManagementLookupUsersWorkflow to use ConduitCall to retrieve information about the commit.

Test Plan:
Ran `bin/repository lookup-users rTESTe9683b64d3283f0b2d355fdbf231bc918b5ac0ab --trace` and saw the information returned (by making a request to `diffusion.querycommits` as the omnipotent user, signed with the device key).

Mucked with `cluster.addresses` and saw requests rejected.

Reviewers: hach-que, btrahan

Reviewed By: btrahan

Subscribers: Krenair, epriestley, Korvin

Maniphest Tasks: T2783

Differential Revision: https://secure.phabricator.com/D10403
2015-01-02 15:13:57 -08:00
epriestley
ec9eaabfbd Allow repo updates to recover after force push + garbage collection
Summary:
Fixes T5839. If a repository has been force pushed and garbage collected, we might have a ref cursor in the database which still points at the old commit (which no longer exists).

We'll then run a command like `git log <new hash> --not <old hash>` to figure out which commits are newly pushed, and this will bomb out because `<old hash>` is invalid.

Instead, validate all the `<old hash>` values before we try to make use of them.

Test Plan:
  - Forced a repository into a bad state by mucking with the datbase, generating a reproducible failure similar to the one in T5839.
  - Applied patch.
  - `bin/repository update <callsign> --trace` filtered the bad commit and put the repository into the right state.
  - Saw new commits recognized correctly.
  - Ran `bin/repository update <callsign>` for a Mercurial and SVN repo as a sanity check.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T5839

Differential Revision: https://secure.phabricator.com/D10226
2014-08-12 12:25:24 -07:00
Joshua Spence
0a62f13464 Change double quotes to single quotes.
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.

Test Plan: Eyeballed it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9431
2014-06-09 11:36:50 -07:00
Joshua Spence
1503840cd9 Batch up SQL operations in the ./bin/repository parents script.
Summary: Fixes T5255. Currently the `./bin/repository parents` workflow is quite slow. Batching up the SQL operations should make the workflow //seem// much faster.

Test Plan: Not yet tested.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin

Maniphest Tasks: T5255

Differential Revision: https://secure.phabricator.com/D9361
2014-06-03 11:27:57 -07:00
Joshua Spence
0d03bbe43c Allow repositories to be deleted using ./bin/remove.
Summary: Currently, repositories can be deleted using `./bin/repository delete`. It makes sense to expose this operate to the `./bin/remove` script as well, for consistency.

Test Plan: Deleted a repository with `./bin/remove rTEST`.

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: epriestley, Korvin

Differential Revision: https://secure.phabricator.com/D9350
2014-06-02 17:11:58 -07:00
Joshua Spence
2f668493a0 Don't attempt to discover parents commits for untracked branchs.
Summary: Fixes T5195. Currently, the `./bin/repository parents` workflow doesn't respect tracked branches and will attempt to build parents caches for all branches.

Test Plan: For at least one of our repositories, this patch fixes the `Unknown commit` exception. Unfortunately, it doesn't seem to completely solve this problem though, but I suspect that this is due to commits that were overwritten with a `git push --force` or similar.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin

Maniphest Tasks: T5195

Differential Revision: https://secure.phabricator.com/D9322
2014-05-29 12:02:37 -07:00
epriestley
e4ea092f60 Implement a chunked, APC-backed graph cache
Summary:
Ref T2683. This is a refinement and simplification of D5257. In particular:

  - D5257 only cached the commit chain, not path changes. This meant that we had to go issue an awkward query (which was slow on Facebook's install) periodically while reading the cache. This was reasonable locally but killed performance at FB scale. Instead, we can include path information in the cache. It is very rare that this is large except in Subversion, and we do not need to use this cache in Subversion. In other VCSes, the scale of this data is quite small (a handful of bytes per commit on average).
  - D5257 required a large, slow offline computation step. This relies on D9044 to populate parent data so we can build the cache online at will, and let it expire with normal LRU/LFU/whatever semantics. We need this parent data for other reasons anyway.
  - D5257 separated graph chunks per-repository. This change assumes we'll be able to pull stuff from APC most of the time and that the cost of switching chunks is not very large, so we can just build one chunk cache across all repositories. This allows the cache to be simpler.
  - D5257 needed an offline cache, and used a unique cache structure. Since this one can be built online it can mostly use normal cache code.
  - This also supports online appends to the cache.
  - Finally, this has a timeout to guarantee a ceiling on the worst case: the worst case is something like a query for a file that has never existed, in a repository which receives exactly 1 commit every time other repositories receive 4095 commits, on a cold cache. If we hit cases like this we can bail after warming the cache up a bit and fall back to asking the VCS for an answer.

This cache isn't perfect, but I believe it will give us substantial gains in the average case. It can often satisfy "average-looking" queries in 4-8ms, and pathological-ish queries in 20ms on my machine; `hg` usually can't even start up in less than 100ms. The major thing that's attractive about this approach is that it does not require anything external or complicated, and will "just work", even producing reasonble improvements for users without APC.

In followups, I'll modify queries to use this cache and see if it holds up in more realistic workloads.

Test Plan:
  - Used `bin/repository cache` to examine the behavior of this cache.
  - Did some profiling/testing from the web UI using `debug.php`.
  - This //appears// to provide a reasonable fast way to issue this query very quickly in the average case, without the various issues that plagued D5257.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley, jhurwitz

Maniphest Tasks: T2683

Differential Revision: https://secure.phabricator.com/D9045
2014-05-12 11:47:23 -07:00
epriestley
95eab2f3b0 Record parent relationships when discovering commits
Summary:
Ref T4455. This adds a `repository_parents` table which stores `<childCommitID, parentCommitID>` relationships.

For new commits, it is populated when commits are discovered.

For older commits, there's a `bin/repository parents` script to rebuild the data.

Right now, there's no UI suggestion that you should run the script. I haven't come up with a super clean way to do this, and this table will only improve performance for now, so it's not important that we get everyone to run the script right away. I'm just leaving it for the moment, and we can figure out how to tell admins to run it later.

The ultimate goal is to solve T2683, but solving T4455 gets us some stuff anyway (for example, we can serve `diffusion.commitparentsquery` faster out of this cache).

Test Plan:
  - Used `bin/repository discover` to discover new commits in Git, SVN and Mercurial repositories.
  - Used `bin/repository parents` to rebuild Git and Mercurial repositories (SVN repos just exit with a message).
  - Verified that the table appears to be sensible.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: jhurwitz, epriestley

Maniphest Tasks: T4455

Differential Revision: https://secure.phabricator.com/D9044
2014-05-12 11:47:22 -07:00
epriestley
118c696f72 Separate repository updates from the pull daemon
Summary:
Ref T4605. Currently, the PullLocal daemon is responsible for two relatively distinct things:

  - scheduling repository updates; and
  - actually updating repositories.

Move the "actually updating" part into a new `bin/repository update` command, which basically runs the pull, discover, refs and mirror commands. This will let the parent process focus on scheduling in a more understandable way and update multiple repositories at once. It also makes it easier to debug and understand update behavior since the non-scheduling pipeline can be run separately.

Test Plan:
  - Ran `update --trace` on SVN, Mercurial and Git repos.
  - Ran PullLocal daemon for a while without issues.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T4605

Differential Revision: https://secure.phabricator.com/D8780
2014-04-16 13:00:29 -07:00
epriestley
dd944f7d83 Separate repository mirroring into an Engine and provide bin/repository mirror
Summary:
Ref T4338. Currently, there's no diagnostic command to execute mirroring (so I can't give users an easy command to run), and it's roughly the last piece of real logic left in the PullLocal daemon.

Separate mirroring out, and provide `bin/repository mirror`.

Test Plan:
  - Ran `bin/repository mirror` to mirror a repository.
  - Ran PullLocalDaemon and verified it also continued mirroring normally.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T4338

Differential Revision: https://secure.phabricator.com/D8066
2014-01-25 14:01:58 -08:00
epriestley
be59578794 Fix bin/repository importing for CLOSEABLE flag
Summary: After adding the CLOSEABLE flag, `bin/repository importing` reports CLOSEABLE commits with no information. Just don't report these, for consistency with the old behavior. Adding flags to show them might be nice at some point if we run into issues where that output would be useful.

Test Plan: Ran `bin/repositroy importing` on a repository with CLOSEABLE commits.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Differential Revision: https://secure.phabricator.com/D8024
2014-01-21 14:09:51 -08:00
epriestley
8520d9e070 Move more discovery responsibilities into DiscoveryEngine
Summary:
Ref T4327. This moves the last pieces of discovery responsibility out of the PullLocal daemon and into the DiscoveryEngine.

(This makes it easier to discover repositories in unit tests in the future, since we don't need to build a PullLocal daemon and can just build a DiscoveryEngine.)

Test Plan: Ran `phd debug pulllocal`. Ran `repostory discover`.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T4327

Differential Revision: https://secure.phabricator.com/D7997
2014-01-17 16:09:24 -08:00
epriestley
f4b9efe256 Introduce ref cursors for repository parsing
Summary:
Ref T4327. I want to make change parsing testable; one thing which is blocking this is that the Git discovery process is still part of `PullLocal` daemon instead of being part of `DiscoveryEngine`. The unit test stuff which I want to use for change parsing relies on `DiscoveryEngine` to discover repositories during unit tests.

The major reason git discovery isn't part of `DiscoveryEngine` is that it relies on the messy "autoclose" logic, which we never implemented for Mercurial. Generally, I don't like how autoclose was implemented: it's complicated and gross and too hard to figure out and extend.

Instead, I want to do something more similar to what we do for pushes, which is cleaner overall. Basically this means remembering the old branch heads from the last time we parsed a repository, and figuring out what's new by comparing the old and new branch heads. This should give us several advantages:

  - It should be simpler to understand than the autoclose stuff, which is pretty mind-numbing, at least for me.
  - It will let us satisfy branch and tag queries cheaply (from the database) instead of having to go to the repository. We could also satisfy some ref-resolve queries from the database.
  - It should be easier to extend to Mercurial.

This implements the basics -- pretty much a table to store the cursors, which we update only for Git for now.

Test Plan:
  - Ran migration.
  - Ran `bin/repository discover X --trace --verbose` on various repositories with branches and tags, before and after modifying pushes.
  - Pushed commits to a git repo.
  - Looked at database tables.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T4327

Differential Revision: https://secure.phabricator.com/D7982
2014-01-17 11:48:53 -08:00
epriestley
e397103bf2 Extend all "ManagementWorkflow" classes from a base class
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.

Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.

Test Plan: Lint; ran some of the scripts.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2015

Differential Revision: https://secure.phabricator.com/D7838
2013-12-27 13:15:40 -08:00
epriestley
d667b12206 Provide a standalone query for resolution of commit author/committer into Phabricator users
Summary:
Ref T4195. To implement the "Author" and "Committer" rules, I need to resolve author/committer strings into Phabricator users.

The code to do this is currently buried in the daemons. Extract it into a standalone query.

I also added `bin/repository lookup-users <commit>` to test this query, both to improve confidence I'm getting this right and to provide a diagnostic command for users, since there's occasionally some confusion over how author/committer strings resolve into valid users.

Test Plan:
I tested this using `bin/repository lookup-users` and `reparse.php --message` on Git, Mercurial and SVN commits. Here's the `lookup-users` output:

  >>> orbital ~/devtools/phabricator $ ./bin/repository lookup-users rINIS3
  Examining commit rINIS3...
  Raw author string: epriestley
  Phabricator user: epriestley (Evan Priestley   )
  Raw committer string: null
  >>> orbital ~/devtools/phabricator $ ./bin/repository lookup-users rPOEMS165b6c54f487c8
  Examining commit rPOEMS165b6c54f487...
  Raw author string: epriestley <git@epriestley.com>
  Phabricator user: epriestley (Evan Priestley   )
  Raw committer string: epriestley <git@epriestley.com>
  Phabricator user: epriestley (Evan Priestley   )
  >>> orbital ~/devtools/phabricator $ ./bin/repository lookup-users rINIH6d24c1aee7741e
  Examining commit rINIH6d24c1aee774...
  Raw author string: epriestley <hg@yghe.net>
  Phabricator user: epriestley (Evan Priestley   )
  Raw committer string: null
  >>> orbital ~/devtools/phabricator $

The `reparse.php` output was similar, and all VCSes resolved authors correctly.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T1731, T4195

Differential Revision: https://secure.phabricator.com/D7801
2013-12-19 11:05:17 -08:00
epriestley
f5ca647d2c Add bin/repository edit for CLI repository editing
Summary:
Ref T4039. This is mostly to deal with that, to prevent the security issues associated with mutable local paths. The next diff will lock them in the web UI.

I also added a confirmation prompt to `bin/repository delete`, which was a little scary without one.

See one comment inline about the `--as` flag. I don't love this, but when I started adding all the stuff we'd need to let this transaction show up as "Administrator" it quickly got pretty big.

Test Plan: Ran `bin/repository edit ...`, saw an edit with a transaction show up on the web UI.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T4039

Differential Revision: https://secure.phabricator.com/D7579
2013-11-13 11:26:05 -08:00
epriestley
bd29784a32 Add an administrative bin/repository importing command to list importing commits
Summary: Ref T4068. Adds a command to list all commits in an "importing" status. This will allow users to use `reparse.php` to diagnose and repair issues.

Test Plan:
  - Ran `bin/repository importing P`, etc.
  - Used `reparse.php` to reparse some commit stages and saw status update correctly.
  - Ran on a repo with no importing commits.
  - Ran with `... --simple | xargs`, which saves us having to put an `awk` or something in there for users.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T4068

Differential Revision: https://secure.phabricator.com/D7515
2013-11-06 11:26:41 -08:00
epriestley
e3a5ab1f8c Add an administrative bin/repository mark-imported command
Summary:
Ref T4068. In some cases like that one, I anticipate a repository not fully importing when a handful of random commits are broken. In the long run we should just deal with that properly, but in the meantime provide an administrative escape hatch so you can mark the repository as imported and get it running normally.

The major reason to do this is that Herald, Feed, Harbormaster, etc., won't activate until a repository is "imported".

Test Plan:
  - Tried to mark an imported repository as imported, got an "already imported" message.
  - Same for not-imported.
  - Marked a repository not-imported.
  - Marked a repository imported.
  - Marked a repository not-imported, then waited for the daemons to mark it imported again automatically.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran, kbrownlees

Maniphest Tasks: T4068

Differential Revision: https://secure.phabricator.com/D7514
2013-11-06 11:26:24 -08:00
epriestley
79abe6653e Remove PhabricatorRepository::loadAllByPHIDOrCallsign()
Summary: Ref T603. Move to real Query classes.

Test Plan:
  - Ran `phd debug pull X` (where `X` does not match a repository).
  - Ran `phd debug pull Y` (where `Y` does match a repository).
  - Ran `phd debug pull`.
  - Ran `repository pull`.
  - Ran `repository pull X`.
  - Ran `repository pull Y`.
  - Ran `repository discover`.
  - Ran `repository delete`.
  - Ran `grep`.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T603

Differential Revision: https://secure.phabricator.com/D7137
2013-09-26 12:36:24 -07:00
epriestley
c467cc464f Make most repository reads policy-aware
Summary: Ref T603. This swaps almost all queries against the repository table over to be policy aware.

Test Plan:
  - Made an audit comment on a commit.
  - Ran `save_lint.php`.
  - Looked up a commit with `diffusion.getcommits`.
  - Looked up lint messages with `diffusion.getlintmessages`.
  - Clicked an external/submodule in Diffusion.
  - Viewed main lint and repository lint in Diffusion.
  - Completed and validated Owners paths in Owners.
  - Executed dry runs via Herald.
  - Queried for package owners with `owners.query`.
  - Viewed Owners package.
  - Edited Owners package.
  - Viewed Owners package list.
  - Executed `repository.query`.
  - Viewed "Repository" tool repository list.
  - Edited Arcanist project.
  - Hit "Delete" on repository (this just tells you to use the CLI).
  - Created a repository.
  - Edited a repository.
  - Ran `bin/repository list`.
  - Ran `bin/search index rGTESTff45d13dffcfb3ea85b03aac8cc36251cacdf01c`
  - Pushed and parsed a commit.
  - Skipped all the Drydock stuff, as it it's hard to test and isn't normally reachable.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T603

Differential Revision: https://secure.phabricator.com/D7132
2013-09-25 16:54:48 -07:00
epriestley
e7fde9a77c Make repository discovery partially testable
Summary:
Ref T2784. Begins pulling discovery into Engines and covering it with tests. In particular:

  - Discovery is currently a one-shot process where we find all the new commits and write them to the database in one go. Split it apart so we find and return the new commits first, then write them to the database separately. This makes things simpler and more testable.
  - This diff only brings SVN into an engine (and only the "find the commits" part), since it's simpler than Git or Mercurial.
  - Creates a base Engine class and moves common functionality there.
  - Restores the `--verbose` flag to `repository pull`.

Test Plan: Added unit tests. Ran `bin/repository discover`. Ran `bin/phd debug pulllocal`.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T2784

Differential Revision: https://secure.phabricator.com/D5906
2013-05-12 19:08:37 -07:00
epriestley
105dd1899c Make repository pulls testable
Summary:
Ref T2784. This moves us toward being able to test the background and Conduit pipelines for repositories. In particular:

  - Separate the logic for pulling repositories (`git pull`, `hg pull`) out of `PhabricatorRepositoryPullLocalDaemon` and put it in `PhabricatorRepositoryPullEngine`. This allows repositories to be pulled directly without invoking the daemons.
  - Add tests for the engine, including a future-looking base test case.
  - Add basic `PhutilDirectoryFixture`-based repositories.

Next steps:

  # Do the same for repo discovery.
  # Then we can start writing tests against specific Conduit methods.

Test Plan: Ran unit tests. Ran `bin/repository pull` on SVN, Hg and Git repositories. Ran `bin/phd debug pulllocal`.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran, nh

Maniphest Tasks: T2784

Differential Revision: https://secure.phabricator.com/D5904
2013-05-12 19:05:52 -07:00