1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-24 15:52:41 +01:00
Commit graph

1053 commits

Author SHA1 Message Date
epriestley
a0d4b6da4b Support (but do not actually enable) a maximum file size limit for Git repositories
Summary:
Depends on D19816. Ref T13216. See PHI908. See PHI750. In a few cases, users have pushed multi-gigabyte files full of various things that probably shouldn't be version controlled. This tends to create various headaches.

Add support for limiting the maximum size of any object. Specifically, we:

  - list all the objects each commit touches;
  - check their size after the commit applies;
  - if it's over the limit, reject the commit.

This change doesn't actually hook the limit up (the limit is always "0", i.e. unlimited), and doesn't have Mercurial or SVN support. The actual parser bit would probably be better in some other `Query/Parser` class eventually, too. But it at least roughly works.

Test Plan:
Changed the hard-coded limit to other values, tried to push stuff, got sensible results:

```
$ echo pew >> magic_missile.txt && git commit -am pew && git push
[master 98d07af] pew
 1 file changed, 1 insertion(+)
# Push received by "local.phacility.net", forwarding to cluster host.
# Acquiring write lock for repository "spellbook"...
# Acquired write lock immediately.
# Acquiring read lock for repository "spellbook" on device "local.phacility.net"...
# Acquired read lock immediately.
# Device "local.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local.phacility.net".
Counting objects: 49, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (48/48), done.
Writing objects: 100% (49/49), 3.44 KiB | 1.72 MiB/s, done.
Total 49 (delta 30), reused 0 (delta 0)
remote: +---------------------------------------------------------------+
remote: |      * * * PUSH REJECTED BY EVIL DRAGON BUREAUCRATS * * *     |
remote: +---------------------------------------------------------------+
remote:              \
remote:               \                    ^    /^
remote:                \                  / \  // \
remote:                 \   |\___/|      /   \//  .\
remote:                  \  /V  V  \__  /    //  | \ \           *----*
remote:                    /     /  \/_/    //   |  \  \          \   |
remote:                    @___@`    \/_   //    |   \   \         \/\ \
remote:                   0/0/|       \/_ //     |    \    \         \  \
remote:               0/0/0/0/|        \///      |     \     \       |  |
remote:            0/0/0/0/0/_|_ /   (  //       |      \     _\     |  /
remote:         0/0/0/0/0/0/`/,_ _ _/  ) ; -.    |    _ _\.-~       /   /
remote:                     ,-}        _      *-.|.-~-.           .~    ~
remote:   *     \__/         `/\      /                 ~-. _ .-~      /
remote:    \____(Oo)            *.   }            {                   /
remote:    (    (..)           .----~-.\        \-`                 .~
remote:    //___\\  \ DENIED!  ///.----..<        \             _ -~
remote:   //     \\                ///-._ _ _ _ _ _ _{^ - - - - ~
remote:
remote:
remote: OVERSIZED FILE
remote: This repository ("spellbook") is configured with a maximum individual file size limit, but you are pushing a change ("98d07af863e799509e7c3a639404d216f9fc79c7") which causes the size of a file ("magic_missile.txt") to exceed the limit. The commit makes the file 317 bytes long, but the limit for this repository is 1 bytes.
remote:
# Released cluster write lock.
To ssh://local.phacility.com/source/spellbook.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'ssh://epriestley@local.phacility.com/source/spellbook.git'
```

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: joshuaspence

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19817
2018-11-20 08:04:17 -08:00
epriestley
cb033673b6 Unify intracluster sync and Drydock working copy construction timeouts as a repository "copy time limit"
Summary:
Depends on D19814. Ref T13216. See PHI885. For various eldritch reasons, `git fetch` can hang. Although we'd probably like to fix this with `git fetch --require-sustained-network-transfer-rate=512KB/5s` or similar, that flag doesn't exist and we don't have a reasonable way to build it.

Short of that, move toward formalizing a repository "copy time limit": the longest amount of time anything may spend trying to make a copy of this repository.

This grows out of the existing intracluster sync limit, which is effectively the same thing. Here, apply it to `git clone` and `git fetch` in Drydock working copy construction, too. A future change may make it configurable.

Test Plan:
  - Set the limit to 0.001.
  - Tried to build and lease working copies, got sensible timeout errors (see D19815).

```
<Activation Failed> Lease activation failed: [CommandException] Command killed by timeout after running for more than 0.001 seconds.
COMMAND
ssh '-o' 'LogLevel=quiet' '-o' 'StrictHostKeyChecking=no' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'BatchMode=yes' -l '********' -p '2222' -i '********' '127.0.0.1' -- '(cd '\''/var/drydock/workingcopy-163/repo/spellbook/'\'' && git clean -d --force && git fetch && git reset --hard)'
```

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19816
2018-11-16 13:08:12 -08:00
epriestley
86fd204148 Fix all query warnings in "arc unit --everything"
Summary:
Ref T13216. Ref T13217. Depends on D19800. This fixes all of the remaining query warnings that pop up when you run "arc unit --everything".

There's likely still quite a bit of stuff lurking around, but hopefully this covers a big set of the most common queries.

Test Plan: Ran `arc unit --everything`. Before change: lots of query warnings. After change: no query warnings.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13217, T13216

Differential Revision: https://secure.phabricator.com/D19801
2018-11-15 03:51:25 -08:00
epriestley
64b52b9952 Make SELECT construction in PolicyAwareQuery safer
Summary: Depends on D19784. Ref T13217. Reduce uses of unsafe `%Q` in SELECT construction.

Test Plan: This reduces the number of safety warnings when loading Phabricator home from ~900 to ~800.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13217

Differential Revision: https://secure.phabricator.com/D19785
2018-11-14 15:32:09 -08:00
epriestley
da40f80741 Update PhabricatorLiskDAO::chunkSQL() for new %Q semantics
Summary:
Ref T13217. This method is slightly tricky:

  - We can't safely return a string: return an array instead.
  - It no longer makes sense to accept glue. All callers use `', '` as glue anyway, so hard-code that.

Then convert all callsites.

Test Plan: Browsed around, saw fewer "unsafe" errors in error log.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13217

Differential Revision: https://secure.phabricator.com/D19784
2018-11-13 08:59:18 -08:00
epriestley
315d857a8a Add a basic web UI for intracluster sync logs
Summary: Depends on D19798. Ref T13216. This puts at least a basic UI on top of sync logs.

Test Plan:
Viewed logs from the web UI and exported data. Note that these syncs are somewhat simulated since I my local cluster is somewhat-faked (i.e., not actually multiple machines).

{F5995899}

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19799
2018-11-10 04:58:36 -08:00
epriestley
1d7c960531 Put push log "hookWait" to data export and add all wait values to UI
Summary:
Depends on D19797. Ref T13216.

  - Put the new `hookWait` in the export and the UI.
  - Put the existing waits in the UI, not just the export.
  - Make order consistent: host, write, read, hook (this is the order the timers start in).

Test Plan: Pushed some stuff, viewed web UI and saw sensible numbers, exported data and got the same values.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19798
2018-11-10 04:47:38 -08:00
epriestley
2a7ac8e388 Make "bin/repository thaw" workflow more clear when devices are disabled
Summary:
Ref T13216. See PHI943. If autoscale lightning strikes all your servers at once and destroys them, the path to recovery can be unclear. You're "supposed" to:

  - demote all the devices;
  - disable the bindings;
  - bind the new servers;
  - put whatever working copies you can scrape up back on disk;
  - promote one of the new servers.

However, the documentation is a bit misleading (it was sort of written with "you lost one or two devices" in mind, not "you lost every device") and demote-before-disable is unnecessary and slightly risky if servers come back online. There's also a missing guardrail before the promote step which lets you accidentally skip the demotion step and end up in a confusing state. Instead:

  - Add a guard rail: when you try to promote a new server, warn if inactive devices still have versions and tell the user to demote them.
  - Allow demotion of inactive devices: the order "disable, demote" is safer and more intuitive than "demote, disable" and there's no reason to require the unintuitive order.
  - Make the "cluster already has leaders" message more clear.
  - Make the documentation more clear.

Test Plan:
  - Bound a repository to two devices.
  - Wrote to A to make it a leader, then disabled it (simulating a lightning strike).
  - Tried to promote B. Got a new, useful error ("demote A first").
  - Demoted A (before: error about demoting inactive devices; now: works fine).
  - Promoted B. This worked.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19793
2018-11-10 04:46:46 -08:00
epriestley
b12e92e6e2 Add timing information for commit hooks to push logs
Summary:
Depends on D19779. Ref T13216. The push logs currently record the "hostWait", which is roughly "locking + subprocess cost". We also record locking separately, so we can figure out "subprocess cost" alone by subtracting the lock costs.

However, the subprocess (normally `git receive-pack`) runs hooks, and we don't have an easy way to figure out how much time was spent doing actual `git` stuff vs spent doing commit hook processing. This would have been useful in diagnosing at least one recent issue.

Track at least a rough hook cost and record it in the push logs.

Test Plan: Pushed to a repository, saw a reasonable hook cost appear in the database table.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19780
2018-11-08 06:00:26 -08:00
epriestley
966db4d38e Add an intracluster synchronization log for cluster repositories
Summary:
Depends on D19778. Ref T13216. See PHI943, PHI889, et al.

We currently have a push log and a pull log, but do not separately log intracluster synchronization events. We've encountered several specific cases where having this kind of log would be helpful:

  - In PHI943, an install was accidentally aborting locks early. Having timing information in the sync log would let us identify this more quickly.
  - In PHI889, an install hit an issue with `MaxStartups` configuration in `sshd`. A log would let us identify when this is an issue.
  - In PHI889, I floated a "push the linux kernel + fetch timeout" theory. A sync log would let us see sync/fetch timeouts to confirm this being a problem in practice.
  - A sync log will help us assess, develop, test, and monitor intracluster routing sync changes (likely those in T13211) in the future.

Some of these events are present in the pull log already, but only if they make it as far as running a `git upload-pack` subprocess (not the case with `MaxStartups` problems) -- and they can't record end-to-end timing.

No UI yet, I'll add that in a future change.

Test Plan:
  - Forced all operations to synchronize by adding `|| true` to the version check.
  - Pulled, got a sync log in the database.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19779
2018-11-07 18:24:20 -08:00
epriestley
51073b972e Try to route cluster writes to nodes which won't need to synchronize first
Summary:
Ref T13109. Ref T13202. See PHI905. See PHI889. When we receive a write to a repository cluster, we currently send it to a random writable node.

Instead, we can prefer:

  - the node currently holding the write lock; or
  - any node which is already up to date.

These should simply be better nodes to take writes in all cases. The write lock is global for the repository, so there's no scaling benefit to spreading writes across different nodes, and these particular nodes will be able to accept the write more quickly.

Test Plan:
  - This is observable by using `fprintf(STDERR, "%s\n", ...)` in the logic, then running `git push`. I'd like to pull this routing logic out of `PhabricatorRepository` at some point, probably into a dedicated `ClusterURIQuery` sort of class, but that is a larger change.
  - Added some `fprintf(...)` stuff around which nodes were being selected.
  - Added a `sleep(10)` after grabbing the write lock.
  - In one window, pushed. Then pushed in a second window.
    - Saw the second window select the lock holder as the write target based on it currently holding the lock.
    - Without a concurrent push, saw pushes select up-to-date nodes based on their up-to-date-ness.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: joshuaspence, timhirsh

Maniphest Tasks: T13202, T13109

Differential Revision: https://secure.phabricator.com/D19734
2018-10-17 08:08:25 -07:00
Austin McKinley
8065433ee8 Migrate DiffusionBlameController to use repo identities
Summary:
Now on the blame page, identities get `avatar.png` and there are little tooltips that show a few characters of the committer identity string.

Also add a default icon for repo identities.

Test Plan: Loaded some blame pages for files touched by users with and without repo identities attached.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19587
2018-09-26 14:45:58 -07:00
epriestley
7db265cd5d Parameterize the repository read and write locks
Summary:
Ref T13202. See PHI889. Update the read and write locks to the modern parameterized verison, which handles hashing/normalization and can store better logs.

This parameterized mode was added in D19173 and has been used successfully for some time, but not all locks have switched over to it yet.

Test Plan:
- Added an `fprintf(STDERR, $full_name)` to the lock code.
- Pulled a repository.
- Saw sensible lock name on stdout before "acquired read lock...".
- Additional changes in this patch series will vet this more completely.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13202

Differential Revision: https://secure.phabricator.com/D19701
2018-09-24 15:14:28 -07:00
epriestley
40d5d5c984 Remove "mailKey" from "PhabricatorRepositoryCommit"
Summary: Ref T13197. Ref T13065. This continues the gradual purge of dedicated "mailKey" columns in favor of shared infrastructure.

Test Plan:
  - Ran migration.
  - Visually inspected database.
  - Grepped for `mailKey`.
  - Added some comments, saw the daemons generate corresponding mail.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13197, T13065

Differential Revision: https://secure.phabricator.com/D19670
2018-09-15 07:54:15 -07:00
epriestley
8268abcb78 Enrich "diffusion.commit.search" with identity, status, and message information
Summary:
Depends on D19657. Ref T13197. See PHI841.

This enriches the results from `diffusion.commit.search` with information similar to the information returned by the "commits" attachment from `differential.diff.search`.

Also include unreachable, imported, message, audit status, and repository PHID.

Test Plan: Called `diffusion.commit.search` and reviewed the results, which looked sensible.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13197

Differential Revision: https://secure.phabricator.com/D19658
2018-09-12 12:45:44 -07:00
epriestley
c7e7b63f15 Rename "PhabricatorAuditCommitStatusConstants" to "DiffusionCommitAuditStatus"; remove "MODERN_"
Summary:
Depends on D19656. Ref T13197. See PHI851.

  - This class is now a real object, so get rid of the "Constants" part of the name.
  - Rename it for greater consistency with other modern objects.
  - Get rid of the `MODERN_` tag now that the old constants are gone.

Test Plan: Bunch of `grep`, browsed around.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13197

Differential Revision: https://secure.phabricator.com/D19657
2018-09-12 12:44:43 -07:00
epriestley
d63281cc54 Migrate remaining Audit database status constants
Summary: Depends on D19652. Ref T13197. See PHI851. This migrates the actual `auditStatus` on Commits, and older status transactions.

Test Plan:
  - Ran migrations.
  - Spot-checked the database for sanity.
  - Ran some different queries, got unchanged results from before migration.
  - Reviewed historic audit state transactions, and accepted/raised concern on new audits. All state transactions appeared to generate properly.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13197

Differential Revision: https://secure.phabricator.com/D19655
2018-09-12 12:21:27 -07:00
epriestley
853a816b3c Continue converting Audit constants, allowing the Query to handle either strings or integers
Summary: Ref T13197. We're almost ready to migrate: let the Query accept either older integer values or new string values. Then move some callsites to use strings.

Test Plan: Called `audit.query`, browsed audits, audited commits.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13197

Differential Revision: https://secure.phabricator.com/D19650
2018-09-10 14:46:47 -07:00
epriestley
bae8a95114 Continue replacing Commit/Audit status checks with object-based checks
Summary: Ref T13195. See PHI851. Continuing down the path toward replacing these legacy numeric constants with more modern string constants.

Test Plan:
- Raised concern, requested verification, verified.
- Looked at commit hovercard with audit status.
- Viewed header on a commit page.
- (Didn't test the Doorkeeper stuff since it requires linking to Asana and seems unlikely to break.)

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13195

Differential Revision: https://secure.phabricator.com/D19647
2018-09-10 11:20:31 -07:00
epriestley
a1ce23b9f5 Introduce an AuditStatus object for commits and move some callsites to it
Summary:
Ref T13195. See PHI851. Add an object, analogous to the `DifferentialRevisionStatus` object, to handle audit status management.

This will primarily make it easier to swap storage over to strings later, but also cleans things up a bit.

Test Plan: Viewed audit/commit lists, saw sensible state icons. Ran `bin/audit synchronize`, got sensible output.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13195

Differential Revision: https://secure.phabricator.com/D19646
2018-09-07 10:20:04 -07:00
epriestley
632cafec88 Pass commit authorship information to Buildkite
Summary:
Fixes T12251. Ref T13189. See PHI610. The difficulty here is that we don't want to disclose Phabricator account information to Buildkite. We're comfortable disclosing information from `git`, etc.

  - For commits, use the Identity to provide authorship information from Git.
  - For revisions, use the local commit information on the Diff to provide the Git/Mercurial/etc author of the HEAD commit.

Test Plan:
  - Built commits and revisions in Buildkite via Harbormaster.
  - I can't actually figure out how to see author information on the Buildkite side, but the values look sane when dumped locally.

Reviewers: amckinley

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13189, T12251

Differential Revision: https://secure.phabricator.com/D19614
2018-08-27 12:52:11 -07:00
Austin McKinley
5c4c593af3 Update DiffusionLastModifiedController to use identities
Summary: Ref T12164. Updates another controller to use identities.

Test Plan:
Pretty ad-hoc, but loaded the main pages of several different repos with and without repo identities. I'm not totally convinced the `author` from this data structure is actually being used:
```
$return = array(
  'commit'    => $modified,
  'date'      => $date,
  'author'    => $author,
  'details'   => $details,
);
```

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Maniphest Tasks: T12164

Differential Revision: https://secure.phabricator.com/D19580
2018-08-17 12:24:21 -07:00
epriestley
24d4445845 Remove pointless requireCapabilities() method from PhabricatorRepositoryEditor
Summary: Depends on D19582. Ref T13164. It's not possible to reach the editor without passing through a CAN_EDIT check, and it shouldn't be necessarily to manually specify that edits require CAN_EDIT by default.

Test Plan: Grepped for `RepositoryEditor`, verified that all callsites pass through a CAN_EDIT check.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13164

Differential Revision: https://secure.phabricator.com/D19583
2018-08-16 10:53:42 -07:00
Austin McKinley
cc1def6cea Remove some array typehints for passing around
Summary: See discussion at https://secure.phabricator.com/D19492#241996

Test Plan: Refreshed a few Diffusion tabs; nothing broke.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19578
2018-08-13 16:07:56 -07:00
Austin McKinley
3b05e920e0 Start changing DiffusionCommitController to use identities
Summary: Depends on D19491.

Test Plan: Viewed some commits where the identity was mapped to a user and another that wasn't; saw the header render either a link to the user or the identity object.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19492
2018-08-13 15:23:31 -07:00
Austin McKinley
2951694c27 Correctly spell 'committer'
Summary: It's a funny word. h/t @joshuaspence

Test Plan: Inspection of correct spelling.

Reviewers: epriestley, joshuaspence

Reviewed By: joshuaspence

Subscribers: Korvin, joshuaspence

Differential Revision: https://secure.phabricator.com/D19570
2018-08-09 17:52:43 -07:00
epriestley
06380e8079 Allow push events to be filtered by which Herald rule blocked the push
Summary: Depends on D19555. Ref T13164. See PHI765. An install is interested in getting a sense of the impact of a particular blocking rule, which seems reasonable. Support filtering for pushes blocked by a particular rule or set of rules.

Test Plan: {F5776385}

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13164

Differential Revision: https://secure.phabricator.com/D19556
2018-08-01 17:38:12 -07:00
epriestley
d8834377be When a Herald rule blocks a push, show which rule fired in the push log UI
Summary:
Ref T13164. See PHI765. We currently show "Rejected: Herald" in the push log UI, but don't show which rule rejected a push.

We store this data, and it's potentially useful: either for hunting down a particular issue, or for getting a general sense of how often a reject rule is triggering (maybe because you want to tune how aggressive it is).

Show this data in the web UI, and include it in the data export payload.

Test Plan:
  - Pushed to a hosted repository so that I got blocked by a Herald rule.
  - Viewed the push logs in the web UI, now saw which rule triggered things.
  - Exported logs to CSV, saw Herald rule PHIDs in the data.

{F5776211}

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13164

Differential Revision: https://secure.phabricator.com/D19555
2018-08-01 17:33:50 -07:00
Austin McKinley
9db5ad3476 Allow null identities to be attached to commit objects
Summary: I landed D19491 a little aggressively, so allow this field to be null until after the migration goes out.

Test Plan: Loaded commits without identity objects; did not get any errors.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19496
2018-06-20 08:35:36 -07:00
Austin McKinley
05f333dfba Attach identities to commits and users to identities
Summary: Ref T12164. Make it easier to work with identity objects by attaching them to commits and attaching users to identities.

Test Plan: Loaded some commits with `->needIdentities(true)` and checked the resulting objects.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Maniphest Tasks: T12164

Differential Revision: https://secure.phabricator.com/D19491
2018-06-18 15:31:41 -07:00
Austin McKinley
787c59744b Correctly attach users to identities
Summary: This never worked.

Test Plan: Ran `bin/repository rebuild-identities` and viewed identity objects with `currentEffectiveUserID`s and no longer got errors about attempting to attach `null` objects instead of `PhabricatorUser` objects.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19495
2018-06-18 15:21:11 -07:00
epriestley
1459fb3037 Make re-running rebuild-identities a bit faster and add a little progress information
Summary:
Ref T13151. Ref T12164. Two small tweaks:

  - If we aren't actually going to change anything, just skip the writes. This makes re-running/resuming a lot faster (~20x, locally).
  - Print when we touch a commit so there's some kind of visible status.

This is just a small quality-of-life tweak that I wrote anyway while investigating T13152, and will make finishing off db024, db025 and db010 manually a little easier.

Test Plan:
  - Set `authorIdentityPHID` + `committerIdentityPHID` to `NULL`.
  - Ran `rebuild-identities`, saw status information.
  - Ran `rebuild-identiites` again, saw it go faster with status information.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13151, T12164

Differential Revision: https://secure.phabricator.com/D19484
2018-06-12 13:18:54 -07:00
Austin McKinley
b8b2d1672d Prevent creation of empty repository identities
Summary: Fixes issue reported in https://secure.phabricator.com/rPf191a66490b194785fae28c062b71be99bb14584#43240

Test Plan: Imported an SVN repo, observed clean import instead of daemon exception.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin

Differential Revision: https://secure.phabricator.com/D19466
2018-06-05 16:13:59 -07:00
Austin McKinley
2f6784ee1c Add workflow to create repository identities
Summary:
Depends on D19443. Creates a workflow for populating the new identity table by iterating over commits, either one repo at a time or all at once. Locally caches identities to avoid fetching them `inf` times. An actual migration that invokes this workflow will come in another revision that won't land until at least next week.

Performance is ~2k commits in 4.9s on my local machine.

Test Plan: Ran locally a few times with a few different states of the `repository_identity` table.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: jcox, Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam

Differential Revision: https://secure.phabricator.com/D19446
2018-05-31 07:29:57 -07:00
Austin McKinley
fe5fde5910 Assign RepositoryIdentity objects to commits
Summary: Depends on D19429. Depends on D19423. Ref T12164. This creates new columns `authorIdentityPHID` and `committerIdentityPHID` on commit objects and starts populating them. Also adds the ability to explicitly set an Identity's assignee to "unassigned()" to null out an incorrect auto-assign. Adds more search functionality to identities. Also creates a daemon task for handling users adding new email address and attempts to associate unclaimed identities.

Test Plan: Imported some repos, watched new columns get populated. Added a new email address for a previous commit, saw daemon job run and assign the identity to the new user. Searched for identities in various and sundry ways.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T12164

Differential Revision: https://secure.phabricator.com/D19443
2018-05-31 07:28:23 -07:00
Austin McKinley
f191a66490 Add controllers/search/edit engine functionality to RepositoryIdentity
Summary: Depends on D19423. Ref T12164. Adds controllers capable of listing and editing `PhabricatorRepositoryIdentity` objects. Starts creating those objects when commits are parsed.

Test Plan: Reparsed some revisions, observed objects getting created in the database. Altered some `Identity` objects using the controllers and observed effects in the database. No attempts made to validate behavior under "challenging" author/committer strings.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T12164

Differential Revision: https://secure.phabricator.com/D19429
2018-05-31 07:03:25 -07:00
Austin McKinley
cd84e53c44 Begin building out RepositoryIdentity indirection layer
Summary: Ref T12164. Start building initial objects for managing `RepositoryIdentity` objects. This won't land until much more of the infrastructure is in place.

Test Plan: Ran `bin/storage upgrade` and observed expected table.

Reviewers: epriestley

Reviewed By: epriestley

Subscribers: Korvin, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T12164

Differential Revision: https://secure.phabricator.com/D19423
2018-05-31 07:01:16 -07:00
Aviv Eyal
7281300446 Allow number in generated clone uri
Summary:
See https://discourse.phabricator-community.org/t/numerical-characters-are-stripped-from-diffusion-git-repository-name-in-the-uri/

Digits are often considered reasonable characters.

Test Plan: Looked at an ascii table.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, Sam2304, epriestley

Differential Revision: https://secure.phabricator.com/D19447
2018-05-11 16:18:06 +00:00
epriestley
843bfb4fd8 Add a "commits" attachment to "differential.diff.search" for retrieving local commit information
Summary:
Ref T13124. See PHI593.

When you `arc diff` in a Git or Mercurial repository, we upload some information about the local commits in your working copy which the change was generated from.

In the future (for example, with T1508) we may increase the prominence of this feature.

Provide a stable way to read this information back via the API. This roughly mirrors the information we provide about commits in "diffusion.commit.search", although the latter is less fleshed-out today.

Test Plan: Used `differential.diff.search` to retrieve commit information about Git, Mercurial, and Subversion diffs. (There's no info for Subversion, but it doesn't crash or anything.)

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13124

Differential Revision: https://secure.phabricator.com/D19386
2018-04-19 17:25:06 -07:00
epriestley
6556536d06 Allow repository cluster bindings to be marked as not "writable", making them read-only
Summary:
Depends on D19356. Fixes T10883. Ref T13120.

  - Add a "writable" property to the bindings, defaulting to "true" with a nice dropdown.
  - When selecting hosts, allow callers to request a writable host.
  - If the caller wants a writable host, only return hosts if they're writable.
  - In SVN and Mercurial, we sometimes return only writable hosts when we //could// return read-only hosts, but figuring out if these request are read-only or read-write is currently tricky. Since these repositories can't really cluster yet, this shouldn't matter too much today.

Test Plan:
  - Without any config changes, viewed repositories via web UI and pushed/pulled via SSH and HTTP.
  - Made all nodes in the cluster read-only by disabling "writable", pulled and hit the web UI (worked), tried to push via SSH and HTTP (got errors about read-only).
  - Put everything back, pulled and pushed.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13120, T10883

Differential Revision: https://secure.phabricator.com/D19357
2018-04-12 16:10:36 -07:00
epriestley
7c7e6d555b Give getAlmanacServiceURI() an "options" parameter to prepare for read-only devices
Summary:
Depends on D19355. Ref T10883. Ref T13120. Rather than adding a million parameters here, wrap the selector-parameters in an `$options`.

The next change adds a new "writable" option to support forcing selection of writable hosts.

Test Plan: Pulled and pushed via HTTP and SSH, viewed repositories via Diffusion.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13120, T10883

Differential Revision: https://secure.phabricator.com/D19356
2018-04-12 16:10:12 -07:00
epriestley
09c6d42b95 Mostly make blame work with DocumentEngine
Summary: Ref T13105. This needs refinement but blame sort of works again, now.

Test Plan: Viewed files in Diffusion and Files; saw blame in Diffusion when viewing in source mode.

Reviewers: mydeveloperday

Reviewed By: mydeveloperday

Maniphest Tasks: T13105

Differential Revision: https://secure.phabricator.com/D19309
2018-04-09 04:48:21 -07:00
epriestley
1fde4a9450 Move Diffusion browse rendering to DocumentEngine, breaking almost all features
Summary:
Ref T13105. This breaks about 9,000 features but moves Diffusion to DocumentEngine for rendering. See T13105 for a more complete list of all the broken stuff.

But you can't bake a software without breaking all the features every time you make a change, right?

Test Plan: Viewed various files in Diffusion, used DocumentEngine features like highlighting and rendering engine selection.

Reviewers: mydeveloperday

Reviewed By: mydeveloperday

Subscribers: mydeveloperday

Maniphest Tasks: T13105

Differential Revision: https://secure.phabricator.com/D19302
2018-04-09 04:46:26 -07:00
epriestley
51461f18c1 When publishing buildables in Differential, ignore autobuilds (local lint and unit)
Summary:
Depends on D19280. Ref T13110. Although Harbormaster cares about all builds, Differential does not practically care about local lint and unit results in determining build status.

In Differential, orient publishing around "remote builds" instead of "builds".

This does not yet change any of the draft logic, it just makes the timeline story use newer logic.

Test Plan: Used `bin/harbormaster publish` (with some guard-clause removal) to publish some buildables to revisions without anything crashing.

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13110

Differential Revision: https://secure.phabricator.com/D19281
2018-04-03 11:02:12 -07:00
epriestley
95c9d403f4 Make objects implementing BuildableInterface produce a BuildableEngine
Summary:
Ref T13110. Currently, build status is published the same way for every Buildable by the BuildEngine.

I want to change this to delegate publishing to each Buildable, particularly so that Differential may use more detailed rules for handling builds and drafts.

Rather than add additional methods to the existing `BuildableInterface`, add an engine generator method instead. This is a pattern which has seen more use recently (e.g., in Ferret) and lets us pay a little more upfront to pull complex pieces of logic out of the main class and let them use inheritence more easily. If we had Traits that might cover this to some degree.

I'd expect to eventually reduce the size of `BuildableInterface` and move the `CircleCI` and `BuildKite` interfaces so that the `BuildableEngine` implements them instead of the main object.

Here, this new engine does nothing and is never instantiated. In upcoming changes, publishing logic will move into it so that Differential can handle publishing differently.

Test Plan: Ran `arc liberate`, loaded pages, grepped for `BuildableInterface`.

Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13110

Differential Revision: https://secure.phabricator.com/D19278
2018-04-03 10:57:51 -07:00
epriestley
66392e5b8b Add a rough "bin/repository unpublish" workflow to attempt to cleanup improperly published repositories
Summary:
Ref T13114. See PHI514. This makes some attempt to undo the damage caused by incorrectly publishing a repository.

Don't run this.

Test Plan: Yikes.

Maniphest Tasks: T13114

Differential Revision: https://secure.phabricator.com/D19271
2018-03-30 08:46:11 -07:00
epriestley
f583406ba9 Drop uniqueness constraint on PushEvent request ID
Summary: See <https://discourse.phabricator-community.org/t/pushing-to-mercurial-repository-fails/1275/1>. Mercurial may invoke hooks multiple times per push.

Test Plan: Pushed to Mercurial, saw key constraint failure.

Differential Revision: https://secure.phabricator.com/D19257
2018-03-26 07:02:15 -07:00
epriestley
df3c937dab Record lock timing information on PushEvents
Summary:
Depends on D19249. Ref T13109. Add timing information to the `PushEvent`:

  - `writeWait`: Time spent waiting for a write lock.
  - `readWait`: Time spent waiting for a read lock.
  - `hostWait`: Roughly, total time spent on the leaf node.

The primary goal here is to see if `readWait` is meaningful in the wild. If it is, that motivates smarter routing, and the value of smarter routing can be demonstrated by looking for a reduction in read wait times.

Test Plan: Pushed some stuff, saw reasonable timing values in the table. Saw timing information in "Export Data".

Maniphest Tasks: T13109

Differential Revision: https://secure.phabricator.com/D19250
2018-03-22 13:46:01 -07:00
epriestley
69bff489d4 Generate a random unique "Request ID" for SSH requests so processes can coordinate better
Summary:
Depends on D19247. Ref T13109. When we receive an SSH request, generate a random unique ID for the request. Then thread it down through the process tree.

The immediate goal is to let the `ssh-exec` process coordinate with `commit-hook` process and log information about read and write lock wait times. Today, there's no way for `ssh-exec` to interact with the `PushEvent`, but this is the most helpful place to store this data for users.

Test Plan: Made pushes, saw the `PushEvent` table populate with a random request ID. Exported data and saw the ID preserved in the export.

Maniphest Tasks: T13109

Differential Revision: https://secure.phabricator.com/D19249
2018-03-22 13:44:30 -07:00
epriestley
859b274970 Provide more information to users during git push while waiting for write locks
Summary:
Ref T13109. Make it slightly more clear what the scope of the write and read locks are, and slightly more clear that we're actively acquiring locks, not just sitting around waiting.

While waiting on another writer, show who we're waiting on so you can walk over to their desk and glare at them.

Test Plan:
Added `sleep(15)` after `willWrite()`. Pushed in two windows. Saw new, more informative messages. In the second window, saw the new guidance:

> # Waiting for hector to finish writing (on device "repo1.local.phacility.net" for 11s)...

Reviewers: asherkin

Reviewed By: asherkin

Subscribers: asherkin

Maniphest Tasks: T13109

Differential Revision: https://secure.phabricator.com/D19247
2018-03-22 13:42:18 -07:00