Summary:
Fixes T9276. Fixes T8650. The story so far:
- We once published build updates to Revisions.
- An unrelated fix (D10911) sent them to the Diffs instead of Revisions, which isn't useful, since you can't see a diff's timeline anywhere.
- This also caused a race condition, where the RevisionEditor and DiffEditor would update the diff simultaneously (T8650).
- The diff update was just disabled to avoid the race (part of D13441).
- Instead, allow the updates to go somewhere else. In this case, we send commit updates to the commit but send diff updates to the revision so you can see 'em.
- Since everything will be using the revision editor now, we should either get proper lock behavior for free or it should be easy to add if something whack is still happening.
- Overall, this should pretty much put us back in working order like we were before D10911.
This behavior is undoubtedly refinable, but this should let us move forward.
Test Plan:
Saw a build failure in timeline:
{F2304575}
Reviewers: chad
Reviewed By: chad
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T9276, T8650
Differential Revision: https://secure.phabricator.com/D17139
Summary:
Fixes T12062. Like the commits from the year 3500, you can artificially build commits with no date information.
We could explicitly store these as `null` to fully respect the underlying datastore. However, I think it's very unlikely that these commits are intentional/meaningful or that this is valuable.
Additionally, "git show" interprets these commits as "Jan 1, 1970". Just store a `0` to mimic its behavior.
Test Plan:
- Following the process in T11537#192019, artificially created a commit with //no// date information (I deleted all date information from the message).
- Used `git show` / `git log --format ...` to inspect it: "Jan 1, 1970" on `git show`, no information at all on `%aD`, `%aT`, etc.
- Pushed it.
- Saw exception for trying to insert empty string into epoch colum from `bin/repository update`.
- Applied patch.
- Got a clean import.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12062
Differential Revision: https://secure.phabricator.com/D17136
Summary:
Fetches cause output in `/var/tmp/phd/log/daemons.log` as
follows:
```
PHLOG: 'Unexpected output while updating repository "rREPONAME": No entry for terminal type "unknown";
using dumb terminal settings.
' at [/path/to/phabricator/src/applications/repository/daemon/PhabricatorRepositoryPullLocalDaemon.php:455]
```
These warnings come from PHP itself. Silence these warnings by providing a
known value for `TERM` before shelling out to the PHP script.
See also D9744 (reverted in D11644) and T4990/T7119, which are a similar issue,
but in the pre-receive hooks, not the pull daemons.
Test Plan:
Enabled in production, observed errors to be silenced and
no SSH hangs
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D17100
Summary:
Ref T11954. This is kind of complex and I'm not sure I want to actually land it, but it gives us a fairly good improvement for clustered repositories so I'm leaning toward moving forward.
When we make (or receive) clustered repository requests, we must first load a bunch of stuff out of Almanac to figure out where to send the request (or if we can handle the request ourselves).
This involves several round trip queries into Almanac (service, device, interfaces, bindings, properties) and generally is fairly slow/expensive. The actual data we get out of it is just a list of URIs.
Caching this would be very easy, except that invalidating the cache is difficult, since editing any binding, property, interface, or device may invalidate the cache for indirectly connected services and repositories.
To address this, introduce `PhabricatorCacheEngine`, which is an extensible engine like `PhabricatorDestructionEngine` for propagating cache updates. It has two modes:
- Discover linked objects (that is: find related objects which may need to have caches invalidated).
- Invalidate caches (that is: nuke any caches which need to be nuked).
Both modes are extensible, so third-party code can build repository-dependent caches or whatever. This may be overkill but even if Almanac is the only thing we use it for it feels like a fairly clean solution to the problem.
With `CacheEngine`, make any edit to Almanac stuff propagate up to the Service, and then from the Service to any linked Repositories.
Once we hit repositories, invalidate their caches when Almanac changes.
Test Plan:
- Observed a 20-30ms performance improvement with `ab -n 100`.
- (The main page making Conduit calls also gets a performance improvement, although that's a little trickier to measure directly.)
- Added debugging code to the cache engine stuff to observe the linking and invalidation phases.
- Made invalidation throw; verified that editing properties, bindings, etc, properly invalidates the cache of any indirectly linked repositories.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11954
Differential Revision: https://secure.phabricator.com/D17000
Summary: Ref T929. We've made some UI updates since D15330.
Test Plan: {F2079125}
Reviewers: avivey, chad
Reviewed By: chad
Maniphest Tasks: T929
Differential Revision: https://secure.phabricator.com/D16990
Summary:
Fixes T11940. In 2.11.0, Git has made a change so that newly-pushed changes are held in a temporary area until the hook accepts or rejects them.
This magic temporary area is only readable if the appropriate `GIT_ENVIRONMENTAL_MAGIC` variables are available. When executing `git` commands, pass them through from the calling context.
We're intentionally conservative about which variables we pass, and with good reason (see "httpoxy" in T11359). I think this continues to be the correct default behavior.
Test Plan:
- Upgraded to Git 2.11.0.
- Tried to push over SSH, got a hook error.
- Applied patch.
- Pulled and pushed over SSH.
- Pulled and pushed over HTTP.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11940
Differential Revision: https://secure.phabricator.com/D16988
Summary:
Fixes T11902.
- Periods now work in short names.
- If you try to name something ".git", no dice.
Test Plan:
- Tried to name something "quack.git", was politely rejected.
- Named something "quack.notgit", and it worked fine.
- Cloned Mercurial and Git repositories over SSH with ".git" and non-".git" variants without hitting any issues.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11902
Differential Revision: https://secure.phabricator.com/D16908
Summary: Fixes T11866. This got converted wrong when doing the `/source/` stuff.
Test Plan: Browsed the root directory of a Subversion repository in Diffusion.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11866
Differential Revision: https://secure.phabricator.com/D16860
Summary: Fixes T4245. When a repository has a short name, use `/source/shortname/` as its primary URI.
Test Plan:
- Cloned Git repositories from shortnames via HTTP and SSH.
- Cloned Mercurial repositories from shortnames via HTTP and SSH.
- Cloned Subversion repositories from shortnames via SSH.
- Browsed Git, Mercurial and Subversion repositories.
- Added and removed short names to various repositories.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4245
Differential Revision: https://secure.phabricator.com/D16851
Summary:
This has been replaced by `PolicyCodex` after D16830. Also:
- Rebuild Celerity map to fix grumpy unit test.
- Fix one issue on the policy exception workflow to accommodate the new code.
Test Plan:
- `arc unit --everything`
- Viewed policy explanations.
- Viewed policy errors.
Reviewers: chad
Reviewed By: chad
Subscribers: hach-que, PHID-OPKG-gm6ozazyms6q6i22gyam
Differential Revision: https://secure.phabricator.com/D16831
Summary:
Fixes T11705. I did not realize that `ON DUPLICATE KEY UPDATE` was order-dependent, so the "reset" clause of this `IF(...)` never actually worked.
Reorder it so we check if we're changing the message type //first//, then actually change the message type.
This makes the count reset properly when a failing repository succeeds, or a working repository fails.
Test Plan:
- On `master`, forced a working repository to fail a `bin/repository update`, saw the message change types (expected) but keep the old count (wrong!).
- With this patch, repeated the process and saw the count reset properly.
- Ran the patch, verified counts reset to 0.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11705
Differential Revision: https://secure.phabricator.com/D16623
Summary:
Fixes T11677. This makes two minor adjustments to the repository import daemons:
- The first step ("Message") now queues at a slightly-lower-than-default (for already-imported repositories) or very-low (for newly importing repositories) priority level.
- The other steps now queue at "default" priority level. This is actually what they already did, but without this change their behavior would be to inherit the priority level of their parents.
This has two effects:
- When adding new repositories to an existing install, they shouldn't block other things from happening anymore.
- The daemons will tend to start one commit and run through all of its steps before starting another commit. This makes progress through the queue more even and predictable.
- Before, they did ALL the message tasks, then ALL the change tasks, etc. This works fine but is confusing/uneven/less-predictable because each type of task takes a different amount of time.
Test Plan:
- Added a new repository.
- Saw all of its "message" steps queue at priority 4000.
- Saw followups queue at priority 2000.
- Saw progress generally "finish what you started" -- go through the queue one commit at a time, instead of one type of task at a time.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11677
Differential Revision: https://secure.phabricator.com/D16585
Summary:
Ref T11665. Currently, when a repository hits an error, we retry it after 15s. This is correct if the error was temporary/transient/config-related (e.g., bad network or administrator setting up credentials) but not so great if the error is long-lasting (completely bad authentication, invalid URI, etc), as it can pile up to a meaningful amount of unnecessary load over time.
Instead, record how many times in a row we've hit an error and adjust backoff behavior: first error is 15s, then 30s, 45s, etc.
Additionally, when computing the backoff for an empty repository, use the repository creation time as though it was the most recent commit. This is a good proxy which gives us reasonable backoff behavior.
This required removing the `CODE_WORKING` messages, since they would have reset the error count. We could restore them (as a different type of message), but I think they aren't particularly useful since cloning usually doesn't take too long and there's more status information avilable now than there was when this stuff was written.
Test Plan:
- Ran `bin/phd debug pull`.
- Saw sensible, increasing backoffs selected for repositories with errors.
- Saw sensible backoffs selected for empty repositories.
Reviewers: chad
Maniphest Tasks: T11665
Differential Revision: https://secure.phabricator.com/D16575
Summary:
Ref T11665. Fixes T7865. When we restart the daemons, the repository pull daemon currently resets the cooldowns on all of its pulls. This can generate a burst of initial load when restarting a lot of instance daemons (as in the Phacility cluster), described in T7865. This smooths things out so that recent pulls are considered, and any repositories which were waiting keep waiting.
Somewhat counterintuitively, hosted repositories write `TYPE_FETCH` status messages, so this should work equally well for hosted and observed repositories.
This also paves the way for better backoff behavior on repository errors, described in T11665. The error backoff now uses the same logic that the standard backoff does. The next change will make backoff computation consider recent errors.
(This is technically too large for repositories which have encountered one error and have a low commit rate, but I'll fix that in the following change; this is just a checkpoint on the way there.)
Test Plan: Ran `bin/phd debug pull`, saw the daemon compute reasonable windows based on previous pull activity.
Reviewers: chad
Maniphest Tasks: T7865, T11665
Differential Revision: https://secure.phabricator.com/D16574
Summary:
Ref T11650. Currently, we load packages and then discard the archived ones.
However, this gets "dominion" rules (where a more-general package gives up ownership if a more-specific package exists) wrong if the more-specific package is archived: we incorrectly give up ownership.
Instead, just ignore these packages completely when loading affected packages. This is slightly simpler.
(There are technically two pieces of code we have to do this for, which should be a single piece of code but which haven't yet been unified.)
Test Plan:
- Created packages:
- Package A, on "/" (strong dominion, autoreview).
- Package B, on "/x/" (weak dominion, autoreview).
- Package C, on "/x/y" (archived, autoreview).
- Create a revision affecting "/x/y".
- Saw correct path ownership in table of contents ("B", strongest package only).
- Saw correct autoreview behavior (A + B).
- (Prior to patch, in `master`, reproduced the problem behaviors described in T11650, with bad dominion rules and failure to autoreview B.)
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11650
Differential Revision: https://secure.phabricator.com/D16564
Summary: Ref T11559. This makes managing large numbers of repositories slightly easier.
Test Plan: {F1796119}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11559
Differential Revision: https://secure.phabricator.com/D16472
Summary:
Fixes T10423. Ref T11524. This changes `diffusion.rawdiffquery` to return a file PHID instead of a blob of data.
This is better in general, but particularly better for huge diffs (as in T10423) and diffs with non-utf8 data (as in T10423).
Test Plan:
- Used `bin/differential extract` to extract a latin1 diff, got a clean diff.
- Used `bin/repository reparse --herald` to rerun herald on a latin1 diff, got a clean result.
- Pushed latin1 diffs to test commit hooks.
- Triggered the the too large / too slow logic.
- Viewed latin1 diffs in Diffusion.
- Used "blame past this change" in Diffusion to hit the `before` logic.
Reviewers: chad
Reviewed By: chad
Subscribers: eadler
Maniphest Tasks: T10423, T11524
Differential Revision: https://secure.phabricator.com/D16460
Summary:
Fixes T11537. See that task for discussion.
Although we could accommodate these faithfully, it requires a huge migration and affects one repository on one install which was written with buggy tools.
At least for now, just replace out-of-32-bit-range epoch values with the current time, which is often somewhat close to the real value.
Test Plan:
- Following the instructions in T11537, created commits in 40,000 AD.
- Tried to import them, reproducing the "epoch" database issue.
- Applied the patch.
- Successfully imported future-commits, with some liberties around commit dates. Note that author date (not stored in an `epoch` column) is still shown faithfully:
{F1789302}
Reviewers: chad, avivey
Reviewed By: avivey
Maniphest Tasks: T11537
Differential Revision: https://secure.phabricator.com/D16456
Summary:
Ref T11522. This tries to reduce the cost of rewriting a repository by making handles smarter about rewritten commits.
When a handle references an unreachable commit, try to load a rewrite hint for the commit. If we find one, change the handle name to "OldHash > NewHash" to provide a strong hint that the commit was rewritten and that copy/pasting the old hash (say, to the CLI) won't work.
I think this notation isn't totally self-evident, but users can click it to see the big error message on the page, and it's at least obvious that something weird is going on, which I think is the important part.
Some possible future work:
- Not sure this ("Recycling Symbol") is the best symbol? Seems sort of reasonable but mabye there's a better one.
- Putting this information directly on the hovercard could help explain what this means.
Test Plan: {F1780719}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11522
Differential Revision: https://secure.phabricator.com/D16437
Summary:
Ref T11522. When a commit is no longer reachable from any branch/tag, we currently show a "this has been deleted" message.
Instead, go further: check if there is a "rewritten" hint pointing at a commit the current commit was rewritten into. If we find one, show a message about that instead.
(This isn't super pretty, just getting it working for now. I expect to revisit this UI in T9713 if we don't get to it before that.)
Test Plan: {F1780703}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11522
Differential Revision: https://secure.phabricator.com/D16436
Summary: Ref T11522. This migrates any "badcommit" data (which probably only exists at Facebook and on 1-2 other installs in the wild) to the new "hint" table.
Test Plan:
- Wrote some bad commit annotations to the badcommit table.
- Viewed them in the web UI and used `bin/repository reparse --change ...` to reparse them. Saw "this is bad" messages.
- Ran migration, verified that valid "badcommit" rows were successfully migrated to become "hint" rows.
- Viewed the new web UI and re-parsed the change, saw "unreadable commit" messages.
- Viewed a good commit; reparsed a good commit.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11522
Differential Revision: https://secure.phabricator.com/D16435
Summary:
Ref T11522. This provides storage for tracking rewritten commits (new feature) and unreadable commits (existing feature, but really hacky).
This doesn't do anything yet, just adds a table and a CLI tool for updating it. I'll document the tool once it works. You just pipe in some JSON, but I need to document the format.
Test Plan:
- Piped JSON for "none", "rewritten" and "unreadable" hints into `bin/repository hint`.
- Examined the database to see that the table was written properly.
- Tried to pipe bad JSON in, invalid hint types, etc. Got reasonable human-readable error messages.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11522
Differential Revision: https://secure.phabricator.com/D16434
Summary:
Ref T7148. The automated export process runs this via daemon, which can't answer "Y" to this prompt. Let it "--force" instead.
(Some of my test instances didn't have any repositories, which is why I didn't catch this sooner.)
Test Plan: Ran `bin/repository move-paths --force ...`, saw change applied without a prompt.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T7148
Differential Revision: https://secure.phabricator.com/D16426
Summary:
Fixes T11453. Currently, commit message summaries are limited to 80 bytes. This may only be 20-40 characters for CJK languages or langauges with Cyrillic script.
Increase storage size to 255, then truncate to the shorter of 255 bytes or 80 glyphs. This preserves the same behavior for latin languages, but is less tight for Russian, etc.
Some minor additional changes:
- Provide a way to ask "how much data fits in this column?" so we don't have to duplicate column lengths across summary checks or UI errors like "title too long".
- Remove the `text80` datatype, since no other columns use it and we have no use cases (or likely use cases) for it.
Test Plan:
- Made a commit with a Cyrillic title, saw reasonable summarization in UI:
{F1757522}
- Added and ran unit tests.
- Grepped for removed `SUMMARY_MAX_LENGTH` constant.
- Grepped for removed `text80` data type.
Reviewers: avivey, chad
Reviewed By: avivey
Subscribers: avivey
Maniphest Tasks: T11453
Differential Revision: https://secure.phabricator.com/D16385
Summary:
Fixes T11309. When checking if a repository was fully imported, we incorrectly allow unreachable, un-imported commits to prevent the repository from moving to "Imported".
This can happen if you delete branches from a repository while it is importing.
Instead, ignore unreachable commits when checking for remaining imports, and when reporting status via `bin/repository importing`.
Test Plan:
- Stopped daemons.
- Created a new repository and activated it.
- Ran `bin/repository update Rxx`.
- Deleted a branch in the repository.
- Ran `bin/repository update Rxx`.
- Ran daemons to flush queue.
Now:
- Ran `bin/repository importing`. Old behavior: showed unreachable commits as importing. New behavior: does not show unreachable commits.
- Ran `bin/repository update`. Old behavior: failed to move repository to "imported" status. New behavior: correctly moves repository to "imported" status.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11309
Differential Revision: https://secure.phabricator.com/D16269
Summary:
Ref T11309. In that task, a user misunderstood two parts of this error:
- They took "exception" to mean "unexpected failure", when it was intended to mean "rare circumstance".
- They intereted the internal ID number of a commit to mean that Phabricator was malfunctioning.
Make the language of this condition more direct, explaining what the situation means in greater detail.
Additionally, we would previously re-throw this exception, which would make the daemon exit, wait a moment, and restart. This was normal and expected.
When //unexpected// failures occur, it's important do to this: it prevents a daemon failing in a loop from causing too many side effects (e.g., limit of 1 email per 5 seconds instead of thousands per second).
When expected, permanent failures occur, we do not need to do this: the task will not be retried. I just did it because it was slightly more consistent ("failures restart daemons") and we had few permanent failure types at the time.
We have more now, and restarting the daemons generates some additional logs which have the potential to confuse. Cycling the daemon also (intentionally) reduces the rate at which we process tasks, which can be bad for permanent failures like "deleted commit" because users can delete a huge number of commits and possibly clog up the queue with cycle-after-failure actions.
Test Plan:
Tried to process a deleted commit, saw a new message:
```
2016-07-11 9:30:22 AM [STDE] <VERB> PhabricatorTaskmasterDaemon Task 1428658 was cancelled: Commit "R55:6c46b7d0fb82a859ca3f87a95dc8dcceef8088c9" (with internal ID "282161") is no longer reachable from any branch, tag, or ref in this repository, so it will not be imported. This usually means that the branch the commit was on was deleted or overwritten.
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11309
Differential Revision: https://secure.phabricator.com/D16268
Summary: See Z2352#28072. Expose this flag to allow callers to take actions after an import finishes, which is generally reasonable.
Test Plan: Ran query from console, saw `isImporting` flag in results.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D16247
Summary:
Fixes T11269. The basic issue is that `git log` in an empty repository exits with an error message.
Prior to recent Git (2.6?), this message reads:
> fatal: bad default revision 'HEAD'
This message was somewhat recently changed by <ce11360467>. After that, it reads:
> fatal: your current branch 'master' does not have any commits yet
This change isn't //technically// a //complete// fix because you could still hit this issue like this:
- Create an empty repository.
- Push some stuff to `master`.
- Delete `master`.
However, this is very rare and even in this case the repository will fix itself once you push something again. We can try to fix that if any users ever actually hit it.
Test Plan:
- Created a new empty Git repository.
- Ran `bin/repository update Rxx`.
- Before patch: "git log" error because of the empty repository.
- After patch: clean update.
- Also ran `repository update` on a non-empty repository.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11269
Differential Revision: https://secure.phabricator.com/D16234
Summary:
Ref T4788. Fixes T9232. This moves the "search for stuff to attach to this object" flow away from hard-coding and legacy constants and toward something more modular and flexible.
It also adds an "Edit Commits..." action to Maniphest, resolving T9232. The behavior of the search for commits isn't great right now, but it will improve once these use real ApplicationSearch.
Test Plan: Edited a tasks' related commits, mocks, tasks, etc.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4788, T9232
Differential Revision: https://secure.phabricator.com/D16189
Summary:
Ref T11208. See that task for a more detailed description of revprops.
This allows revprop changes in a hosted Subversion repository if the repository has the "allow dangerous changes" flag set.
In the future, we could expand this into real Herald support, but the only use case we have for now is letting `svnsync` work.
Test Plan:
Edited revprops with `svn propset --revprop -r 2 propkey propvalue repositoryuri`:
- Tried before patch, got a "configure a commit hook" error.
- Tried after patch, got a "dangerous change" error.
- Allowed dangerous changes.
- Did a revprop edit.
- Prevented dangerous changes.
- Got an error again.
- Made a normal commit to an SVN repository.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11208
Differential Revision: https://secure.phabricator.com/D16174
Summary:
Fixes T11180. In Git, it's possible to tag a tag (????). When you do, we try to log the tag-object, which automatically resolves to the commit and fails.
Just skip these. If "A" points at "B" which points at "C", it's fine to ignore "A" and "B" since we'll get the same stuff when we process "C".
Test Plan:
- Tagged a tag.
- Pushed it.
- Discovered it.
- Before patch: got exception similar to the one in T11180.
- After patch: got tag-tag skipped. Also got slightly better error messages.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11180
Differential Revision: https://secure.phabricator.com/D16149
Summary:
Ref T9028. Mostly, this gives them a strikethru style.
(I think this is probably the right definition of "closed" for commits. Another definition might be "audited", but I don't think completing audits really "closes" a commit.)
Test Plan: {F1689662}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9028
Differential Revision: https://secure.phabricator.com/D16135
Summary: Ref T9028. Ref T6878. This rule should probably be refined in the long term, but for now just ignore "phabricator/diff/12424" and similar staging area tags.
Test Plan: Ran `bin/repository discover --verbose` on a repository with staging area refs, saw Phabricator ignore those refs as untracked.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T6878, T9028
Differential Revision: https://secure.phabricator.com/D16134
Summary:
Ref T9028. This allows us to detect when commits are unreachable:
- When a ref (tag, branch, etc) is moved or deleted, store the old thing it pointed at in a list.
- After discovery, go through the list and check if all the stuff on it is still reachable.
- If something isn't, try to follow its ancestors back until we find something that is reachable.
- Then, mark everything we found as unreachable.
- Finally, rebuild the repository summary table to correct the commit count.
Test Plan:
- Deleted a ref, ran `pull` + `refs`, saw oldref in database.
- Ran `discover`, saw it process the oldref, mark the unreachable commit, and update the summary table.
- Visited commit page, saw it properly marked.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9028
Differential Revision: https://secure.phabricator.com/D16133
Summary:
Ref T9028. This corrects the reachability of existing commits in a repository.
In particular, it can be used to mark deleted commits as unreachable.
Test Plan:
- Ran it on a bad repository, with bad args, etc.
- Ran it on a clean repo, got no changes.
- Marked a reachable commit as unreachable, ran script, got it marked reachable.
- Started deleting tags and branches from the local working copy while running the script, saw greater parts of the repository get marked unreachable.
- Pulled repository again, everything automatically revived.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9028
Differential Revision: https://secure.phabricator.com/D16132
Summary:
Ref T9028. This improves the daemon behavior for unreachable commits. There is still no way for commits to become marked unreachable on their own.
- When a daemon encounters an unreachable commit, fail permanently.
- When we revive a commit, queue new daemons to process it (since some of the daemons might have failed permanently the first time around).
- Before doing a step on a commit, check if the step has already been done and skip it if it has. This can't happen normally, but will soon be possible if a commit is repeatedly deleted and revived very quickly.
- Steps queued with `bin/repository reparse ...` still execute normally.
Test Plan:
- Used `bin/repository reparse` to run every step, verified they all mark the commit with the proper flag.
- Faked the `reparse` exception in the "skip step" code, used `repository reparse` to skip every step.
- Marked a commit as unreachable, ran `discover`, saw daemons queue for it.
- Ran daemons with `bin/worker execute --id ...`, saw them all skip + queue the next step.
- Marked a commit as unreachable, ran `bin/repository reparse` on it, got permanent failures immediately for each step.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9028
Differential Revision: https://secure.phabricator.com/D16131
Summary:
Ref T9028. This is the easy part of dealing with deleted commits:
- Add a flag for unreachable commits (nothing sets this flag yet).
- Ignore unreachable commits when querying for known commits during discovery, so we pretend they do not exist.
- When recording a commit, try just reviving an existing unreachable commit first. If that works, bail out.
Test Plan:
- Artificially marked a commit as unreachable with raw SQL.
- Verified it said "deleted: unreachable" in the UI.
- Ran `repository discover --trace --verbose`.
- Saw the discovery process ignore the commit when filling the cache.
- Saw the discovery process revive the commit instead of trying to record it again.
- Web UI now shows the commit as normal.
- Running `repository discover` again doesn't make any further changes.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9028
Differential Revision: https://secure.phabricator.com/D16130
Summary:
Ref T9028. Fixes T6878. Currently, we only fetch and discover branches. This is fine 99% of the time but sometimes commits are pushed to just a tag, e.g.:
```
git checkout <some hash>
nano file.c
git commit -am '...'
git tag wild-wild-west
git push origin wild-wild-west
```
Through a similar process, commits can also be pushed to some arbitrary named ref (we do this for staging areas).
With the current rules, we don't fetch tag refs and won't discover these commits.
Change the rules so:
- we fetch all refs; and
- we discover ancestors of all refs.
Autoclose rules for tags and arbitrary refs are just hard-coded for now. We might make these more flexible in the future, or we might do forks instead, or maybe we'll have to do both.
Test Plan:
Pushed a commit to a tag ONLY (`vegetable1`).
<cf508b8de6>
On `master`, prior to the change:
- Used `update` + `refs` + `discover`.
- Verified tag was not fetched with `git for-each-ref` in local working copy and the web UI.
- Verified commit was not discovered using the web UI.
With this patch applied:
- Used `update`, saw a `refs/*` fetch instead of a `refs/heads/*` fetch.
- Used `git for-each-ref` to verify that tag fetched.
- Used `repository refs`.
- Saw new tag appear in the tags list in the web UI.
- Saw new refcursor appear in refcursor table.
- Used `repository discover --verbose` and examine refs for sanity.
- Saw commit row appear in database.
- Saw commit skeleton appear in web UI.
- Ran `bin/phd debug task`.
- Saw commit fully parse.
{F1689319}
Reviewers: chad
Reviewed By: chad
Subscribers: avivey
Maniphest Tasks: T6878, T9028
Differential Revision: https://secure.phabricator.com/D16129
Summary:
Ref T11137. This class is removed in D16099. Depends on D16099.
`PhutilURI` now attempts to "just work" with Git-style URIs, so at least in theory we can just delete all of this code and pretend it does not exist.
(I've left "Display URI" and "Effective URI" as distinct, at least for now, because I think the distinction may be relevant in the future even though it isn't right now, and to keep this diff small, although I may go remove one after I think about this for a bit.)
Test Plan:
- Created a new Git repository with a Git URI.
- Pulled/updated it, which now works correctly and should resolve the original issue in T11137.
- Verified that daemons now align the origin to a Git-style URI with a relative path, which should resolve the original issue in T11004.
- Grepped for `PhutilGitURI`.
- Also grepped in `arcanist/`, but found no matches, so no patch for that.
- Checked display/conduit URIs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11137
Differential Revision: https://secure.phabricator.com/D16100
Summary: Ref T10227. When we perform `git` http operations (fetch, mirror) check if we should use a proxy; if we should, set `http_proxy` or `https_proxy` in the environment to make `git` have `curl` use it.
Test Plan:
- Configured a proxy extension to run stuff through a local instance of Charles.
- Ran `repository pull` and `repository mirror`.
- Saw `git` HTTP requests route through the proxy.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10227
Differential Revision: https://secure.phabricator.com/D16092
Summary:
Fixes T11082. Currently, the `/123/` and `/CALLSIGN/` versions of the URI get the same score.
Also the scores are backwards.
Test Plan:
- Added `getPublicCloneURI()` output to repository listing.
- Before patch, saw a repository with a callsign list a less-preferred ID-based URI.
- After patch, saw the repository list the more-preferred callsign-based URI.
Reviewers: chad
Reviewed By: chad
Subscribers: nikolay.metchev
Maniphest Tasks: T11082
Differential Revision: https://secure.phabricator.com/D16008
Summary:
Ref T4292. For hosted, clustered repositories we have a good way to increment the internal version of the repository: every time a user pushes something, we increment the version by 1.
We don't have a great way to do this for observed/remote repositories because when we `git fetch` we might get nothing, or we might get some changes, and we can't easily tell //what// changes we got.
For example, if we see that another node is at "version 97", and we do a fetch and see some changes, we don't know if we're in sync with them (i.e., also at "version 97") or ahead of them (at "version 98").
This implements a simple way to version an observed repository:
- Take the head of every branch/tag.
- Look them up.
- Pick the biggest internal ID number.
This will work //except// when branches are deleted, which could cause the version to go backward if the "biggest commit" is the one that was deleted. This should be OK, since it's rare and the effects are minor and the repository will "self-heal" on the next actual push.
Test Plan:
- Created an observed repository.
- Ran `bin/repository update` and observed a sensible version number appear in the version table.
- Pushed to the remote, did another update, saw a sensible update.
- Did an update with no push, saw no effect on version number.
- Toggled repository to hosted, saw the version reset.
- Simulated read traffic to out-of-sync node, saw it do a remote fetch.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15986
Summary:
Fixes T11030. Fixes T11032.
- Allow HTTP access to "Public" repositories even if `diffusion.allow-http-auth` is disabled.
- If you run Phabricator on an unusual port (???) use that port as the default when generating HTTP URIs.
Test Plan:
- Faked `phabricator.base-uri` to an unusual port, saw repository HTTP URI generate with an unusual port.
- Disabled `diffusion.allow-http-auth`, confirmed that toggling view policy between "public" and "users" activated or deactivated HTTP clone URI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11030, T11032
Differential Revision: https://secure.phabricator.com/D15973
Summary:
Ref T10939. Fixes T10174. We can currently trigger "uninteresting" auditors in two ways:
- Packages with auditing disabled ("NONE" audits).
- Packages with auditing enabled, but they don't need an audit (e.g., author is a pacakge owner; "NOT REQUIRED" audits).
These audits aren't interesting (we only write them so we can list "commits in this package" from other UIs) but right now they take up the audit slot. In particular:
- They show in the UI, but are generally useless/confusing nowadays. The actual table of contents does a better job of just showing "which packages do these paths belong to" now, and shows all packages for each path.
- They block Herald from adding real auditors.
Change this:
- Don't show uninteresting auditors.
- Let Herald upgrade uninteresting auditors into real auditors.
Test Plan:
- Ran `bin/repository reparse --owners <commit> --force`, and `--herald` to trigger Owners and Herald rules.
- With a package with auditing disabled, triggered a "None" audit and saw it no longer appear in the UI with the patch applied.
- With a package with auditing disabled, added a Herald rule to trigger an audit. With the patch, saw it go through and upgrade the audit to "Audit Required".
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10174, T10939
Differential Revision: https://secure.phabricator.com/D15940
Summary:
Ref T10939. Fixes T10181. This slightly simplifies, then documents the auditing rules, which haven't been updated for a while. In particular:
- If an owner authored the change, never audit.
- Examine all reviewers to determine reviewer audit status, not just the first reviewer.
- Simplify some of the loading code a bit.
Test Plan:
- Ran `bin/repository reparse --owners <commit> --force` to trigger this stuff.
- Verified that the web UI did reasonable things with resulting audits.
- Read documentation.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10181, T10939
Differential Revision: https://secure.phabricator.com/D15939
Summary:
Ref T10939. This is just a bug. I thought this was what was described in T10174 but that's actually talking about something completely different.
Also make a `<select />` slightly easier to use.
Test Plan:
- Created a package with auditing enabled.
- Pushed a change.
- Saw audit trigger.
- Disabled the package, pushed a change.
- Before patch: saw audit trigger improperly.
- After patch: restarted daemons, then saw audit correctly not trigger.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10939
Differential Revision: https://secure.phabricator.com/D15907
Summary:
Ref T4292. Currently, we hold one big lock around the whole `bin/repository update` workflow.
When running multiple daemons on different hosts, this lock can end up being contentious. In particular, we'll hold it during `git fetch` on every host globally, even though it's only useful to hold it locally per-device (that is, it's fine/good/expected if `repo001` and `repo002` happen to be fetching from a repository they are observing at the same time).
Instead, split it into two locks:
- One lock is scoped to the current device, and held during pull (usually `git fetch`). This just keeps multiple daemons accidentally running on the same host from making a mess when trying to initialize or update a working copy.
- One lock is scoped globally, and held during discovery. This makes sure daemons on different hosts don't step on each other when updating the database.
If we fail to acquire either lock, assume some other process is legitimately doing the work and bail more quietly instead of fataling. In approximately 100% of cases where users have hit this lock contention, that was the case: some other daemon was running somewhere doing the work and the error didn't actually represent an issue.
If there's an actual problem, we still raise a diagnostically useful message if you run `bin/repository update` manually, so there are still tools to figure out that something is hung or whatever.
Test Plan:
- Ran `bin/repository update`, `pull`, `discover`.
- Added `sleep(5)`, forced processes to contend, got lock exceptions and graceful exit with diagnostic message.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15903
Summary:
Fixes T10940. Two issues currently:
First, `PullLocal` deamon refuses to update non-cluster repositories on cluster devices. However, this is surprising/confusing/bad because as soon as you enroll a repository host in the cluster, most of the repositories on it stop working until you `clusterize` them. This is especially confusing because the documentation gives you a very nice, gradual walkthrough about going through things slowly and being able to check your work at every step, but we really drop you off a bit of a cliff here. The workflow implied by the documentation is a desirable one.
This operation is generally only unsafe/problematic if the daemon would be creating a //new// working copy. If a working copy already exists, we can reasonably guess that it's almost certainly because you've enrolled a previously un-clustered host into a new cluster. This allows the nice, gradual workflow the documentation describes to proceed as expected, without any weird surprises.
Instead of refusing to update these repositories, only refuse to update them if updating would create a new working copy. This should make transitioning much smoother without any meaningful reduction in safety.
Second, the lower-level `bin/repository update`, `refs`, `mirror`, etc., commands don't apply this same check. However, these commands are potentially just as dangerous. Use the same code to do a similar check there, making sure we only operate on repositories that are either expected to be on the current device, or which already exist here.
Test Plan:
- Ran `bin/phd debug pull`, saw diagnostic information choose to update most repositories (including some non-cluster repositories) but properly skip non-cluster repositories that do not exist locally.
- Ran `bin/repository update`, etc., saw the command apply consistent rules to the rules applied by `PullLocal` and refuse to update non-local repositories it would need to create.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10940
Differential Revision: https://secure.phabricator.com/D15902
Summary: Fixes T10948. Ref T10923. Make these rules a little more thorough and document their behavior.
Test Plan: Looked at Diffusion clone URIs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10923, T10948
Differential Revision: https://secure.phabricator.com/D15887
Summary: Fixes T10815. We already recovered reasonably from this for cluster repositories, but not for non-cluster repositories.
Test Plan:
- Viewed cluster and non-cluster empty Git repository.
- Viewed cluster and non-cluster empty Mercurial repository.
- Viewed cluster and non-clsuter empty hosted SVN repository.
- Viewed cluster and non-cluster empty observed SVN repository.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10815
Differential Revision: https://secure.phabricator.com/D15878
Summary:
Ref T10923. Fixes T9554.
When hosting a repository, we currently have a heuristic that tries to detect when you're doing an initial import: if you push more than 7 commits to an empty repository, it counts as an import and we disable mail/feed/etc.
Do something similar for observed repositories: if the repository is empty and we discover more than 7 commits, switch to import mode until we catch up.
This should align behavior with user expectation more often when juggling hosted vs imported repositories.
Test Plan:
- Created a new hosted repository.
- Activated it and allowed it to fully import.
- Added an "Observe URI".
- Saw it automatically drop into "Importing" mode until the import completed.
- Swapped it back to hosted mode.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9554, T10923
Differential Revision: https://secure.phabricator.com/D15877
Summary:
Ref T10923. When regenerating the URI index for a repository, index every URI.
- Also, make the index slightly stricter (domain + path instead of just path). Excluding the domain made more sense when we were generating only first-party URIs.
- Make the index smarter about `/diffusion/123/` URIs.
- Show normalized URIs in `diffusion.repository.search` results.
Test Plan:
- Ran migration.
- Verified sensible-looking results in database.
- Searched for a repository URI by first-party clone URI.
- Searched for a repository URI by mirror URI.
- Used `diffusion.repository.search` to get information about repository URIs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10923
Differential Revision: https://secure.phabricator.com/D15876
Summary: Ref T10923. Walk users through the "create, configure, activate" workflow a little better and set expectations more clearly.
Test Plan:
- Created a new repository, saw new UI help.
- Activated repository, saw onboarding help disappear.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10923
Differential Revision: https://secure.phabricator.com/D15875
Summary:
Ref T10923.
- Hide "Automation", "Staging" and "Branches" in repositories where they do nothing.
- Fix SVN SSH URIs to read "svn+ssh://" and have proper paths.
Test Plan:
- Verified irrelevant sections did not appear in Subversion in Manage UI.
- Checked out a new hosted SVN repository.
Reviewers: chad, avivey
Reviewed By: avivey
Maniphest Tasks: T10923
Differential Revision: https://secure.phabricator.com/D15874
Summary: Ref T10748. We're returning a `PhabricatorRepositoryURI` here but the code expects an actual `PhutilURI`.
Test Plan:
This should clear this up in production:
```
Daemon 180278 STDE [Wed, 04 May 2016 23:25:16 +0000] [2016-05-04 23:25:16] EXCEPTION: (PhutilProxyException) Error while executing Task ID 1677075. {>} (RuntimeException) Object of class PhabricatorRepositoryURI could not be converted to string at [<phutil>/src/error/PhutilErrorHandler.php:205]
Daemon 180278 STDE [Wed, 04 May 2016 23:25:16 +0000] arcanist(head=master, ref.master=c58f1b9a2507), libcore(), phabricator(head=master, ref.master=29d1115037b8), phutil(head=master, ref.master=0709cd5cfc26), services(head=master, ref.master=04ae8c8f8e3b)
Daemon 180278 STDE [Wed, 04 May 2016 23:25:16 +0000] #0 <#2> PhutilErrorHandler::handleError(integer, string, string, integer, array) called at [<phabricator>/src/applications/repository/storage/PhabricatorRepository.php:1200]
Daemon 180278 STDE [Wed, 04 May 2016 23:25:16 +0000] #1 <#2> PhabricatorRepository::getPublicCloneURI() called at [<phabricator>/src/applications/repository/storage/PhabricatorRepositoryCommit.php:395]
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15844
Summary:
Ref T10748. This needs more extensive testing and is sure to have some rough edges, but seems to basically work so far.
Throwing this up so I can work through it more deliberately and make notes.
Test Plan:
- Ran migration.
- Used `bin/repository list` to list existing repositories.
- Used `bin/repository update <repository>` to update various repositories.
- Updated a migrated, hosted Git repository.
- Updated a migrated, observed Git repository.
- Converted an observed repository into a hosted repository by toggling the I/O mode of the URI.
- Conveted a hosted repository into an observed repository by toggling it back.
- Created and activated a new empty hosted Git repository.
- Created and activated an observed Git repository.
- Updated a mirrored repository.
- Cloned and pushed over HTTP.
- Tried to HTTP push a read-only repository.
- Cloned and pushed over SSH.
- Tried to SSH push a read-only repository.
- Updated several Mercurial repositories.
- Updated several Subversion repositories.
- Created and edited repositories via the API.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15842
Summary:
Ref T10748. This migrates and swaps mirroring to `PhabricatorRepositoryURI`, obsoleting `PhabricatorRepositoryMirror`.
This prevents you from editing, adding or disabling mirrors unless you know a secret URI (until the UI cuts over fully), but existing mirroring is not affected.
Test Plan:
- Added a mirroring URI to an old repository.
- Verified it worked with `bin/repository mirror`.
- Migrated forward.
- Verified it still worked with `bin/repository mirror`.
- Wow, mirroring: https://github.com/epriestley/locktopia-mirror
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15841
Summary:
Ref T10748. In D14250#158181, I accepted this conditional on removing it once Conduit could handle it.
Conduit can now handle it, or at least will be able to as soon as T10748 cuts over.
Test Plan: Grepped for `repository edit`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15839
Summary:
Ref T10748. This has had many problems for a long time, can't create hosted repositories, can't create cluster repositories, etc. It is obsoleted by `diffusion.repository.edit`. Remove it.
(Right now `diffusion.repository.edit` isn't a strict replacement, but it will be as soon as the URI stuff cuts over.)
Test Plan: Grepped for `repository.create`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15838
Summary:
Ref T4039. Long ago these were more freely editable and there were some security concerns around creating a repository, then setting its local path to point somewhere it shouldn't.
Local paths are no longer editable so there's no real reason we need to provide a uniqueness guarantee anymore, but you could still make a mistake with `bin/repository move-paths` by accident, and it's a little cleaner to pull them out into their own column with a key.
(We still don't -- and, largely can't -- guarantee that two paths aren't //equivalent// since one might be symlinked to the other, or symlinked only on some hosts, or whatever, but the primary value here is as a sanity check that you aren't goofing things up and pointing a bunch of repositories at the same working copy by mistake.)
Test Plan:
- Ran migrations.
- Grepped for `local-path`.
- Listed and moved paths with `bin/repository`.
- Created a new repository, verified its local path populated correctly.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4039
Differential Revision: https://secure.phabricator.com/D15837
Summary:
Ref T10748. Ref T10366. Allows users to set credential for new URIs.
- Ref T7221. Our handling of the "git://" protocol is currently incorrect. This protocol is not authenticated, but is considered an SSH protocol. In the new UI, it is considered an anonymous/unauthenticated protocol instead.
- Ref T10241. This fixes the `PassphraseCredentialControl` so it doesn't silently edit the value if the current value is not visible to you and/or not valid.
Test Plan:
Performed a whole lot of credential edits, removals, and adjustments. I'll give this additional vetting before cutting over to it.
{F1253207}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T7221, T10241, T10366, T10748
Differential Revision: https://secure.phabricator.com/D15829
Summary:
Ref T10748.
- Allow users to add new URIs by clicking a button instead of knowing a secret URI.
- Validate that URIs are actually valid URIs.
- Add enable/disable action and strings.
Test Plan:
- Created a new URI.
- Tried to create a nonsense URI, created a good URI.
- Enabled/disabled a URI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15825
Summary: Ref T10748. Adds a "uris" attachment with URI information.
Test Plan: Queried URI information via Conduit, saw reasonable looking information.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15822
Summary: Ref T10748. Brings the rest of the transactions to EditEngine, supports creating via API.
Test Plan:
- Created a URI via API.
- Created a URI via web.
- Tried to apply sneaky transactions, got rejected with good error messages. <_< >_>
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15821
Summary: Ref T10748. Ref T10366. This documents how everything is planned to work shortly.
Test Plan: Read documentation.
Reviewers: chad
Reviewed By: chad
Subscribers: eadler, scode
Maniphest Tasks: T10366, T10748
Differential Revision: https://secure.phabricator.com/D15817
Summary:
Ref T10748.
- New View page for repository URIs.
- Make display and I/O behavior (observe, mirror, read, read/write) editable.
- Add a bunch of checks to prevent you from completely screwing up a repository by making it writable from a bunch of differnet sources.
Test Plan:
{F1249866}
{F1249867}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15816
Summary:
Ref T10748. Ref T10366. This adds a new EditEngine, EditController, Editor, Query, and Transaction for RepositoryURIs.
None of these really do anything helpful yet, and these URIs are still unused in the actual application.
Test Plan: {F1249794}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10366, T10748
Differential Revision: https://secure.phabricator.com/D15815
Summary:
Ref T10748. Allow the new EditEngine workflow to create repositories by giving the user a modal repository type choice upfront.
(The rest of this flow is still confusing/weird, though.)
Test Plan:
- Created a new repository.
{F1249626}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15813
Summary: Ref T10748. This brings the "Actions" items (publish/notify + autoclose enabled) into the new UI.
Test Plan:
- Edited this stuff via EditEngine and Conduit.
- Viewed via new Manage UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15811
Summary: Ref T10748. Port this, add EditEngine support, add some type validation to the transaction.
Test Plan:
- Edited via EditEngine.
- Edited via Conduit.
- Viewed via Management UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15808
Summary: Ref T4292. This provides at least some sort of hint about how to set up cluster repositories.
Test Plan:
- Read documentation.
- Ran `bin/repository clusterize` to add + remove clusters.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15798
Summary: This gets over-escaped instead of bolded right now, but I only ever hit it when exporting/importing and never both cleaning it up.
Test Plan: Ran `bin/repository move-paths`, saw bolded "Move" instead of ANSI escape sequences.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D15797
Summary:
Ref T10860. This allows us to recover if the connection to the database is lost during a push.
If we lose the connection to the master database during a push, we would previously freeze the repository. This is very safe, but not very operator-friendly since you have to go manually unfreeze it.
We don't need to be quite this aggressive about freezing things. The repository state is still consistent after we've "upgraded" the lock by setting `isWriting = 1`, so we're actually fine even if we lost the global lock.
Instead of just freezing the repository immediately, sit there in a loop waiting for the master to come back up for a few minutes. If it recovers, we can release the lock and everything will be OK again.
Basically, the changes are:
- If we can't release the lock at first, sit in a loop trying really hard to release it for a while.
- Add a unique lock identifier so we can be certain we're only releasing //our// lock no matter what else is going on.
- Do the version reads on the same connection holding the lock, so we can be sure we haven't lost the lock before we do that read.
Test Plan:
- Added a `sleep(10)` after accepting the write but before releasing the lock so I could run `mysqld stop` and force this issue to occur.
- Pushed like this:
```
$ echo D >> record && git commit -am D && git push
[master 707ecc3] D
1 file changed, 1 insertion(+)
# Push received by "local001.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "local001.phacility.net"...
# Acquired read lock immediately.
# Device "local001.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "local001.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 254 bytes | 0 bytes/s, done.
Total 3 (delta 1), reused 0 (delta 0)
BEGIN SLEEP
```
- Here, I stopped `mysqld` from the CLI in another terminal window.
```
END SLEEP
# CRITICAL. Failed to release cluster write lock!
# The connection to the master database was lost while receiving the write.
# This process will spend 300 more second(s) attempting to recover, then give up.
```
- Here, I started `mysqld` again.
```
# RECOVERED. Link to master database was restored.
# Released cluster write lock.
To ssh://local@localvault.phacility.com/diffusion/26/locktopia.git
2cbf87c..707ecc3 master -> master
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15792
Summary: Ref T10860. This doesn't change anything, it just separates all this stuff out of `PhabricatorRepository` since I'm planning to add a bit more state to it and it's already pretty big and fairly separable.
Test Plan: Pulled, pushed, browsed Diffusion.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10860
Differential Revision: https://secure.phabricator.com/D15790
Summary:
Ref T4292. Sometimes, we may not have a working copy for a repository. The easiest way to get into this condition is to deactivate a repository.
We could try to clone + fetch in this case, but that's kind of complex, and there's an easy command that administrators can run manually. For now, just tell them to do that.
This affects the inactive repositories on `secure`, like rGITCOINS.
Test Plan: Removed working copy, got message.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15786
Summary:
Ref T4292. When the daemons make a query for repository information, we need to make sure the working copy on disk is up to date before we serve the response, since we might not have the inforamtion we need to respond otherwise.
We do this automatically for almost all Diffusion methods, but this particular method is a little unusual and does not get this check for free. Add this check.
Test Plan:
- Made this code throw.
- Ran `bin/repository reparse --message ...`, saw the code get hit.
- Ran `bin/repository lookup-user ...`, saw this code get hit.
- Made this code not throw.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15783
Summary:
Ref T10751. Add support tooling for manually prying your way out of trouble if disaster strikes.
Refine documentation, try to refer to devices as "devices" more consistently instead of sometimes calling them "nodes".
Test Plan: Promoted and demoted repository devices with `bin/repository thaw`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10751
Differential Revision: https://secure.phabricator.com/D15768
Summary:
Ref T10751. Make the UI more useful and explain what failure states mean and how to get out of them.
The `bin/repository thaw` command does not exist yet, I'll write that soon.
Test Plan: {F1238241}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10751
Differential Revision: https://secure.phabricator.com/D15766
Summary: Ref T10751. This cleans this up so it's a little more modern, and fixes a possible bad access on the log detail page.
Test Plan: Viewed push log list, viewed push log detail.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10751
Differential Revision: https://secure.phabricator.com/D15765
Summary:
Ref T4292. Right now, repository versions only get marked when a write happens.
This potentially creates a problem: if I pushed all the sync code to `secure` and enabled `secure002` as a repository host, the daemons would create empty copies of all the repositories on that host.
Usually, this would be fine. Most repositories have already received a write on `secure001`, so that working copy has a verison and is a leader.
However, when a write happened to a rarely-used repository (say, rKEYSTORE) that hadn't received any write recently, it might be sent to `secure002` randomly. Now, we'd try to figure out if `secure002` has the most up-to-date copy of the repository or not.
We wouldn't be able to, since we don't have any information about which node has the data on it, since we never got a write before. The old code could guess wrong and decide that `secure002` is a leader, then accept the write. Since this would bump the version on `secure002`, that would //make// it an authoritative leader, and `secure001` would synchronize from it passively (or on the next read or write), which would potentially destroy data.
Instead:
- Refuse to continue in situations like this.
- When a repository is on exactly one device, mark it as a leader with version "0".
- When a repository is created into a cluster service, mark its version as "0" on all devices (they're all leaders, since the repository is empty).
This should mean that we won't lose data no matter how much weird stuff we run into.
Test Plan:
- In single-node mode, used `repository update` to verify that `0` was written properly.
- With multiple nodes, used `repository update` to verify that we refuse to continue.
- Created a new repository, verified versions were initialized correctly.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15761
Summary: Ref T4292. When we write a push log, also log which node received the request.
Test Plan: {F1230467}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15759
Summary:
Ref T4292. We currently synchronize hosted, clustered, Git repositories when we receive an SSH pull or push.
Additionally:
- Synchronize before HTTP reads and writes.
- Synchronize reads before Conduit requests.
We could relax Conduit eventually and allow Diffusion to say "it's OK to give me stale data".
We could also redirect some set of these actions to just go to the up-to-date host instead of connecting to a random host and synchronizing it. However, this potentially won't work as well at scale: if you have a larger number of servers, it sends all of the traffic to the leader immediately following a write. That can cause "thundering herd" issues, and isn't efficient if replicas are in different geographical regions and the write just went to the east coast but most clients are on the west coast. In large-scale cases, it's better to go to the local replica, wait for an update, then serve traffic from it -- particularly given that writes are relatively rare. But we can finesse this later once things are solid.
Test Plan:
- Pushed and pulled a Git repository over HTTP.
- Browsed a Git repository from the web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15758
Summary:
Ref T4292. Before we write or read a hosted, clustered Git repository over SSH, check if another version of the repository exists on another node that is more up-to-date.
If such a version does exist, fetch that version first. This allows reads and writes of any node to always act on the most up-to-date code.
Test Plan: Faked my way through this and got a fetch via `bin/repository update`; this is difficult to test locally and needs more work before we can put it in production.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15757
Summary:
Ref T4292. This consolidates code for figuring out which user we should connect to hosts with.
Also narrows a lock window.
Test Plan: Browsed Diffusion, pulled and pushed through an SSH proxy.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15754
Summary:
Ref T4292. Ref T10366. Depends on D15751. Today, generating repository commands is purely a function of the repository, so they use protocols and credentials based on the repository configuration.
For example, a repository with an SSH "remote URI" always generate SSH "remote commands".
This needs to change in the future:
- After T10366, repositories won't necessarily just have one type of remote URI. They can only have one at a time still, but the repository itself won't change based on which one is currently active.
- For T4292, I need to generate intracluster commands, regardless of repository configuration. These will have different protocols and credentials.
Prepare for these cases by separating out command construction, so they'll be able to generate commands in a more flexible way.
Test Plan:
- Added unit tests.
- Browsed diffusion.
- Ran `bin/phd debug pull` to pull a bunch of repos.
- Ran daemons.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292, T10366
Differential Revision: https://secure.phabricator.com/D15752
Summary: Ref T4292. This will let the UI and future `bin/repository` tools give administrators more tools to understand problems when reporting or resolving them.
Test Plan:
- Pushed fully clean repository.
- Pushed previously-pushed repository.
- Forced write to abort, inspected useful information in the database.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15748
Summary:
Ref T4292. Small fixes:
- There was a bug with the //first// write, where we'd write 1 but expect 0. Fix this.
- Narrow the window where we hold the `isWriting` lock: we don't need to wait for the client to finish.
- Release the lock even if something throws.
- Use a more useful variable name.
Test Plan:
- Made new writes to a fresh cluster repository.
- Made sequential writes.
- Made concurrent writes.
- Made good writes and bad writes.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15747
Summary:
Fixes T10830.
- The return code from `storage adjust` did not propagate correct.
- There was one column issue which I missed the first time around because I had a bunch of unrelated stuff locally.
Test Plan:
- Ran `bin/storage upgrade -f` with failures, used `echo $?` to make sure it exited nonzero.
- Got fully clean `bin/storage adjust` by dropping all my extra local tables.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10830
Differential Revision: https://secure.phabricator.com/D15746
Summary:
Fixes T10830. Ref T10366. I wasn't writing to this table yet so I didn't build it, but the fact that `bin/storage adjust` would complain slipped my mind.
- Add the table.
- Make the tests run `adjust`. This is a little slow (a few extra seconds) but we could eventually move some steps like this to run server-side only.
Test Plan: Ran `bin/storage upgrade -f`, got a clean `adjust`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10366, T10830
Differential Revision: https://secure.phabricator.com/D15744
Summary:
Ref T10748. This supports more transaction types in the modern editor and improves validation so Conduit benefits.
You can technically create repositories via `diffusion.repository.edit` now, although they aren't very useful.
Test Plan:
- Used `diffusion.repository.edit` to create and edit repositories.
- Used `/editpro/` to edit repositories.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10748
Differential Revision: https://secure.phabricator.com/D15740
Summary: Ref T10748. Ref T10366. No support for editing and no impact on the UI, but get some of the basics in place.
Test Plan: {F1223279}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10366, T10748
Differential Revision: https://secure.phabricator.com/D15742
Summary: Ref T10748. Ref T10337. This technically implements this stuff, but it does not do anything useful yet. This skips all the hard stuff.
Test Plan:
- Technically used `diffusion.repository.search` to get repository information.
- Technically used `diffusion.repository.edit` to change a repository name.
- Used `editpro/` to edit a repository name.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10337, T10748
Differential Revision: https://secure.phabricator.com/D15736
Summary:
This nearly works but I didn't have time to get back to it and it isn't stable enough to turn on in the cluster yet.
We have enough other stuff going out this week, so just disable it before `stable` gets cut. Should be ready by next week if things go well.
Test Plan: Fetched a Git SSH repo locally.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D15731
Summary:
Ref T2783. This allows this worker to run on a machine different to the one that stores the repository, by routing the execution of Git over Conduit calls.
This API method is super gross, but fixing it isn't straightforward and it runs into other complicated considerations. We can fix it later; for now, just define it as "internal" to limit how much mess this creates.
"Internal" methods do not appear on the console.
Test Plan: Ran `bin/repository reparse --change <commit> --trace` on several commits, saw daemons make a Conduit call instead of running a `git` command.
Reviewers: hach-que, chad
Reviewed By: chad
Subscribers: joshuaspence, Korvin, epriestley
Maniphest Tasks: T2783
Differential Revision: https://secure.phabricator.com/D11874
Summary: Fixes T10789. If we aren't configured with a device, we never grabbed a lock in the first place, and should not expect one to be held.
Test Plan: Pushed non-cluster-configured Git SSH repository.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10789
Differential Revision: https://secure.phabricator.com/D15692
Summary:
Ref T4292. This mostly implements the locking/versioning logic for multi-master repositories. It is only active on Git SSH pathways, and doesn't actually do anything useful yet: it just does bookkeeping so far.
When we read (e.g., `git fetch`) the logic goes like this:
- Get the read lock (unique to device + repository).
- Read all the versions of the repository on every other device.
- If any node has a newer version:
- Fetch the newer version.
- Increment our version to be the same as the version we fetched.
- Release the read lock.
- Actually do the fetch.
This makes sure that any time you do a read, you always read the most recently acknowledged write. You may have to wait for an internal fetch to happen (this isn't actually implemented yet) but the operation will always work like you expect it to.
When we write (e.g., `git push`) the logic goes like this:
- Get the write lock (unique to the repository).
- Do all the read steps so we're up to date.
- Mark a write pending.
- Do the actual write.
- Bump our version and mark our write finished.
- Release the write lock.
This allows you to write to any replica. Again, you might have to wait for a fetch first, but everything will work like you expect.
There's one notable failure mode here: if the network connection between the repository node and the database fails during the write, the write lock might be released even though a write is ongoing.
The "isWriting" column protects against that, by staying locked if we lose our connection to the database. This will currently "freeze" the repository (prevent any new writes) until an administrator can sort things out, since it'd dangerous to continue doing writes (we may lose data).
(Since we won't actually acknowledge the write, I think, we could probably smooth this out a bit and make it self-healing //most// of the time: basically, have the broken node rewind itself by updating from another good node. But that's a little more complex.)
Test Plan:
- Pushed changes to a cluster-mode repository.
- Viewed web interface, saw "writing" flag and version changes.
- Pulled changes.
- Faked various failures, got sensible states.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15688
Summary:
Ref T4292. This adds some very basic cluster/device data to the new management view. Nothing interesting yet.
Also deal with disabled bindings a little more cleanly.
Test Plan: {F1214619}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15685
Summary:
Ref T10756. When repositories are properly configured for the cluster (which is hard to set up today), be smart about which repositories are expected to exist on the current host, and only pull them.
This generally allows daemons to pretty much do the right thing no matter how many copies are running, although there may still be some lock contention issues that need to be sorted out.
Test Plan: {F1214483}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10756
Differential Revision: https://secure.phabricator.com/D15682