Summary:
Via HackerOne. There are two attacks here:
- Configuring mirroring to a `file://` URI to place files on disk or overwrite another repository. This is not particularly severe.
- Configuring cloning from a `file://` URI to read repositories you should not have access to. This is more severe.
Historically, repository creation and editing explicitly supported `file://` URIs to deal with use cases where you had something else managing repositories on the same machine. Since there were no permissions, repository management was admin-only, and you couldn't mirror, this was fine.
As we've evolved, this use case is a tiny minority use case and the security implications of `file://` URIs overwhelm the utility it provides. Prevent the use of `file://` URIs. Existing configured repositories won't stop working, you just can't add any new ones.
Also prevent `localPath` from being set via Conduit (see T4039).
Test Plan:
- Tried to create a `file://` repository.
- Tried to create a `file://` mirror.
- Tried to create a `file://` repository via Conduit.
- Created a non-`file://` repository.
- Created a non-`file://` mirror.
- Created a non-`file://` repository via Conduit.
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9513
Summary: Ref T4986. Instead of requiring users to know the name of an application search engine class, let them select from a list.
Test Plan:
Created a new panel.
{F165468}
Reviewers: chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T4986
Differential Revision: https://secure.phabricator.com/D9500
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary: This went smoother than expeced. Makes the rounded Card the default, also tweaked selected state a little.
Test Plan:
Test UIExamples, Maniphest, Home, Differential, Harbormaster, Audit. Everything seems normal
{F163971}
{F163973}
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9408
Summary: Ref T4045. Ref T5179. Send all new writes into the modern store.
Test Plan:
- Created a diff.
- Verified it went to the modern store.
- Destroyed a revision, verified hunks were destroyed.
- Also unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9293
Summary:
Ref T4045. Ref T5179. Hunk storage has two major issues:
- It's utf8, but actual diffs are binary.
- It's huge and can't be compressed or archived.
This introduces a second datastore which solves these problems: by recording hunk encoding, supporting compression, and supporting alternate storage. There's no actual compression or storage support yet, but there's space in the table for them.
Since nothing actually uses hunk IDs, it's fine to have these tables exist at the same time and use the same IDs. We can migrate data between the tables gradually without requiring downtime or disrupting installs.
Test Plan:
- There are no writes to the new table yet.
- The only effect this has is making us issue one extra query when looking for hunks.
- Observed the query issue, but everything else continue working fine.
- Created a new diff.
- Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9290
Summary: Ref T5179. Ref T4045. Continue reducing the number of direct hunk loads we perform.
Test Plan: Pushed a closing commit, used `scripts/repository/reparse.php --message ...` to trigger this logic, got a sensible/accurate result.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4045, T5179
Differential Revision: https://secure.phabricator.com/D9288
Summary: Fixes T5255. Currently the `./bin/repository parents` workflow is quite slow. Batching up the SQL operations should make the workflow //seem// much faster.
Test Plan: Not yet tested.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Maniphest Tasks: T5255
Differential Revision: https://secure.phabricator.com/D9361
Summary: Currently, repositories can be deleted using `./bin/repository delete`. It makes sense to expose this operate to the `./bin/remove` script as well, for consistency.
Test Plan: Deleted a repository with `./bin/remove rTEST`.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9350
Summary: Fixes T5226. It's rare (but possible) for a commit to have the same parent more than once in Git.
Test Plan: Ran `bin/repository parents` on a normal repository.
Reviewers: joshuaspence
Reviewed By: joshuaspence
Subscribers: epriestley
Maniphest Tasks: T5226
Differential Revision: https://secure.phabricator.com/D9344
Summary:
This probably needs some tweaks, but the idea is to make it easier to browse and access applications without necessarily needing them to be on the homepage.
Open to feedback.
Test Plan:
(This screenshot merges "Organization", "Communication" and "Core" into a single "Core" group. We can't actually do this yet because it wrecks the homepage.)
{F160052}
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T5176
Differential Revision: https://secure.phabricator.com/D9297
Summary: Fixes T5195. Currently, the `./bin/repository parents` workflow doesn't respect tracked branches and will attempt to build parents caches for all branches.
Test Plan: For at least one of our repositories, this patch fixes the `Unknown commit` exception. Unfortunately, it doesn't seem to completely solve this problem though, but I suspect that this is due to commits that were overwritten with a `git push --force` or similar.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Maniphest Tasks: T5195
Differential Revision: https://secure.phabricator.com/D9322
Summary: Point everything at the new canonical URI.
Test Plan: `grep`
Reviewers: chad
Reviewed By: chad
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D9328
Summary:
Ref T4994. This stuff works:
- You can dump a blob of coverage information into `diffusion.updatecoverage`. This wipes existing coverage information and replaces it.
- It shows up when viewing files.
- It shows up when viewing commits.
This stuff does not work:
- When viewing files, the Javascript hover interaction isn't tied in yet.
- We always show this information, even if you're behind the commit where it was generated.
- You can't do incremental updates.
- There's no aggregation at the file (this file has 90% coverage), diff (the changes in this commit are 90% covered), or directory (the code in this directory has 90% coverage) levels yet.
- This is probably not the final form of the UI, storage, or API, so you should expect occasional changes over time. I've marked the method as "Unstable" for now.
Test Plan:
- Ran `save_lint.php` to check for collateral damage; it worked fine.
- Ran `save_lint.php` on a new branch to check creation.
- Published some fake coverage information.
- Viewed an affected commit.
- Viewed an affected file.
{F151915}
{F151916}
Reviewers: chad, btrahan
Reviewed By: btrahan
Subscribers: jhurwitz, epriestley, zeeg
Maniphest Tasks: T5044, T4994
Differential Revision: https://secure.phabricator.com/D9022
Summary: Fixes T3044. We currently don't add these to the index.
Test Plan: Made a unique inline comment on a commit, then searched for it.
Reviewers: btrahan, chad
Reviewed By: chad
Subscribers: epriestley
Maniphest Tasks: T3044
Differential Revision: https://secure.phabricator.com/D9170
Summary: Ref T5058. The use of "enum" is confusing; we mean "choose one of these specific string constants". Make this more clear.
Test Plan: Viewed each call from the web UI.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5058
Differential Revision: https://secure.phabricator.com/D9127
Summary:
Allows you to quickly search for files within a repository. Roughly:
- We build a big tree of everything and ship it to the client.
- The client implements a bunch of Sublime-ish magic to find paths.
Test Plan: {F154007}
Reviewers: chad, btrahan
Reviewed By: btrahan
Subscribers: epriestley, zeeg
Differential Revision: https://secure.phabricator.com/D9087
Summary: Ref T4986. Move push logs to a View, then have all the stuff that needs to use it use that View.
Test Plan: Viewed push logs and transaction detail in Diffusion. Created a panel.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4986
Differential Revision: https://secure.phabricator.com/D9104
Summary: Ref T2683. This is a small optimization, but it has low complexity: don't rebuild a bucket more than once in the same request, since it will almost always be the same. Bucket rebuilds are pretty cheap, but this saves a few queries.
Test Plan:
- After discovering (but before parsing) a commit, viewed its browse view. Verified that this patch causes us to perform only one bucket rebuild, and therefore reduces the number of queries we issue.
- Parsed the commit and viewed the browse view again, got successful rebuild and then fills from cache.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T2683
Differential Revision: https://secure.phabricator.com/D9055
Summary: Ref T2683. Normally not a big deal, but if a readme has some codeblocks missing the cache can slow things down.
Test Plan:
- Verified we hit the cache.
- Verified TOC still works.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T5028, T2683
Differential Revision: https://secure.phabricator.com/D9049
Summary:
Ref T2683. At least locally, browse views are now nearly instantaneous, even in Mercurial. We also fall back to what we were doing before if we miss or take too long, so this shouldn't make things very much worse even in extreme cases.
For a local `hg` repo, the time we spend pulling browse stuff has dropped from ~3,000ms to ~20ms. This is probably atypical, but not completely crazy or rigged or anything.
Test Plan: Viewed Git, Subversion and Mercurial repositories and observed dramatically better performance in Git and Mercurial as they took advantage of the cache.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, jhurwitz
Maniphest Tasks: T2683
Differential Revision: https://secure.phabricator.com/D9047
Summary:
Ref T2683. This is a refinement and simplification of D5257. In particular:
- D5257 only cached the commit chain, not path changes. This meant that we had to go issue an awkward query (which was slow on Facebook's install) periodically while reading the cache. This was reasonable locally but killed performance at FB scale. Instead, we can include path information in the cache. It is very rare that this is large except in Subversion, and we do not need to use this cache in Subversion. In other VCSes, the scale of this data is quite small (a handful of bytes per commit on average).
- D5257 required a large, slow offline computation step. This relies on D9044 to populate parent data so we can build the cache online at will, and let it expire with normal LRU/LFU/whatever semantics. We need this parent data for other reasons anyway.
- D5257 separated graph chunks per-repository. This change assumes we'll be able to pull stuff from APC most of the time and that the cost of switching chunks is not very large, so we can just build one chunk cache across all repositories. This allows the cache to be simpler.
- D5257 needed an offline cache, and used a unique cache structure. Since this one can be built online it can mostly use normal cache code.
- This also supports online appends to the cache.
- Finally, this has a timeout to guarantee a ceiling on the worst case: the worst case is something like a query for a file that has never existed, in a repository which receives exactly 1 commit every time other repositories receive 4095 commits, on a cold cache. If we hit cases like this we can bail after warming the cache up a bit and fall back to asking the VCS for an answer.
This cache isn't perfect, but I believe it will give us substantial gains in the average case. It can often satisfy "average-looking" queries in 4-8ms, and pathological-ish queries in 20ms on my machine; `hg` usually can't even start up in less than 100ms. The major thing that's attractive about this approach is that it does not require anything external or complicated, and will "just work", even producing reasonble improvements for users without APC.
In followups, I'll modify queries to use this cache and see if it holds up in more realistic workloads.
Test Plan:
- Used `bin/repository cache` to examine the behavior of this cache.
- Did some profiling/testing from the web UI using `debug.php`.
- This //appears// to provide a reasonable fast way to issue this query very quickly in the average case, without the various issues that plagued D5257.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, jhurwitz
Maniphest Tasks: T2683
Differential Revision: https://secure.phabricator.com/D9045
Summary:
Ref T4455. This adds a `repository_parents` table which stores `<childCommitID, parentCommitID>` relationships.
For new commits, it is populated when commits are discovered.
For older commits, there's a `bin/repository parents` script to rebuild the data.
Right now, there's no UI suggestion that you should run the script. I haven't come up with a super clean way to do this, and this table will only improve performance for now, so it's not important that we get everyone to run the script right away. I'm just leaving it for the moment, and we can figure out how to tell admins to run it later.
The ultimate goal is to solve T2683, but solving T4455 gets us some stuff anyway (for example, we can serve `diffusion.commitparentsquery` faster out of this cache).
Test Plan:
- Used `bin/repository discover` to discover new commits in Git, SVN and Mercurial repositories.
- Used `bin/repository parents` to rebuild Git and Mercurial repositories (SVN repos just exit with a message).
- Verified that the table appears to be sensible.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: jhurwitz, epriestley
Maniphest Tasks: T4455
Differential Revision: https://secure.phabricator.com/D9044
Summary: Ref T4986. I think this is the last of the easy ones, there are about 10 not-quite-so-trivial ones left.
Test Plan:
- Viewed app results.
- Created panels.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4986
Differential Revision: https://secure.phabricator.com/D9025
Summary: Ref T4986. These are mostly mechanical now, I skipped a couple of slightly tricky ones. Still a bunch to go.
Test Plan:
For each engine:
- Viewed the application;
- created a panel to issue the query.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4986
Differential Revision: https://secure.phabricator.com/D9017
Summary:
This is just a general review of config options, to reduce the amount of damage a rogue administrator (without host access) can do. In particular:
- Fix some typos.
- Lock down some options which would potentially let a rogue administrator do something sketchy.
- Most of the new locks relate to having them register a new service account, then redirect services to their account. This potentially allows them to read email.
- Lock down some general disk stuff, which could be troublesome in combination with other vulnerabilities.
Test Plan:
- Read through config options.
- Tried to think about how to do evil things with each one.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D8928
Summary:
Fixes T4917. Currently, if a user doesn't have access to, e.g., Phriction, they still get a checkbox in the search results to search for Wiki Documents. Those results will be filtered anyway, so this is confusing at best.
Instead, bind PHID types to applications. This is a relatively tailored fix; some areas for potential future work:
- Go through every PHID type and bind them all to applications. Vaguely nice to have, but doesn't get us anything for now.
- If no searchable application is installed, we don't show you an error state. This isn't currently possible ("People" is always installed) but in the interest of generality we could throw an exception or something at least.
- The elasticserach thing could probably constrain types to visible types, but we don't have a viewer there easily right now.
Test Plan: Uninstalled Phriction, saw the checkbox vanish.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4917
Differential Revision: https://secure.phabricator.com/D8904
Summary: Fixes T4919. There's some special casing in Diffusion for CAN_PUSH right now, just accommodate that until things get more general.
Test Plan: Viewed a repository edit screen with a custom policy transaction. Clicked the link to view it.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4919
Differential Revision: https://secure.phabricator.com/D8898
Summary:
Sometimes a commit can be huge (like a branch cut in FB www which could have more than half a million files touched). It will generate some emails with size more than 30M, and it will take quite a while to just sort the files and to send out.
Put a hard limit here to avoid such cases. Probably only matters for FB right now, but still even for a small repo with several thousand files, it is a waste to send them all out. Not sure if there is any cleaner way to do it though.
Test Plan: Tried it in FB installtion.
Reviewers: lifeihuang, epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D8889
Summary: ...also kills off "PhabricatorAuditCommitQuery" and "PhabricatorAuditQuery", by moving the work to "DiffusionCommitQuery". Generally cleans up some code around the joint on this too. Also provides policies for audit requests, which is basically the policy for the underlying commit. Fixes T4715. (For the TODO I added about files, I just grabbed T4713.)
Test Plan:
Audit: verified the three default views all showed the correct things, including highligthing. did some custom queries and got the correct results.
Diffusion: verified "blame view" still worked. verified paths were highlighted for packages i owned.
Home: verified audit boxes showed up with proper commits w/ audits
bin/audit: played around with it via --dry-run and got the right audits back
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: chad, epriestley, Korvin
Maniphest Tasks: T4715
Differential Revision: https://secure.phabricator.com/D8805
Summary: Throwing this up for testing, swapped out all icons in timeline for their font equivelants. Used better icons where I could as well. We should feel free to use more / be fun with the icons when possible since there is no penalty anymore.
Test Plan: I browsed many, not all, timelines in my sandbox and in IE8. Some of these were just swagged, but I'm expecting we'll do more SB testing before landing.
Reviewers: btrahan, epriestley
Reviewed By: epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D8827
Summary:
Ref T4605. When figuring out how long to wait to update a repository, factor in when it was last pushed. For rarely updated repositories, wait longer between updates.
(A slightly funky thing about this is that empty repos update every 15 seconds, but that seems OK for the moment.)
Test Plan:
Ran `bin/phd debug pulllocal` and saw sensible calculations and output:
```
...
<VERB> PhabricatorRepositoryPullLocalDaemon Last commit to repository "rPOEMS" was 1,239,608 seconds ago; considering a wait of 6,198 seconds before update.
>>> [79] <query> SELECT * FROM `repository` r ORDER BY r.id DESC
<<< [79] <query> 514 us
>>> [80] <query> SELECT * FROM `repository_statusmessage` WHERE statusType = 'needs-update'
<<< [80] <query> 406 us
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rINIH" is not due for an update for 8,754 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rDUCK" is not due for an update for 14 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rMTESTX" is not due for an update for 21,598 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rQWER" is not due for an update for 14 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rBT" is not due for an update for 13 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rSVNX" is not due for an update for 21,598 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rINIG" is not due for an update for 13 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rHGTEST" is not due for an update for 21,598 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rBTX" is not due for an update for 14 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rGX" is not due for an update for 13 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rMTX" is currently updating.
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rPOEMS" is not due for an update for 6,198 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rPHU" is currently updating.
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rSVN" is not due for an update for 21,598 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rPHY" is currently updating.
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rGTEST" is not due for an update for 21,598 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rINIS" is not due for an update for 6,894 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rARCLINT" is not due for an update for 21,599 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rLPHX" is not due for an update for 1,979 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rARC" is not due for an update for 1,824 second(s).
<VERB> PhabricatorRepositoryPullLocalDaemon Repository "rINIHG" is not due for an update for 21,599 second(s).
...
```
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4605
Differential Revision: https://secure.phabricator.com/D8782
Summary:
Ref T4605. Fixes T3466. The major change here is that we now run up to four simultaneous updates. This should ease cases where, e.g., one very slow repository was blocking other repositories. It also tends to increase load; the next diff will introduce smart backoff for cold repositories to ease this.
The rest of this is just a ton of logging so I can IRC debug these things by having users run them in `phd debug pulllocal` mode.
For T3466:
- You now have to hit four simultaneous hangs to completely block the update process.
- Importing repository updates are killed after 4 hours.
- Imported repository updates are killed after 15 minutes.
Test Plan:
- Ran `phd debug pulllocal` and observed sensible logs and behavior.
- Interrupted daemon from sleeps and processing with `diffusion.looksoon`.
- Ran with various `--not`, `--no-discovery` flags.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T3466, T4605
Differential Revision: https://secure.phabricator.com/D8785
Summary:
Ref T4605. Before discovering branches, try to prefill the cache in bulk. For repositories with large numbers of branches, this allows us to issue dramatically fewer queries.
(Before D8780, this cache was usually held across discovery events, so being able to fill it cheaply was not as relevant.)
Test Plan: Ran discovery on Git, Mercurial and SVN repositories. Observed fewer queries for Git/Mercurial.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4605
Differential Revision: https://secure.phabricator.com/D8781
Summary:
Ref T4605. Currently, the PullLocal daemon is responsible for two relatively distinct things:
- scheduling repository updates; and
- actually updating repositories.
Move the "actually updating" part into a new `bin/repository update` command, which basically runs the pull, discover, refs and mirror commands. This will let the parent process focus on scheduling in a more understandable way and update multiple repositories at once. It also makes it easier to debug and understand update behavior since the non-scheduling pipeline can be run separately.
Test Plan:
- Ran `update --trace` on SVN, Mercurial and Git repos.
- Ran PullLocal daemon for a while without issues.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4605
Differential Revision: https://secure.phabricator.com/D8780
Summary:
See discussion in D8773. Three small adjustments which should help prevent this kind of issue:
- When queueing followup tasks, hold them on the worker until we finish the task, then queue them only if the work was successful.
- Increase the default lease time from 60 seconds to 2 hours. Although most tasks finish in far fewer than 60 seconds, the daemons are generally stable nowadays and these short leases don't serve much of a purpose. I think they also date from an era where lease expiry and failure were less clearly distinguished.
- Increase the default wait-after-failure from 60 seconds to 5 minutes. This largely dates from the MetaMTA era, where Facebook ran services with high failure rates and it was appropriate to repeatedly hammer them until things went through. In modern infrastructure, such failures are rare.
Test Plan:
- Verified that tasks queued properly after the main task was updated.
- Verified that leases default to 7200 seconds.
- Intentionally failed a task and verified default 300 second wait before retry.
- Removed all default leases shorter than 7200 seconds (there was only one).
- Checked all the wait before retry implementations for anything much shorter than 5 minutes (they all seem reasonable).
Reviewers: btrahan, sowedance
Reviewed By: sowedance
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D8774
Summary:
Recently we see issues with huge commits (branch cuts for www) where people received hundreds of emails for the same commit. By checking all the active and archived tasks related to such commits, I saw the following pattern:
- The commit itself is marked as importStatus = 15 which means all the processing was actually done;
- In archived tasks, I see one PhabricatorRepositorySvnCommitMessageParserWorker, one PhabricatorRepositorySvnCommitChangeParserWorker, followed by many PhabricatorRepositoryCommitHeraldWorker, which means that the PhabricatorRepositoryCommitOwnersWorker (who schedule those herald tasks) was never done;
- PhabricatorRepositoryCommitOwnersWorker is always active (for days) with failureCount = 0;
- In daemon log I see a lot of lease expire exception for PhabricatorRepositoryCommitOwnersWorker.
So to me it looks like the following happened:
- Everything is fine until we schedule the PhabricatorRepositoryCommitOwnersWorker
- PhabricatorRepositoryCommitOwnersWorker actually successfully finished but its running time exceed 60s. Before it finishes, it scheduled the PhabricatorRepositoryCommitHeraldWorker task
- When we try to archive it, the lease expiration exception happened. As a result, it stayed active and will be picked up immediately since it is in the head of the queue
- The two steps above repeat forever until we kill it
I am not sure why we want to check lease expiration when we are archiving the task. For now I am giving the worker a little more time since parsing half million affected path needs some time..
Test Plan: Patched in our production and it worked.
Reviewers: lifeihuang, JoelB, #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D8773
Summary: wasn't working due to some type issues. Fixes T4756. I also made it display nicer while I was debugging this.
Test Plan: created a herald rule to block changes that added refs. git tag -a "test" -m "test test"; git push origin test got me blocked!
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: epriestley, Korvin
Maniphest Tasks: T4756
Differential Revision: https://secure.phabricator.com/D8724
Summary:
Fixes T4736. Currently, we incorrectly skip the `writeImportStatusFlag()` call if publishing is disabled (the `herald-disabled`) check. This means we don't flag the commit as imported, and don't move the pipeline forward correctly.
Instead, we only want to skip the owners stuff, not the pipeline stuff. Move that to a method.
(Also fix a nearby TODO now that we have a permanent failure exception.)
Test Plan:
- Used `scripts/repository/reparse.php --owners ...` to execute this code, fiddled with things to hit both the disabled and enabled branches and verified the flag stuff is still reached.
- Faked the exceptions and made sure they raise correctly.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4736
Differential Revision: https://secure.phabricator.com/D8715
Summary: ...also link to commits we know about in "Local Commits" and "Revision Update History" tables. Fixes T4585.
Test Plan: made a repo. made a diff (foo) and committed it (bar). made a new diff that was comprised of two local commits. noted links to (bar) in various commit hashes as expected
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: epriestley, Korvin, chad
Maniphest Tasks: T4585
Differential Revision: https://secure.phabricator.com/D8679
Summary:
Fixes T4677. Implements a "send an email" pre-receive action, which sends push summaries.
For use cases where features are often pushed as a large number of commits (e.g., checkpoint commits are retained), using commit emails means users get a ton of email. Instead, this allows you to get an email about a push, which summarizes what changed.
Overall, this is basically the same as commit email, but more suitable for some workflows.
Test Plan:
Wrote some rules, then made a bunch of pushes. Got email like this:
{F134929}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4677
Differential Revision: https://secure.phabricator.com/D8618
Summary:
Ref T4677. This shows a more detailed view of an entire "git push", "hg push", or "svn commit".
This is mostly to give push summary emails a reasonable, stable URI to link to for T4677.
Test Plan:
- Pushed into SVN, Git and Mercurial.
- Viewed partial and imported event records.
{F134864}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4677
Differential Revision: https://secure.phabricator.com/D8616
Summary:
Ref T4677. Currently, we record individual actions in a push as PhabricatorRepositoryPushLogs, but tie them together only loosely with a `transactionKey`.
Provide a real PushEvent object, and move some of the denormalized fields to it. This primarily just gives us more robust infrastructure for building, e.g., email about pushes, for T4677, since we can act on real PHIDs rather than passing awkward identifiers around.
Test Plan:
- Performed migration.
- Looked at database for consistency.
- Browsed/queried push logs.
- Pushed a bunch of stuff.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4677
Differential Revision: https://secure.phabricator.com/D8615
Summary: Ref T4590. Ref T1049. This is primarily intended to support HTTP auth in Harbormaster.
Test Plan: Added a field, edited it, etc.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T4590, T1049
Differential Revision: https://secure.phabricator.com/D8607
Summary: Fixes T4600. If there's also a revision, the variable "$message" gets overwritten. groan~
Test Plan: Pushed a commit with "Fixes T123" and a revision, saw it parse on the first try.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: chrisbolt, aran, epriestley
Maniphest Tasks: T4600
Differential Revision: https://secure.phabricator.com/D8519
Summary:
Via HackerOne. In regular expressions, "$" matches "end of input, or before terminating newline". This means that the expression `/^A$/` matches two strings: `"A"`, and `"A\n"`.
When we care about this, use `\z` instead, which matches "end of input" only.
This allowed registration of `"username\n"` and similar.
Test Plan:
- Grepped codebase for all calls to `preg_match()` / `preg_match_all()`.
- Fixed the ones where this seemed like it could have an impact.
- Added and executed unit tests.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: aran, epriestley
Differential Revision: https://secure.phabricator.com/D8516
Summary:
Currently, disabling Herald only disables feed, notifications and email. Historically, audits didn't really create external effects so it made sense for Herald to only partially disable itself.
With the advent of Harbormaster/Build Plans, it makes more sense for Herald to just stop doing anything. When this option is disabled, stop all audit/build/publish/feed/email actions for the repository.
Test Plan: Ran `scripts/repository/reparse.php --herald`, etc.
Reviewers: dctrwatson, btrahan
Reviewed By: btrahan
Subscribers: aran, epriestley
Differential Revision: https://secure.phabricator.com/D8509
Summary:
- Fixes T4588.
- See D8501.
- Adds a "Tags" field for Herald commit emails.
- Fixes a bug in `tagsquery` when filtering by commit name.
- Make `tagsquery` just return nothing instead of fataling against Mercurial/Subversion.
Test Plan: Used `bin/repository/reparse.php --herald` to exercise this code.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: aran, epriestley
Maniphest Tasks: T4588
Differential Revision: https://secure.phabricator.com/D8502
Summary:
Ref T4588. Request from @zeeg. Adds a "BRANCHES" field to commit emails, so the branches the commit appears on are shown.
I've implemented this with CustomField, but in a very light way.
Test Plan: Used `scripts/repository/reparse.php --herald` to generate mail, got a BRANCHES section where applicable.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: aran, epriestley, zeeg
Maniphest Tasks: T4588
Differential Revision: https://secure.phabricator.com/D8501
Summary:
Ref T2222.
- Removes `DifferentialTasksAttacher`, which has had no callsites for a very long time.
- Moves `differential.getrevisioncomments` off `DifferentialCommentQuery`.
- Moves Releeph churn field off `DifferentialCommentQuery`.
- Removes dead code in `DifferentialRevisionViewController`.
- Removes `DifferentialException` (no references).
- Removes `DifferentialRevision->loadComments()` (no callsites).
- Removes `DifferentialRevision->loadReviewedBy()` (all callsites updated).
- Removes `DifferentialCommentQuery` (all callsites updated).
Test Plan: Mostly a lot of `grep`.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2222
Differential Revision: https://secure.phabricator.com/D8476
Summary: Ref T2222. Straightforward, just breaks a needless dependency.
Test Plan: Pushed and parsed a commit with "Auditors" in it.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2222
Differential Revision: https://secure.phabricator.com/D8473
Summary: Ref T2222. There's some magic here, just port it forward in a mostly-reasonable way. This could use some refinement eventually.
Test Plan: Pushed commits with "Fixes" and "Ref" language, used `reparse.php` to trigger the new code. Saw expected updates in Maniphest.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2222
Differential Revision: https://secure.phabricator.com/D8471
Summary:
There are quite a few tests in Arcanist, libphutil and Phabricator that do something similar to `$this->assertEqual(false, ...)` or `$this->assertEqual(true, ...)`.
This is unnecessarily verbose and it would be cleaner if we had `assertFalse` and `assertTrue` methods.
Test Plan: I contemplated adding a unit test for the `getCallerInfo` method but wasn't sure if it was required / where it should live.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Differential Revision: https://secure.phabricator.com/D8460
Summary: Ref T2222. When we discover a commit associated with a revision, close it using modern transactions.
Test Plan: {F123848}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2222
Differential Revision: https://secure.phabricator.com/D8441
Summary:
For imported SVN repositories with an "Import Only" path, we produce a `/path/to/root/` URI, but should produce `/path/to/root/then/to/import/only/`.
As it is, the URI instructs the user to check out the whole repository.
Also, don't show the "Clone As" fragment in the URI for remote repositories, and prevent it from being edited for nonhosted repositories. This is generally more consistent with user expectation.
Test Plan:
- Created a remote SVN repository with "Import Only", saw path include it.
- Verified no "Clone As" options, no "Clone As" in URI.
- Switched it to hosted, saw "Clone As" options appear and work properly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, staticshock
Differential Revision: https://secure.phabricator.com/D8375
Summary:
Fixes T4414. Currently, when we discover a new repository, we do something like this:
foreach (branch) {
foreach (commit on this branch) {
do_something();
}
}
In cases where there are a lot of branches which mostly just branch `master`, this leads to us doing roughly `O(branches * commits)` work.
We have a `commitCache` to prevent this, but it has two problems:
- It only fills out of the DB, and we do this whole thing before writing to the DB, which is the entire point.
- It has a fixed size (2048) and on initial discovery we're usually dealing with way more commits than that.
Instead, also stop doing work if we hit a commit which is known already.
Test Plan:
- Added `print` on the number of discovered refs and number of unique refs.
- Ran `bin/repository discover --repair X` on a repo with several branches.
- Before the patch, got 397 refs and 135 unique refs.
- After the patch, got 135 refs and 135 unique refs.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4414
Differential Revision: https://secure.phabricator.com/D8374
Summary:
See IRC. I'm having trouble figuring out what's going on with b4taylor's report, but fix two possible issues:
# The commit query is missing a `repositoryID`, which could cause issues if you import two copies of the same repository.
# I think we may try to close commits on untracked branches right now, as long as they aren't excluded by other autoclose rules.
Test Plan: Ran `bin/repository refs` on a few repos.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, brennantaylor
Differential Revision: https://secure.phabricator.com/D8373
Summary: Ref T2222. This is obsolete and no longer used. We could deduce it from transactions or commits in modern Phabricator if we wanted it. We may implement a more general mechanism for T4434.
Test Plan: `grep`
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2222
Differential Revision: https://secure.phabricator.com/D8330
Summary:
@mbishopim3 reported an issue in IRC:
> mbishopim3: epriestley: "Error updating working copy: Commit "" has not been discovered yet! Run discovery before updating refs." any ideas?
I can't reproduce it and it went away for him, but one theory is that we're getting here and git/hg are spitting out nothing, which we incorrectly parse as `array("")` when we intend `array()`.
Test Plan:
Pushed some new commits, ran `bin/repositoy refs X`, got expected results.
I can't actually reproduce the bug, but this might fix it and appears to make the code more correct.
Reviewers: btrahan
Reviewed By: btrahan
CC: mbishopim3, aran
Differential Revision: https://secure.phabricator.com/D8326
Summary:
Ref T1191. I believe we only have three meaningful binary fields across all applications:
- The general cache may contain gzipped content.
- The file storage blob may contain arbitrary binary content.
- The Passphrase secret can store arbitrary binary data (although it currently never does).
This adds Lisk config for binary fields, and uses `%B` where necessary.
Test Plan:
- Added and executed unit tests.
- Forced file uploads to use MySQL, uploaded binaries.
- Disabled the CONFIG_BINARY on the file storage blob and tried again, got an appropraite failure.
- Tried to register with an account containing a G-Clef, and was stopped before the insert.
Reviewers: btrahan, arice
Reviewed By: arice
CC: arice, chad, aran
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D8316
Summary:
Ref T3886. Ref T418. For fields like "Summary" and "Test Plan" where changes can't be summarized in one line, allow CustomField to provide a "(Show Details)" link and render a diff.
Also consolidate some of the existing copy/paste, and simplify this featuer slightly now that we've move to dialogs.
Test Plan:
{F115918}
- Viewed "description"-style field changes in phlux, pholio, legalpad, maniphest, differential, ponder (questions), ponder (answers), and repositories.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3886, T418
Differential Revision: https://secure.phabricator.com/D8284
Summary:
Error:
Fatal error: Call to a member function getRefType() on a non-object in /opt/phabricator/phabricator/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php on line 197
Test Plan: No more error in daemon.log afterwards
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Differential Revision: https://secure.phabricator.com/D8278
Summary: Fixes T4443. Plug VCS passwords into the shared key stretching. They don't use any real stretching now (I anticipated doing something like T4443 eventually) so we can just migrate them into stretching all at once.
Test Plan:
- Viewed VCS settings.
- Used VCS password after migration.
- Set VCS password.
- Upgraded VCS password by using it.
- Used VCS password some more.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4443
Differential Revision: https://secure.phabricator.com/D8272
Summary: See <https://github.com/facebook/phabricator/pull/510>.
After the repository rework, the Create New Repository link in Repositories goes straight to creating a phabricator hosted repo(diffusion/create), rather than the chooser create/import (diffusion/new)
I updated it to point to diffusion/new the same as the New Repository link in Diffusion.
As a side note, the Repositories page could probably use the Crumbs treatment.
Reviewed by: epriestley
Summary: See <https://github.com/facebook/phabricator/issues/501>. I think the issue here is that we created a foreign stub for commit `X-1`, probably because commit `X` was created by running `svn cp y x`.
Test Plan: I'll write a separate test for this before I land it. Huge pain to test.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8133
Summary:
Ref T3102
In diffusion, add "In Any Project" to search options.
Test Plan: use it.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T3102
Differential Revision: https://secure.phabricator.com/D8113
Summary:
Ref T4175. This allows these URIs to all be valid for Git and Mercurial:
/diffusion/X/
/diffusion/X/anything.git
/diffusion/X/anything/
This mostly already works, it just needed a few tweaks.
Test Plan: Cloned git and hg working copies using HTTP and SSH.
Reviewers: btrahan, chad
Reviewed By: chad
CC: aran
Maniphest Tasks: T4175
Differential Revision: https://secure.phabricator.com/D8098
Summary:
Ref T4175.
- Add a configurable name for the clone-as directory, so you can have "Bits & Pieces" clone as "bits~n~pieces/" or simliar.
- By default, use "reasonable" heruistics to choose such a name.
- Generate a copy/pasteable clone commmand with this directory name.
Test Plan: Looked at some repositories.
Reviewers: btrahan, chad
Reviewed By: chad
CC: aran
Maniphest Tasks: T4175
Differential Revision: https://secure.phabricator.com/D8097
Summary:
Hosted repositories have muddied this distinction somewhat. In some cases, we only want to use the real remote URI, and the call is only relevant for imported repositories.
In other cases, we want the URI we'd plug into `git clone`.
Move this logic into `PhabricatorRepository` and make the distinction more clear.
Test Plan: Viewed SVN, Git, and Mercurial hosted and remote repositories, all the URIs looked reasonable.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, dctrwatson
Differential Revision: https://secure.phabricator.com/D8096
Summary:
Fixes T4357. The Mercurial commit hooks enforce dangerous change protection, but the UI doesn't actually let you configure it.
This was just an oversight (I never went back and enabled it) -- allow it to be configured in the UI.
Test Plan: Clicked "Edit Repository" on a hosted Mercurial repository, saw option to enable dangerous changes.
Reviewers: btrahan, chad
Reviewed By: chad
CC: aran
Maniphest Tasks: T4357
Differential Revision: https://secure.phabricator.com/D8108
Summary:
Ref T4338. Mercurial exits with exit code 1 and "no changes found" in stdout
when there's no changes. I've split up the `pushRepositoryToMirror` to make
the code a tad more readable.
It isn't perfect, but it works for me.
Test Plan:
pushed some changes to my hosted repo. Saw them appearing in the
mirrored repo
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T4338
Differential Revision: https://secure.phabricator.com/D8107
Summary:
I derped this up, and it passed my tests because the two URIs are the same for hosted repositories.
Match repositories against their remote URI (like `https://github.com/user/repo.git`), not their display URI (like `/diffusion/X/`).
Test Plan: i r dums
Reviewers: talshiri, btrahan
Reviewed By: talshiri
CC: aran
Differential Revision: https://secure.phabricator.com/D8082
Summary:
Moves away from ArcanistProjects:
- Adds storage for diffs to be directly associated with a repository (instead of indirectly, through arcanist projects). Not really populated yet.
- Drops `parentRevisionID`, which is obsoleted by the "Depends On" edge. This is not exposed in the UI anywhere and doesn't do anything. Resolves TODO.
Test Plan: Ran storage upgrades, browsed around, lots of `grep`.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8072
Summary: This helps us move away from arcanist projects in normal use, by allowing us to identify repositories based on the remote URI.
Test Plan: Used `repository.query` to query some repos by remote URI.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8069
Summary: Expose more options on `repository.query`. My broad goal here is to move forward on suppressing Arcanist Projects.
Test Plan: Used Conduit console to execute queries.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8068
Summary: Ref T4338. Currently, if you have several mirrors and the first one fails, we won't try the other mirrors (since we'll throw and that will take us out of the mirroring process). Instead, try each mirror even if one fails, and then throw an AggregateException with all the failures.
Test Plan:
- Ran `bin/repository mirror` normally.
- Faked an exception, ran again, got the AggregateException I expected.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4338
Differential Revision: https://secure.phabricator.com/D8067
Summary:
Ref T4338. Currently, there's no diagnostic command to execute mirroring (so I can't give users an easy command to run), and it's roughly the last piece of real logic left in the PullLocal daemon.
Separate mirroring out, and provide `bin/repository mirror`.
Test Plan:
- Ran `bin/repository mirror` to mirror a repository.
- Ran PullLocalDaemon and verified it also continued mirroring normally.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4338
Differential Revision: https://secure.phabricator.com/D8066
Summary:
I'm not sure if this is desired functionality, but we happen to need
mirroring of our repository which is not hosted by phabricator, and as far as
I can tell the mirroring code does not depend on the hosting code.
Test Plan:
Setup mirror with SSH credentials to github, pushed changes to
elsewhere hosted repository, commits got mirrored to github.
Reviewers: epriestley
Reviewed By: epriestley
CC: Korvin, epriestley, aran
Maniphest Tasks: T4338
Differential Revision: https://secure.phabricator.com/D7637
Summary: See IRC. Currently, if a fetch fails, we may repair the remote URI incorrectly, replacing it with one without any credentials. Instead, retain credentials.
Test Plan: Faked it locally, got le-boom to verify on IRC.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8055
Summary: After adding the CLOSEABLE flag, `bin/repository importing` reports CLOSEABLE commits with no information. Just don't report these, for consistency with the old behavior. Adding flags to show them might be nice at some point if we run into issues where that output would be useful.
Test Plan: Ran `bin/repositroy importing` on a repository with CLOSEABLE commits.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8024
Summary: Fixes T3238. Ref T4327. Although the instructions are fairly clear on this, it's easy to miss them. Make sure the root the user enters matches the real root.
Test Plan: Added unit tests. Used `bin/repository discover` to hit the check explicitly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3238, T4327
Differential Revision: https://secure.phabricator.com/D8020
Summary: Ref T4327. This adds a subpath/partial repository where we import only a subdirectory, and adds tests for it.
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8019
Summary:
Ref T4327. Adds additional tests:
- Property changes on the root directory (which has special handling).
- Copying a file from a previous revision (this is `svn cp svn://server.com/root/some_file@123 some_file`).
- "Multicopying" a file: this is where a file is copied to several places and moved in the same commit.
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8018
Summary:
Ref T4327. Adds additional SVN tests to cover:
- replacing a file with a file;
- replacing a file with a directory;
- removing a directory with files in it;
- adding a directory with files;
- copying a directory and adding and deleting files inside it.
Test Plan: Ran unit tests. One of these has a slightly irregular parse, see inline.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8017
Summary:
Ref T4327. Adds the same basic battery of tests to the SVN parser as the other parsers have.
cowsay {{{ THIS IS AN AMAZING DIFF }}}
Test Plan:
Ran unit tests against SVN change parsing.
bwahaha
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8016
Summary: Ref T4327. This adds a set of test cases for the Mercurial parser. These tests are nearly identical to the git cases, except that the Mercurial parser can't figure out symlinks right now.
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8014
Summary: Ref T4327. Adds some basic tests to the Git parser for a set of common operations (add, change, move, copy, directory, symlink, propchange).
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8012
Summary:
Ref T4327. There are a bunch of other probably-related tasks too, some linked there.
We have some rare/unusual bugs in the change parsers, mostly in Subversion, but it's terrifying to touch them because they're complicated and fragile and have no test coverage.
To fix this stuff, I want to make them more testable. In particular, they basically end with this big INSERT right now. Instead, I'm going to make them return objects representing the data to be inserted, then have the common infrastructure do the insert. This gives us two benefits:
- Reduced code duplication on the insert;
- we can stop before the insert and have unit tests examine the objects.
This swaps the Git parser over, but doesn't swap the hg/svn parsers yet. I'll do those separately, the SVN one looks a bit tricky.
Test Plan:
- Used `scripts/repository/reparse.php` to reparse a Git commit, with `--trace`. Verified it looked the same as before and the SQL that was executed seemed reasonable.
- Did the same for `hg` / `svn` commits, to make sure I didn't derp anything. These aren't expected to do anything differently.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7980
Summary:
Currently, we can sit in a sleep() for up to 15 seconds (or longer, in some cases), even if there's a parse request.
Try polling the DB every second to see if we should wake up instead. This might or might not produce significant improvements, but seems OK to try and see.
Also a small fix for logging branch names with `%` in them, etc.
Test Plan: Ran `phd debug pulllocal` and pushed commits, saw them parse within a second.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, dctrwatson
Differential Revision: https://secure.phabricator.com/D7998
Summary:
Ref T4327. This moves the last pieces of discovery responsibility out of the PullLocal daemon and into the DiscoveryEngine.
(This makes it easier to discover repositories in unit tests in the future, since we don't need to build a PullLocal daemon and can just build a DiscoveryEngine.)
Test Plan: Ran `phd debug pulllocal`. Ran `repostory discover`.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7997
Summary:
Ref T4327. Consolidates and simplifies infrastructure:
- Moves Git discovery into DiscoveryEngine.
- Collapses a bunch of the Git and Mercurial code related to stream discovery.
- Removes all cach code from PullLocal daemon (it's no longer called).
- Adds basic unit tests for Git and Mercurial discovery.
Various cleanup:
- Makes GitStream and MercurialStream extend a common base.
- Improves performance of MercurialStream in some cases, by requiring fewer commits be output and parsed.
- Makes mirroring exceptions easier to debug.
- Fixes discovery of Mercurial repositories with multiple branch heads.
- Adds some missing `pht()`.
Test Plan:
I tested this fairly throughly because I think this phase is complete:
- Made new repositories in multiple VCSes and did full imports.
- Particularly, I reimported Arcanist to make sure that TODO was resolved. I think it was related to the toposort stuff.
- Pushed commits to multiple VCSes.
- Pushed commits to a non-close branch, then pushed a merge commit. Observed commits import initially as non-close, then get flagged for close.
- Started full daemons and resolved various minor issues that showed up in the daemon log until everything ran cleanly.
- Basically spent about 30 minutes banging on this in every way I could think of to try to break it. I found and fixed some minor stuff, but it seems solid.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7987
Summary:
Ref T4327. Simplify the git discovery process so I can move it to the DiscoveryEngine, so I can make change parsing testable.
In particular:
- As an optimization, we process closeable branches ("master") first, then process uncloseable branches ("epriestley-devel"). This means that in the common case we can insert a commit as closeable immediately when it is discovered, the first pass through the pipeline will get it right, and the "ref update" step will never need to do any meaningful work.
- Commits which do not initially appear on a closeable branch, but later move to one (via merges or ref moves) will now be caught in the ref update step, have the closeable flag set, and have a message step re-queued.
- We no longer need to do a separate discovery step on closable branches.
- We no longer need to keep track of `seenOnBranches`.
Test Plan: Ran discovery on repositories after pushing commits, got reasonable results.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7985
Summary:
Ref T4327. This provides a more general RefCursor-based way to identify closeable commits, and moves away from the messy `seenOnBranches` stuff. Basically:
- When a closeable ref (like the branch "master") is updated, query the VCS for all commits which are ancestors of the new ref, but not ancestors of the existing closeable heads we previously knew about. This is basically all the commits which have been merged or moved onto the closeable ref.
- Take these commits and set the "closeable" flag on the ones which don't have it yet, then queue new tasks to reprocess them.
I haven't removed the old stuff yet, but will do that shortly.
Test Plan:
- Ran `bin/repository discover` and `bin/repository refs` on a bunch of different VCSes and VCS states. The effects seemed correct (e.g., new commits were marked as closeable).
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7984