Summary: Ref T7094. This makes the underlying class take a $user parameter, and then the worker just hands it an omnipotent user. Said underyling class is the benefactor of a small re-factor, dropping one query per-use, though the single query that now remains is policy-based so maybe its a wash or even worse. Still, gotta love one less query.
Test Plan:
a little tricky to test so some extra thought instead
basic acceptance test with `bin/repository reparse --change rValidHashHere` -- it worked!
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T7094
Differential Revision: https://secure.phabricator.com/D11653
Summary:
Ref T2783. I think this served two purposes:
- Improving performance in cases where we "know" a repository is local.
- Preventing loops.
It is now obsolete:
- After D11476, refs can almost always resolve on a fast path.
- As T2783 moves forward, we can usually no longer know when a repository is local without actually looking it up -- almost everything is allowed to run anywhere.
- The cluster behavior in D11475 now prevents loops.
Test Plan: `grep`, browsed around. This didn't really do much of anything anymore.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T2783
Differential Revision: https://secure.phabricator.com/D11477
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.
Test Plan: Eyeballed it.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9431
Summary:
See discussion in D8773. Three small adjustments which should help prevent this kind of issue:
- When queueing followup tasks, hold them on the worker until we finish the task, then queue them only if the work was successful.
- Increase the default lease time from 60 seconds to 2 hours. Although most tasks finish in far fewer than 60 seconds, the daemons are generally stable nowadays and these short leases don't serve much of a purpose. I think they also date from an era where lease expiry and failure were less clearly distinguished.
- Increase the default wait-after-failure from 60 seconds to 5 minutes. This largely dates from the MetaMTA era, where Facebook ran services with high failure rates and it was appropriate to repeatedly hammer them until things went through. In modern infrastructure, such failures are rare.
Test Plan:
- Verified that tasks queued properly after the main task was updated.
- Verified that leases default to 7200 seconds.
- Intentionally failed a task and verified default 300 second wait before retry.
- Removed all default leases shorter than 7200 seconds (there was only one).
- Checked all the wait before retry implementations for anything much shorter than 5 minutes (they all seem reasonable).
Reviewers: btrahan, sowedance
Reviewed By: sowedance
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D8774
Summary: See <https://github.com/facebook/phabricator/issues/501>. I think the issue here is that we created a foreign stub for commit `X-1`, probably because commit `X` was created by running `svn cp y x`.
Test Plan: I'll write a separate test for this before I land it. Huge pain to test.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D8133
Summary:
Ref T4327. Adds additional SVN tests to cover:
- replacing a file with a file;
- replacing a file with a directory;
- removing a directory with files in it;
- adding a directory with files;
- copying a directory and adding and deleting files inside it.
Test Plan: Ran unit tests. One of these has a slightly irregular parse, see inline.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8017
Summary:
Ref T4327. Adds the same basic battery of tests to the SVN parser as the other parsers have.
cowsay {{{ THIS IS AN AMAZING DIFF }}}
Test Plan:
Ran unit tests against SVN change parsing.
bwahaha
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8016
Summary: Ref T4327. This adds a set of test cases for the Mercurial parser. These tests are nearly identical to the git cases, except that the Mercurial parser can't figure out symlinks right now.
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8014
Summary: Ref T4327. Adds some basic tests to the Git parser for a set of common operations (add, change, move, copy, directory, symlink, propchange).
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D8012
Summary:
Ref T4327. There are a bunch of other probably-related tasks too, some linked there.
We have some rare/unusual bugs in the change parsers, mostly in Subversion, but it's terrifying to touch them because they're complicated and fragile and have no test coverage.
To fix this stuff, I want to make them more testable. In particular, they basically end with this big INSERT right now. Instead, I'm going to make them return objects representing the data to be inserted, then have the common infrastructure do the insert. This gives us two benefits:
- Reduced code duplication on the insert;
- we can stop before the insert and have unit tests examine the objects.
This swaps the Git parser over, but doesn't swap the hg/svn parsers yet. I'll do those separately, the SVN one looks a bit tricky.
Test Plan:
- Used `scripts/repository/reparse.php` to reparse a Git commit, with `--trace`. Verified it looked the same as before and the SQL that was executed seemed reasonable.
- Did the same for `hg` / `svn` commits, to make sure I didn't derp anything. These aren't expected to do anything differently.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4327
Differential Revision: https://secure.phabricator.com/D7980
Summary: Fixes T3857. Earlier work made this trivial and just left product questions, which I've answered by requiring the daemons to run on reasonable installs.
Test Plan: Ran `bin/search index` and `bin/search index --background`. Observed indexes write in the former case and tasks queue in the latter case. Commented with a unique string on a revision and searched for it a moment later, got exactly one result (that revision), verifying that reindexing works correctly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3857
Differential Revision: https://secure.phabricator.com/D7966
Summary: Ref T4195. Even though we use `svnlook` in the hook itself, I need this query elsewhere, so provide it and merge the classes into one which does the right thing.
Test Plan:
- Used `reparse.php` to reparse messages for Git, SVN and Mercurial commits, using `var_dump()` to examine the commit refs for sanity.
- Used `reparse.php` to reparse changes for an SVN commit.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4195
Differential Revision: https://secure.phabricator.com/D7800
Summary:
Ref T2230. SVN has some weird rules about path construction. Particularly, if you're missing a "/" in the remote URI right now, the change parsing step doesn't build the right paths.
Instead, build the right paths more intelligently.
Test Plan: Added and executed unit tests. Imported an SVN repo.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, jpeffer
Maniphest Tasks: T2230
Differential Revision: https://secure.phabricator.com/D7590
Summary:
See <https://github.com/facebook/phabricator/issues/425>. There are some ways that the change parsers may not reach `finishParse()`, but we now need them to in order to mark the commit imported, advance the progress bar, and eventually kick the repository out of IMPORTING status.
Take all the copy/pasted code in the parsers and move it into the parent. Specifically, this is:
- Printing a status message about starting a parse;
- checking for bad commits;
- queueing the next parse stage; and
- marking the import step complete.
Test Plan: Used `reparse.php --change` to reparse Git, SVN and Mercurial repos.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D7470
Summary:
Relocated files aren't treated as newly created files by the worker. This
can lead to the worker trying to look up information about deleted files
in the wrong location.
Test Plan: See T4030
Reviewers: #blessed_reviewers, epriestley
Reviewed By: epriestley
CC: epriestley, aran
Maniphest Tasks: T4030
Differential Revision: https://secure.phabricator.com/D7432
Summary:
Fixes T3416. Fixes T1733.
- Adds a flag to the commit table showing whether or not we have parsed it.
- The flag is set to `0` initially when the commit is discovered.
- The flag is set to `1` when the changes are parsed.
- The UI can now use the flag to distinguish between "empty commit" and "commit which we haven't imported changes for yet".
- Simplify rendering code a little bit.
- Fix an issue with the Message parser for empty commits.
- There's a key on the flag so we can do `SELECT * FROM repository_commit WHERE repositoryID = %d AND importStatus = 0 LIMIT 1` soon, to determine if a repository is fully imported or not. This will let us improve the UI (Ref T776, Ref T3217).
Test Plan:
- Ran `bin/storage upgrade -f`.
- Created an empty commit.
- Without the daemons running, ran `bin/repository pull GTEST` and `bin/repository discover GTEST`.
- Viewed web UI to get the first screenshot ("Still Importing...").
- Ran the message and change steps with `scripts/repository/reparse.php`.
- Viewed web UI to get the second screenshot ("Empty Commit").
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T776, T1733, T3416, T3217
Differential Revision: https://secure.phabricator.com/D7428
Summary: Ref T603. This swaps almost all queries against the repository table over to be policy aware.
Test Plan:
- Made an audit comment on a commit.
- Ran `save_lint.php`.
- Looked up a commit with `diffusion.getcommits`.
- Looked up lint messages with `diffusion.getlintmessages`.
- Clicked an external/submodule in Diffusion.
- Viewed main lint and repository lint in Diffusion.
- Completed and validated Owners paths in Owners.
- Executed dry runs via Herald.
- Queried for package owners with `owners.query`.
- Viewed Owners package.
- Edited Owners package.
- Viewed Owners package list.
- Executed `repository.query`.
- Viewed "Repository" tool repository list.
- Edited Arcanist project.
- Hit "Delete" on repository (this just tells you to use the CLI).
- Created a repository.
- Edited a repository.
- Ran `bin/repository list`.
- Ran `bin/search index rGTESTff45d13dffcfb3ea85b03aac8cc36251cacdf01c`
- Pushed and parsed a commit.
- Skipped all the Drydock stuff, as it it's hard to test and isn't normally reachable.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T603
Differential Revision: https://secure.phabricator.com/D7132
Summary: Ref T3377. MySQL ignores indexes if we hand it mismatched datatypes. This seems colossally dumb, but give it what it expects.
Test Plan: wat
Reviewers: wez, btrahan
Reviewed By: wez
CC: aran
Maniphest Tasks: T3377
Differential Revision: https://secure.phabricator.com/D6201
Summary: Ref T2784. This one was a wee bit complicated. Had to add PhabricatorUser and concept of initFromConduit (or not) to DiffusionRequest.
Test Plan: foreach repo, visited CALLSIGN and clicked a commit and verified they laoded correctly. Hacked code to hit NOT via Conduit and repeated tests to great success.
Reviewers: epriestley
Reviewed By: epriestley
CC: chad, aran, Korvin
Maniphest Tasks: T2784
Differential Revision: https://secure.phabricator.com/D5928
Summary: If there is a change in SVN root (perhaps properties change) then we try to list its parent (which doesn't exist) and mark the root itself as deleted.
Test Plan: Parsed SVN commit with property change of root.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D5709
Summary:
The other `true` is correct (and I think we can fix the scaling issues) but this one should be an indirect change. This prevents the branch from appearing in the history of every file.
(I misread this code and gave @vrana some bad advice originally. This is //actually// consistent with Mercurial and Git.)
Test Plan: Partial revert. I'll make this stuff testable.
Reviewers: nh, vrana
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D4742
Test Plan: HHVM currently core dumps so I didn't test it.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D4343
Summary:
The search indexing API has several problems right now:
- Always runs in-process.
- It would be nice to push this into the task queue for performance. However, the API currently passses an object all the way through (and some indexers depend on preloaded object attributes), so it can't be dumped into the task queue at any stage since we can't serialize it.
- Being able to use the task queue will also make rebuilding indexes faster.
- Instead, make the API phid-oriented.
- No uniform indexing API.
- Each "Editor" currently calls SomeCustomIndexer::indexThing(). This won't work with AbstractTransactions. The API is also just weird.
- Instead, provide a uniform API.
- No uniform CLI.
- We have `scripts/search/reindex_everything.php`, but it doesn't actually index everything. Each new document type needs to be separately added to it, leading to stuff like D3839. Third-party applications can't provide indexers.
- Instead, let indexers expose documents for indexing.
- Not application-oriented.
- All the indexers live in search/ right now, which isn't the right organization in an application-orietned view of the world.
- Instead, move indexers to applications and load them with SymbolLoader.
Test Plan:
- `bin/search index`
- Indexed one revision, one task.
- Indexed `--type TASK`, `--type DREV`, etc., for all types.
- Indexed `--all`.
- Added the word "saboteur" to a revision, task, wiki page, and question and then searched for it.
- Creating users is a pain; searched for a user after indexing.
- Creating commits is a pain; searched for a commit after indexing.
- Mocks aren't currently loadable in the result view, so their indexing is moot.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: 20after4, aran
Maniphest Tasks: T1991, T2104
Differential Revision: https://secure.phabricator.com/D4261
Summary:
Because of the way PhabricatorOwnersPackage works, the code to save package
changes when parsing commit changes was raising a few undefined index errors,
and was throwing an exception trying to call a method on null (because not
all of the phids related to the package had their handles loaded). This fix
doesn't load the missing handles, it just avoids trying to use them.
Test Plan:
./scripts/repository/reparse.php on a commit with path changes that triggered
a package change
Reviewers: epriestley, vrana
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D4236
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary:
Currently, when taskmasters complete a task it is immediately deleted. This prevents us from doing some general things, like:
- Supporting the idea of permanent failure (e.g., after N failures just stop trying).
- Showing the user how fast taskmasters are completing tasks.
- Showing the user how long tasks took to complete.
Having better visibility into this is important to Drydock, which builds on the task system. Also, generally buff debug output for task execution.
Test Plan: Ran `bin/phd debug taskmaster`. Ran `bin/phd debug garbage`. Queued some tasks via various systems.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D3852
Summary: If a change copies some file `A` to `B` and also edits `A`, we currently record this as an indirect change and don't show the edits to `A` in the diff. Instead, record these as direct changes.
Test Plan: Created two commits, one which copied `A` to `B` without modifying `A` and one which copied `A` to `B` and modified A. Viewed both commits in Diffusion. The unmodified commit did not show `A`, and the modified commit did (with the correct changes).
Reviewers: btrahan, vrana
Reviewed By: vrana
CC: champo, aran
Differential Revision: https://secure.phabricator.com/D3120
Summary:
- `kill_init.php` said "Moving 1000 files" - I hope that this is not some limit in `FileFinder`.
- [src/infrastructure/celerity] `git mv utils.php map.php; git mv api/utils.php api.php`
- Comment `phutil_libraries` in `.arcconfig` and run `arc liberate`.
NOTE: `arc diff` timed out so I'm pushing it without review.
Test Plan:
/D1234
Browsed around, especially in `applications/repository/worker/commitchangeparser` and `applications/` in general.
Auditors: epriestley
Maniphest Tasks: T1103
Summary:
we were parsing the git log output slightly incorrectly and over-exploding on spaces. we also needed to escape the path %20 stuff`.
Not sure if there's something fancy to do given folks should reparse their repos if they are impacted by this issue.
Test Plan:
made a directory with spaces and some dummy revisions. observed diffs wouldn't load and links broken.
with patch, ran scripts/reparse.php for pertinent revisions and diffs loaded and links weren't broken.
Reviewers: floatinglomas, epriestley
Reviewed By: epriestley
CC: aran, Koolvin
Maniphest Tasks: T1252
Differential Revision: https://secure.phabricator.com/D2510
Summary:
D2457 and D2459 made some changes here. This little guy just needed to be touched so it could be arc lint'd to have the proper updated include paths
Sorry to spam reviewers; Evan is in Diablo 3 somewhere. :D
Test Plan: no more fatal on test everything implemented (arc unit src/infrastructure/__tests__/)
Reviewers: epriestley, jungejason, nh, csilvers
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2474
Summary:
This continues work started at D2215.
Files moved from deleted directory were marked as Copied Here instead of Moved Here.
Test Plan: Reparsed two commits which was previously wrong, now correct.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran
Maniphest Tasks: T1114
Differential Revision: https://secure.phabricator.com/D2229
Summary: This is not perfect. Moved files are reported as deleted but I'm happy with it.
Test Plan: Reparsed two commits which was previously wrong, now semi-correct.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran
Maniphest Tasks: T1114
Differential Revision: https://secure.phabricator.com/D2215
Summary:
Currently, we use "git log" to detect the change list for all commits, but this produces no output for merge commits.
Instead, parse them as changes against the first parent (the merge destination). This produces generally sensible/expected behavior, and is consistent with what GitHub does.
We need to special-case the first commit because it doesn't have parents.
NOTE: This is a parser change so you need to run `./scripts/repository/reparse.php --all <callsign> --change` to reparse merge commits in already-imported repositories after updating.
Test Plan: Reparsed a merge commit, a non-merge commit, and the first commit in the Phabricator repository.
Reviewers: btrahan, gschmidt
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T961
Differential Revision: https://secure.phabricator.com/D1985
Summary: Last of the big final patches. Left a few debatable classes (12 out of about 400) that I'll deal with individually eventually.
Test Plan: Ran testEverythingImplemented.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T795
Differential Revision: https://secure.phabricator.com/D1881
Summary:
svn 1.7 changed their xml format slightly, they now have a
##<?xml version="1.0" encoding="UTF-8"?>## tag instead of
##<?xml version="1.0"?>##. This relaxes matching this tag.
Test Plan: ./scripts/repository/reparse.php rE521979 --change
Reviewers: epriestley, jungejason
Reviewed By: epriestley
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1866
Summary: We currently parse these as directory changes and discard them. Instead, parse them as a new "SUBMODULE" type of change.
Test Plan:
- Reparsed a commit which changes submodules and verified it parses correctly.
- Reparsed a commit which adds submodules and verified it parses correctly.
Reviewers: btrahan, kdeggelman
Reviewed By: btrahan
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1815
Summary:
enable herald commit rules to have access to auditing info.
Note that the new herald condition I added contains info for the
packages. I thought about using a simpler herald condition like
"Requires audit is true or false" and let it work together with the
existing "Affected package contains any of the package". It doesn't work
because we need the info about the package to decide if the commit
requires audit, but the herald conditions work separately.
Test Plan:
- A commit requiring auditing was detected by a herald rule that checks
the auditing status
- A commit not requiring auditing was not detected by a herald rule
which checks auditing status, but was detected by a rule which doesn't
check the auditing status
Reviewers: epriestley, nh
Reviewed By: epriestley
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1399
Summary:
For each commit, find the affected packages, and provide a way to
search by package.
Test Plan:
create commits that touch and don't touch two packages, and verify
that they display correctly in all the UI pages.
Reviewers: epriestley, blair, nh, tuomaspelkonen
Reviewed By: epriestley
CC: benmathews, aran, epriestley, btrahan, jungejason, mpodobnik, prithvi
Maniphest Tasks: T83
Differential Revision: 1208
Summary:
We need to issue all commands as $repository->junk() so we can pick up
credentials. Some of this stuff predates that change landing.
(I removed the "https" vs "svn+ssh" fallback code since it's specific to
Facebook, affected a tiny number of commits, is basically an SVN bug with UTF-8
handling and HTTP support, and doesn't make sense in the general case. The user
has the tools they need to force it via "reparse.php" if it's really an issue.)
Test Plan: Created new authenticated-remote mercurial and git repositories and
pulled/discovered them with credentials.
Reviewers: Makinde, jungejason, nh, tuomaspelkonen, aran
Reviewed By: Makinde
CC: aran, Makinde
Differential Revision: 970
Summary: Change import script plus almost all the view stuff. Still some rough
edges but this seems to mostly work. Blame is currently unsupported but I think
everything else works properly.
Test Plan:
Imported the hg repository itself. It doesn't immediately seem completely
broken. Here are some screens:
https://secure.phabricator.com/file/view/PHID-FILE-1438b71cc7c4a2eb4569/https://secure.phabricator.com/file/view/PHID-FILE-3cec4f72f39e7de2d041/https://secure.phabricator.com/file/view/PHID-FILE-2ea4883f160e8e5098f9/https://secure.phabricator.com/file/view/PHID-FILE-35f751a36ebf65399ade/
All the parsers were able to churn through it without errors.
Ran the new "reparse.php" script in various one-commit and repository modes.
Browsed/imported some git repos for good measure.
NOTE: The hg repository is only 15,000 commits and around 1,000 files.
Performance is okay but hg doesn't provide performant, native APIs to get some
data efficiently so we have to do some dumb stuff. If some of these interfaces
are cripplingly slow or whatever, let me know and we can start bundling some
Mercurial extensions with Arcanist.
Reviewers: Makinde, jungejason, nh, tuomaspelkonen, aran
Reviewed By: Makinde
CC: aran, Makinde, epriestley
Differential Revision: 960