Summary:
To make it easier to monitor daemons, let's store their current state
(running, died, exited, or unknown) to the db. The purpose of this is to
provide more information on the daemon console about the status of daemons,
especially when they are running on multiple machines. This is mostly backend
work, with only a few frontend changes. (It is also dependent on a change
to libphutil.)
These changes will make dead or stuck daemons more obvious, and will allow
more work on the frontend to hide daemons (and logs) that have exited cleanly,
i.e. ones we don't care about any more.
Test Plan:
- run db migration, check in db that all daemons were marked as exited
- start up a daemon, check in db that it is marked as running
- open web interface, check that daemon is listed as running
- after daemon has been running for a little bit, check in db that dateModified
is being updated (indicating daemon is properly sending heartbeat)
- kill -9 daemon (but don't run bin/phd yet), and check that db still shows it
as running
- edit daemon db entry to show it as being on a different host, and backdate
dateModified field by 3 minutes, and check the web ui to show that the status
is unknown.
- change db entry to have proper host, check in web ui that daemon status is
displayed as dead. Check db to see that the status was saved.
- run bin/phd stop, and see that the formerly dead daemon is now exited.
Reviewers: epriestley, vrana
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3126
Summary:
- Include symbols in main typeahead results.
- Simplify the symbol query a bit and extend PhabricatorOffsetPagedQuery. There was some stuff around language ranking that I got rid of, I think the theory there was that mapping file extensions to languages might not work in general but I think it works fine in practice, and we have more config stuff now around guessing languages and getting the mappings right.
- Make it easier to debug the typeahead by showing the results in page format for non-ajax requests.
- When we have too many results, show only the top few of each type.
Depends on D3116, D3117
Test Plan: Used typeahead, got symbols in results. Hit endpoint with non-ajax, got useful debug view.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1569
Differential Revision: https://secure.phabricator.com/D3118
Summary:
Each query takes over 2 seconds in FBCODE.
I didn't find a way how to speed them up.
There's also no easy way how to parallelize them at least.
So AJAX is the last instance.
Test Plan: Loaded commit with one branch and no tag.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3112
Summary:
- Looks better (can probably still use some tweaks), especially search.
- Moves logout from weird footer location to main menu.
- Reactive: on tablets and phones, the menu adjusts to remain useful.
- Fixed position on desktops for future side nav changes.
- Adds an icon header thing that's currently hard-coded but will be application-driven soon.
Test Plan: Used menu on desktop, tablet, phone, logged in / logged out, toggled darkconsole. Will add some screenshots.
Reviewers: btrahan, chad
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1569
Differential Revision: https://secure.phabricator.com/D3105
Summary:
Support placeholder text for inputs. We currently don't use this because it requires JS and doesn't degrade (no JS means you have zero idea what the input is for if it isn't separately labeled) but there are some cases where intent is obvious from context (for example, the search input in the menu bar, which is fairly obvious on its own and will soon have a magnifying glass icon) and in such cases it's much prettier and saves a bunch of space over an explicit label. Add a behavior so we can add placeholders where they make sense.
This implementation is somewhat sanity-checked agianst the two jQuery placeholder implementations I was able to google:
https://github.com/danielstocks/jQuery-Placeholder/https://github.com/mathiasbynens/jquery-placeholder
Since we don't currently have any uses cases, I haven't included support for making JS access to the `value` work, for password inputs, or for dynamically altering the placeholder.
Test Plan: Played around with the placeholder in the UI example in various browsers and couldn't break it.
Reviewers: btrahan, chad
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1569
Differential Revision: https://secure.phabricator.com/D3103
Summary: For any count fact, allow a chart to be drawn. INCREDIBLY POWERFUL DATA ANALYSIS PLATFORM.
Test Plan: Drew a chart of object counts. Drew the Maniphest burn chart.
Reviewers: vrana, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3099
Summary:
- Add PhabricatorApplication. This is a general class that I have grand designs for, but used here to allow applications to provide objects for analysis by the facts appliction.
- Add FactCursors, to keep track of where iterators are.
- Make the daemon do something sort of useful.
- Add `bin/fact cursors` for showing and managing objects and cursors.
- Add some options to `bin/fact analyze`.
Test Plan:
- `bin/fact cursors`, `bin/fact cursors --reset DifferentialRevision`, `bin/fact cursors --reset X`
- `bin/fact analyze`, `bin/fact analyze --all`, `bin/fact analyze --iterator DifferentialRevision --skip-aggregates`
- `bin/phd debug fact`
Reviewers: vrana, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3098
Summary: Not totally sure about this but I think it's okay?
Test Plan: Loaded /fact/, got a more readable page.
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3090
Summary:
Some facts are aggregations of other facts. For example, we may compute how many times each macro is used in each object as a "raw fact":
Dnnn uses macro "psyduck" 6 times.
But we want to present this data in aggregate form, e.g. "order macros by popularity". We can do this at runtime and it probably won't be too awful a query, but we can also aggregate it cheaply:
Macro "psyduck" is used 3920 times across all objects.
...and then do a query like "select macros ordered by usage".
"Aggregate" facts support facts like this. The aggregate facts I've implemented are:
- Count of all objects.
- Count of objects of type X.
- Last time facts were updated.
These clearly fit the "aggregate" facts template well. I'm not 100% sure macros do. We can use this table to answer a question like "What are the most popular macros, ordered by use?" We can also use it to answer a question like "What are the most popular macros in the last 6 months?", if we build a specific fact for that. But we can't use it to answer a question like "What are the most popular macros between times X and Y?". Maybe that's important; maybe not.
This seems like a good fit for at least some types of facts.
I'll de-magic the keys a bit in the next diff.
Test Plan: Ran the engines and got some aggregated facts about other facts.
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3089
Summary:
Basic "Fact" application with some storage, part of a daemon, and a control binary.
= Goals =
The general idea is that we have various statistics we'd like to compute, like the frequency of image macros, reviewer responsiveness, task close rates, etc. Computing these on page load is expensive and messy. By building an ETL pipeline and running it in a daemon, we can precompute statistics and just pull them out of "stats" tables.
One way to do this is just to completely hard-code everything, e.g. have a daemon that runs every hour which issues a big-ass query and dumps results into a table per-fact or per fact-group. But this has a bunch of drawbacks: adding new stuff to the pipeline is a pain, various fact aggregators can't share much code, updates are slow and expensive, we can never build generic graphs on top of it, etc.
I'm hoping to build an ETL pipeline which is generic enough that we can use it for most things we're interested in without needing schema changes, and so that installs can use it also without needing schema changes, while still being specific enough that it's fast and we can build useful stuff on top of it. I'm not sure if this will actually work, but it would be cool if it does so I'm starting pretty generally and we'll see how far I get. I haven't built this exact sort of thing before so I might be way off.
I'm basing the whole thing on analyzing entire objects, not analyzing changes to objects. So each part of the pipeline is handed an object and told "analyze this", not handed a change. It pretty much deletes all the old data about that thing and then writes new data. I think this is simpler to implement and understand, and it protects us from all sorts of weird issues where we end up with some kind of garbage in the DB and have to wipe the whole thing.
= Facts =
The general idea is that we extract "facts" out of objects, and then the various view interfaces just report those facts. This change has on type of fact, a "raw fact", which is directly derived from an object. These facts are concerete and relate specifically to the object they are derived from. Some examples of such facts might be:
D123 has 9 comments.
D123 uses macro "psyduck" 15 times.
D123 adds 35 lines.
D123 has 5 files.
D123 has 1 object.
D123 has 1 object of type "DREV".
D123 was created at epoch timestamp 89812351235.
D123 was accepted by @alincoln at epoch timestamp 8397981839.
The fact storage looks like this:
<factType, objectPHID, objectA, valueX, valueY, epoch>
Currently, we supprot one optional secondary key (like a user PHID or macro PHID), two optional integer values, and an optional timestamp. We might add more later. Each fact type can use these fields if it wants. Some facts use them, others don't. For instance, this diff adds a "N:*" fact, which is just the count of total objects in the system. These facts just look like:
<"N:*", "PHID-xxxx-yyyy", ...>
...where all other fields are ignored. But some of the more complex facts might look like:
<"DREV:accept", "PHID-DREV-xxxx", "PHID-USER-yyyy", ..., ..., nnnn> # User 'yyyy' accepted at epoch 'nnnn'.
<"FILE:macro", "PHID-DREV-xxxx", "PHID-MACR-yyyy", 17, ..., ...> # Object 'xxxx' uses macro 'yyyy' 17 times.
Facts have no uniqueness constraints. For @vrana's reviewer responsiveness stuff, we can insert multiple rows for each reviewer, e.g.
<"DREV:reviewed", "PHID-DREV-xxxx", "PHID-USER-yyyy", nnnn, ..., mmmm> # User 'yyyy' reviewed revision 'xxxx' after 'nnnn' seconds at 'mmmm'.
The second value (valueY) is mostly because we need it if we sample anything (valueX = observed value, valueY = sample rate) but there might be other uses. We might need to add "objectB" at some point too -- currently we can't represent a fact like "User X used macro Y on revision Z", so it would be impossible to compute macro use rates //for a specific user// based on this schema. I think we can start here though and see how far we get.
= Aggregated Facts =
These aren't implemented yet, but the idea is that we can then take the "raw facts" and compute derived/aggregated/rollup facts based on the raw fact table. For example, the "count" fact can be aggregated to arrive at a count of all objects in the system. This stuff will live in a separate table which does have uniqueness constraints, and come in the next diff.
We might need some kind of time series facts too, not sure about that. I think most of our use cases today are covered by raw facts + aggregated facts.
Test Plan: Ran `bin/fact` commands and verified they seemed to do reasonable things.
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran, majak
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3078
Summary: This iterator processes objects that have been updated.
Test Plan:
Ran this test script:
$cursor = null;
$table = new DifferentialRevision();
while (true) {
$iterator = new PhabricatorFactsUpdateIterator($table, $cursor);
foreach ($iterator as $new_cursor => $update) {
echo "{$new_cursor} => D".$update->getID()."\n";
$cursor = $new_cursor;
}
echo "Zzz...\n";
sleep(5);
}
Verified it iterated over every object and then stopped. Made a comment on a differenial revision, verified it iterated over the object after 15 seconds.
Reviewers: vrana, btrahan
Reviewed By: vrana
CC: aran
Maniphest Tasks: T1562
Differential Revision: https://secure.phabricator.com/D3077
Summary: Also move the other tests up so they'll trigger when this stuff is touched.
Test Plan: liberate
Reviewers: nh, btrahan, vrana
Reviewed By: nh
CC: aran
Maniphest Tasks: T1283
Differential Revision: https://secure.phabricator.com/D3074
Summary:
The final goal is to display reviewers response time on homepage.
This is a building block for it.
The algorithm is quite strict - it doesn't count simple comment as response because reviewers would be able to cheat with comments like "I'm overwhelmed right now and will review next week".
We are more liberate in Phabricator where reviewers response with comments without changing the status quite often but I'm not trying to improve response times in Phabricator so this is irrelevant.
Reviewers in Facebook changes status more often (to clean their queue) so I follow this approach.
There is currently no way to track reviewers silently added and removed in Edit Revision but it's not a big deal.
The algorithm doesn't track commandeered revision, there's a TODO for it.
Response times are put in two buckets: `$reviewed` and `$not_reviewed`.
`$reviewed` contains reviewers who took action, `$not_reviewed` contains reviewers who didn't respond on time.
I will probably compute average time from `$reviewed` and raise it for those `$not_reviewed` that are higher than this average.
The idea is to not favor reviewers who were only lucky for being in a group with someone fast.
Test Plan: Passed test.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3062
Summary: We pull "retries" and a doc link from PhabricatorEnv directly. Break these dependencies so the classes can move to libphutil.
Test Plan: Browsed site, triggered a schema exception and verified I still got the useful footer text.
Reviewers: btrahan, vrana
Reviewed By: vrana
CC: aran
Maniphest Tasks: T1283
Differential Revision: https://secure.phabricator.com/D3053
Summary: created a PhabricatorInlineCommentPreviewController so controllers in Diffusion and Differential respectively just have to handle the URI mapping and data loading like good little controllers.
Test Plan:
left inline comments on commits, deleted inline commits, submitted inline comments -- all worked well
did the same on some diffs
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1176
Differential Revision: https://secure.phabricator.com/D3034
Summary:
- Add edges for this relationship.
- Use edges to store this data.
- Migrate old data.
- Fix some warnings with generating feed stories about Aux and Edge transactions.
- Fix a task-task edge issue with "Create Subtask".
Test Plan:
- Migrated data, verified reivsions showed up.
- Attached and detached tasks to revisions and vice versa.
- Created a new revision with attached tasks.
- Created a subtask.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D3018
Summary:
blogs are collections of posts. a blog also has metadata like a name, description and "bloggers" that can edit the metadata of the blog and contribute posts.
changes include the post edit flow where bloggers can now select which blogs to publish to. also made various small tweaks throughout the UI to make things sensical and clean as the concept of blogs is introduced.
there's edges powering this stuff. bloggers <=> blogs and posts <=> blogs in particular.
Test Plan:
made blogs, deleted blogs, tried to make blogs with no bloggers. all went well.
verified ui to publish only showed up for public posts, published posts to blogs, un-published posts to blogs, re-published posts to blogs, deleted posts and verified they disappeared from blogs.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1373
Differential Revision: https://secure.phabricator.com/D3003
Summary:
- Use edges to store "X depends on Y" information in Maniphest.
- Show both "Depends On" and "Dependent Tasks".
- Migrate all the old edges.
Test Plan:
- Added some relationships, migrated, verified they were preserved.
- Added some new valid relationships, verified tasks got updated with sensible transactions and sent reasonable emails.
- Tried to add a cycle, got an ugly but effective error.
Reviewers: vrana, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1162
Differential Revision: https://secure.phabricator.com/D3006
Summary: See D3006, D3007. Make it easier to do migrations like that without holding all results in memory.
Test Plan:
Ran this code with an artificially small page size (2):
foreach (new LiskMigrationIterator(new DifferentialRevision()) as $rev) {
echo "Revision ".$rev->getID()."\n";
}
Verified each revision as loaded and processed.
Reviewers: vrana, btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1162
Differential Revision: https://secure.phabricator.com/D3008
Summary: This is a minor quality-of-life improvement to prevent D2968 from being as nasty as it is.
Test Plan: Ran unit tests; generated Differential, Maniphest and Diffusion emails and verified the bodies looked sensible.
Reviewers: btrahan, vrana
Reviewed By: vrana
CC: aran
Maniphest Tasks: T931
Differential Revision: https://secure.phabricator.com/D2986
Summary: a silly thing because I was bored
Test Plan: and I said "arc call-conduit", and there were comments
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1500
Differential Revision: https://secure.phabricator.com/D2971
Summary:
The immediate issue this addresses is T1366, adding a rendering cache to Phriction. For wiki pages with code blocks especially, rerendering them each time is expensive.
The broader issue is that out markup caches aren't very good right now. They have three major problems:
**Problem 1: the data is stored in the wrong place.** We currently store remarkup caches on objects. This means we're always loading it and passing it around even when we don't need it, can't genericize cache management code (e.g., have one simple script to drop/GC caches), need to update authoritative rows to clear caches, and can't genericize rendering code since each object is different.
To solve this, I created a dedicated cache database that I plan to move all markup caches to use.
**Problem 2: time-variant rules break when cached.** Some rules like `**bold**` are time-invariant and always produce the same output, but some rules like `{Tnnn}` and `@username` are variant and may render differently (because a task was closed or a user is on vacation). Currently, we cache the raw output, so these time-variant rules get locked at whatever values they had when they were first rendered. This is the main reason Phriction doesn't have a cache right now -- I wanted `{Tnnn}` rules to reflect open/closed tasks.
To solve this, I split markup into a "preprocessing" phase (which does all the parsing and evaluates all time-invariant rules) and a "postprocessing" phase (which evaluates time-variant rules only). The preprocessing phase is most of the expense (and, notably, includes syntax highlighting) so this is nearly as good as caching the final output. I did most of the work here in D737 / D738, but we never moved to use it in Phabricator -- we currently just do the two operations serially in all cases.
This diff splits them apart and caches the output of preprocessing only, so we benefit from caching but also get accurate time-variant rendering.
**Problem 3: cache access isn't batched/pipelined optimally.** When we're rendering a list of markup blocks, we should be able to batch datafetching better than we do. D738 helped with this (fetching is batched within a single hunk of markup) and this improves batching on cache access. We could still do better here, but this is at least a step forward.
Also fixes a bug with generating a link in the Phriction history interface ($uri gets clobbered).
I'm using PHP serialization instead of JSON serialization because Remarkup does some stuff with non-ascii characters that might not survive JSON.
Test Plan:
- Created a Phriction document and verified that previews don't go to cache (no rows appear in the cache table).
- Verified that published documents come out of cache.
- Verified that caches generate/regenerate correctly, time-variant rules render properly and old documents hit the right caches.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1366
Differential Revision: https://secure.phabricator.com/D2945
Summary:
Certain types of things we should be storing in edges (notably, Task X depends on Task Y) should always be acyclic. Allow `PhabricatorEdgeEditor` to enforce this, since we can't correctly enforce it outside of the editor without being vulnerable to races.
Each edge type can be marked acyclic. If an edge type is acyclic, we perform additional steps when writing new edges of that type:
- We acquire a global lock on the edge type before performing any reads or writes. This ensures we can't produce a cycle as a result of a race where two edits add edges which independently do not produce a cycle, but do produce a cycle when combined.
- After performing writes but before committing transactions, we load the edge graph for each acyclic type and verify that it is, in fact, acyclic. If we detect cycles, we abort the edit.
- When we're done, we release the edge type locks.
This is a relatively high-complexity change, but gives us a simple way to flag an edge type as acyclic in a robust way.
Test Plan: Ran unit tests.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1162
Differential Revision: https://secure.phabricator.com/D2940
Summary:
A later diff adds unit tests against edges, but we need real objects to connect with edges. Add some trivial objects to the Harbormaster database to compliment the similar HarbormasterScratchTable.
On its own, this does nothing interesting.
Test Plan: Built unit tests on this in a followup.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1162
Differential Revision: https://secure.phabricator.com/D2937
Summary:
accessibility covers not only a given post but also the various "published" views.
to keep the code relative clean, this diff also splits up the post list controller logic quite a bit. this also feels like good preparation for some other work around introducing "blogs" which are collections of published posts from bloggers with some fancy features around that.
Test Plan: clicked around various parts of the Phame application as a logged in user, a logged in user with no personal posts, and without any user logged in at all. various views all seemed reasonable.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1373
Differential Revision: https://secure.phabricator.com/D2898
Summary:
- See D2741.
- When EdgeEditor performs edits, emit events.
- Listen for Maniphest edge events and save the changes as transactions.
- Do all this in a reasonably generic way that won't take too much rewriting as we use edges more generally.
Test Plan: Attached and detached commits from tasks, saw reasonable-looking transactions spring into existence.
Reviewers: btrahan, davidreuss
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D2906
Summary: This is both only partially complete (supports Maniphest only) and somewhat overcomplicated (includes support for applying similar algorithms to Feed), but provides runtime aggregation of notifications.
Test Plan: {F13502}
Reviewers: btrahan, jungejason
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D2884
Summary:
Diff of diffs display changes between new versions of file.
This is bad after rebase because there can be many unrelated changes so it is hard to spot the real change.
This diff unhighlights the lines that were added or removed in rebase.
The changes are still visible (they can be sometimes relevant) but very subtle.
Test Plan:
# Add, change and delete line. Display diff.
# Add and change some lines in parent. Rebase. Display diff. Display diff of diff.
# Change and add some lines. Display diff. Display diff to first diff. Display diff to second diff.
Reviewers: epriestley, jungejason
Reviewed By: jungejason
CC: jungejason, aran, Korvin
Differential Revision: https://secure.phabricator.com/D2761
Summary:
- Allow clients to query for specific closed statuses (invalid, resolved, wontfix, etc), not just "closed" tasks.
- Rename this method to maniphest.query and deprecate maniphest.find as an alias to maniphest.query, for API consistency.
Test Plan: Ran queries for all tasks, "wontfix" tasks, closed tasks.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D2887
Summary:
This adds:
1) A new "arc:lint-postponed" diff property which stores a list of
lint names that are postponed and a finishpostponedlinters conduit
method which removes linters from this list. Postponed linters are
shown in the lint details.
2) A updatelintmessages conduit message, which adds additional lint
messages to the "arc:lint" diff property.
In combination, this provides very basic support for running
asynchronous static analysis tools. When the diff is being created,
a list of asynchronous static analysis runs can be added to the
diff's postponed linters list. As these postponed linters finish
up, then can report new lint messages back to the diff then mark
themselves as complete.
The client is currently responsible for filtering the lint messages
by things like affected lines and files.
Test Plan:
Used conduit call API to add lint messages and remove postponed
linters from a test diff.
Reviewers: epriestley, vrana, nh, jungejason
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1332
Differential Revision: https://secure.phabricator.com/D2792
Summary: Implementation is a little crazy but this seems to work as advertised.
Test Plan: Acquired locks with "lock.php". Verified they held as long as the process reamined open and released properly on kill -9, ^C, etc.
Reviewers: nh, jungejason, vrana, btrahan, Girish, edward
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1400
Differential Revision: https://secure.phabricator.com/D2864
Summary: I will also need `getRemovedLines()` so refactor this first.
Test Plan:
New test case.
Viewed uncached diff.
Verified that the only callsite of `getAddedLines()` trims lines.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D2875
Summary: show project profile image on pertinent edit page. also add a "Use Default Image" checkbox for both project and user profiles. Also added a function for projects to get the profile picture to prevent some copy + paste action.
Test Plan: set my user profile and project profile image. clicked "Use Default Image" and got the default image back.
Reviewers: epriestley, floatinglomas
Reviewed By: floatinglomas
CC: aran, Korvin
Maniphest Tasks: T1307
Differential Revision: https://secure.phabricator.com/D2852
Summary:
Improve performance of large discovery tasks in Git by using subprocess streaming, like we do for Mercurial.
Basically, we save the cost of running many `git log` commands by running one big `git log` command but only parsing as much of it as we need to. This is pretty complicated, but we more or less need it for mercurial (which has ~100ms of 'hg' overhead instead of ~5ms of 'git' overhead) so we're already committed to most of the complexity costs. The git implementation is much simpler than the hg implementation because we don't need to handle all the weird parent rules (git gives us to them easily).
Test Plan:
Before, `discover --repair` on Phabricator took 35s:
real 0m35.324s
user 0m13.364s
sys 0m21.088s
Now 7s:
real 0m7.236s
user 0m2.436s
sys 0m3.444s
Note that most of the time is spent inserting rows after discover, the actual speedup of the git discovery part is much larger (subjectively, it runs in less than a second now, from ~28 seconds before).
Also ran discover/pull on single new commits in normal cases to verify that nothing broke in the common case.
Reviewers: jungejason, nh, vrana
Reviewed By: vrana
CC: aran
Maniphest Tasks: T1401
Differential Revision: https://secure.phabricator.com/D2851
Summary:
Nothing new or exciting here yet, just moving the random scripts/repositories/ things to bin/repository. Also add `repository list`.
(Console stuff comes from D2841.)
Test Plan: Ran `repository list`, `repository pull`, `repository discover`, `repository discover --verbose`, `repository help`.
Reviewers: jungejason, vrana
Reviewed By: vrana
CC: aran
Differential Revision: https://secure.phabricator.com/D2849
Summary:
"differential.creatediff" requires a mostly-parsed diff, but there's no reason we can't make DifferentialDiffs out of raw diffs.
This mainly serves to lower the adoption barrier if getting "arc" distributed is too much of a hassle.
Test Plan: Made a diff out of a raw block of diff text.
Reviewers: ffx, vrana, btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D2751