phorge-phorge

mirror of https://we.phorge.it/source/phorge.git synced 2025-02-26 05:29:06 +01:00

Author	SHA1	Message	Date
epriestley	8a2863e3f7	Change the "can see remote address?" policy to "is administrator?" everywhere Summary: Depends on D18970. Ref T13049. Currently, the policy for viewing remote addresses is: - In activity logs: administrators. - In push and pull logs: users who can edit the corresponding repository. This sort of makes sense, but is also sort of weird. Particularly, I think it's kind of hard to understand and predict, and hard to guess that this is the behavior we implement. The actual implementation is complex, too. Instead, just use the rule "administrators can see remote addresses" consistently across all applications. This should generally be more strict than the old rule, because administrators could usually have seen everyone's address in the activity logs anyway. It's also simpler and more expected, and I don't really know of any legit use cases for the "repository editor" rule. Test Plan: Viewed pull/push/activity logs as non-admin. Saw remote addresses as an admin, and none as a non-admin. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18971	2018-01-30 15:45:23 -08:00
epriestley	75bc86589f	Add date range filtering for activity, push, and pull logs Summary: Ref T13049. This is just a general nice-to-have so you don't have to export a 300MB file if you want to check the last month of data or whatever. Test Plan: Applied filters to all three logs, got appropriate date-range result sets. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18970	2018-01-30 15:36:22 -08:00
epriestley	213eb8e93d	Define common ID and PHID export fields in SearchEngine Summary: Ref T13049. All exportable objects should always have these fields, so make them builtins. This also sets things up for extensions (like custom fields). Test Plan: Exported user data, got the same export as before. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13049 Differential Revision: https://secure.phabricator.com/D18951	2018-01-29 15:17:00 -08:00
epriestley	a79bb55f3f	Support CSV, JSON, and tab-separated text as export formats Summary: Depends on D18919. Ref T13046. Adds some simple modular exporters. Test Plan: Exported pull logs in each format. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046 Differential Revision: https://secure.phabricator.com/D18934	2018-01-26 11:16:52 -08:00
epriestley	c0b8e4784b	Add a basic, general-purpose export workflow for all objects with SearchEngine support Summary: Depends on D18918. Ref T13046. Ref T5954. Pull logs can currently be browsed in the web UI, but this isn't very powerful, especially if you have thousands of them. Allow SearchEngine implementations to define exportable fields so that users can "Use Results > Export Data" on any query. In particular, they can use this workflow to download a file with pull logs. In the future, this can replace the existing "Export to Excel" feature in Maniphest. For now, we hard-code JSON as the only supported datatype and don't actually make any effort to format the data properly, but this leaves room to add more exporters (CSV, Excel) and data type awareness (integer casting, date formatting, etc) in the future. For sufficiently large result sets, this will probably time out. At some point, I'll make this use the job queue (like bulk editing) when the export is "large" (affects more than 1K rows?). Test Plan: Downloaded pull logs in JSON format. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046, T5954 Differential Revision: https://secure.phabricator.com/D18919	2018-01-26 11:15:59 -08:00
epriestley	e6a9db56a9	Add a basic view for repository pull logs Summary: Depends on D18912. Ref T13046. Add a UI to browse the existing pull log table. The actual log still has some significant flaws, but get the basics working. Test Plan: {F5391909} Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13046 Differential Revision: https://secure.phabricator.com/D18914	2018-01-23 14:10:10 -08:00
epriestley	6a62797056	Fix some issues with Diffusion file data limits Summary: See <https://discourse.phabricator-community.org/t/files-created-from-repository-contents-slightly-over-one-chunk-in-size-are-truncated-to-exactly-one-chunk-in-size/988/1>. Three issues here: - When we finish reading `git cat-file ...` or whatever, we can end up with more than one chunk worth of bytes left in the internal buffer if the read is fast. Use `while` instead of `if` to make sure we write the whole buffer. - Limiting output with `setStdoutSizeLimit()` isn't really a reliable way to limit the size if we're also reading from the buffer. It's also pretty indirect and confusing. Instead, just let the `FileUploadSource` explicitly implement a byte limit in a straightforward way. - We weren't setting the time limit correctly on the main path. Overall, this could cause >4MB files to "write" as 4MB files, with the rest of the file left in the UploadSource buffer. Since these files were technically under the limit, they could return as valid. This was intermittent. Test Plan: - Pushed a ~4.2MB file. - Reloaded Diffusion a bunch, sometimes saw the `while/if` buffer race and produce a 4MB file with a prompt to download it. (Other times, the buffer worked right and the page just says "this file is too big, sorry"). - Applied patches. - Reloaded Diffusion a bunch, no longer saw bad behavior or truncated files. Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D18885	2018-01-22 11:52:37 -08:00
epriestley	a7921a4448	Filter and reject "--config" and "--debugger" flags to Mercurial in any position Summary: Ref T13012. These flags can be exploited by attackers to execute code remotely. See T13012 for discussion and context. Additionally, harden some Mercurial commands where possible (by using additional quoting or embedding arguments in other constructs) so they resist these flags and behave properly when passed arguments with these values. Test Plan: - Added unit tests. - Verified "--config" and "--debugger" commands are rejected. - Verified more commands now work properly even with branches and files named `--debugger`, although not all of them do. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13012 Differential Revision: https://secure.phabricator.com/D18769	2017-11-10 08:42:07 -08:00
epriestley	157f47cd14	Rewrite CommitQuery to use UNION for performance Summary: Ref T12680. See PHI167. See that task for discussion. Rewrite `DiffusionCommitQuery` to work more like `DifferentialRevisionQuery`, and use a UNION to find "all revisions you need to audit OR respond to". I tried to get this working a little more cleanly than RevisionQuery does, and can probably simplify that now. Test Plan: Poked at the UI locally without hitting any apparent issues, but my local data is pretty garbage at this point. I'll take a look at how the query plans work on `secure`. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T12680 Differential Revision: https://secure.phabricator.com/D18722	2017-10-23 10:32:24 -07:00
epriestley	65f13b156f	Improve "refengine" performance for testing large numbers of Mercurial branches Summary: See PHI158. In the RefEngine, we test if any old branch positions have been removed from the repository. This is uncommon (but not impossible) in Mercurial, and corresponds to users deleting branches in Git. Currently, we end up running `hg log` for each position, in parallel. Because of Python's large startup overhead, this can be resource intensive for repositories with a large number of branches. We have to do this in the general case because the caller may be asking us to resolve `tip`, `newfeature`, `tip~3`, `9`, etc. However, in the specific case where the refs are 40-digit hashes, we can bulk resolve them if they exist, like this: ``` hg log ... --rev (abcd or def0 or ab12 or ...) ``` In the general case, we could probably do less of this than we currently do (instead of testing all old heads, we could prune the list by removing commits which we know are still pointed to by current heads) but that's a slightly more involved change and the effect here is already dramatic. Test Plan: Verified that CPU usage drops from ~110s -> ~0.9s: Before: ``` epriestley@orbital ~/dev/phabricator $ time ./bin/repository refs nss Updating refs in "nss"... Done. real 0m14.676s user 1m24.714s sys 0m21.645s ``` After: ``` epriestley@orbital ~/dev/phabricator $ time ./bin/repository refs nss Updating refs in "nss"... Done. real 0m0.861s user 0m0.882s sys 0m0.213s ``` - Manually resolved `blue`, `tip`, `9`, etc., got expected results. - Tried to resolve invalid hashes, got expected result (no resolution). Reviewers: amckinley Reviewed By: amckinley Differential Revision: https://secure.phabricator.com/D18717	2017-10-20 11:09:14 -07:00
Dmitri Iouchtchenko	9bd6a37055	Fix spelling Summary: Noticed a couple of typos in the docs, and then things got out of hand. Test Plan: - Stared at the words until my eyes watered and the letters began to swim on the screen. - Consulted a dictionary. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley, yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam Differential Revision: https://secure.phabricator.com/D18693	2017-10-09 10:48:04 -07:00
epriestley	8982e3e52d	Update major RefCursor callsites to work properly with RefPosition Summary: Ref T11823. This is the meaty part of the change, and updates `RefEngine` to use separate RefCursor (for names) and RefPosition (for actual commit positions) tables. I'll hold this whole series until after the release cut so it has some time to bake on `secure` to look for issues. It's also not a huge problem if there are bugs here since these tables are just caches anyway, although they do feed into some other things, and obviously it's never good to have bugs. Test Plan: - This logic can be invoked directly with `bin/repository refs <repository> --trace --verbose`. - Ran that on unchanged repositories, new branches, removed branches, and modified branches. Saw appropriate output and cursor positions. - Ran on a mercurial repository to test the close/open logic, saw it correct open/closed state of incorrect positions. - Browed around Diffusion in various repositories. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T11823 Differential Revision: https://secure.phabricator.com/D18614	2017-09-15 10:21:32 -07:00
epriestley	2e36653965	Reduce callsites to "ArcanistDifferentialRevisionStatus" in Phabricator Summary: Ref T2543. These are currently numeric values, like "0" and "3". I want to replace them with strings, like "accepted", and move definitions from Arcanist to Phabricator. To set the stage for this, reduce the number of callsites where Phabricator invokes `ArcanistDifferentialRevisionStatus`. This is just the easy ones. I'll hold this until the release cut. Test Plan: - Called `differential.find`. - Called `differential.getrevision`. - Called `differential.query`. - Removed all reviewers from a revision, saw warning. - Abandoned the no-reviewers revision, no more warning. - Attached a revision to a task to get it to show the state icon with the status on a tooltip. - Viewed revision bucketing on dashboard. - Used `bin/search index` to reindex a revision. - Hit the "Land Revision" endpoint. I didn't explicitly test these cases: - Doorkeeper Asana integration, since setup takes a thousand years. - Disambiguation logic when multiple hashes match, since setup is also very involved. - Releeph because it's Releeph. Reviewers: chad Reviewed By: chad Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam Maniphest Tasks: T2543 Differential Revision: https://secure.phabricator.com/D18339	2017-08-09 11:04:52 -07:00
epriestley	f48f2dae9f	Move Phabricator to use PhutilBinaryAnalyzer and show binary versions Summary: Fixes T12942. - Adds binary version and path information to {nav Config > Version Information}. - Replaces old code all over the place with new consolidated code. Test Plan: {F5073531} Also faked some cases of missing binaries, bad versions, etc. Reviewers: chad Reviewed By: chad Maniphest Tasks: T12942 Differential Revision: https://secure.phabricator.com/D18306	2017-08-01 07:14:48 -07:00
epriestley	45b386596e	Make the Files "TTL" API more structured Summary: Ref T11357. When creating a file, callers can currently specify a `ttl`. However, it isn't unambiguous what you're supposed to pass, and some callers get it wrong. For example, to mean "this file expires in 60 minutes", you might pass either of these: - `time() + phutil_units('60 minutes in seconds')` - `phutil_units('60 minutes in seconds')` The former means "60 minutes from now". The latter means "1 AM, January 1, 1970". In practice, because the GC normally runs only once every four hours (at least, until recently), and all the bad TTLs are cases where files are normally accessed immediately, these 1970 TTLs didn't cause any real problems. Split `ttl` into `ttl.relative` and `ttl.absolute`, and make sure the values are sane. Then correct all callers, and simplify out the `time()` calls where possible to make switching to `PhabricatorTime` easier. Test Plan: - Generated an SSH keypair. - Viewed a changeset. - Viewed a raw diff. - Viewed a commit's file data. - Viewed a temporary file's details, saw expiration date and relative time. - Ran unit tests. - (Didn't really test Phragment.) Reviewers: chad Reviewed By: chad Subscribers: hach-que Maniphest Tasks: T11357 Differential Revision: https://secure.phabricator.com/D17616	2017-04-04 16:16:28 -07:00
Jakub Vrana	9f3cde4db7	Fix errors found by PHPStan Test Plan: None. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley Differential Revision: https://secure.phabricator.com/D17377	2017-02-18 09:24:56 +00:00
Jakub Vrana	a778151f28	Fix errors found by PHPStan Test Plan: Ran `phpstan analyze -a autoload.php phabricator/src`. Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: Korvin, hach-que Differential Revision: https://secure.phabricator.com/D17371	2017-02-17 10:10:15 +00:00
Josh Cox	1b8b64aae6	Stop calling the undefined `withIsTag` method Summary: This just cleans up a method call that was missed in D15986. It's been causing fatal errors in one of our workflows. Test Plan: Grep'd for other instances of `withIsTag` and didn't find any Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley, yelirekim Differential Revision: https://secure.phabricator.com/D17299	2016-12-14 14:56:40 -05:00
epriestley	4890d66795	Excluded authored commits from "Ready to Audit"; handle unreachable commits better Summary: Ref T10978. I'm inching toward cleaning up our audit state. Two issues are: - Authored commits show up in "Ready to Audit", but should not. - Unreachable commits (like that stacked of unsquashed stuff) show up too, but we don't really care about them. Kick authored stuff out of the "Ready to Audit" bucket and hide unreachable commits by default, with constraints for filtering. Also give them a closed/disabled/strikethru style. Test Plan: - Viewed audit buckets. - Searched for reachable/unreachable commits. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10978 Differential Revision: https://secure.phabricator.com/D17279	2017-01-31 13:37:05 -08:00
epriestley	bcbd4035fd	Remove several pieces of audit-related code Summary: Ref T10978. This code (mostly related to the old ADD_AUDIT transaction and some to the "store English text in the database" audit reasons) is no longer reachable. Test Plan: Grepped for removed symbols: - withAuditStatus - getActionNameMap (unrelated callsites exist) - getActionName (unrelated callsites exist) - getActionPastTenseVerb - addAuditReason - getAuditReasons - auditReasonMap Also audited some commits. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10978 Differential Revision: https://secure.phabricator.com/D17267	2017-01-30 15:26:26 -08:00
epriestley	5e7a091737	Write an explicit edge for commit membership in packages Summary: Ref T10978. Currently, during commit import, we write an "Audit Not Required" auditor for commits which don't require an audit. This auditor is used to power the "Commits in this package" query in Owners. This conflates audits and commit/package membership. I think it might even predate edges. Code needs to dance around this mess and we get the wrong result in some cases, since auditors are now editable. Instead, write an explicit edge which just says "this commit is part of such-and-such packages". Then use that to run the query. Logical! I'll issue guidance on this but I'm not migrating it, since it fixes itself going forward and only really affects the UI in Owners. Test Plan: - Ran `bin/audit update-owners` with various arguments. - Viewed packages in web UI, saw them load the proper commits. - Queried by packages in Diffusion explicitly. - Clicked the "View All" link in Owners and got to the right search UI. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10978 Differential Revision: https://secure.phabricator.com/D17264	2017-01-30 15:23:34 -08:00
epriestley	97cac83e9b	Add a "Needs Verification" state to Audit Summary: Fixes T2393. This allows authors to explicitly say "I think I fixed everything, please accept my commit now thank you". Also improves behavior of "re-accept" and "re-reject" after new auditors you have authority over get added. Test Plan: - Kicked a commit back and forth between an author and auditor by alternately using "Request Verification" and "Raise Concern". - Verified it showed up properly in bucketing for both users. - Accepted, added a project, accepted again (works now; didn't before). - Audited on behalf of projects / packages. Reviewers: chad Reviewed By: chad Maniphest Tasks: T2393 Differential Revision: https://secure.phabricator.com/D17252	2017-01-25 13:08:59 -08:00
epriestley	903e37a21b	Show yellow "draft" bubble in Audit Summary: Fixes T6660. Uses the new stuff in Audit to build an EditEngine-aware icon. Test Plan: {F2364304} Reviewers: chad Reviewed By: chad Maniphest Tasks: T6660 Differential Revision: https://secure.phabricator.com/D17208	2017-01-16 10:28:59 -08:00
epriestley	a635da68d4	Provide bucketing for commits in Audit Summary: Fixes T9430. Fixes T9362. Fixes T9544. This changes the default view of Audit to work like Differential, where commits you need to audit or respond to are shown in buckets. This is a bit messy and probably needs some followups. This stuff has changed from a compatibility viewpoint: - The query works differently now (but in a better, modern way), so existing saved queries will need to be updated. - I've removed the counters from the home page instead of updating them, since they're going to get wiped out by ProfileMenu soon anyway. - When bucketed queries return too many results (more than 1,000) we now show a warning about it. This isn't greaaaat but it seems good enough for now. Test Plan: {F2351123} Reviewers: chad Reviewed By: chad Maniphest Tasks: T9430, T9362, T9544 Differential Revision: https://secure.phabricator.com/D17192	2017-01-12 12:04:05 -08:00
epriestley	5f26dd9b66	Use futures to improve clustered repository main page performance Summary: Ref T11954. In cluster configurations, we get repository information by making HTTP calls over Conduit. These are slower than local calls, so clustering imposes a performance penalty. However, we can use futures and parallelize them so that clustering actually improves overall performance. When not running in clustered mode, this just makes us run stuff inline. Test Plan: - Browsed Git, Mercurial and Subversion repositories. - Locally, saw a 700ms wall time page drop to 200ms. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11954 Differential Revision: https://secure.phabricator.com/D17009	2016-12-08 07:26:32 -08:00
epriestley	9329e6a12d	Stop doing an excessive amount of work in `diffusion.rawdiffquery` Ref T11665. Without `-n 1`, this logs the ENTIRE history of the repository. We actually get the right result, but this is egregiously slow. Add `-n 1` to return only one result. It appears that I wrote this wrong way back in 2011, in D953. This query is rarely used (until recently) which is likely why it has escaped notice for so long. Test Plan: Used Conduit console to execute `diffusion.rawdiffquery`. Got the same results but spent 8ms instead of 200ms executing this command, in a very small repository.	2016-09-20 06:00:31 -07:00
epriestley	c55de86f0e	Return Diffusion diffs through Files, not directly over Conduit Summary: Fixes T10423. Ref T11524. This changes `diffusion.rawdiffquery` to return a file PHID instead of a blob of data. This is better in general, but particularly better for huge diffs (as in T10423) and diffs with non-utf8 data (as in T10423). Test Plan: - Used `bin/differential extract` to extract a latin1 diff, got a clean diff. - Used `bin/repository reparse --herald` to rerun herald on a latin1 diff, got a clean result. - Pushed latin1 diffs to test commit hooks. - Triggered the the too large / too slow logic. - Viewed latin1 diffs in Diffusion. - Used "blame past this change" in Diffusion to hit the `before` logic. Reviewers: chad Reviewed By: chad Subscribers: eadler Maniphest Tasks: T10423, T11524 Differential Revision: https://secure.phabricator.com/D16460	2016-08-27 09:11:03 -07:00
epriestley	771579496f	Make logic for streaming VCS stuff directly to Files more reusable Summary: Ref T11524. Ref T10423. Earlier, I converted `diffusion.filecontentquery` to put the actual file content in Files, then return a PHID for the file, instead of trying to send the content over Conduit. In T11524, we have a similar set of problems with diffs that contain non-UTF8 data (and, in T10423, diffs that are simply enormous). I want to provide an API method to do the same sort of thing with diff output (like from `git diff`), so we call the method, it shoves the data in Files, and then we go pull it out of Files. To support this, take the "shove the output of a Future into Files" logic and put it in a new base `FileFuture` query. This will let me make `RawDiffQuery` share the logic more easily. Test Plan: Browsed Diffusion, ran `diffusion.filecontentquery` to fetch file content. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10423, T11524 Differential Revision: https://secure.phabricator.com/D16458	2016-08-27 09:10:20 -07:00
epriestley	be235301d0	When commits have a "rewritten" hint, try to show that in handles in other applications Summary: Ref T11522. This tries to reduce the cost of rewriting a repository by making handles smarter about rewritten commits. When a handle references an unreachable commit, try to load a rewrite hint for the commit. If we find one, change the handle name to "OldHash > NewHash" to provide a strong hint that the commit was rewritten and that copy/pasting the old hash (say, to the CLI) won't work. I think this notation isn't totally self-evident, but users can click it to see the big error message on the page, and it's at least obvious that something weird is going on, which I think is the important part. Some possible future work: - Not sure this ("Recycling Symbol") is the best symbol? Seems sort of reasonable but mabye there's a better one. - Putting this information directly on the hovercard could help explain what this means. Test Plan: {F1780719} Reviewers: chad Reviewed By: chad Maniphest Tasks: T11522 Differential Revision: https://secure.phabricator.com/D16437	2016-08-24 09:35:19 -07:00
epriestley	e4c4724afd	Migrate the "badcommit" table to use the less-hacky "hint" mechanism Summary: Ref T11522. This migrates any "badcommit" data (which probably only exists at Facebook and on 1-2 other installs in the wild) to the new "hint" table. Test Plan: - Wrote some bad commit annotations to the badcommit table. - Viewed them in the web UI and used `bin/repository reparse --change ...` to reparse them. Saw "this is bad" messages. - Ran migration, verified that valid "badcommit" rows were successfully migrated to become "hint" rows. - Viewed the new web UI and re-parsed the change, saw "unreadable commit" messages. - Viewed a good commit; reparsed a good commit. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11522 Differential Revision: https://secure.phabricator.com/D16435	2016-08-24 09:32:59 -07:00
epriestley	8a4fbcd8c0	Provide a new "hint" table for weird commits (rewritten, unreadable) Summary: Ref T11522. This provides storage for tracking rewritten commits (new feature) and unreadable commits (existing feature, but really hacky). This doesn't do anything yet, just adds a table and a CLI tool for updating it. I'll document the tool once it works. You just pipe in some JSON, but I need to document the format. Test Plan: - Piped JSON for "none", "rewritten" and "unreadable" hints into `bin/repository hint`. - Examined the database to see that the table was written properly. - Tried to pipe bad JSON in, invalid hint types, etc. Got reasonable human-readable error messages. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11522 Differential Revision: https://secure.phabricator.com/D16434	2016-08-24 09:31:46 -07:00
epriestley	28eb562899	Ignore unrecognized refs in "refs/remotes/" Summary: Ref T9028. When selecting refs, pretend refs in "refs/remotes/" that we don't otherwise recognize don't exist, since it looks like these are probably remotes //of the remote// we're observing, and who knows what state they're in. Test Plan: Used `bin/repository discover --verbose` to verify that these named refs no longer appear in the list. Reviewers: chad, joshuaspence Reviewed By: joshuaspence Maniphest Tasks: T9028 Differential Revision: https://secure.phabricator.com/D16136	2016-06-16 16:03:36 -07:00
epriestley	2949905c04	Fetch and discover all Git ref types, not just branches Summary: Ref T9028. Fixes T6878. Currently, we only fetch and discover branches. This is fine 99% of the time but sometimes commits are pushed to just a tag, e.g.: ``` git checkout <some hash> nano file.c git commit -am '...' git tag wild-wild-west git push origin wild-wild-west ``` Through a similar process, commits can also be pushed to some arbitrary named ref (we do this for staging areas). With the current rules, we don't fetch tag refs and won't discover these commits. Change the rules so: - we fetch all refs; and - we discover ancestors of all refs. Autoclose rules for tags and arbitrary refs are just hard-coded for now. We might make these more flexible in the future, or we might do forks instead, or maybe we'll have to do both. Test Plan: Pushed a commit to a tag ONLY (`vegetable1`). <`cf508b8de6`> On `master`, prior to the change: - Used `update` + `refs` + `discover`. - Verified tag was not fetched with `git for-each-ref` in local working copy and the web UI. - Verified commit was not discovered using the web UI. With this patch applied: - Used `update`, saw a `refs/` fetch instead of a `refs/heads/` fetch. - Used `git for-each-ref` to verify that tag fetched. - Used `repository refs`. - Saw new tag appear in the tags list in the web UI. - Saw new refcursor appear in refcursor table. - Used `repository discover --verbose` and examine refs for sanity. - Saw commit row appear in database. - Saw commit skeleton appear in web UI. - Ran `bin/phd debug task`. - Saw commit fully parse. {F1689319} Reviewers: chad Reviewed By: chad Subscribers: avivey Maniphest Tasks: T6878, T9028 Differential Revision: https://secure.phabricator.com/D16129	2016-06-16 11:20:05 -07:00
epriestley	f5f784f4c1	Version clustered, observed repositories in a reasonable way (by largest discovered HEAD) Summary: Ref T4292. For hosted, clustered repositories we have a good way to increment the internal version of the repository: every time a user pushes something, we increment the version by 1. We don't have a great way to do this for observed/remote repositories because when we `git fetch` we might get nothing, or we might get some changes, and we can't easily tell //what// changes we got. For example, if we see that another node is at "version 97", and we do a fetch and see some changes, we don't know if we're in sync with them (i.e., also at "version 97") or ahead of them (at "version 98"). This implements a simple way to version an observed repository: - Take the head of every branch/tag. - Look them up. - Pick the biggest internal ID number. This will work //except// when branches are deleted, which could cause the version to go backward if the "biggest commit" is the one that was deleted. This should be OK, since it's rare and the effects are minor and the repository will "self-heal" on the next actual push. Test Plan: - Created an observed repository. - Ran `bin/repository update` and observed a sensible version number appear in the version table. - Pushed to the remote, did another update, saw a sensible update. - Did an update with no push, saw no effect on version number. - Toggled repository to hosted, saw the version reset. - Simulated read traffic to out-of-sync node, saw it do a remote fetch. Reviewers: chad Reviewed By: chad Maniphest Tasks: T4292 Differential Revision: https://secure.phabricator.com/D15986	2016-05-30 09:53:01 -07:00
epriestley	e81637a6c6	Fix some issues with the "Explain Why" dialog Summary: Ref T11051. This is still not as clear as it should be, but is at least working as intended now. I believe this part of the code just never worked. The test plan on D10489 didn't specifically cover it. Test Plan: Did this sort of thing in a repository: ``` $ git checkout -b featurex $ echo x >> y $ git commit -am wip $ arc diff ``` Then I simulated just pushing it (this flow is a little more involved than necessary): ``` $ arc land --hold $ git commit --amend $ # remove all metadata -- particularly, "Differential Revision"! $ git push HEAD:master ``` I got a not-great but more-useful dialog: {F1667318} Prior to this change, the hash match was incorrectly not reported at all. Reviewers: chad Reviewed By: chad Maniphest Tasks: T11051 Differential Revision: https://secure.phabricator.com/D15989	2016-05-30 09:52:35 -07:00
epriestley	3fdb1a2bc4	Improve behavior for not-yet-created non-cluster repositories Summary: Fixes T10815. We already recovered reasonably from this for cluster repositories, but not for non-cluster repositories. Test Plan: - Viewed cluster and non-cluster empty Git repository. - Viewed cluster and non-cluster empty Mercurial repository. - Viewed cluster and non-clsuter empty hosted SVN repository. - Viewed cluster and non-cluster empty observed SVN repository. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10815 Differential Revision: https://secure.phabricator.com/D15878	2016-05-11 06:38:53 -07:00
epriestley	575c01373e	Extract repository command construction from Repositories Summary: Ref T4292. Ref T10366. Depends on D15751. Today, generating repository commands is purely a function of the repository, so they use protocols and credentials based on the repository configuration. For example, a repository with an SSH "remote URI" always generate SSH "remote commands". This needs to change in the future: - After T10366, repositories won't necessarily just have one type of remote URI. They can only have one at a time still, but the repository itself won't change based on which one is currently active. - For T4292, I need to generate intracluster commands, regardless of repository configuration. These will have different protocols and credentials. Prepare for these cases by separating out command construction, so they'll be able to generate commands in a more flexible way. Test Plan: - Added unit tests. - Browsed diffusion. - Ran `bin/phd debug pull` to pull a bunch of repos. - Ran daemons. Reviewers: chad Reviewed By: chad Maniphest Tasks: T4292, T10366 Differential Revision: https://secure.phabricator.com/D15752	2016-04-19 04:51:48 -07:00
epriestley	b07a524b4b	Fix resolution of commits in SVN repositories without callsigns Summary: Fixes T10721. When trying to load commits by identifier, we would take some bad pathways in Subversion if the repository had no callsign and end up missing the commits. Fix this logic so it works for either callsigns (e.g., if passed `rXyyy`) or with PHIDs if passed repositories. Test Plan: - Viewed SVN commit in a Subversion repository with no callsign. - Added a callsign, looked at it again. - Viewed non-SVN commits in callsign and non-callsign repositories. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10721 Differential Revision: https://secure.phabricator.com/D15607	2016-04-04 09:44:36 -07:00
epriestley	b51a859636	Allow diffusion.filecontentquery to load data for arbitrarily large files Summary: Fixes T10186. After D14970, `diffusion.filecontentquery` puts the content in a file and returns the file PHID. However, it does this in a way that doesn't go through the chunking engine, so it will fail for files larger than the chunk threshold (generally, 8MB). Instead, stream the file from the underlying command directly into chunked storage. Test Plan: - Made a commit including a really big file: `4dcd4c492b` - Used `diffusion.filecontentquery` to load file content. - Parsed/imported commit locally. - Used `diffusion.filecontentquery` to load content for smaller files (README, etc). Reviewers: chad Reviewed By: chad Maniphest Tasks: T10186 Differential Revision: https://secure.phabricator.com/D15072	2016-01-21 09:52:43 -08:00
epriestley	a061bd2d09	Parse and display commit authorship date in Git in Diffusion Summary: Fixes T8826. Git tracks an "author date", which may be different from the "committed date". We don't currently extract/show this; do so. Test Plan: {F1059235} Reviewers: chad Reviewed By: chad Maniphest Tasks: T8826 Differential Revision: https://secure.phabricator.com/D14995	2016-01-11 09:32:37 -08:00
epriestley	d1fb2f7fb9	Make `diffusion.filecontentquery` return file PHIDs instead of raw content Summary: Fixes T9319. Proxied requests (e.g., in the cluster) for binary files (like images) currently fail because we can not return binary data over Conduit in JSON. Although Conduit will eventually support binary-safe encodings, a cleaner approach to this is just to return a `filePHID` instead of the raw content. This is generally faster and more flexible, and gives us more opportunities to add caching later. After making the call, the client pulls the file data separately. We also no longer need to return a complex data structure because we don't do blame over this call any longer. Test Plan: - Viewed images in Diffusion. - Viewed READMEs in Diffusion. - Used `bin/differential attach-commit rX Dy` to hit attach pathway. Reviewers: chad Reviewed By: chad Maniphest Tasks: T9319 Differential Revision: https://secure.phabricator.com/D14970	2016-01-08 09:29:16 -08:00
epriestley	449da36c2f	Use a path digest when building blame cache keys Keys have a maximum length of 128, and long paths could cause key lengths to exceed this. Auditors: chad	2016-01-06 19:12:57 -08:00
Fabian Stelzer	e8d3071452	Implement a git blame cache Summary: Ref T2450. Ref T2453. Add a repository_blamecache table and cache git blame information Test Plan: View files in Diffusion with enabled blame Reviewers: fabe, chad, #blessed_reviewers Reviewed By: chad, #blessed_reviewers Subscribers: joshuaspence, epriestley Maniphest Tasks: T2453, T2450 Differential Revision: https://secure.phabricator.com/D10600	2016-01-06 18:43:30 -08:00
epriestley	0759b84d77	Improve construction of commit queries from blame lookups Summary: Ref T2450. File blame tends to have the same commit a lot of times, and we don't do lookups like this efficiently right now. In particular, for a file like `__phutil_library_map__.php`, we would issue a query with ~9,000 clauses like this: ``` (repositoryID = 1 AND commitIdentifier LIKE "XYZ%") ``` ...but only a few hundred of those identifiers were unique. Instead, issue only one clause per unique identifier. MySQL also seems to do a little better on "commitIdentifier = X" if we have the full hash, so special case that slightly. Test Plan: - Issuing a query for only unique identifiers dropped the cost from 400ms to 100ms locally. - Swapping to `=` if we have the full hash dropped the cost from 100ms to 75ms locally. Reviewers: chad Reviewed By: chad Maniphest Tasks: T2450 Differential Revision: https://secure.phabricator.com/D14962	2016-01-06 18:43:04 -08:00
epriestley	9728c65e93	Drive blame generation through `diffusion.blame` Summary: Ref T2450. Ref T9319. This is still a bit messy, but not quite so bad as it was: instead of using a single call to get both blame information and file content, use `diffusion.blame` for blame information. This will make optimizations to both blame and file content easier. Test Plan: Viewed a bunch of blame (color on/off, blame on/off). Reviewers: chad Reviewed By: chad Maniphest Tasks: T2450, T9319 Differential Revision: https://secure.phabricator.com/D14958	2016-01-06 09:24:21 -08:00
epriestley	f561dc172d	Implement a dedicated "diffusion.blame" API method Summary: Fixes T2451. Several motivations here, from strongest to weakest: - Currently, getting blame and file content are closely entwined. This makes fixing T9319 more difficult, and I want to fix it. I want to separate blame from content so there's more flexibility in how we approach this issue. - This makes pursuing T2450 easier, if it turns out to be a meaningful win. - If we can get a win on blame performance, we can do `arc blame` eventually if we want. Test Plan: - Blamed in SVN, Git and Mercurial. Reviewers: chad Reviewed By: chad Maniphest Tasks: T2451 Differential Revision: https://secure.phabricator.com/D14957	2016-01-06 09:24:03 -08:00
epriestley	bcfd6bdd81	Move various other callsites away from callsigns Summary: Ref T4245. These mostly relate to building URIs. Test Plan: Tried to hunt down as many of these in the UI as I could. Some are a bit tricky but they should be low-risk. Reviewers: chad Reviewed By: chad Maniphest Tasks: T4245 Differential Revision: https://secure.phabricator.com/D14933	2016-01-04 06:54:42 -08:00
Aviv Eyal	724f6ddda5	return this in DiffusionCommitQuery Test Plan: chain another call after this Reviewers: #blessed_reviewers, epriestley Reviewed By: #blessed_reviewers, epriestley Subscribers: epriestley Differential Revision: https://secure.phabricator.com/D14364	2015-10-28 23:25:41 +00:00
Christopher Speck	812c41a18a	Conditionally use `hg files` vs. `hg locate` depending on version of Mercurial Summary: In Mercurial 3.2 the `locate` command was deprecated in favor of `files` command. This change updates the DiffusionLowLevelMercurialPathsQuery command to conditionally use `locate` or `files` based on the version of Mercurial used. Closes T7375 Test Plan: My test/develop Phabricator instance is setup to run Mercurial 3.5.1. The test procedure to verify valid file listings are being returned: 1. I navigated to `http://192.168.0.133/conduit/method/diffusion.querypaths/` 2. I populated the following fields: - path: `"/"` - commit: `"d721d5b57fc9ef72e47ff9d4e0c583d74a46590c"` - callsign: `"HGTEST"` 3. I submitted request and verified that result contained all files in the repository: ``` { "0": "README", "1": "alpha/beta/trifle", "2": "test/Chupacabra.cow", "3": "test/socket.ks" } ``` I repeated the above steps after setting up Mercurial 2.6.2, which I installed in the following manner: 1. I downloaded Mercurial 2.6.2 source and run `make local` which will only compile it to work from its own directory (`/opt/mercurial-2.6.2`) 2. I linked `/usr/local/bin/hg -> /opt/mercurial-2.6.2/hg` (there's also a `/usr/bin/hg` which is a link to `/usr/local/bin/hg`) 3. I navigated to my home directory and verify that `hg --version` returns 2.6.2. 4. I restarted phabricator services (probably unnecessary). With the Multimeter application active 1. I verified that `/usr/local/bin/hg` referred to version 2.6 2. I ran the same conduit call from the conduit application 3. I verified that `http://192.168.0.133/multimeter/?type=2&group=label` incremented values for `bin.hg locate`. 4. I swapped out mercurial versions for 3.5.1 5. I ran the same conduit call from the conduit application 6. I verified that `http://192.168.0.133/multimeter/?type=2&group=label` incremented values for `bin.hg files` Reviewers: epriestley, #blessed_reviewers Reviewed By: epriestley, #blessed_reviewers Subscribers: Korvin Maniphest Tasks: T7375 Differential Revision: https://secure.phabricator.com/D14253	2015-10-12 17:50:26 -07:00
epriestley	b2e89a9e48	Fix several error handling issues with Subversion commits in Diffusion Summary: Ref T9513. I checked this briefly but didn't do a very thorough job of it. - Don't try to query merges for Subversion, since it doesn't support them. - Fix up "existsquery" to work properly (and efficiently) for both hosted and imported repositories. - Fix up "parentsquery" to have similar behavior on invalid commits to other VCSes (throw an exception). Test Plan: - No more merges warning on SVN. - Hosted SVN gets the right exists result now. - Visiting "r23980283789287" now 404's instead of "not parsed yet". Reviewers: chad Reviewed By: chad Maniphest Tasks: T9513 Differential Revision: https://secure.phabricator.com/D14239	2015-10-05 15:57:41 -07:00

1 2 3 4 5 ...

297 commits