Summary:
Ref T13511. Ferret functions currently define "aliases", and some applications override the default aliases.
This probably isn't really the right model, since it means the available function aliases in global search depend on the types of documents you're searching for. This isn't fundamentally unworkable but is kind of weird.
Regardless, these don't actually work. Searching for "description:x" is a syntax error.
Since they don't work, it's a good bet no one is relying on them. Just get rid of them until there's a clearer argument for the feature.
Test Plan: Grepped for "getFunctionMap", got no other hits. Ran some queries with the alias functions, got syntax errors.
Maniphest Tasks: T13511
Differential Revision: https://secure.phabricator.com/D21130
Summary:
Ref T13501. Depends on D21127. With the "prefix" behavior removed in D21127, we now have two virtually identical copies of the same code.
The newer one in Ferret is better: it slices utf8 correctly and is slightly more efficient on large inputs. Pull it out and make all callers call into it.
Test Plan:
- Grepped for all affected symbols.
- Ran `bin/search index --force ...` to reindex various objects (tasks, files).
- Searched for things in the UI.
Maniphest Tasks: T13501
Differential Revision: https://secure.phabricator.com/D21128
Summary:
Ref T13501. The older ngram code has some "prefix" behavior that tries to handle cases where a user issues a very short (one or two character) query.
This code doesn't work, presumably never worked, and can not be made to work (or, at least, I don't see a way, and am fairly sure one does not exist).
If the user searches for "xy", we can find trigrams in the form "xy*" using the index, but not in the form "*xy". The code makes a misguided effort to look for " xy", but this will only find "xy" in words that begin with "xy", like "xylophone".
For example, searching Files for "om" does not currently find "random.txt".
Remove this behavior. Without engaging the trigram index, these queries fall back to an unidexed "LIKE" table scan, but that's about the best we can do.
Test Plan: Searched for "om", hit "random.txt".
Maniphest Tasks: T13501
Differential Revision: https://secure.phabricator.com/D21127
Summary: Ref T13511. This function does nothing interesting and has no callers.
Test Plan: Grepped for callers.
Maniphest Tasks: T13511
Differential Revision: https://secure.phabricator.com/D21126
Summary:
Ref T13507. We currently compress normal responses, but do not compress file data responses because most files we serve are images and already compressed.
However, there are some cases where large files may be highly compressible (e.g., huge XML files stored in LFS) and we can benefit from compressing responses.
Make a reasonable guess about whether compression is beneficial and enable compression if we guess it is.
Test Plan:
- Used `curl ...` to download an image with `Accept-Encoding: gzip`. Got raw image data in the response (as expected, because we don't expect images to be worthwhile to recompress).
- Used `curl ...` to download a text file with `Accept-Encoding: gzip`. Got a compressed response. Decompressed the response into the original file.
Maniphest Tasks: T13507
Differential Revision: https://secure.phabricator.com/D21125
Summary:
See <https://hackerone.com/reports/850114>.
An attacker with administrator privileges can configure "notification.servers" to connect to internal services, either directly or with chosen parameters by selecting an attacker-controlled service and having it issue a "Location" redirect.
Generally, we allow this attack to occur. The same administrator can use an authentication provider or a VCS repository to perform the same attack, and we can't reasonably harden these workflows without breaking things that users expect to be able to do.
There's no reason this particular variation of the attack needs to be allowable, though, and the current behavior isn't consistent with how other similar things work.
- Hide the "notification.servers" configuration, which also locks it. This is similar to other modern service/server configuration.
- Don't follow redirects on these requests. Aphlict should never issue a "Location" header, so if we encounter one something is misconfigured. Declining to follow this header likely makes the issue easier to debug.
Test Plan:
- Viewed configuration in web UI.
- Configured a server that "Location: ..." redirects, got a followed redirect before and a failure afterward.
{F7365973}
Differential Revision: https://secure.phabricator.com/D21123
Summary:
Ref T13507. Now that we handle processing of "Content-Encoding: gzip" headers by default, this setup check can get a decompressed body back. Since it specifically wants a raw body back, disable this behavior.
Also, "@" a couple things which can get in the way if they fail now that error handling is more aggressive about throwing on warnings.
Test Plan: Ran setup check after other changes in T13507, got clean result.
Maniphest Tasks: T13507
Differential Revision: https://secure.phabricator.com/D21122
Summary: Ref T13507. If we believe the server can accept "Content-Encoding: gzip" requests, make the claim in an "X-Conduit-Capabilities" header in responses. Clients can use request compression on subsequent requests.
Test Plan: See D21119 for the client piece.
Maniphest Tasks: T13507
Differential Revision: https://secure.phabricator.com/D21120
Summary: Ref T13507. See that task for discussion.
Test Plan: Faked different response behaviors and hit both variations of this error.
Maniphest Tasks: T13507
Differential Revision: https://secure.phabricator.com/D21116
Summary:
See PHI1692. Currently, the Aphlict log is ridiculously verbose. As an initial pass at improving this:
- When starting in "debug" mode, pass "--debug=1" to Node.
- In Node, separate logging into "log" (lower-volume, more-important messages) and "trace" (higher-volume, less-important messages).
- Only print "trace" messages in "debug" mode.
Test Plan: Ran Aphlict in debug and non-debug modes. Behavior unchanged in debug mode, but log has more sensible verbosity in non-debug mode.
Differential Revision: https://secure.phabricator.com/D21115
Summary:
See PHI1692. Currently, it's hard to get a local profile or "--trace" of some Diffusion API methods, since they always proxy via HTTP -- even if the local node can serve the request.
This always-proxy behavior is intentional (so we always go down the same code path, to limit surprises) but inconvenient when debugging. Allow an operator to connect to a node which can serve a request and issue a `--local` call to force in-process execution.
This makes it straightforward to "--trace" or "--xprofile" the call.
Test Plan: Ran `bin/conduit call ...` with and without `--local` using a Diffusion method on a clustered repository. Without `--local`, saw proxy via HTTP. With `--local`, saw in-process execution.
Differential Revision: https://secure.phabricator.com/D21114
Summary: Ref T13509. In `title:big "red" dog`, keep "title" sticky across all three terms, since this seems like it's probably the best match for intent.
Test Plan: Added unit tests; ran unit tests.
Maniphest Tasks: T13509
Differential Revision: https://secure.phabricator.com/D21111
Summary:
Ref T13509. Since `title:- cat` is now ambiguous, forbid spaces after operators.
Also, forbid spaces inside operators, although this has no effect today.
Test Plan: Added unit tests, ran unit tests.
Maniphest Tasks: T13509
Differential Revision: https://secure.phabricator.com/D21109
Summary:
Ref T13509. Currently, functions are "sticky", but this stickness is in the query execution layer.
Instead:
- move stickiness to the query compiler; and
- make it so that functions are not sticky if their arguments are quoted.
For example:
- `title:x y` previously meant `title:x title:y` (and still does). The "title:" is sticky.
- `title:"x" y` previously meant `title:x title:y`. It now means `title:x all:y`. The "title:" is not sticky because the argument is quoted.
Test Plan: Added unit tests, ran unit tests.
Maniphest Tasks: T13509
Differential Revision: https://secure.phabricator.com/D21108
Summary: Ref T13509. Parse "xyz:-" as "xyz is absent" and "xyz:~" as "xyz is present". These are new operators which the compiler emits separately from "not" and "substring".
Test Plan: Added unit tests, ran unit tests.
Maniphest Tasks: T13509
Differential Revision: https://secure.phabricator.com/D21107
Summary:
Ref T13509. Certain query tokens like `title:=""` are currently accepted by the parser but discarded, and have no impact on the query. This isn't desirable.
Instead, require that tokens making an assertion about field content must be nonempty.
Test Plan: Added unit tests, made them pass.
Maniphest Tasks: T13509
Differential Revision: https://secure.phabricator.com/D21106
Summary: Ref T13490. This simplifies some client behavior in the general case.
Test Plan: Called API method, saw URIs.
Maniphest Tasks: T13490
Differential Revision: https://secure.phabricator.com/D21105
Summary:
See PHI1697. If a diff is not attached to a revision (for example, if it was created with "arc diff --only"), but is attached to a repository, it is supposed to be visible only to users who can see that repository.
It currently skips this extended policy check and may incorrectly be visible to too many users.
(Once a diff is attached to a revision, this rule is enforced properly via the revision policy.)
Test Plan:
- Set repository R to be visible only to Alice.
- As Alice, created a diff from a working copy of repository R with "arc diff --only".
- As Bailey, viewed the diff.
- Before: visible diff.
- After: policy exception (as expected).
Differential Revision: https://secure.phabricator.com/D21103
Summary: Ref T13490. This simplifies mostly-theoretical cases where you're accessing Phabricator via arc-over-ssh and the Conduit protocol + domain may differ from the production protocol + domain.
Test Plan: Called API via web UI, saw sensible URI values in results.
Maniphest Tasks: T13490
Differential Revision: https://secure.phabricator.com/D21102
Summary: Ref T13505. See that task for details. When a class has exactly one "@task" block, this API returns a string. Some day, this should be made more consistent.
Test Plan: Viewed a class with exactly one "@task", no more fatal. Viewed classes with zero and more than one "@task" attributes, got clean renderings.
Maniphest Tasks: T13505
Differential Revision: https://secure.phabricator.com/D21062
Summary: Ref T13505. See that task for discussion.
Test Plan: Ran `diviner generate` locally, found a page fataling on this `strlen()`, applied patch, got a sketchy but not-broken page.
Maniphest Tasks: T13505
Differential Revision: https://secure.phabricator.com/D21061
Summary: See PHI1684. Expose the published state of the "Done" checkbox to the API.
Test Plan: Made API calls on a comment in all four states, got correct published states via the API in all cases.
Differential Revision: https://secure.phabricator.com/D21059
Summary:
See <https://discourse.phabricator-community.org/t/upgrade-from-sep-30-2016/3702/>. A user performing an upgrade from 2016 to 2020 ran into an issue where this setup query is overheating.
This is likely caused by too many rows changing state during query execution, but the particulars aren't important since this setup check isn't too critical and will catch the issue eventually. It's fine to just move on if this query fails for any reason.
Test Plan: Forced the query to overheat, loaded setup issues, got overheating fatal. Applied patch, no more fatal.
Differential Revision: https://secure.phabricator.com/D21057
Summary:
See PHI1688. If many refs with a large amount of shared ancestry are deleted from a repository, we can spend much longer than necessary marking their mutual ancestors as unreachable over and over again.
For example, if refs A, B and C all point near the head of an obsolete "develop" branch and have about 1K shared commits reachable from no other refs, deleting all three refs will lead to us performing 3,000 mark-as-unreachable operations (once for each "<ref, commit>" pair).
Instead, we can stop exploring history once we reach an already-unreachable commit.
Test Plan:
- Destroyed 7 similar refs simultaneously.
- Ran `bin/repository refs`, saw 7 entries appear in the `oldref` table.
- Ran `bin/repository discover` with some debugging statements added, saw sensible-seeming behavior which didn't double-mark any newly-unreachable refs.
Differential Revision: https://secure.phabricator.com/D21056
Summary:
Depends on D21053. Ref T11968. Three things have changed:
- Overseers can no longer use FutureIterator to continue execution of an arbitrary list of futures from any state. Use FuturePool instead.
- Same with repository daemons.
- Probably (?) fix an API change in the Harbormaster exec future.
Test Plan:
- Ran "bin/phd debug task" and "bin/phd debug pull", no longer saw Future-management related errors.
- The Harbormaster future is easiest to test by just seeing if production works once this change is deployed there.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T11968
Differential Revision: https://secure.phabricator.com/D21054
Summary: Jira allows creating projects which contain number in names, phabricator will not allow such projects but it should
Test Plan: Pasted URL with Jira project which contain number in project name and it was parsed and resolved properly in phabricator
Reviewers: epriestley, Pawka, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D21040
Summary:
Ref T13493. Google returns a lower-quality account identifier ("email") and a higher-quality account identifier ("id"). We currently read only "email".
Change the logic to read both "email" and "id", so that if Google ever moves away from "email" the transition will be a bit easier.
Test Plan: Linked/unlinked a Google account, looked at the external account identifier table.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21028
Summary:
Fixes T5028. Older versions of Git (apparently, from before 2010) did not provide a way to extract the raw body of a commit message from "git log", so we approximate it with "subject" and "wrapped body".
In newer versions of Git, the raw body can be extracted exactly.
Adjust how we extract messages based on the version of Git, and try to be more faithful to edge cases: particularly, be more careful to extract the correct number of trailing newlines.
Test Plan:
- Added "var_dump()" + "die(1)" later in this method, then pushed various commit messages. Used "&& false" to force execution down the old path (either path should work in modern Git).
- Observed more faithful extraction of messages, including a more faithful extraction of the number of trailing newlines. Extraction is fully faithful if we can go down the "%B" path, which we should be able to in nearly all modern cases.
- Not all messages extract faithfully or consistently across the old and new versions, but the old extraction is destructive so this is likely about as close as we can realistically ever get.
Maniphest Tasks: T5028
Differential Revision: https://secure.phabricator.com/D21027
Summary:
Depends on D21022. Ref T13493. The JIRA API has changed from using "key" to identify users to using "accountId".
By reading both identifiers, this linkage "just works" if you run against an old version of JIRA, a new version of JIRA, or an intermediate version of JIRA.
It also "just works" if you run old JIRA, upgrade to intermediate JIRA, everyone refreshes their link at least once, then you upgrade to new JIRA.
This is a subset of cases and does not include "sudden upgrade to new JIRA", but it's strictly better than the old behavior for all cases it covers.
Test Plan: Linked, unlinked, and logged in with JIRA. Looked at the "ExternalAccountIdentifier" table and saw a sensible value.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21023
Summary: Depends on D21019. Ref T13493. There are no more barriers to removing readers and writers of "accountID"; the new "ExternalAccountIdentity" table can replace it completely.
Test Plan: Linked and unlinked OAuth accounts, logged in with OAuth accounts, tried to double-link OAuth accounts, grepped for affected symbols.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21022
Summary:
Depends on D21018. Ref T13493. Ref T6703. The "ExternalAccount" table has a unique key on `<accountType, accountDomain, accountID>` but this no longer matches our model of reality and changes in this sequence end writes to `accountID`.
Remove this key.
Then, remove all readers of `accountType` and `accountDomain` (and all nontrivial writers) because none of these callsites are well-aligned with plans in T6703.
This change has no user-facing impact today: all the rules about linking/unlinking/etc remain unchanged, because other rules currently prevent creation of more than one provider with a given "accountType".
Test Plan:
- Linked an OAuth1 account (JIRA).
- Linked an OAuth2 account (Asana).
- Used `bin/auth refresh` to cycle OAuth tokens.
- Grepped for affected symbols.
- Published an Asana update.
- Published a JIRA link.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13493, T6703
Differential Revision: https://secure.phabricator.com/D21019
Summary: Depends on D21017. Ref T13493. Update the Asana integration so it reads the "ExternalAccountIdentifier" table instead of the old "accountID" field.
Test Plan: Linked an Asana account, used `bin/feed republish` to publish activity to Asana.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21018
Summary:
Depends on D21015. When we sync an external account and get a list of account identifiers, write them to the database.
Nothing reads them yet and we still write "accountId", this just prepares us for reads.
Test Plan: Linked, refreshed, unlinked, and re-linked an external account. Peeked at the database and saw a sensible-looking row.
Differential Revision: https://secure.phabricator.com/D21016
Summary: Depends on D21014. Ref T13493. Make these objects all use destructible interfaces and destroy sub-objects appropriately.
Test Plan:
- Used `bin/remove destroy --trace ...` to destroy a provider, a user, and an external account.
- Observed destruction of sub-objects, including external account identifiers.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21015
Summary:
Depends on D21013. Ref T13493. When users log in with most providers, the provider returns an "ExternalAccount" identifier (like an Asana account GUID) and the workflow figures out where to go from there, usually a decision to try to send the user to registration (if the external account isn't linked to anything yet) or login (if it is).
In the case of password providers, the password is really a property of an existing account, so sending the user to registration never makes sense. We can bypass the "external identifier" indirection layer and just say "username -> internal account" instead of "external GUID -> internal mapping -> internal account".
Formalize this so that "AuthProvider" can generate either a "map this external account" value or a "use this internal account" value.
This stops populating "accountID" on "password" "ExternalAccount" objects, but this was only an artifact of convenience. (These records don't really need to exist at all, but there's little harm in going down the same workflow as everything else for consistency.)
Test Plan: Logged in with a username/password. Wiped the external account table and repeated the process.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21014
Summary:
Depends on D21012. Ref T13493. Currently, auth adapters return a single identifier for each external account.
Allow them to return more than one identifier, to better handle cases where an API changes from providing a lower-quality identifier to a higher-quality identifier.
On its own, this change doesn't change any user-facing behavior.
Test Plan: Linked and unlinked external accounts.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21013
Summary:
Ref T13493. This check was introduced in D4647, but the condition can never be reached in modern Phabricator because the table has a unique key on `<accountType, accountDomain, accountID>` -- so no row can ever exist with the same value for that tuple but a different ID.
(I'm not entirely sure if it was reachable in D4647 either.)
Test Plan: Used `SHOW CREATE TABLE` to look at keys on the table and reasoned that this block can never have any effect.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21012
Summary:
Depends on D21010. Ref T13493. External accounts may have multiple different unique identifiers, most often when v1 of the API makes a questionable choice (and provies a mutable, non-unique, or PII identifier) and v2 of the API uses an immutable, unique, random identifier.
Allow Phabricator to store multiple identifiers per external account.
Test Plan: Storage only, see followup changes.
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21011
Summary:
Ref T13493. The "AuthAccountView" UI element currently exposes raw account ID values, but I'm trying to make these many-to-one.
This isn't terribly useful as-is, so get rid of it. This element could use a design refresh in general.
Test Plan: Viewed the UI element in "External Accounts".
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21010
Summary:
See PHI1647, which asks for "vscode://" to be a configurable protocol on hosted Phacility instances.
I made the configuration editable in D21008, but this can reasonably just come upstream too.
Test Plan: Viewed config in Config, set my editor URI to `vscode://blahblah`.
Differential Revision: https://secure.phabricator.com/D21009
Summary:
Ref T13493. I'm updating callers to `getAccountID()` to prepare to move it to a separate table.
This callsite once supported this flow:
- External users with no accounts send mail to `bugs@`.
- This creates tasks in Maniphest.
- They're CC'd when the tasks are updated.
However, after T12237 we never actually send this mail (since their addresses are necessarily unverified).
I left this code in in case this needed to be revisited, but it hasn't been an issue. Just remove it and treat these users as undeliverable.
Test Plan: As a cursory test for nothing being horribly broken, sent some object mail.
Maniphest Tasks: T13493
Differential Revision: https://secure.phabricator.com/D21007
Summary:
Ref T13395. No library with this name loads any more, so we can't version check it.
(Ideally, the version check stuff would be more graceful when it fails now, since it's required to load "Config" after I moved it off a separate page.)
Test Plan: Loaded "Config".
Maniphest Tasks: T13395
Differential Revision: https://secure.phabricator.com/D20995
Summary: Fixes T2543. This mode has been a stable prototype for a very long time now; promote it so "--draft" can promote out of "experimental" in Arcanist.
Test Plan: See T2543 for discussion.
Maniphest Tasks: T2543
Differential Revision: https://secure.phabricator.com/D20983
Summary: Ref T13395. Moves a small amount of remaining "libphutil/" code into "phabricator/" and stops us from loading "libphutil/".
Test Plan: Browsed around; there are likely remaining issues.
Maniphest Tasks: T13395
Differential Revision: https://secure.phabricator.com/D20981
Summary: Ref T13395. Move cache classes, syntax highlighters, other markup classes, and sprite sheets to Phabricator.
Test Plan: Attempted to find any callers for any of this stuff in libphutil or Arcanist and couldn't.
Maniphest Tasks: T13395
Differential Revision: https://secure.phabricator.com/D20977
Summary:
`diffusion.branchquery` can return dictionary instead of array if some branches are filtered out.
Eg.:
```
{
"result": [
{
"shortName": "master",
"commitIdentifier": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
"refType": "branch",
"rawFields": {
"objectname": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
"objecttype": "commit",
```
might become:
```
{
"result": {
"1": {
"shortName": "master",
"commitIdentifier": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
"refType": "branch",
"rawFields": {
"objectname": "2817b0d8f79748ddfad0220c46d9b20bea34f460",
"objecttype": "commit",
```
Reproduction - find repository which has couple of branches, setup to track only some of them, execute `diffusion.branchquery` API call - result is dictionary instead of array
Test Plan: Apply patch, execution `diffusion.branchquery` call - result is no longer dictionary if it was one before
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D20973
Summary: Ref T10635. An install with large blocks of remarkup (4MB) in test details is reporting slow page rendering. This is expected, but I've mostly given up on fighting this unless I absolutely have to. Degrade the interface more aggressively.
Test Plan:
- Submitted a large block of test details in remarkup format.
- Before patch: they rendered inline.
- After patch: degraded display.
- Verified small blocks are not changed.
{F7180727}
{F7180728}
Subscribers: PHID-OPKG-gm6ozazyms6q6i22gyam
Maniphest Tasks: T10635
Differential Revision: https://secure.phabricator.com/D20970
Summary: Ref T13486. Currently, "Zarbo" sorts above "alice", but this isn't expected for a list of (mostly) human usernames.
Test Plan: Loaded a task with subscribers with mixed-case usernames.
Maniphest Tasks: T13486
Differential Revision: https://secure.phabricator.com/D20969