Summary:
Ref T13000. This adds support for tracking "common" ngrams, which occur in too many documents to be useful as part of the ngram index.
If an ngram is listed in the "common" table, it won't be written when indexing documents, or queried for when searching for them.
In this change, nothing actually writes to the "common" table. I'll start writing to the table in a followup change.
Specifically, I plan to do this:
- A new GC process updates the "common" table periodically, by writing ngrams which appear in more than X% of documents to it, for some value of X, if there are at least a minimum number of documents (maybe like 4,000).
- A new GC process deletes ngrams that have been added to the common table from the existing indexes.
Hopefully, this will pare down the ngrams index to something reasonable over time without requiring any manual tuning.
Test Plan:
- Ran some queries and indexes.
- Manually inserted ngrams `xxx` and `yyy` into the ngrams table, searched and indexed, saw them ignored as viable ngrams for search/index.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13000
Differential Revision: https://secure.phabricator.com/D18672
Summary:
Ref T12987. I was focused on the RefCursor table and overlooked that we need some care on this key.
It's currently possible to run `bin/storage upgrade --no-adjust`, then start Phabricator, and end up with duplicate records in this table. If you try to run `bin/storage adjust` later, it will try to add the unique key but fail. This is unusual for normal installs (they usually do not use `--no-adjust`) but we do it in the cluster and I did this exact thing on `secure`.
Normally, to avoid this, when a new table with a unique key is introduced, we also add a migration to explicitly add that key.
This is mostly harmless in this case. Fix this mistake (force the table to contain only unique rows; add the key) and try using `LOCK TABLES` to make this atomic. If this doesn't cause problems we can use this in similar situations in the future.
The "alter table may unlock things" warning comes from here:
https://dev.mysql.com/doc/refman/5.7/en/lock-tables.html
It seems like it's fine to issue `UNLOCK TABLES` even if you don't have any locks, so I think this script should always do the right thing now, regardless of ALTER TABLE unlocking or not unlocking tables.
Test Plan: Ran `bin/storage upgrade -f`, saw table end up in the right state. I'll also check this on `secure`, where the starting state is a little messier.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T12987
Differential Revision: https://secure.phabricator.com/D18623
Summary:
Ref T11823. This change isn't standalone, but prepares for the more involved code change by dropping obsolete columns from the RefCursor table and adding the unique key we need to prevent the ambiguous/duplicate refs issue.
This data was moved to the RefPosition table in D18612.
Test Plan: Ran storage upgrade. See next revision for more substantial testing of this change series.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T11823
Differential Revision: https://secure.phabricator.com/D18613
Summary:
Ref T11823. This populates the new RefPosition table based on the existing RefCursor table, and deletes now-duplicate rows in the RefCursor table so the next change can add a unique key.
This change is not standalone, and there need to be separate code updates. I have a rough version of that written, but this migration needs to happen first to test it.
I'll hold this whole series of changes until after the release cut and until the code is updated.
Test Plan: Ran migration, spot-checked database tables. Saw redundant rows remove and correct-looking rows populated into the new RefPosition table.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T11823
Differential Revision: https://secure.phabricator.com/D18612
Summary:
Ref T11823. Currently, we have a "RefCursor" table which stores rows like `<branch or tag name, commit it is pointing at>` with some more data.
Because Mercurial can have a single branch pointing at several different places, this table must allow multiple rows with the same branch or tag name.
Among other things, this means there isn't a single PHID which can be used to identify a branch name in a stable way. However, we have several UIs where we want to be able to do this.
Some specific examples where we run into trouble: in Mercurial, if there are 5 heads for "default", that means there are 5 phids. And currently, if someone deletes a branch, we lose the PHID for it. Instead, we'd rather retain it so the whole world doesn't break if you accidentally delete a branch and then fix it a little later.
(I'll likely hold this until the rest of the logic is fleshed out a little more in followup changes.)
Test Plan: Ran `bin/storage upgrade`, saw the table get created without warnings.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T11823
Differential Revision: https://secure.phabricator.com/D18602
Summary:
Ref T12819. This is shipping, so issue upgrade guidance to instruct installs to rebuild the index.
Also generate a new `quickstart.sql` since we haven't regenerated in a bit and there's been a large amount of table churn fairly recently.
Test Plan: Ran `bin/storage upgrade`, saw guidance notification in UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18594
Summary: Ref T12819. More ferret engine support.
Test Plan: Indexed and searched commits and repositories.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18572
Summary: Ref T12819. Support for Pholio.
Test Plan: Indexed and searched mocks.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18569
Summary: Ref T12819. Adds ferret engine support for Calendar events.
Test Plan: Indexed and queried calendar events.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18568
Summary: Ref T12819. Adds Ferret engine support.
Test Plan: Indexed and searched for documents.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18567
Summary: Ref T12819. Adds support for projects.
Test Plan: Indexed and searched for projects.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18566
Summary: Ref T12819. Mostly straightforward, with a couple of minor query modernization things.
Test Plan: Indexed and searched for posts and blogs.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18565
Summary: Ref T12819. Same deal as before, but smaller diffs after D18559.
Test Plan: Indexed and searched for packages.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18564
Summary: Ref T12819. Adds Ferret support to Passphrase.
Test Plan: Indexed credentials, searched for credentials.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18556
Summary: Ref T12819. Adds Ferret engine support to initiatives.
Test Plan: Indexed and searched for initiatives.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18555
Summary:
Ref T12819. Adds support for indexing user accounts so they appear in global fulltext results.
Also, always rank users ahead of other results.
Test Plan: Indexed users. Searched for a user, got that user.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18552
Summary: Ref T12819. Adds storage and indexing for the Ferret engine to Differential.
Test Plan: Ran `bin/search index D123 --force`, saw indexes appear in database. No UI/user impact yet.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18540
Summary:
Ref T12819. Ferret currently does substring search, but this is not the default mode users expect: when you search for the "RICO" act, you do not expect to find documents containing "apRICOt" even though "RICO" is a substring.
To support term search, index the corpus as a list of terms with puncutation removed and whitespace normalized so the engine can match against it.
Test Plan:
Ran `storage upgrade`, ran `search index`, saw sensible database results:
```
rawCorpus: This is the task description.
Hark! Whom'st'dve eaten this "food" shall surely ~perish~?? #blessed
normalCorpus: thi the task descript hark whom dve eaten food shall sure perish bless
termCorpus: This is the task description Hark Whom'st'dve eaten this food shall surely perish blessed
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18498
Summary:
Ref T12819. This addresses two issues:
- One practical issue is that right now, if you search for "dog cat", and they appear in different fields (for example, "dog" appears ONLY in the title, while "cat" appears ONLY in a comment) we won't find the document. This is somewhat rare -- usually, if "dog" appears in the title, it's also repeated in the description -- but I think clearly a bug. To attack this, start automatically creating a virtual "ALL" field with the full document text which we'll use as the primary thing we match against.
- For fields which may occur more than once -- today, only comments -- aggregate them all into one big "all of the text" row instead of writing one row per comment. This partly addresses the first point ("dog" in one comment and "cat" in a different comment won't be found) and partly makes some of the query gymnastics easier.
Test Plan:
Ran `bin/storage upgrade`, ran `bin/search index <Txxx>`, saw sensible corpus values in the database:
```
mysql> select * from maniphest_task_ffield\G
*************************** 1. row ***************************
id: 3
documentID: 1981
fieldKey: full
rawCorpus: This is the task title
This is the task description.
normalCorpus: thi the task titl
thi the task descript
*************************** 2. row ***************************
id: 4
documentID: 1981
fieldKey: titl
rawCorpus: This is the task title
normalCorpus: thi the task titl
*************************** 3. row ***************************
id: 5
documentID: 1981
fieldKey: body
rawCorpus: This is the task description.
normalCorpus: thi the task descript
3 rows in set (0.00 sec)
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819
Differential Revision: https://secure.phabricator.com/D18497
Summary:
Ref T12819. I gave this stuff a sweet code name because all the terms related to "fulltext" and "search" already mean 5 different things. It, uh, ferrets out documents for you?
I'm building this to work a lot like the existing ngram index, which seems to work pretty well. If this sticks, it will auto-resolve the join issue (in T12443) by letting us do the entire thing locally in a JOIN and thus dodge a lot of mess.
This index gets built alongside other indexes, but only shows up in the UI if you have prototypes enabled. If you do, it appears under the existing fulltext field in Maniphest. No existing functionality is affected or disrupted.
NOTE: The query engine half of this is still EXTREMELY primitive, and this probably performs worse than the existing field for now. If this doesn't show obvious signs of being awful on `secure` I'll improve that in followup changes.
Test Plan:
Indexed my tasks, ran some simple queries, got the results I wanted, even for queries "ko", "k", "v0.1".
{F5147746}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12819, T12443
Differential Revision: https://secure.phabricator.com/D18484
Summary: Just deletes the view code until I have time to better plan this out, or just not ship.
Test Plan: Visit Phame post on public logged out page, view count doesnt cause transaction fatal.
Reviewers: epriestley
Reviewed By: epriestley
Spies: Korvin
Differential Revision: https://secure.phabricator.com/D18475
Summary:
Ref T12956. After this change, individual users will no longer be able to modify builtin queries on a user-by-user basis: they will always appear at the bottom of the list, under their personal queries, and can only be managed by administrators.
To support this, clean up the old rows which could be hanging around from before: delete any personal saved queries where the saved query is a builtin query.
To ease this transition, try to pin the query we're deleting //if// the user had reordered things to put it on top.
Test Plan:
- Ran the migration, saw no changes in the UI but fewer rows.
- Went back to `master`, reordered queries to put a builtin one on top.
- Ran the migration.
- Saw that builtin one drop to the bottom (since it can't be on top anymore) but be pinned, preserving the behavior of `/maniphest/`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12956
Differential Revision: https://secure.phabricator.com/D18464
Summary:
Ref T12956. Currently, when you visit `/maniphest/` (or any other ApplicationSearch application) we execute the first query in the list by default.
In T12956, I plan to make changes so that personal queries are always first, then global/builtin queries. Without changing the "default query" rule, this will make it harder to have, for example, some custom queries in Differential but still run a global query like "Active" by default. To make this work, you'd have to save a personal copy of the "Active" query, then put it at the top.
This feels a bit cumbersome and this rule is kind of implicit and a little weird anyway. To make this work a little better as we make changes here, add an explicit pinning action, like the one we have in Project ProfileMenus.
You can now explicitly choose a query to make default.
Test Plan:
- Browsed without pinning anything, saw normal behavior.
- Pinned queries, viewed `/maniphest/`, saw a non-initial query selected by default.
- Pinned a query, deleted it, nothing exploded.
{F5098484}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12956
Differential Revision: https://secure.phabricator.com/D18422
Summary: This adds a very very basic view count to Phame, so bloggers can get some idea which posts are more popular than others. Anything more than this I think should be Facts or Google Analytics.
Test Plan: Write a new post, see post count. Reload page, post count goes up. Archive post, post count stays the same.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Differential Revision: https://secure.phabricator.com/D18446
Summary:
Ref T2543. This updates and migrates the status change transactions:
- All storage now records the modern modular transaction ("differential.revision.status"), not the obsolete non-modular transaction ("differential:status").
- All storage now records the modern constants ("accepted"), not the obsolete numeric values ("2").
Test Plan:
- Selected all the relevant rows before/after migration, data looked sane.
- Browsed around, reviewed timelines, no changes after migration.
- Changed revision states, saw appropriate new transactions in the database and timeline rendering.
- Grepped for `differential:status`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T2543
Differential Revision: https://secure.phabricator.com/D18419
Summary:
Ref T2543. Rewrites all the storage to use constants.
Note that transactions still use legacy values, I'll migrate and update them separately.
Test Plan:
- Ran migration.
- Browsed around, changed revision states, viewed dashboard, etc.
- Selected `DISTINCT()` and `GROUP_CONCAT()` of the `status` field in the database, saw sane/expected before and after values.
- Verified that old Conduit methods still return numeric constants for compatibility.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T2543
Differential Revision: https://secure.phabricator.com/D18418
Summary: Ref T2543. This migrates existing saved queries so they use the right modern values for the new tokenizer control, introduced in D18393.
Test Plan:
- Saved a query with "Abandoned" selected as the status in the old "<select />", prior to D18393.
- Upgraded to D18393, which broke the query (it no longer selected any status filter).
- Ran the migration to fix things.
- Saw the query now execute with "Abandoned" selected in the tokenizer, preseving the original behavior accurately.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T2543
Differential Revision: https://secure.phabricator.com/D18394
Summary: Fixes T12124. Changes `ManiphestEditEngine` to populate the select using priority keywords instead of the integer value. Marks `maniphest.querystatuses` as frozen. Adds a new Conduit method for fetching potential task statuses.
Test Plan: Created tasks and changed their priorities, observed that transactions in the DB still have the same type (integers as strings). Invoked `maniphest.update` with `priority => '90'` and observed that it still works. Invoked `maniphest.edit` with `priority => 'unbreak'` and observed that it now works.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T12124
Differential Revision: https://secure.phabricator.com/D18111
Summary: Builds out some images to use to identify repositories. Fixes T12825.
Test Plan:
Try setting custom, built in, and null images.
{F4998175}
{F4998192}
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T12825
Differential Revision: https://secure.phabricator.com/D18116
Summary:
See D18037. The migration there may cause us to write new file records as a side effect.
Ideally, we would rewrite that migration to not ever have this kind of side effect. However, that would make it much more complicated, and it's already very complicated.
Instead, retroactively expand the size of this field before `storage adjust` does it, so it has the right size by the time we hit the migration in D18037.
Test Plan:
@chad, can you `arc patch` this and see if it works?
It's possible that it will get us about five lines deeper and then we'll just hit another similar exception, and that this isn't really a viable way forward.
Reviewers: chad, amckinley
Reviewed By: amckinley
Subscribers: amckinley, chad
Differential Revision: https://secure.phabricator.com/D18107
Summary: Does the UI work that's part of T12234 and adds migrations for both of the old-style duplicate transactions.
Test Plan:
- Started with a clean DB.
- Checked out really old code that marks tasks as dupes using comments.
- Made a bunch of tasks and closed some as dupes. Made a bunch of additional comments.
- Checked out D10427 and did a `storage upgrade`.
- Made a bunch more new tasks and dupes.
- Snapshotted DB.
- Ran migration repeatedly until all expected edges showed up in the `phabricator_maniphest.edge`table.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T12234
Differential Revision: https://secure.phabricator.com/D18037
Summary: Fixes T12505. `PhabricatorProjectsMembershipIndexEngineExtension->materializeProject()` was incorrectly bailing early for milestone objects, which prevented milestone members from being calculated correctly. This was causing problems where (for example) an Owners package owned by a milestone wasn't being satisfied when a member of the milestone approved a revision.
Test Plan: Invoked migration, observed that a user's milestones correctly showed up when searched for. Also observed that accepting a revision on behalf of a milestone now satisfies Owners rules.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T12505
Differential Revision: https://secure.phabricator.com/D18033
Summary:
Ref T12738. This makes clicking "Throw In Trash" technically do something, sort of.
In Nuance, the default mode of operation for actions is asynchronous -- so you don't have to wait for a response from Twitter or GitHub after you mash the "send default reply tweet" / "close this pull request with a nice response" button and can move directly to the next item instead.
In the future, some operations will attempt to apply synchronously (e.g., local actions like "ignore this item forever"). This fakes our way through that for now.
There's also no connection to the action actually doing anything yet, but I'll probably rig that up next.
Test Plan: {F4975227}
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12738
Differential Revision: https://secure.phabricator.com/D18010
Summary: Ref T12738. Some of the Nuance "form" workflows currently fatal after work on the GitHub stuff. Try to make everything stop fataling, at least.
Test Plan: Using "Complaints Form" no longer fatals, and now lodges a complaint instead.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12738
Differential Revision: https://secure.phabricator.com/D18007
Summary: Also changes access modifiers on `PhabricatorProjectTransactionEditor` and sets up `storage` for `applyExternalEffects`.
Test Plan: Created new projects, attempted to create without name, with too long of a name, and with a name that conflicts with other projects and observed expected errors.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Maniphest Tasks: T12673
Differential Revision: https://secure.phabricator.com/D17947
Summary:
Fixes T12623. Adds new modular transactions to Slowvote. Also converts
the `shuffle` column to `bool` for consistency with other boolean-ish columns.
Test Plan:
Create a new vote, modified everything that could be modified from the web UI,
observed expected timeline.
Example timeline: {F4938843}
Example transaction values in DB: {F4938850}
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T12623
Differential Revision: https://secure.phabricator.com/D17830
Summary:
- Change column type from `sort128` to `sort`.
- Remove `originalName`. This column is unused. Long ago, we used it to generate a `Thread-Topic` header for mail, but just use PHIDs now (the value just needs to be stable for a given object, users normally don't see it).
Test Plan:
- Created a package with a beautifully long name. Magnificent!
- Grepped for `originalName` / `getOriginalName()`, found no Owners hits.
- Verified that there isn't any name-length validation code to remove.
{F4925637}
Reviewers: chad, amckinley
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D17798
Summary:
Depends on D17785. Fixes T12635. There was a bug where users could verify their primary email without getting the "isEmailVerified" flag set on their accounts.
D17785 fixes this bug. This change migrates affected account to fix their state, now that they can't get in trouble any more (hopefully).
Test Plan:
- Explicitly removed this flag from a bunch of accounts.
- Ran migration, saw the accounts get fixed.
- Ran migration again (`storage upgrade --apply ...`), saw the accounts not get touched.
- We have 117 affected accounts on `secure`, so I'll verify that this fixes them.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12635
Differential Revision: https://secure.phabricator.com/D17786
Summary:
The quickstart SQL hasn't been regenrated in a while, mildly impacting unit test and instance startup times.
- Use `bin/storage quickstart` to regenerate quickstart.
- Manually set the FULLTEXT tables back to `MyISAM` until we deal with T11741.
Test Plan:
- Saw database setup drop from ~10,500ms to ~7,500ms locally.
- Visually inspected diff, changes looked expected.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D17775
Summary:
Fixes T12628. After later changes to `PhabricatorFile`, this migration no longer runs if you upgrade through it to a recent `HEAD` while your data has some room images.
Since this isn't critical and has been available for ~6 months, I just nuked it as a first pass. I can find a more careful approach which lets us continue to run this migration instead if you're hesitant to skip this step, although it may be a little involved.
In 95% of cases we avoid this by updating the storage table as it existed at the time the migraiton ran, but Files are much too complicated for that to be realistic.
Test Plan: Ran `bin/storage upgrade -f --apply phabricator:20161005.conpherence.image.2.php`, saw it do nothing.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T12628
Differential Revision: https://secure.phabricator.com/D17770
Summary:
Ref T11476. This is a bit hacky, but makes `Application` extend `LiskDAO` so we can apply transactions to it with an `Editor` class.
Also fixes schema stuff so builds should produce a clean bill of health again.
This might only get you slightly further, yell if you run into more trouble.
Test Plan:
- Ran `bin/storage upgrade -f` and got no warnings.
- Browsed around, nothing exploded?
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T11476
Differential Revision: https://secure.phabricator.com/D17738
Summary: Part of the groundwork for T11476.
Test Plan: ran `./bin/storage upgrade` and observed expected DB tables
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin
Maniphest Tasks: T11476
Differential Revision: https://secure.phabricator.com/D17736
Summary:
Pathway to D17685. This column is (mostly) a denormalization of `dateModified` on the thread.
Just use a JOIN instead.
This isn't //exactly// the same: we'll bump threads to the top now for non-message changes (e.g., a topic or title change). That seems fine, but we could put a `lastMessageDate` on Thread later if we want to refine it.
Also got rid of a lot of other unused stuff. There's a big garbage TODO here, I'll fix that in the next change.
Test Plan:
- Grepped for `dateTouched`.
- Grepped for `participantCursor`.
- Grepped for `ConpherenceParticipantQuery::LIMIT`.
- Looked for callsites to `setOrder()`, found none.
- Added a message to an older thread, saw it bump up to the top.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D17731
Summary:
Pathway to D17685. This column is a very complicated cache of: is participant.messageCount equal to thread.messageCount?
We can just ask this question with a JOIN instead and simplify things dramatically.
Test Plan:
- Ran migration.
- Browsed around.
- Sent a message, saw unread count go up.
- Read the message, saw unread count go down.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D17730
Summary: Pathway to D17685. Nothing reads this field and it has no use or value.
Test Plan:
- Ran migration.
- Grepped for `behindTransactionPHID`.
Reviewers: chad
Reviewed By: chad
Differential Revision: https://secure.phabricator.com/D17729