Summary:
Ref T11741. This makes everything work if we switch to InnoDB, but never actually switches yet.
Since the default minimum word length (3) and stopword list (36 common English words) in InnoDB are generally pretty reasonable, I just didn't add any setup advice for them. I figure we're better off with simpler setup until we identify some real problem that the builtin stopwords create.
Test Plan: Swapped the `false` to `true`, ran `storage adjust`, got InnoDB fulltext indexes, searched for stuff, got default "AND" behavior.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11741
Differential Revision: https://secure.phabricator.com/D16942
Summary:
Ref T11741. On recent-enough versions of MySQL, we would prefer to use InnoDB for fulltext indexes instead of MyISAM.
Allow `bin/storage adjust` to read actual and expected table engines, and apply adjustments as necessary.
We have one existing bad table that uses the wrong engine, `metamta_applicationemail`. This change corrects that table.
Test Plan:
- Ran `bin/storage upgrade`.
- Saw the adjustment phase apply this change properly:
```
>>>[463] <query> ALTER TABLE `local_metamta`.`metamta_applicationemail` COLLATE = 'utf8mb4_bin', ENGINE = 'InnoDB'
```
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11741
Differential Revision: https://secure.phabricator.com/D16941
Summary: Depends on D16847. Ref T11044. This updates the remaining storage-related workflows from the CLI to accommodate multiple masters.
Test Plan:
- Configured multiple masters.
- Ran all `bin/storage` workflows.
- Ran `arc unit --everything`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16848
Summary:
Depends on D16115. Ref T11044. In the brave new world of multiple masters, we need to check the schemata on each master when looking for missing storage patches, keys, schema changes, etc.
This realigns all the "check out what's up with that schema" calls to work for multiple hosts, and updates the web UI to include a "Server" column and allow you to browse per-server.
This doesn't update `bin/storage`, so it breaks things on its own (and unit tests probably won't pass). I'll update that in the next change.
Test Plan: Configured local environment in cluster mode with multiple masters, saw both hosts' status reported in web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T11044
Differential Revision: https://secure.phabricator.com/D16847
Summary:
Fixes T11453. Currently, commit message summaries are limited to 80 bytes. This may only be 20-40 characters for CJK languages or langauges with Cyrillic script.
Increase storage size to 255, then truncate to the shorter of 255 bytes or 80 glyphs. This preserves the same behavior for latin languages, but is less tight for Russian, etc.
Some minor additional changes:
- Provide a way to ask "how much data fits in this column?" so we don't have to duplicate column lengths across summary checks or UI errors like "title too long".
- Remove the `text80` datatype, since no other columns use it and we have no use cases (or likely use cases) for it.
Test Plan:
- Made a commit with a Cyrillic title, saw reasonable summarization in UI:
{F1757522}
- Added and ran unit tests.
- Grepped for removed `SUMMARY_MAX_LENGTH` constant.
- Grepped for removed `text80` data type.
Reviewers: avivey, chad
Reviewed By: avivey
Subscribers: avivey
Maniphest Tasks: T11453
Differential Revision: https://secure.phabricator.com/D16385
Summary: Fixes T10259. There was no real reason to do this `ip2long()` stuff in the first place -- it's very slightly smaller, but won't work with ipv6 and the savings are miniscule.
Test Plan:
- Ran migration.
- Viewed logs in web UI.
- Pulled and pushed.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10259
Differential Revision: https://secure.phabricator.com/D15165
Summary:
Ref T10195. Distinguish between "database does not exist" and "database exists, you just don't have permission to access it".
We can't easily get this information out of INFORMATION_SCHEMA but can just `SHOW TABLES IN ...` every database that looks like it's missing and then look at the error code.
Test Plan:
- Created a user `limited` with limited access.
- Ran `bin/storage adjust`.
- Got hopefully more helpful messages about access problems, instead of "Missing" errors.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10195
Differential Revision: https://secure.phabricator.com/D15079
Summary:
Ref T10010. This implements technical groundwork for subprojects. Specifically, it implements policy rules like Phriction:
- to see a project, you must be able to see all of its parents (and the project itself).
- you can edit a project if you can edit any of its parents (or the project itself).
To facilitiate this, we load all project ancestors when querying projects so we can do the view/edit checks.
This does NOT yet implement:
- proper membership rules for these projects (up next);
- any kind of UI to let users create subprojects.
Test Plan:
- Added unit tests.
- Executed unit tests.
- Browsed Projects (no change in behavior is expected).
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T10010
Differential Revision: https://secure.phabricator.com/D14861
Summary:
Ref T9979. This uses ngrams (specifically, trigrams) to build a reasonably efficient index for substring matching. Specifically, for a package like "Example", with ID 123, we store rows like this:
```
< ex, 123>
<exa, 123>
<xam, 123>
<amp, 123>
<mpl, 123>
<ple, 123>
<le , 123>
```
When the user searches for `exam`, we join this table for packages with tokens `exa` and `xam`. MySQL can do this a lot more efficiently than it can process a `LIKE "%exam%"` query against a huge table.
When the user searches for a one-letter or two-letter string, we only search the beginnings of words. This is probably what they want, the only thing we can do quickly, and a reasonable/expected behavior for typeaheads.
Test Plan:
- Ran storage upgrades and search indexer.
- Searched for stuff with "name contains".
- Used typehaead and got sensible results.
- Searched for `aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz` and saw only 16 joins.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14846
Summary: Use `PhutilClassMaQuery` instead of `PhutilSymbolLoader`, mostly for consistency. Depends on D13588.
Test Plan: Poked around a bunch of pages.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13589
Summary:
Fixes T7422. We'll currently choose a "binary" charset with a "utf8_general_ci" collation on "sort" columns on older MySQL, which seems to be causing problems.
Choose "utf8" in this case instead.
(I attempted to simplify the logic, too, but that's the only actual change.)
Test Plan: Went back and forth with `--disable-utf8mb4` on `storage adjust`, but this is version dependent so I'm not 100% sure it's the right fix.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T7422
Differential Revision: https://secure.phabricator.com/D11928
Summary: Ref T6822.
Test Plan: Visual inspection. These methods are only called from within the `PhabricatorConfigStorageSchema` class.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11247
Summary:
Ref T1191. Currently if a developer forgot to specify a column type, `storage adjust` aborts explosively mid-stream. Instead:
- Make this a formal error with an unambiugous name/description instead of something you sort of infer by seeing "<unknown>".
- Make this error prevent generation of adjustment warnings, so we don't try to `ALTER TABLE t CHANGE COLUMN c <unknown>`, which is nonsense.
- When schemata errors exist, surface them prominiently in `storage adjust`.
Overall:
- Once `storage upgrade` runs `storage adjust` automatically (soon), this will make it relatively difficult to miss these errors.
- Letting these errors slip through no longer escalates into a more severe issue.
Test Plan:
Commented out the recent `mailKey` spec and ran `storage adjust`:
```
$ ./bin/storage adjust --force
Verifying database schemata...
Found no adjustments for schemata.
Target Error
phabricator2_phriction.phriction_document.mailKey Column Has No Specification
SCHEMATA ERRORS
The schemata have serious errors (detailed above) which the adjustment
workflow can not fix.
If you are not developing Phabricator itself, report this issue to the
upstream.
If you are developing Phabricator, these errors usually indicate that your
schema specifications do not agree with the schemata your code actually
builds.
```
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10771
Summary:
Ref T1191. Notable stuff:
- Adds `--disable-utf8mb4` to `bin/storage` to make it easier to test what things will (approximately) do on old MySQL. This isn't 100% perfect but should catch all the major stuff. It basically makes us pretend the server is an old server.
- Require utf8mb4 to dump a quickstart.
- Fix some issues with quickstart generation, notably special casing the FULLTEXT handling.
- Add an `--unsafe` flag to `bin/storage adjust` to let it truncate data to fix schemata.
- Fix some old patches which don't work if the default table charset is utf8mb4.
Test Plan:
- Dumped a quickstart.
- Loaded the quickstart with utf8mb4.
- Loaded the quickstart with `--disable-utf8mb4` (verified that we get binary columns, etc).
- Adjusted schema with `--disable-utf8mb4` (got a long adjustment with binary columns, some truncation stuff with weird edge case test data).
- Adjusted schema with `--disable-utf8mb4 --unsafe` (got truncations and clean adjust).
- Adjusted schema back without `--disable-utf8mb4` (got a long adjustment with utf8mb4 columns, some invalid data on truncated utf8).
- Adjusted schema without `--disable-utf8mb4`, but with `--unsafe` (got truncations on the invalid data).
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10757
Summary:
Ref T1191. Although I fixed some of the mutations earlier (in D10598), I missed the column mutations under old versions of MySQL. In particular, this isn't valid:
- `ALTER TABLE ... MODIFY columnName VARCHAR(64) COLLATE binary`
Issue the permitted version of this instead, which is:
- `ALTER TABLE ... MODIFY columnName VARBINARY(64)`
Also fixed an issue where a clean schema had the wrong nullability for a column in the draft table. Force it to the expected nullability.
The other trick here is around the one column with a FULLTEXT index on it, which needs a little massaging.
Test Plan:
- Forced my local install to return `false` for utf8mb4 support.
- Did a clean adjust into `binary` columns.
- Poked around, added emoji to things.
- Reverted the fake check and did a clean adjust into `utf8mb4` columns.
- Emoji survived.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: fabe, epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10627
Summary:
Ref T1191. Now that the whole database is covered, we don't need to do as much work to build expected schemata. Doing them database-by-database was helpful in converting, but is just reudndant work now.
Instead of requiring every application to build its Lisk objects, just build all Lisk objects.
I removed `harbormaster.lisk_counter` because it is unused.
It would be nice to autogenerate edge schemata, too, but that's a little trickier.
Test Plan: Database setup issues are all green.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley, hach-que
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10620
Summary:
Ref T1191. When changing the column type of an AUTO_INCREMENT column, we currently may lose the autoincrement attribute.
Instead, support it. This is a bit messy because AUTO_INCREMENT columns interact with PRIMARY KEY columns (tables may only have one AUTO_INCREMENT column, and it must be a primary key). We need to migrate in more phases to avoid this issue.
Introduce new `auto` and `auto64` types to represent autoincrement IDs.
Test Plan:
- Saw autoincrement show up correctly in web UI.
- Fixed an autoincrement issue on the XHProf storage table with `bin/storage adjust` safely.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10607
Summary:
Ref T1191. Currently, the `quickstart.sql` gets generated in a pretty manual fashion. This is a pain, and will become more of a pain in the world of utf8mb4.
Provide a workflow which does upgrade + adjust + dump + destroy, then massages the output to produce a workable `quickstart.sql`.
Test Plan: Inspected output; I'll test this more throughly before actually generating a new quickstart, but that's some ways away.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10603
Summary:
Ref T1191. For most text columns, we either don't care if "a" and "A" are the same, or we expect them to be different (for example: keys, domains, secrets, etc). Default text columns to the `_bin` collation so they are compared by strict character value. This is safer in cases where we aren't sure.
For some text columns, we allow the user to sort by the column in the UI (like Maniphest task titles) or we do care that "A" and "a" are the same (for example: project names). Introduce a new class of virtual data types, the "sort..." types, to cover these columns. These are like the "text..." types but use sorting collations which treat "A" and "a" the same.
Test Plan:
- Made an effort to identify all columns where the UI relies on database collation.
- Ran `bin/storage adjust` and cleared all warnings.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: beng, epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10602
Summary:
Ref T1191. This was useful for annotating everything but we no longer need it; there are just two types of issues now:
- Error: stuff we can't fix (missing or surplus tables/database/columns, bad column nullability).
- Warning: stuff we can fix (column types, character sets, collations, missing or surplus keys, incorrectly defined keys, bad key uniqueness).
Test Plan: Saw 3,399 warnings and 0 errors.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10594
Summary:
Ref T1191. Ref T6203. While generating expected schemata, I ran into these columns which seem to have sketchy nullability.
- Mark most of them for later resolution (T6203). They work fine today and don't need to block T1191. Changing them can break the application, so we can't autofix them.
- Forgive a couple of them that are sort-of reasonable or going to get wiped out.
Test Plan: Saw 94 remaining warnings.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: hach-que, epriestley
Maniphest Tasks: T1191, T6203
Differential Revision: https://secure.phabricator.com/D10593
Summary:
Ref T1191. We have several keys on `<x, y, id>`. When `id` is an auto-increment primary key, I believe this is exactly equivalent to a key on `<x, y>`, because the leaf nodes are implicitly sorted by `id`. We omit the implicit `id` elsewhere.
It would be nice to drop the `id` bit for consistency, but it's not doing any harm and this doesn't need to block the primary work of T1191.
Test Plan: Saw slightly fewer warnings.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10592
Summary:
Ref T1191.
- Adds definitions for missing keys and keys with wrong uniqueness. Generally, I defined these before fixing the key query to actually pull all keys and support uniqueness.
- Moves "key uniqueness" to note severity; this is fixable (probably?) and there are no remaining issues.
- Moves "Missing Key" to note severity; missing keys are fixable and all remaining missing keys are really missing (either missing edge keys, or missing PHID keys):
{F210089}
- Moves "Surplus Key" to note seveirty; surplus keys are fixable all remaining surplus keys are really surplus (duplicate key in Harbormaster, key on unused column in Worker):
{F210090}
Test Plan:
- Vetted missing/surplus/unique messages.
- 146 issues remaining.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10590
Summary:
Ref T1191. Nothing too notable here:
- Allow a Lisk object to specify that there's no expectation that a table exists. We have one Harbormaster object and one Token object like this.
- Removed BuildPlanTransactionComment because it's currently unused.
Test Plan:
- Saw ~200 fewer warnings; just ~800 left.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10583
Summary:
Ref T1191.
- Removes ponder comment table. This was migrated a very long time ago.
Test Plan:
- Grepped for removed table.
- Saw ~100 fewer issues in web UI.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10582
Summary:
Ref T1191. Notes:
- Drops the project affiliation table. This is a very old membership table which was migrated to edges.
- Drops the subproject table. This is a very old table for a removed feature.
Test Plan:
- Grepped for dropped tables.
- Saw ~100 fewer setup issues.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10581
Summary:
Ref T1191. Some notes here:
- Drops the old LDAP and OAuth info tables. These were migrated to the ExternalAccount table a very long time ago.
- Separates surplus/missing keys from other types of surplus/missing things. In the long run, my plan is to have only two notice levels:
- Error: something we can't fix (missing database, table, or column; overlong key).
- Warning: something we can fix (surplus anything, missing key, bad column type, bad key columns, bad uniqueness, bad collation or charset).
- For now, retaining three levels is helpful in generating all the expected scheamta.
Test Plan:
- Saw ~200 issues resolve, leaving ~1,300.
- Grepped for removed tables.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10580
Summary: Ref T1191. Handful of minor things here (T6150, T6149, T6148, T6147, T6146) but nothing very noteworthy.
Test Plan: Viewed web UI, saw fewer errors.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10527
Summary:
Ref T1191.
- Adds support for custom fields.
- Adds support for partial indexes (indexes on a prefix of a column).
- Drops old auxiliary storage table: this was moved to custom field storage about a year ago.
- Drops old project table: this was moved to edges about two months ago.
Test Plan:
- Viewed web UI, saw fewer issues.
- Used `grep` to verify no readers/writers for storage or project table.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10526
Summary:
Ref T1191. Three parts:
- The old way of getting key information only got primary / unique / foreign keys, not all keys. Use `SHOW INDEXES` to get all keys instead.
- Track key uniqueness and raise warnings about it.
- Add a new "all issues" view to show an expanded, flat view of all issues. This is just an easier way to get a list so you don't have to dig around in the hierarchical view.
Test Plan:
{F206351}
{F206352}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10525
Summary:
Ref T1191.
- There was a varchar(50) column. I changed it to `text64`, since this length is unusual.
- There was an int(3) column. I changed it to `int32`, since this length is unusual.
Test Plan: Ran migrations, saw warnings disappear from config tool.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10524
Summary: T1191. Nothing very notable here.
Test Plan: Saw more blue in web UI.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D10522
Summary:
Ref T1191. Notable:
- Allowed objects to remove default columns (some feed tables have no `id`).
- Added a "note" severity and moved all the charset stuff down to that to make progress more clear.
Test Plan:
Trying to make the whole thing blue...
{F205970}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10519
Summary: Ref T1191. Fills in some more of the databases. Nothing very notable here. I didn't encounter any issues or overlong keys.
Test Plan: Used web UI to click around and verify expected schemata match up against actual schemata well.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10516
Summary: Ref T1191. This fills in some more features and gets audit and auth nearly generating reasonable expected schemata.
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10500
Summary:
Ref T1191. The major issue motivation here is that InnoDB keys have a maximum length of 767 bytes. When we move `utf8` colums to `utf8mb4` columns, they'll jump from 3 bytes per character to 4 bytes per character, which may make some indexes too long. Add key schema to help spot this.
Also add nullability since it doesn't hurt.
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10499
Summary:
Ref T1191. This lays some groundwork for generating the expected schemata, so we can compare them to the actual schemata and produce a meaningful diff.
- In general, each application will subclass `PhabricatorConfigSchemaSpec` and provide a definition of the tables it expects.
- This class has helper methods to mostly-automatically build table definitions for Lisk and (in the future) edges.
- When building expected schema, we specify a "data type", like "epoch". This is the type of data the application stores in the column, from the application's point of view. The SchemaSpec converts this into the best avilable storage type: for example, "text" will translate to `utf8mb4` if it's availalbe, or `binary` if not. This gives us a layer of indirection to insulate us from craziness.
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10497
Summary:
Ref T1191. This builds on the "view of the database as it exists" by building a view of the database as it is expected to exist (this is mostly empty for now) and comparing the two. We now render a view of the "comparison schema", which is the actual schema merged with the expected schema and annotated with the differences.
(I'm merging them like this because it makes it easier to handle both "missing" and "surpulus" warnings in a consistent way. If we tried to annotate just the actual or expected schema, the absence of components which are expected to exist is messy to handle.)
Test Plan: See screenshots.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10496
Summary:
Ref T1191. Plan here is:
- Build a tool showing the current schemata status (this diff).
- Have it compare the current status to the desired status (partly here, mostly in future diffs).
- Then add a migration tool, and eventually a setup issue to tell people to run it.
Test Plan:
Reviewed current schemata.
{F204492}
{F204493}
{F204494}
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T1191
Differential Revision: https://secure.phabricator.com/D10494