1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2025-02-28 22:49:16 +01:00
Commit graph

20 commits

Author SHA1 Message Date
epriestley
7c5b5327c8 Use stemming in the MySQL fulltext search engine
Summary:
Ref T6740. When we index a document, also save a copy of the stemmed version.

When querying, search the combined corpus for the terms.

(We may need to tune this a bit later since it's possible for literal, quoted terms to match in the stemmed section, but I think this wil rarely cause issues in practice.)

A downside here is that search sort of breaks if you upgrade into this and don't reindex. I wasn't able to find a way to issue the query that remained compatible with older indexes and didn't have awful performance, so my plan is:

  - Put this on `secure`.
  - Rebuild the index.
  - If things look good after a couple of days, add a way that we can tell people they need to rebuild the search index with a setup warning.

We might get some reports between now and then, but if this is super awful we should know by the end of the weekend.

Test Plan:
WOW AMAZING

{F2021466}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T6740

Differential Revision: https://secure.phabricator.com/D16947
2016-11-25 15:30:50 -08:00
epriestley
d54c14c644 If InnoDB FULLTEXT is available, use it for for fulltext indexes
Summary: Ref T11741. I'll wait until the release cut to land this; it just adds a test for InnoDB FULLTEXT being available instead of always returning `false`.

Test Plan:
  - Ran with InnoDB fulltext locally for a day and a half without issues.
  - Ran `bin/storage upgrade`, saw it detect InnoDB fulltext.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T11741

Differential Revision: https://secure.phabricator.com/D16946
2016-11-25 15:29:14 -08:00
epriestley
48a34eced2 Prepare for InnoDB FULLTEXT support
Summary:
Ref T11741. This makes everything work if we switch to InnoDB, but never actually switches yet.

Since the default minimum word length (3) and stopword list (36 common English words) in InnoDB are generally pretty reasonable, I just didn't add any setup advice for them. I figure we're better off with simpler setup until we identify some real problem that the builtin stopwords create.

Test Plan: Swapped the `false` to `true`, ran `storage adjust`, got InnoDB fulltext indexes, searched for stuff, got default "AND" behavior.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T11741

Differential Revision: https://secure.phabricator.com/D16942
2016-11-25 15:18:26 -08:00
epriestley
ff3333548f Create and populate a stopwords table for InnoDB fulltext indexes to use in the future
Summary:
Ref T11741. InnoDB uses a stopwords table instead of a stopwords file.

During `storage upgrade`, synchronize the table from the stopwords file on disk.

Test Plan:
  - Ran `storage upgrade`.
  - Ran `select * from stopwords`, saw stopwords.
  - Added some garbage to the table.
  - Ran `storage upgrade`, saw it remove it.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T11741

Differential Revision: https://secure.phabricator.com/D16940
2016-11-25 15:13:08 -08:00
epriestley
a956047989 Use PhutilQueryCompiler in Phabricator fulltext search
Summary:
Ref T11741. Fixes T10642. Parse and compile user queries with a consistent ruleset, then submit queries to the backend using whatever ruleset MySQL is configured with.

This means that `ft_boolean_syntax` no longer needs to be configured (we'll just do the right thing in all cases).

This should improve behavior with RDS immediately (T10642), and allow us to improve behavior with InnoDB in the future (T11741).

Test Plan:
  - Ran various queries in the UI, saw the expected results.
  - Ran bad queries, got useful errors.
  - Searched threads in Conpherence.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10642, T11741

Differential Revision: https://secure.phabricator.com/D16939
2016-11-25 14:46:10 -08:00
epriestley
163f2c4262 Refine available filters and defaults for relationship selection
Summary:
Ref T4788. Fixes T10703.

In the longer term I want to put this on top of ApplicationSearch, but that's somewhat complex and we're at a fairly good point to pause this feature for feedback.

Inch toward that instead: provide more appropriate filters and defaults without rebuilding the underlying engine. Specifically:

  - No "assigned" for commits (barely makes sense).
  - No "assigned" for mocks (does not make sense).
  - Default to "open" for parent tasks, subtasks, close as duplicate, and merge into.

Also, add a key to the `search_document` table to improve the performance of the "all open stuff of type X" query. "All Open Tasks" is about 100x faster on my machine with this key.

Test Plan:
  - Clicked all object relationships, saw more sensible filters and defaults.
  - Saw "open" query about 100x faster locally (300ms to 3ms).

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T4788, T10703

Differential Revision: https://secure.phabricator.com/D16202
2016-06-30 11:51:36 -07:00
Joshua Spence
d6b882a804 Fix visiblity of LiskDAO::getConfiguration()
Summary: Ref T6822.

Test Plan: `grep`

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: hach-que, Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11370
2015-01-14 06:54:13 +11:00
epriestley
d67b7f0f47 Correct column mutations for old versions of MySQL
Summary:
Ref T1191. Although I fixed some of the mutations earlier (in D10598), I missed the column mutations under old versions of MySQL. In particular, this isn't valid:

  - `ALTER TABLE ... MODIFY columnName VARCHAR(64) COLLATE binary`

Issue the permitted version of this instead, which is:

  - `ALTER TABLE ... MODIFY columnName VARBINARY(64)`

Also fixed an issue where a clean schema had the wrong nullability for a column in the draft table. Force it to the expected nullability.

The other trick here is around the one column with a FULLTEXT index on it, which needs a little massaging.

Test Plan:
  - Forced my local install to return `false` for utf8mb4 support.
  - Did a clean adjust into `binary` columns.
  - Poked around, added emoji to things.
  - Reverted the fake check and did a clean adjust into `utf8mb4` columns.
  - Emoji survived.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: fabe, epriestley

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10627
2014-10-02 14:44:22 -07:00
epriestley
e4e5a2004f Use sort, not text, for search index table
Summary: Ref T1191. The index's case sensitivity depends on the column type. Using `text` makes the search case-sensitive, which is not desirable.

Test Plan: After adjustment, searched for "PROJECTS" and found hits against "projects".

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10619
2014-10-02 09:50:23 -07:00
epriestley
4fcc634a99 Fix almost all remaining schemata issues
Summary:
Ref T1191. This fixes nearly every remaining blocker for utf8mb4 -- primarily, overlong keys.

Remaining issue is https://secure.phabricator.com/T1191#77467

Test Plan: I'll annotate inline.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley, hach-que

Maniphest Tasks: T6099, T6129, T6133, T6134, T6150, T6148, T6147, T6146, T6105, T1191

Differential Revision: https://secure.phabricator.com/D10601
2014-10-01 08:18:36 -07:00
epriestley
2880732a49 Generate expected schemata for Search
Summary:
Ref T1191. Notable:

  - Drops a very old saved query table. See comments inline: plan was to remove it after a year. It's been ~a year and two weeks.
  - This has our only fulltext index. I'm not supporting that formally for now, but left a note.
  - This has our only MyISAM table. I'm not supporting that explicitly for now, but it shouldn't affect anything. I may deal with this in the future.
  - These tables don't actually write directly via Lisk, so there's some fiddling to get the schemata right.

Test Plan: Down to ~250 warnings. No more surplus databases or tables.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T1191

Differential Revision: https://secure.phabricator.com/D10589
2014-10-01 07:53:35 -07:00
Joshua Spence
8756d82cf6 Remove @group annotations
Summary: I'm pretty sure that `@group` annotations are useless now... see D9855. Also fixed various other minor issues.

Test Plan: Eye-ball it.

Reviewers: #blessed_reviewers, epriestley, chad

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9859
2014-07-10 08:12:48 +10:00
vrana
ef85f49adc Delete license headers from files
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).

We are removing the headers for these reasons:

- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.

This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).

Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.

Reviewers: epriestley, davidrecordon

Reviewed By: epriestley

CC: aran, Korvin

Maniphest Tasks: T2035

Differential Revision: https://secure.phabricator.com/D3886
2012-11-05 11:16:51 -08:00
vrana
6cc196a2e5 Move files in Phabricator one level up
Summary:
- `kill_init.php` said "Moving 1000 files" - I hope that this is not some limit in `FileFinder`.
- [src/infrastructure/celerity] `git mv utils.php map.php; git mv api/utils.php api.php`
- Comment `phutil_libraries` in `.arcconfig` and run `arc liberate`.

NOTE: `arc diff` timed out so I'm pushing it without review.

Test Plan:
/D1234
Browsed around, especially in `applications/repository/worker/commitchangeparser` and `applications/` in general.

Auditors: epriestley

Maniphest Tasks: T1103
2012-06-01 12:32:44 -07:00
epriestley
09c8af4de0 Upgrade phabricator to libphutil v2
Summary: Mechanical changes from D2588. No "Class.php" moves yet.

Test Plan: See D2588.

Reviewers: vrana, btrahan, jungejason

Reviewed By: vrana

CC: aran

Maniphest Tasks: T1103

Differential Revision: https://secure.phabricator.com/D2591
2012-05-30 14:26:29 -07:00
epriestley
d0af617818 Add "final" to (almost) everything else
Summary: Last of the big final patches. Left a few debatable classes (12 out of about 400) that I'll deal with individually eventually.

Test Plan: Ran testEverythingImplemented.

Reviewers: btrahan

Reviewed By: btrahan

CC: aran, epriestley

Maniphest Tasks: T795

Differential Revision: https://secure.phabricator.com/D1881
2012-03-13 16:21:04 -07:00
epriestley
4bec2579d5 Some documentation updates. 2011-09-14 08:02:31 -07:00
epriestley
b8e08f34f7 Provide an indirection layer between documents and the search engine
Summary:
In preparation for adding another search engine (see T355):

  - Rename "executor" to "engine".
  - Move all engine-specific operations into the engine. Specifically, this
means that indexing moves out of the document store and into the engine (it was
sort of silly where it was before).
  - Split choice of an engine into an overridable "selector" class, a base API,
and a concrete MySQL implementation (just like storage engine selection).
  - Make all callers go through the indirection layer.

The default selector just unconditionally selects the MySQL engine, but now
(with D786) I can build an Elastic Search engine and you guys can build a
multi-target engine if you want and I don't get there fast enough.

Test Plan:
  - Created a new document (task).
  - Searched for and found it.
  - Viewed index reconstruction.

Reviewed By: jungejason
Reviewers: jungejason, amckinley, tuomaspelkonen, aran
CC: aran, jungejason, epriestley
Differential Revision: 788
2011-08-08 11:43:05 -07:00
epriestley
505c82236d Improve search functionality.
Summary:

Test Plan:

Reviewers:

CC:
2011-02-18 17:16:00 -08:00
epriestley
147d2e2e3d Rought cut of search.
Summary: Botched this pretty badly in git so we'll see how much I broke. :/

Test Plan:

Reviewers:

CC:
2011-02-14 15:34:20 -08:00