Summary:
Ref T9979. There are currently some hacks around Conpherence indexing: it does not really use the fulltext index, but its own specialized index. However, it's kind of hacked up so it can get reindexed by the normal indexing pipeline.
Lift it up into IndexEngine, instead of FulltextEngine. Specifically, the new stuff is going to look like this:
- IndexEngine: Rebuild all indexes.
- ConpherenceIndexExtension: Rebuild thread indexes.
- ProjectMemberIndexExtension: Rebuild project membership views.
- NgramIndexExtension: Rebuild ngram indexes.
- FulltextIndexExtension / FulltextEngine: Rebuild fulltext indexes, a special type of index.
- FulltextCommentExtension: Rebuild comment fulltext indexes.
- FulltextProjectExtension: Rebuild project fulltext indexes.
- etc.
Most of this is at least sort-of-in-place as of this diff, although some of the part in the middle is still pretty rough.
Test Plan:
- Made a unique comment in a Conpherence thread.
- Used `bin/search index --force` to rebuild the index.
- Searched for the comment.
- Found the thread.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14841
Summary:
Ref T9890. Ref T9979. Several adjacent goals:
- The `SearchEngine` vs `ApplicationSearchEngine` thing is really confusing. There are also a bunch of confusing class names and class relationships within the fulltext indexing. I want to rename these classes to be more standard (`IndexEngine`, `IndexEngineExtension`, etc). Rename `SearchIndexer` to `IndexEngine`. A future change will rename `SearchEngine`.
- Add the index locks described in T9890.
- Structure things a little more normally so future diffs can do the "EngineExtension" thing more cleanly.
Test Plan:
Indexing:
- Renamed a task to have a unique word in the title.
- Ran `bin/search index Txxx`.
- Searched for unique word.
- Found task.
Locking:
- Added a `sleep(10)` after the `lock()` call.
- Ran `bin/search index Txxx` in two windows.
- Saw first one lock, sleep 10 seconds, index.
- Saw second one give up temporarily after failing to grab the lock.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9890, T9979
Differential Revision: https://secure.phabricator.com/D14834
Summary: Fixes T5991. If //all requested documents// failed to index, consider this a catastrophic failure and exit with an error code.
Test Plan:
- Ran `bin/search index --type TASK`, observed successful exit despite a small number of un-indexable documents.
- Ran `bin/search index PHID-TASK-xxx` for an invalid task, observed exception on exit after complete failure.
- Ran normal indexing through daemons.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T5991
Differential Revision: https://secure.phabricator.com/D14174
Summary: Use `PhutilClassMaQuery` instead of `PhutilSymbolLoader`, mostly for consistency. Depends on D13588.
Test Plan: Poked around a bunch of pages.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13589
Summary: Remove the `PhabricatorDefaultSearchEngineSelector` class. This is quite similar to D12053.
Test Plan: Went to `/view/PhabricatorSearchApplication/` and saw the storage engine configuration. Set `search.elastic.host` and saw the highlighted storage engine change.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12670
Summary: Derped this up in D11234.
Test Plan: Ran `bin/search index --all`.
Reviewers: joshuaspence
Reviewed By: joshuaspence
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D11273
Summary:
Ref T3165. Builds a dedicated index for Conpherence to avoid scale/policy filtering concerns.
- This is pretty one-off but I think it's generally OK.
- There's no UI for it.
- `ConpherenceFulltextQuery` is very low-level. You would need to do another query on the PHIDs it returns to actually show anything to the user.
- The `previousTransactionPHID` is so you can load chat context efficiently. Specifically, if you want to show results like this:
> previous line of context
> **line of chat that matches the query**
> next line of context
...you can read the previous lines out of `previousTransactionPHID` directly, and the next lines by issuing one query with `WHERE previousTransactionPHID IN (...)`.
I'm not 100% sure this is useful, but it seemed like a reasonable thing to provide, since there's no way to query this efficiently otherwise and I figure a lot of chat might make way more sense with a couple of lines of context.
Test Plan:
- Indexed a thread manually (whole thing indexed).
- Indexed a thread by updating it (just the new comment indexed).
- Wrote a hacky test script and got reasonable-looking query results.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T3165
Differential Revision: https://secure.phabricator.com/D11234
Summary:
When the index does not exist and auto_create_index isn't
enabled, running ./bin/index results in a failure. That's
T5990
Instead create an index properly. This also allows us to do
nice things like do a proper mapping and analysis like for
substring matching like outlined by @fabe in T6552.
Test Plan:
Deleted and created index multiple times to verify
proper index creation and usage.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: Korvin, manybubbles, chasemp, fabe, epriestley
Differential Revision: https://secure.phabricator.com/D10955
Summary: I'm pretty sure that `@group` annotations are useless now... see D9855. Also fixed various other minor issues.
Test Plan: Eye-ball it.
Reviewers: #blessed_reviewers, epriestley, chad
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9859
Summary: Currently, the `./bin/search index` script produces a lot of output (one line for every indexed object). Instead, use a `PhutilConsoleProgressBar` to indicate progress. This is much less verbose and gives a real indication of how long the script should take to complete.
Test Plan: Ran `./bin/search index` and verified that a progress bar was output.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D9364
Summary: Fixes T3857. Earlier work made this trivial and just left product questions, which I've answered by requiring the daemons to run on reasonable installs.
Test Plan: Ran `bin/search index` and `bin/search index --background`. Observed indexes write in the former case and tasks queue in the latter case. Commented with a unique string on a revision and searched for it a moment later, got exactly one result (that revision), verifying that reindexing works correctly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3857
Differential Revision: https://secure.phabricator.com/D7966
Summary:
Ref T2015. Not directly related to Drydock, but I've wanted to do this for a bit.
Introduce a common base class for all the workflows in the scripts in `bin/*`. This slightly reduces code duplication by moving `isExecutable()` to the base, but also provides `getViewer()`. This is a little nicer than `PhabricatorUser::getOmnipotentUser()` and gives us a layer of indirection if we ever want to introduce more general viewer mechanisms in scripts.
Test Plan: Lint; ran some of the scripts.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2015
Differential Revision: https://secure.phabricator.com/D7838
Summary:
Part one of a large and complicated plot:
- The last filter for Maniphest "pro" queries is "Group By".
- This is currently executed in a convoluted and ridiculous way, loading massive amounts of data.
- The primary reason it works like it does is that we don't have a project name index available in Maniphest, so we can't sort in the DB.
- So, I want to provide a name index to Maniphest and push this work to the DB.
To do that, my plan is:
- Index projects in Search.
- Add a "did update index" event.
- Have Maniphest listen for it.
- When projects are updated, update their indexes in Maniphest.
- Rewrite the giant mess of "group by: project" to be somewhat reasonable.
- This may also extend to some future "group by: assignee".
This is the first small step down this path, which just indexes projects in search.
Test Plan: Ran `bin/search index --type project`, then searched for projects.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Differential Revision: https://secure.phabricator.com/D6955
Summary: Ref T2715. This only ever supported like 10% of object types; get rid of it in favor of the new infra.
Test Plan:
- Ran `bin/search index D12`; `bin/search index <some valid phid>`, `bin/search index derp`.
- Turned off Search jump, searched for `D12`.
- Used `phid.lookup`.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2715
Differential Revision: https://secure.phabricator.com/D6519
Summary: Fixes T2651. This could be futher generalized but it's a bit out of the way.
Test Plan: See chatlog.
Reviewers: chad
Reviewed By: chad
CC: aran
Maniphest Tasks: T2651
Differential Revision: https://secure.phabricator.com/D5236
Summary: Mostly applies a new call spacing rule; also a few things that have slipped through via pull requests and such
Test Plan: `find src/ -type f -name '*.php' | xargs -n16 arc lint --output summary --apply-patches`
Reviewers: chad
Reviewed By: chad
CC: aran
Differential Revision: https://secure.phabricator.com/D5002
Summary:
The search indexing API has several problems right now:
- Always runs in-process.
- It would be nice to push this into the task queue for performance. However, the API currently passses an object all the way through (and some indexers depend on preloaded object attributes), so it can't be dumped into the task queue at any stage since we can't serialize it.
- Being able to use the task queue will also make rebuilding indexes faster.
- Instead, make the API phid-oriented.
- No uniform indexing API.
- Each "Editor" currently calls SomeCustomIndexer::indexThing(). This won't work with AbstractTransactions. The API is also just weird.
- Instead, provide a uniform API.
- No uniform CLI.
- We have `scripts/search/reindex_everything.php`, but it doesn't actually index everything. Each new document type needs to be separately added to it, leading to stuff like D3839. Third-party applications can't provide indexers.
- Instead, let indexers expose documents for indexing.
- Not application-oriented.
- All the indexers live in search/ right now, which isn't the right organization in an application-orietned view of the world.
- Instead, move indexers to applications and load them with SymbolLoader.
Test Plan:
- `bin/search index`
- Indexed one revision, one task.
- Indexed `--type TASK`, `--type DREV`, etc., for all types.
- Indexed `--all`.
- Added the word "saboteur" to a revision, task, wiki page, and question and then searched for it.
- Creating users is a pain; searched for a user after indexing.
- Creating commits is a pain; searched for a commit after indexing.
- Mocks aren't currently loadable in the result view, so their indexing is moot.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: 20after4, aran
Maniphest Tasks: T1991, T2104
Differential Revision: https://secure.phabricator.com/D4261