Summary:
Depends on D19503. Ref T13151. See PHI719. If you have something like a script which updates an object in a loop, we can end up queueing many search reindex tasks.
These tasks may reasonably contend for the lock, especially if the object is larger (lots of text and/or lots of comments) and indexing takes a few seconds.
This isn't concerning, and the indexers should converge to good behavior quickly once the updates stop.
Today, they'll spew a bunch of serious-looking lock exceptions into the log. Instead, just yield so it's more clear that there's (normally) no cause for concern here.
Test Plan: Ran `bin/search index Txxx --force` on a large object in multiple windows with a 0 second lock, saw an explicit yield instead of a lock exception.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13151
Differential Revision: https://secure.phabricator.com/D19504
Summary:
Ref T13151. See PHI719. One minor hiccup in debugging the issue (which ended up being "revision has 100K comments") was that the `SearchWorker` did not show which object it was indexing.
Add `'objectPHID'` to the queue call so you can see which object is affected from the web UI.
Test Plan:
- Stopped daemons.
- Used `bin/search index D123 --background` to queue a search task.
- Viewed task details in web UI from `/daemon/`.
- Before change: no indication of which object was being indexed.
- After change: page helpfully shows that the task is indexing D123.
Reviewers: amckinley
Reviewed By: amckinley
Maniphest Tasks: T13151
Differential Revision: https://secure.phabricator.com/D19502
Summary: Ref T9979. This simplifies/standardizes the code a bit, but mostly gives us more consistent class names and structure.
Test Plan:
- Used `bin/search index --type ...` to index documents of every indexable type.
- Searched for documents by unique text, found them.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14842
Summary:
Ref T9979. There are currently some hacks around Conpherence indexing: it does not really use the fulltext index, but its own specialized index. However, it's kind of hacked up so it can get reindexed by the normal indexing pipeline.
Lift it up into IndexEngine, instead of FulltextEngine. Specifically, the new stuff is going to look like this:
- IndexEngine: Rebuild all indexes.
- ConpherenceIndexExtension: Rebuild thread indexes.
- ProjectMemberIndexExtension: Rebuild project membership views.
- NgramIndexExtension: Rebuild ngram indexes.
- FulltextIndexExtension / FulltextEngine: Rebuild fulltext indexes, a special type of index.
- FulltextCommentExtension: Rebuild comment fulltext indexes.
- FulltextProjectExtension: Rebuild project fulltext indexes.
- etc.
Most of this is at least sort-of-in-place as of this diff, although some of the part in the middle is still pretty rough.
Test Plan:
- Made a unique comment in a Conpherence thread.
- Used `bin/search index --force` to rebuild the index.
- Searched for the comment.
- Found the thread.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14841
Summary:
Ref T9890. Ref T9979. Several adjacent goals:
- The `SearchEngine` vs `ApplicationSearchEngine` thing is really confusing. There are also a bunch of confusing class names and class relationships within the fulltext indexing. I want to rename these classes to be more standard (`IndexEngine`, `IndexEngineExtension`, etc). Rename `SearchIndexer` to `IndexEngine`. A future change will rename `SearchEngine`.
- Add the index locks described in T9890.
- Structure things a little more normally so future diffs can do the "EngineExtension" thing more cleanly.
Test Plan:
Indexing:
- Renamed a task to have a unique word in the title.
- Ran `bin/search index Txxx`.
- Searched for unique word.
- Found task.
Locking:
- Added a `sleep(10)` after the `lock()` call.
- Ran `bin/search index Txxx` in two windows.
- Saw first one lock, sleep 10 seconds, index.
- Saw second one give up temporarily after failing to grab the lock.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9890, T9979
Differential Revision: https://secure.phabricator.com/D14834
Summary: Fixes T5991. If //all requested documents// failed to index, consider this a catastrophic failure and exit with an error code.
Test Plan:
- Ran `bin/search index --type TASK`, observed successful exit despite a small number of un-indexable documents.
- Ran `bin/search index PHID-TASK-xxx` for an invalid task, observed exception on exit after complete failure.
- Ran normal indexing through daemons.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T5991
Differential Revision: https://secure.phabricator.com/D14174
Summary: Ref T6822. This method is only called from within the `PhabricatorWorker::executeTask()` and `PhabricatorWorker::scheduleTask()` methods.
Test Plan: `grep`ped for `->doWork`.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T6822
Differential Revision: https://secure.phabricator.com/D11406
Summary:
Ref T3165. Builds a dedicated index for Conpherence to avoid scale/policy filtering concerns.
- This is pretty one-off but I think it's generally OK.
- There's no UI for it.
- `ConpherenceFulltextQuery` is very low-level. You would need to do another query on the PHIDs it returns to actually show anything to the user.
- The `previousTransactionPHID` is so you can load chat context efficiently. Specifically, if you want to show results like this:
> previous line of context
> **line of chat that matches the query**
> next line of context
...you can read the previous lines out of `previousTransactionPHID` directly, and the next lines by issuing one query with `WHERE previousTransactionPHID IN (...)`.
I'm not 100% sure this is useful, but it seemed like a reasonable thing to provide, since there's no way to query this efficiently otherwise and I figure a lot of chat might make way more sense with a couple of lines of context.
Test Plan:
- Indexed a thread manually (whole thing indexed).
- Indexed a thread by updating it (just the new comment indexed).
- Wrote a hacky test script and got reasonable-looking query results.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T3165
Differential Revision: https://secure.phabricator.com/D11234
Summary: Fixes T3857. Earlier work made this trivial and just left product questions, which I've answered by requiring the daemons to run on reasonable installs.
Test Plan: Ran `bin/search index` and `bin/search index --background`. Observed indexes write in the former case and tasks queue in the latter case. Commented with a unique string on a revision and searched for it a moment later, got exactly one result (that revision), verifying that reindexing works correctly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3857
Differential Revision: https://secure.phabricator.com/D7966