Summary:
Fixes T9890. This allows IndexExtensions to emit an object version.
Before we build indexes, we check if the indexed version is the same as the current version. If it is, we just don't call that extension.
T9890 has a case where this is useful: a script went crazy and posted thousands of comments to a single task.
Without versioning, that results in the same comments being indexed over and over again. With versioning, most of the queue could just exit without doing any work.
Test Plan:
- Added a `sleep(1)` to the actual indexing, used `bin/search index --background` to queue up a lot of tasks, ran them with `bin/phd debug task`, saw them complete very quickly with only one actual index operation performed.
- Used `bin/search index --trace` and `bin/search index --trace --background` to observe the behavior of queries against the index version store, which looked sensible.
- Made comments/transactions, saw versions update.
- Used `bin/remove destroy`, verified index versions were purged.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9890
Differential Revision: https://secure.phabricator.com/D14845
Summary:
Ref T9979. I picked this name long before the advent of modern "Engine" architecture and it ended up being pretty confusing.
Rename "SearchEngine" (currently: mysql or elasticsearch, used to store and query fulltext indexes) to "FulltextStorageEngine" to make it more clear what it does and disambituate it from ApplicationSearch, which also has a bunch of stuff called "SearchEngine", "SearchEngineExtension", etc.
Test Plan: Grepped for `phabricatorsearchengine`.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14843
Summary: Ref T9979. This simplifies/standardizes the code a bit, but mostly gives us more consistent class names and structure.
Test Plan:
- Used `bin/search index --type ...` to index documents of every indexable type.
- Searched for documents by unique text, found them.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14842
Summary:
Ref T9979. There are currently some hacks around Conpherence indexing: it does not really use the fulltext index, but its own specialized index. However, it's kind of hacked up so it can get reindexed by the normal indexing pipeline.
Lift it up into IndexEngine, instead of FulltextEngine. Specifically, the new stuff is going to look like this:
- IndexEngine: Rebuild all indexes.
- ConpherenceIndexExtension: Rebuild thread indexes.
- ProjectMemberIndexExtension: Rebuild project membership views.
- NgramIndexExtension: Rebuild ngram indexes.
- FulltextIndexExtension / FulltextEngine: Rebuild fulltext indexes, a special type of index.
- FulltextCommentExtension: Rebuild comment fulltext indexes.
- FulltextProjectExtension: Rebuild project fulltext indexes.
- etc.
Most of this is at least sort-of-in-place as of this diff, although some of the part in the middle is still pretty rough.
Test Plan:
- Made a unique comment in a Conpherence thread.
- Used `bin/search index --force` to rebuild the index.
- Searched for the comment.
- Found the thread.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14841
Summary:
Ref T9979. This is currently hard-coded but can be done in a generic way.
This has one minor behavioral changes: answer text is no longer included in the question text index in Ponder. I'm not planning to accommodate that for now since I don't want to dig this hole any deeper than I already have. This behavior should be different anyway (e.g., index the answer, then show the question in the results or something).
Test Plan:
- Put a unique word in a Maniphest comment.
- Searched for the word.
- Found the task.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14837
Summary:
Ref T9979. This event had one weird callsite and no known third-party callers. It can be done more cleanly as an extension, now.
This index is used to allow us to "Group By: Project" in Maniphest without joining into the Projects database.
Test Plan:
- Ran a query with "Group By: Project" in Maniphest.
- Renamed project "Apples" to "Zebras".
- Reloaded page.
- UI properly moved "Zebras" tasks to the bottom of the list.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14836
Summary: Ref T9979. This is going to become `FulltextEngine`, but pave the way for that by pulling extensions out of it.
Test Plan:
{F1036624}
- Used `bin/search index Txxx`, saw projects, subscribers and custom fields rebuild in the index.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9979
Differential Revision: https://secure.phabricator.com/D14835
Summary:
Ref T9890. Ref T9979. Several adjacent goals:
- The `SearchEngine` vs `ApplicationSearchEngine` thing is really confusing. There are also a bunch of confusing class names and class relationships within the fulltext indexing. I want to rename these classes to be more standard (`IndexEngine`, `IndexEngineExtension`, etc). Rename `SearchIndexer` to `IndexEngine`. A future change will rename `SearchEngine`.
- Add the index locks described in T9890.
- Structure things a little more normally so future diffs can do the "EngineExtension" thing more cleanly.
Test Plan:
Indexing:
- Renamed a task to have a unique word in the title.
- Ran `bin/search index Txxx`.
- Searched for unique word.
- Found task.
Locking:
- Added a `sleep(10)` after the `lock()` call.
- Ran `bin/search index Txxx` in two windows.
- Saw first one lock, sleep 10 seconds, index.
- Saw second one give up temporarily after failing to grab the lock.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9890, T9979
Differential Revision: https://secure.phabricator.com/D14834
Summary: Fixes T5991. If //all requested documents// failed to index, consider this a catastrophic failure and exit with an error code.
Test Plan:
- Ran `bin/search index --type TASK`, observed successful exit despite a small number of un-indexable documents.
- Ran `bin/search index PHID-TASK-xxx` for an invalid task, observed exception on exit after complete failure.
- Ran normal indexing through daemons.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T5991
Differential Revision: https://secure.phabricator.com/D14174
Summary: Use `PhutilClassMaQuery` instead of `PhutilSymbolLoader`, mostly for consistency. Depends on D13588.
Test Plan: Poked around a bunch of pages.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13589
Summary: All classes should extend from some other class. See D13275 for some explanation.
Test Plan: `arc unit`
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Differential Revision: https://secure.phabricator.com/D13283
Summary: Ref T8441. I want to use `PhabricatorSearchField` for a better, more useful object.
Test Plan: `grep`, `arc lint`
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T8441
Differential Revision: https://secure.phabricator.com/D13168
Summary: Remove the `PhabricatorDefaultSearchEngineSelector` class. This is quite similar to D12053.
Test Plan: Went to `/view/PhabricatorSearchApplication/` and saw the storage engine configuration. Set `search.elastic.host` and saw the highlighted storage engine change.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: Korvin, epriestley
Differential Revision: https://secure.phabricator.com/D12670
Summary:
Ref T3165. Builds a dedicated index for Conpherence to avoid scale/policy filtering concerns.
- This is pretty one-off but I think it's generally OK.
- There's no UI for it.
- `ConpherenceFulltextQuery` is very low-level. You would need to do another query on the PHIDs it returns to actually show anything to the user.
- The `previousTransactionPHID` is so you can load chat context efficiently. Specifically, if you want to show results like this:
> previous line of context
> **line of chat that matches the query**
> next line of context
...you can read the previous lines out of `previousTransactionPHID` directly, and the next lines by issuing one query with `WHERE previousTransactionPHID IN (...)`.
I'm not 100% sure this is useful, but it seemed like a reasonable thing to provide, since there's no way to query this efficiently otherwise and I figure a lot of chat might make way more sense with a couple of lines of context.
Test Plan:
- Indexed a thread manually (whole thing indexed).
- Indexed a thread by updating it (just the new comment indexed).
- Wrote a hacky test script and got reasonable-looking query results.
Reviewers: btrahan
Reviewed By: btrahan
Subscribers: epriestley
Maniphest Tasks: T3165
Differential Revision: https://secure.phabricator.com/D11234
Summary: Ref T5402. This more or less "fixes" it but there's probably some polish to do?
Test Plan:
stopped and started daemons. error logs look good.
ran bin/storage upgrade. noted that `adjust` added the appropriate indices for active and archive task.
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, epriestley
Maniphest Tasks: T5402
Differential Revision: https://secure.phabricator.com/D11044
Summary: Fixes T3189. Now if you say #projects in a commit message they will associate nicely with the commit. Also we record transactions about all this project editing fun.
Test Plan: tested migration by associating some projects with commits and verifying they still showed up post migration. tested adding / removing projects by doing so from the UI, noting transactions written nicely as well
Reviewers: epriestley
Reviewed By: epriestley
Subscribers: Korvin, epriestley
Projects: #projects
Maniphest Tasks: T3189
Differential Revision: https://secure.phabricator.com/D10877
Summary:
Ref T4896. Depends on D10056. Moves search indexing to standard infrastructure.
Also, fixes a bug where inline comments would not be indexed.
Test Plan: Used `bin/search index ... --trace` to view index construction of a commit, saw all the comments and inlines get indexed.
Reviewers: btrahan, joshuaspence
Reviewed By: joshuaspence
Subscribers: epriestley
Maniphest Tasks: T4896
Differential Revision: https://secure.phabricator.com/D10057
Summary: Fixes T5336. Currently, `PhabricatorWorkerLeaseQuery` is basically FIFO. It makes more sense for the queue to be a priority-queue, and to assign higher priorities to alerts (email and SMS).
Test Plan: Created dummy tasks in the queue (with different priorities). Verified that the priority field was set correctly in the DB and that the priority was shown on the `/daemon/` page. Started a `PhabricatorTaskmasterDaemon` and verified that the higher priority tasks were executed before lower priority tasks.
Reviewers: epriestley, #blessed_reviewers
Reviewed By: epriestley, #blessed_reviewers
Subscribers: epriestley, Korvin
Maniphest Tasks: T5336
Differential Revision: https://secure.phabricator.com/D9871
Summary: I'm pretty sure that `@group` annotations are useless now... see D9855. Also fixed various other minor issues.
Test Plan: Eye-ball it.
Reviewers: #blessed_reviewers, epriestley, chad
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley, Korvin, hach-que
Differential Revision: https://secure.phabricator.com/D9859
Summary:
Ref T2222. Ref T3886. Ref T418. A few changes:
- CustomField can now index into global search.
- Use CustomField fields instead of older custom fields for Differential global search. (This slightly breaks any custom fields which exist, but they are presumably very rare, and probably do not exist; this break is also very mild.)
- Automatically perform CustomField and Subscribable indexing on applicable object types.
Test Plan: Used `bin/search index` to reindex a bunch of stuff, then searched for it. Debug-dumped abstract documents to inspect them.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T418, T3886, T2222
Differential Revision: https://secure.phabricator.com/D8346
Summary:
Ref T4379. Fixes T4359. Currently, `bin/search index` does not rebuild CustomField indexes. This is because they aren't really part of the main search index. However, from a user's point of view this is by far the most logical place to look for index rebuilds, and it's straightforward for us to write into this secondary store.
At some point, it might be nice to let you specify fields as "fulltext" too, although no one has asked for that yet. We could then dump the text of those fields into the fulltext index. Ref T418.
Test Plan: Used `bin/search index --type proj --trace`, etc., and examination of the database to verify that indexes rebuilt. Reindexed users, tasks, projects.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4359, T418, T4379
Differential Revision: https://secure.phabricator.com/D8185
Summary:
Ref T4365. Drive primary search through ApplicationSearch instead of through a bunch of custom nonsense. Notably, this allows you to save searches, notably.
The one thing this doesn't do -- which I'd like it to -- is carry your query text across searches. When you search for "quack", I want to overwrite the query in your default filter and give you those results, so you can turn the search into an "Open Tasks" search by default by reordering the queries. I'll probably do that next. It feels a little hacky but I want to try it out.
Test Plan: {F106932}
Reviewers: btrahan, chad
Reviewed By: btrahan
CC: aran, bigo
Maniphest Tasks: T4365
Differential Revision: https://secure.phabricator.com/D8123
Summary: Ref T4365. It's not practical to cursor-page all engines; allow main search engines to be offset-paged. Basically, this comes down to setting a flag and then doing a couple of tiny things differently.
Test Plan: Used this two diffs from now.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T4365
Differential Revision: https://secure.phabricator.com/D8121
Summary: Fixes T3857. Earlier work made this trivial and just left product questions, which I've answered by requiring the daemons to run on reasonable installs.
Test Plan: Ran `bin/search index` and `bin/search index --background`. Observed indexes write in the former case and tasks queue in the latter case. Commented with a unique string on a revision and searched for it a moment later, got exactly one result (that revision), verifying that reindexing works correctly.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3857
Differential Revision: https://secure.phabricator.com/D7966
Summary:
See discussion in D6955. Provide an event for applications and users to update secondary search indexes.
Facebook: I don't recall exactly how all the search stuff is rigged up, but this might provide a more practical / less fragile alternative. I think it publishes into ElasticSearch now, and then intern somehow handles the result merge at display time, implictly relying on Phabricator's storage format? A cleaner approach might be to publish a secondary "intern" index in a standard format.
Test Plan: Ran `bin/search index --type proj --trace`, saw events fire.
Reviewers: btrahan
Reviewed By: btrahan
CC: FacebookPOC, aran
Differential Revision: https://secure.phabricator.com/D6956
Summary: Ref T3578. Get indexing back, and try to simplify it a bit.
Test Plan: Rebuilt QUES and MOCK indexes with `bin/search`. Created question with unique string, verified it appeared as a result. Added an answer with a unique string, got it as a result too.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T3578
Differential Revision: https://secure.phabricator.com/D6619
Summary: Ref T2715. Had to start loading status information in the query class. Debated trying to clean up some of the attach / load stuff but decided to just add status under the new paradigm for now.
Test Plan: phid.query also made a status and checked that out. also played in conpherence.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2715
Differential Revision: https://secure.phabricator.com/D6585
Summary: Ref T2715. Switch Ponder to the new IDs.
Test Plan: Ran `phid.lookup`; `phid.query`. Grepped for old constant
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2715
Differential Revision: https://secure.phabricator.com/D6554
Summary: Ref T2715. Switch Maniphest to the new stuff.
Test Plan: Used `phid.query`; `phid.lookup` to load objects.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2715
Differential Revision: https://secure.phabricator.com/D6516
Summary: Ref T2716. Ref T2715. Move CMIT to use Application PHIDs. Nothing too special here, but I consolidated some code into DiffusionCommitQuery. Depends on D6514.
Test Plan: Browsed Diffusion. Browsed Differential/Maniphest with linked commits. Used jump nav; used `phid.lookup` and `phid.query`. Used remarkup for Git and SVN repos. Grepped for PHID_TYPE_CMIT.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T2715, T2716
Differential Revision: https://secure.phabricator.com/D6515
Summary:
T2345
getConfig throws an Exception when the key does not exist.
Also removes dead code that throws an Exception.
Test Plan:
Reloaded the Phabricator home page. In the process, found
2 Exceptions thrown due to nonexistent keys. After addressing these problems,
the home page loads without Exceptions.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D4541
Summary:
The search indexing API has several problems right now:
- Always runs in-process.
- It would be nice to push this into the task queue for performance. However, the API currently passses an object all the way through (and some indexers depend on preloaded object attributes), so it can't be dumped into the task queue at any stage since we can't serialize it.
- Being able to use the task queue will also make rebuilding indexes faster.
- Instead, make the API phid-oriented.
- No uniform indexing API.
- Each "Editor" currently calls SomeCustomIndexer::indexThing(). This won't work with AbstractTransactions. The API is also just weird.
- Instead, provide a uniform API.
- No uniform CLI.
- We have `scripts/search/reindex_everything.php`, but it doesn't actually index everything. Each new document type needs to be separately added to it, leading to stuff like D3839. Third-party applications can't provide indexers.
- Instead, let indexers expose documents for indexing.
- Not application-oriented.
- All the indexers live in search/ right now, which isn't the right organization in an application-orietned view of the world.
- Instead, move indexers to applications and load them with SymbolLoader.
Test Plan:
- `bin/search index`
- Indexed one revision, one task.
- Indexed `--type TASK`, `--type DREV`, etc., for all types.
- Indexed `--all`.
- Added the word "saboteur" to a revision, task, wiki page, and question and then searched for it.
- Creating users is a pain; searched for a user after indexing.
- Creating commits is a pain; searched for a commit after indexing.
- Mocks aren't currently loadable in the result view, so their indexing is moot.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: 20after4, aran
Maniphest Tasks: T1991, T2104
Differential Revision: https://secure.phabricator.com/D4261
Summary: ...and do so for a few fields -- summary, test plan, and revert plan.
Test Plan: added NATASHA and BULLWINKLE to summary and test plan of existing diff. Diff showed up in search!
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T654
Differential Revision: https://secure.phabricator.com/D3915
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary:
We have two troubles with this script:
# Our revisions and commits don't fit in the memory. (Our tasks do :-).)
# Reindexing revisions is slow.
Test Plan: Ran it.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3483
Summary:
We have some issues with Elastic search (or maybe it's SMC) causing that indexing sporadically doesn't work.
Throwing in indexing stops the workflow and is annoying.
Not indexing doesn't have fatal consequences for the user and we can (and probably should) postpone it.
Test Plan: Thrown, looked at log.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3350
Summary:
Ponder is similar in spirit to the Wiki tool, but uses a Q&A
format and up/downvotes to signal user sentiment. Popular
questions are moved to the top of the feed on a 5-minute
cycle based on age (younger is better) and vote count (higher
is better).
Pre-apologies for noob diff.
Test Plan:
- `./bin/phd list` Should include `PonderHeatDaemon`; phd launch it
if necessary.
- Navigate to /ponder/ ; observe sanity when adding questions,
voting on them, and adding answers.
- Confirm that questions and answers are linkable using Q5 / Q5#A5 formatted object links.
- Confirm that searching for Ponder Questions works using built-in
search.
Feedback on code / schema / whatever organization very welcome.
Reviewers: nh, vrana, epriestley
Reviewed By: epriestley
CC: gmarcotte, aran, Korvin, starruler
Differential Revision: https://secure.phabricator.com/D3136
Test Plan:
$ ./reindex_all_users.php
Search for me in open documents.
Search for @epriestley in open documents.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3216
Summary:
- Commit detail view
- List of projects
- "edit" action which takes the user to a simple form where they can only add / remove projects.
- Integrated the project relationship into the commit search indexer
- fixed a bug from D790; it seems you must select the column if you're going to join against it later. Without this change searching for author or projectfails 100% for me.
Test Plan: added and removed projects. verified appropriate projects showed up in detail and edit view. searched for commits by project and found the ones I was supposed to...!
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T1614
Differential Revision: https://secure.phabricator.com/D3189
Summary:
- `kill_init.php` said "Moving 1000 files" - I hope that this is not some limit in `FileFinder`.
- [src/infrastructure/celerity] `git mv utils.php map.php; git mv api/utils.php api.php`
- Comment `phutil_libraries` in `.arcconfig` and run `arc liberate`.
NOTE: `arc diff` timed out so I'm pushing it without review.
Test Plan:
/D1234
Browsed around, especially in `applications/repository/worker/commitchangeparser` and `applications/` in general.
Auditors: epriestley
Maniphest Tasks: T1103
Summary: Only inlines were indexed (contrary to what comment claims).
Test Plan: Index one revision, check database.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2359
Summary: I thought that this will be fun but the elasticsearch API is horrible and the documentation is poor.
Test Plan:
Search for:
- string
- author
- author, owner
- string, author
- open
- string, open, author
- string, exclude
- several authors, several owners
- nothing
- probably all other combinations
Normally, such an exhaustive test plan wouldn't be required but each combination requires a completely different query.
Reviewers: epriestley, jungejason
Reviewed By: epriestley
CC: aran, Koolvin, btrahan
Differential Revision: https://secure.phabricator.com/D2298
Summary: "Committed" is SVN-specific language, and confusing in Git and Mercurial. Use neutral language instead.
Test Plan: Inspection.
Reviewers: btrahan, Makinde, vrana, jungejason
Reviewed By: vrana
CC: aran
Maniphest Tasks: T909
Differential Revision: https://secure.phabricator.com/D2087
Summary: Last of the big final patches. Left a few debatable classes (12 out of about 400) that I'll deal with individually eventually.
Test Plan: Ran testEverythingImplemented.
Reviewers: btrahan
Reviewed By: btrahan
CC: aran, epriestley
Maniphest Tasks: T795
Differential Revision: https://secure.phabricator.com/D1881
Summary: Add audit information to the commit search index.
Test Plan: Updated a commit, searched for terms in its comments, got hits.
Reviewers: btrahan, jungejason
Reviewed By: jungejason
CC: aran, epriestley
Maniphest Tasks: T904
Differential Revision: https://secure.phabricator.com/D1696
Phabricator
Summary: ...this breaks without D1328. Used good ole "codemod" to do this
work, with lots of manual edits around 80 chars.
Test Plan: clicked around phabricator tool suite, particular differential, a
bunch
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1351