Summary:
The search indexing API has several problems right now:
- Always runs in-process.
- It would be nice to push this into the task queue for performance. However, the API currently passses an object all the way through (and some indexers depend on preloaded object attributes), so it can't be dumped into the task queue at any stage since we can't serialize it.
- Being able to use the task queue will also make rebuilding indexes faster.
- Instead, make the API phid-oriented.
- No uniform indexing API.
- Each "Editor" currently calls SomeCustomIndexer::indexThing(). This won't work with AbstractTransactions. The API is also just weird.
- Instead, provide a uniform API.
- No uniform CLI.
- We have `scripts/search/reindex_everything.php`, but it doesn't actually index everything. Each new document type needs to be separately added to it, leading to stuff like D3839. Third-party applications can't provide indexers.
- Instead, let indexers expose documents for indexing.
- Not application-oriented.
- All the indexers live in search/ right now, which isn't the right organization in an application-orietned view of the world.
- Instead, move indexers to applications and load them with SymbolLoader.
Test Plan:
- `bin/search index`
- Indexed one revision, one task.
- Indexed `--type TASK`, `--type DREV`, etc., for all types.
- Indexed `--all`.
- Added the word "saboteur" to a revision, task, wiki page, and question and then searched for it.
- Creating users is a pain; searched for a user after indexing.
- Creating commits is a pain; searched for a commit after indexing.
- Mocks aren't currently loadable in the result view, so their indexing is moot.
Reviewers: btrahan, vrana
Reviewed By: btrahan
CC: 20after4, aran
Maniphest Tasks: T1991, T2104
Differential Revision: https://secure.phabricator.com/D4261
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).
We are removing the headers for these reasons:
- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.
This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).
Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.
Reviewers: epriestley, davidrecordon
Reviewed By: epriestley
CC: aran, Korvin
Maniphest Tasks: T2035
Differential Revision: https://secure.phabricator.com/D3886
Summary:
We have two troubles with this script:
# Our revisions and commits don't fit in the memory. (Our tasks do :-).)
# Reindexing revisions is slow.
Test Plan: Ran it.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Korvin
Differential Revision: https://secure.phabricator.com/D3483
Summary: There are currently two files, but all scripts require both of them,
which is clearly silly. In the longer term I want to rewrite all of this init
stuff to be more structured (e.g., merge webroot/index.php and __init_script__
better) but this reduces the surface area of the ad-hoc "include files" API we
have now, at least.
Test Plan:
- Grepped for __init_env__.php (no hits)
- Ran a unit test (to test unit changes)
- Ran a daemon (to test daemon changes)
Reviewers: jungejason, nh, tuomaspelkonen, aran
Reviewed By: jungejason
CC: aran, jungejason
Differential Revision: 976
Summary:
create the indexer for commit
Test Plan:
run the reindex_one_commit.php against one existing commit
and it is returned in search result on the webpage; run parse_one_commit
against another commit; modified reindex_everything.php to let it only
parse one commit after loading all commits.
Reviewed By: epriestley
Reviewers: epriestley, aran
CC: aran, jungejason, epriestley, debow
Differential Revision: 490