mirror of
https://we.phorge.it/source/phorge.git
synced 2025-01-22 20:51:10 +01:00
Adjust and wordsmith Search documentation
Summary: Ref T12450. General adjustments: - Try to make "Cluster: Search" more about "stuff in common + types" instead of pretty much all being Elastic-specific, so we can add Solr or whatever later. - Provide guidance about rebuilding indexes after making a change. - Simplify the basic examples, then provide a more advanced example at the ed. - Really try to avoid suggesting anyone configure Elasticsearch ever for any reason. Test Plan: Read documents, previewed in remarkup. Reviewers: chad, 20after4 Reviewed By: 20after4 Maniphest Tasks: T12450 Differential Revision: https://secure.phabricator.com/D17602
This commit is contained in:
parent
64234535e3
commit
287e708c4d
2 changed files with 191 additions and 69 deletions
|
@ -47,7 +47,7 @@ will have on availability, resistance to data loss, and scalability.
|
|||
| **SSH Servers** | Minimal | Low | No Risk | Low
|
||||
| **Web Servers** | Minimal | **High** | No Risk | Moderate
|
||||
| **Notifications** | Minimal | Low | No Risk | Low
|
||||
| **Fulltext Search** | Moderate | **High** | Minimal Risk | Moderate
|
||||
| **Fulltext Search** | Minimal | Low | No Risk | Low
|
||||
|
||||
See below for a walkthrough of these services in greater detail.
|
||||
|
||||
|
@ -241,26 +241,14 @@ For details, see @{article:Cluster: Notifications}.
|
|||
Cluster: Fulltext Search
|
||||
========================
|
||||
|
||||
At a certain scale, you may begin to bump up against the limitations of MySQL's
|
||||
built-in fulltext search capabilities. We have seen this with very large
|
||||
installations with several million objects in the database and very many
|
||||
simultaneous requests. At this point you may consider adding Elasticsearch
|
||||
hosts to your cluster to reduce the load on your MySQL hosts.
|
||||
Configuring search services is relatively simple and has no pre-requisites.
|
||||
|
||||
Elasticsearch has the ability to spread the load across multiple hosts and can
|
||||
handle very large indexes by sharding.
|
||||
By default, Phabricator uses MySQL as a fulltext search engine, so deploying
|
||||
multiple database hosts will effectively also deploy multiple fulltext search
|
||||
hosts.
|
||||
|
||||
Search does not involve any risk of data lost because it's always possible to
|
||||
rebuild the search index from the original database objects. This process can
|
||||
be very time consuming, however, especially when the database grows very large.
|
||||
|
||||
With multiple Elasticsearch hosts, you can survive the loss of a single host
|
||||
with minimal disruption as Phabricator will detect the problem and direct
|
||||
queries to one of the remaining hosts.
|
||||
|
||||
Phabricator supports writing to multiple indexing servers. This Simplifies
|
||||
Elasticsearch upgrades and makes it possible to recover more quickly from
|
||||
problems with the search index.
|
||||
Search indexes can be completely rebuilt from the database, so there is no
|
||||
risk of data loss no matter how fulltext search is configured.
|
||||
|
||||
For details, see @{article:Cluster: Search}.
|
||||
|
||||
|
|
|
@ -4,73 +4,207 @@
|
|||
Overview
|
||||
========
|
||||
|
||||
You can configure phabricator to connect to one or more fulltext search clusters
|
||||
running either Elasticsearch or MySQL. By default and without further
|
||||
configuration, Phabricator will use MySQL for fulltext search. This will be
|
||||
adequate for the vast majority of users. Installs with a very large number of
|
||||
objects or specialized search needs can consider enabling Elasticsearch for
|
||||
better scalability and potentially better search results.
|
||||
You can configure Phabricator to connect to one or more fulltext search
|
||||
services.
|
||||
|
||||
By default, Phabricator will use MySQL for fulltext search. This is suitable
|
||||
for most installs. However, alternate engines are supported.
|
||||
|
||||
|
||||
Configuring Search Services
|
||||
===========================
|
||||
|
||||
To configure an Elasticsearch service, use the `cluster.search` configuration
|
||||
option. A typical Elasticsearch configuration will probably look similar to
|
||||
the following example:
|
||||
To configure search services, adjust the `cluster.search` configuration
|
||||
option. This option contains a list of one or more fulltext search services,
|
||||
like this:
|
||||
|
||||
```lang=json
|
||||
[
|
||||
{
|
||||
"type": "...",
|
||||
"hosts": [
|
||||
...
|
||||
],
|
||||
"roles": {
|
||||
"read": true,
|
||||
"write": true
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
When a user makes a change to a document, Phabricator writes the updated
|
||||
document into every configured, writable fulltext service.
|
||||
|
||||
When a user issues a query, Phabricator tries configured, readable services
|
||||
in order until it is able to execute the query successfully.
|
||||
|
||||
These options are supported by all service types:
|
||||
|
||||
| Key | Description |
|
||||
|---|---|
|
||||
| `type` | Constant identifying the service type, like `mysql`.
|
||||
| `roles` | Dictionary of role settings, for enabling reads and writes.
|
||||
| `hosts` | List of hosts for this service.
|
||||
|
||||
Some service types support additional options.
|
||||
|
||||
Available Service Types
|
||||
=======================
|
||||
|
||||
These service types are supported:
|
||||
|
||||
| Service | Key | Description |
|
||||
|---|---|---|
|
||||
| MySQL | `mysql` | Default MySQL fulltext index.
|
||||
| Elasticsearch | `elasticsearch` | Use an external Elasticsearch service
|
||||
|
||||
|
||||
Fulltext Service Roles
|
||||
======================
|
||||
|
||||
These roles are supported:
|
||||
|
||||
| Role | Key | Description
|
||||
|---|---|---|
|
||||
| Read | `read` | Allows the service to be queried when users search.
|
||||
| Write | `write` | Allows documents to be published to the service.
|
||||
|
||||
|
||||
Specifying Hosts
|
||||
================
|
||||
|
||||
The `hosts` key should contain a list of dictionaries, each specifying the
|
||||
details of a host. A service should normally have one or more hosts.
|
||||
|
||||
When an option is set at the service level, it serves as a default for all
|
||||
hosts. It may be overridden by changing the value for a particular host.
|
||||
|
||||
|
||||
Service Type: MySQL
|
||||
==============
|
||||
|
||||
The `mysql` service type does not require any configuration, and does not
|
||||
need to have hosts specified. This service uses the builtin database to
|
||||
index and search documents.
|
||||
|
||||
A typical `mysql` service configuration looks like this:
|
||||
|
||||
```lang=json
|
||||
{
|
||||
"cluster.search": [
|
||||
{
|
||||
"type": "elasticsearch",
|
||||
"hosts": [
|
||||
{
|
||||
"host": "127.0.0.1",
|
||||
"roles": { "write": true, "read": true }
|
||||
}
|
||||
],
|
||||
"port": 9200,
|
||||
"protocol": "http",
|
||||
"path": "/phabricator",
|
||||
"version": 5
|
||||
},
|
||||
],
|
||||
"type": "mysql"
|
||||
}
|
||||
```
|
||||
|
||||
Supported Options
|
||||
-----------------
|
||||
| Key | Type |Comments|
|
||||
|`type` | String |Engine type. Currently, 'elasticsearch' or 'mysql'|
|
||||
|`protocol`| String |Either 'http' or 'https'|
|
||||
|`port`| Int |The TCP port that Elasticsearch is bound to|
|
||||
|`path`| String |The path portion of the url for phabricator's index.|
|
||||
|`version`| Int |The version of Elasticsearch server. Supports either 2 or 5.|
|
||||
|`hosts`| List |A list of one or more Elasticsearch host names / addresses.|
|
||||
|
||||
Host Configuration
|
||||
------------------
|
||||
Each search service must have one or more hosts associated with it. Each host
|
||||
entry consists of a `host` key, a dictionary of roles and can optionally
|
||||
override any of the options that are valid at the service level (see above).
|
||||
Service Type: Elasticsearch
|
||||
======================
|
||||
|
||||
Currently supported roles are `read` and `write`. These can be individually
|
||||
enabled or disabled on a per-host basis. A typical setup might include two
|
||||
elasticsearch clusters in two separate datacenters. You can configure one
|
||||
cluster for reads and both for writes. When one cluster is down for maintenance
|
||||
you can simply swap the read role over to the backup cluster and then proceed
|
||||
with maintenance without any service interruption.
|
||||
The `elasticsearch` sevice type supports these options:
|
||||
|
||||
| Key | Description |
|
||||
|---|---|
|
||||
| `protocol` | Either `"http"` (default) or `"https"`.
|
||||
| `port` | Elasticsearch TCP port.
|
||||
| `version` | Elasticsearch version, either `2` or `5` (default).
|
||||
| `path` | Path for the index. Defaults to `/phabriator`. Advanced.
|
||||
|
||||
A typical `elasticsearch` service configuration looks like this:
|
||||
|
||||
```lang=json
|
||||
{
|
||||
"type": "elasticsearch",
|
||||
"hosts": [
|
||||
{
|
||||
"protocol": "http",
|
||||
"host": "127.0.0.1",
|
||||
"port": 9200
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Monitoring Search Services
|
||||
==========================
|
||||
|
||||
You can monitor fulltext search in {nav Config > Search Servers}. This interface
|
||||
shows you a quick overview of services and their health.
|
||||
You can monitor fulltext search in {nav Config > Search Servers}. This
|
||||
interface shows you a quick overview of services and their health.
|
||||
|
||||
The table on this page shows some basic stats for each configured service,
|
||||
followed by the configuration and current status of each host.
|
||||
|
||||
NOTE: This page runs its diagnostics //from the web server that is serving the
|
||||
request//. If you are recovering from a disaster, the view this page shows
|
||||
may be partial or misleading, and two requests served by different servers may
|
||||
see different views of the cluster.
|
||||
|
||||
Rebuilding Indexes
|
||||
==================
|
||||
|
||||
After adding new search services, you will need to rebuild document indexes
|
||||
on them. To do this, first initialize the services:
|
||||
|
||||
```
|
||||
phabricator/ $ ./bin/search init
|
||||
```
|
||||
|
||||
This will perform index setup steps and other one-time configuration.
|
||||
|
||||
To populate documents in all indexes, run this command:
|
||||
|
||||
```
|
||||
phabricator/ $ ./bin/search index --force --background --type all
|
||||
```
|
||||
|
||||
This initiates an exhaustive rebuild of the document indexes. To get a more
|
||||
detailed list of indexing options available, run:
|
||||
|
||||
```
|
||||
phabricator/ $ ./bin/search help index
|
||||
```
|
||||
|
||||
|
||||
Advanced Example
|
||||
================
|
||||
|
||||
This is a more advanced example which shows a configuration with multiple
|
||||
different services in different roles. In this example:
|
||||
|
||||
- Phabricator is using an Elasticsearch 2 service as its primary fulltext
|
||||
service.
|
||||
- An Elasticsearch 5 service is online, but only receiving writes.
|
||||
- The MySQL service is serving as a backup if Elasticsearch fails.
|
||||
|
||||
This particular configuration may not be very useful. It is primarily
|
||||
intended to show how to configure many different options.
|
||||
|
||||
|
||||
```lang=json
|
||||
[
|
||||
{
|
||||
"type": "elasticsearch",
|
||||
"version": 2,
|
||||
"hosts": [
|
||||
{
|
||||
"host": "elastic2.mycompany.com",
|
||||
"port": 9200,
|
||||
"protocol": "http"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "elasticsearch",
|
||||
"version": 5,
|
||||
"hosts": [
|
||||
{
|
||||
"host": "elastic5.mycompany.com",
|
||||
"port": 9789,
|
||||
"protocol": "https"
|
||||
"roles": {
|
||||
"read": false,
|
||||
"write": true
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"type": "mysql"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
|
Loading…
Reference in a new issue