1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2025-02-02 09:58:24 +01:00

Adjust and wordsmith Search documentation

Summary:
Ref T12450. General adjustments:

  - Try to make "Cluster: Search" more about "stuff in common + types" instead of pretty much all being Elastic-specific, so we can add Solr or whatever later.
  - Provide guidance about rebuilding indexes after making a change.
  - Simplify the basic examples, then provide a more advanced example at the ed.
  - Really try to avoid suggesting anyone configure Elasticsearch ever for any reason.

Test Plan: Read documents, previewed in remarkup.

Reviewers: chad, 20after4

Reviewed By: 20after4

Maniphest Tasks: T12450

Differential Revision: https://secure.phabricator.com/D17602
This commit is contained in:
epriestley 2017-04-02 12:55:38 -07:00
parent 64234535e3
commit 287e708c4d
2 changed files with 191 additions and 69 deletions

View file

@ -47,7 +47,7 @@ will have on availability, resistance to data loss, and scalability.
| **SSH Servers** | Minimal | Low | No Risk | Low | **SSH Servers** | Minimal | Low | No Risk | Low
| **Web Servers** | Minimal | **High** | No Risk | Moderate | **Web Servers** | Minimal | **High** | No Risk | Moderate
| **Notifications** | Minimal | Low | No Risk | Low | **Notifications** | Minimal | Low | No Risk | Low
| **Fulltext Search** | Moderate | **High** | Minimal Risk | Moderate | **Fulltext Search** | Minimal | Low | No Risk | Low
See below for a walkthrough of these services in greater detail. See below for a walkthrough of these services in greater detail.
@ -241,26 +241,14 @@ For details, see @{article:Cluster: Notifications}.
Cluster: Fulltext Search Cluster: Fulltext Search
======================== ========================
At a certain scale, you may begin to bump up against the limitations of MySQL's Configuring search services is relatively simple and has no pre-requisites.
built-in fulltext search capabilities. We have seen this with very large
installations with several million objects in the database and very many
simultaneous requests. At this point you may consider adding Elasticsearch
hosts to your cluster to reduce the load on your MySQL hosts.
Elasticsearch has the ability to spread the load across multiple hosts and can By default, Phabricator uses MySQL as a fulltext search engine, so deploying
handle very large indexes by sharding. multiple database hosts will effectively also deploy multiple fulltext search
hosts.
Search does not involve any risk of data lost because it's always possible to Search indexes can be completely rebuilt from the database, so there is no
rebuild the search index from the original database objects. This process can risk of data loss no matter how fulltext search is configured.
be very time consuming, however, especially when the database grows very large.
With multiple Elasticsearch hosts, you can survive the loss of a single host
with minimal disruption as Phabricator will detect the problem and direct
queries to one of the remaining hosts.
Phabricator supports writing to multiple indexing servers. This Simplifies
Elasticsearch upgrades and makes it possible to recover more quickly from
problems with the search index.
For details, see @{article:Cluster: Search}. For details, see @{article:Cluster: Search}.

View file

@ -4,73 +4,207 @@
Overview Overview
======== ========
You can configure phabricator to connect to one or more fulltext search clusters You can configure Phabricator to connect to one or more fulltext search
running either Elasticsearch or MySQL. By default and without further services.
configuration, Phabricator will use MySQL for fulltext search. This will be
adequate for the vast majority of users. Installs with a very large number of By default, Phabricator will use MySQL for fulltext search. This is suitable
objects or specialized search needs can consider enabling Elasticsearch for for most installs. However, alternate engines are supported.
better scalability and potentially better search results.
Configuring Search Services Configuring Search Services
=========================== ===========================
To configure an Elasticsearch service, use the `cluster.search` configuration To configure search services, adjust the `cluster.search` configuration
option. A typical Elasticsearch configuration will probably look similar to option. This option contains a list of one or more fulltext search services,
the following example: like this:
```lang=json
[
{
"type": "...",
"hosts": [
...
],
"roles": {
"read": true,
"write": true
}
}
]
```
When a user makes a change to a document, Phabricator writes the updated
document into every configured, writable fulltext service.
When a user issues a query, Phabricator tries configured, readable services
in order until it is able to execute the query successfully.
These options are supported by all service types:
| Key | Description |
|---|---|
| `type` | Constant identifying the service type, like `mysql`.
| `roles` | Dictionary of role settings, for enabling reads and writes.
| `hosts` | List of hosts for this service.
Some service types support additional options.
Available Service Types
=======================
These service types are supported:
| Service | Key | Description |
|---|---|---|
| MySQL | `mysql` | Default MySQL fulltext index.
| Elasticsearch | `elasticsearch` | Use an external Elasticsearch service
Fulltext Service Roles
======================
These roles are supported:
| Role | Key | Description
|---|---|---|
| Read | `read` | Allows the service to be queried when users search.
| Write | `write` | Allows documents to be published to the service.
Specifying Hosts
================
The `hosts` key should contain a list of dictionaries, each specifying the
details of a host. A service should normally have one or more hosts.
When an option is set at the service level, it serves as a default for all
hosts. It may be overridden by changing the value for a particular host.
Service Type: MySQL
==============
The `mysql` service type does not require any configuration, and does not
need to have hosts specified. This service uses the builtin database to
index and search documents.
A typical `mysql` service configuration looks like this:
```lang=json ```lang=json
{ {
"cluster.search": [ "type": "mysql"
{
"type": "elasticsearch",
"hosts": [
{
"host": "127.0.0.1",
"roles": { "write": true, "read": true }
}
],
"port": 9200,
"protocol": "http",
"path": "/phabricator",
"version": 5
},
],
} }
``` ```
Supported Options
-----------------
| Key | Type |Comments|
|`type` | String |Engine type. Currently, 'elasticsearch' or 'mysql'|
|`protocol`| String |Either 'http' or 'https'|
|`port`| Int |The TCP port that Elasticsearch is bound to|
|`path`| String |The path portion of the url for phabricator's index.|
|`version`| Int |The version of Elasticsearch server. Supports either 2 or 5.|
|`hosts`| List |A list of one or more Elasticsearch host names / addresses.|
Host Configuration Service Type: Elasticsearch
------------------ ======================
Each search service must have one or more hosts associated with it. Each host
entry consists of a `host` key, a dictionary of roles and can optionally
override any of the options that are valid at the service level (see above).
Currently supported roles are `read` and `write`. These can be individually The `elasticsearch` sevice type supports these options:
enabled or disabled on a per-host basis. A typical setup might include two
elasticsearch clusters in two separate datacenters. You can configure one | Key | Description |
cluster for reads and both for writes. When one cluster is down for maintenance |---|---|
you can simply swap the read role over to the backup cluster and then proceed | `protocol` | Either `"http"` (default) or `"https"`.
with maintenance without any service interruption. | `port` | Elasticsearch TCP port.
| `version` | Elasticsearch version, either `2` or `5` (default).
| `path` | Path for the index. Defaults to `/phabriator`. Advanced.
A typical `elasticsearch` service configuration looks like this:
```lang=json
{
"type": "elasticsearch",
"hosts": [
{
"protocol": "http",
"host": "127.0.0.1",
"port": 9200
}
]
}
```
Monitoring Search Services Monitoring Search Services
========================== ==========================
You can monitor fulltext search in {nav Config > Search Servers}. This interface You can monitor fulltext search in {nav Config > Search Servers}. This
shows you a quick overview of services and their health. interface shows you a quick overview of services and their health.
The table on this page shows some basic stats for each configured service, The table on this page shows some basic stats for each configured service,
followed by the configuration and current status of each host. followed by the configuration and current status of each host.
NOTE: This page runs its diagnostics //from the web server that is serving the
request//. If you are recovering from a disaster, the view this page shows Rebuilding Indexes
may be partial or misleading, and two requests served by different servers may ==================
see different views of the cluster.
After adding new search services, you will need to rebuild document indexes
on them. To do this, first initialize the services:
```
phabricator/ $ ./bin/search init
```
This will perform index setup steps and other one-time configuration.
To populate documents in all indexes, run this command:
```
phabricator/ $ ./bin/search index --force --background --type all
```
This initiates an exhaustive rebuild of the document indexes. To get a more
detailed list of indexing options available, run:
```
phabricator/ $ ./bin/search help index
```
Advanced Example
================
This is a more advanced example which shows a configuration with multiple
different services in different roles. In this example:
- Phabricator is using an Elasticsearch 2 service as its primary fulltext
service.
- An Elasticsearch 5 service is online, but only receiving writes.
- The MySQL service is serving as a backup if Elasticsearch fails.
This particular configuration may not be very useful. It is primarily
intended to show how to configure many different options.
```lang=json
[
{
"type": "elasticsearch",
"version": 2,
"hosts": [
{
"host": "elastic2.mycompany.com",
"port": 9200,
"protocol": "http"
}
]
},
{
"type": "elasticsearch",
"version": 5,
"hosts": [
{
"host": "elastic5.mycompany.com",
"port": 9789,
"protocol": "https"
"roles": {
"read": false,
"write": true
}
}
]
},
{
"type": "mysql"
}
]
```