mirror of
https://we.phorge.it/source/phorge.git
synced 2025-03-29 12:38:12 +01:00
Summary: Ref T13216. See PHI943. If autoscale lightning strikes all your servers at once and destroys them, the path to recovery can be unclear. You're "supposed" to: - demote all the devices; - disable the bindings; - bind the new servers; - put whatever working copies you can scrape up back on disk; - promote one of the new servers. However, the documentation is a bit misleading (it was sort of written with "you lost one or two devices" in mind, not "you lost every device") and demote-before-disable is unnecessary and slightly risky if servers come back online. There's also a missing guardrail before the promote step which lets you accidentally skip the demotion step and end up in a confusing state. Instead: - Add a guard rail: when you try to promote a new server, warn if inactive devices still have versions and tell the user to demote them. - Allow demotion of inactive devices: the order "disable, demote" is safer and more intuitive than "demote, disable" and there's no reason to require the unintuitive order. - Make the "cluster already has leaders" message more clear. - Make the documentation more clear. Test Plan: - Bound a repository to two devices. - Wrote to A to make it a leader, then disabled it (simulating a lightning strike). - Tried to promote B. Got a new, useful error ("demote A first"). - Demoted A (before: error about demoting inactive devices; now: works fine). - Promoted B. This worked. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13216 Differential Revision: https://secure.phabricator.com/D19793
508 lines
21 KiB
Text
508 lines
21 KiB
Text
@title Cluster: Repositories
|
|
@group cluster
|
|
|
|
Configuring Phabricator to use multiple repository hosts.
|
|
|
|
Overview
|
|
========
|
|
|
|
If you use Git, you can deploy Phabricator with multiple repository hosts,
|
|
configured so that each host is readable and writable. The advantages of doing
|
|
this are:
|
|
|
|
- you can completely survive the loss of repository hosts;
|
|
- reads and writes can scale across multiple machines; and
|
|
- read and write performance across multiple geographic regions may improve.
|
|
|
|
This configuration is complex, and many installs do not need to pursue it.
|
|
|
|
This configuration is not currently supported with Subversion or Mercurial.
|
|
|
|
|
|
How Reads and Writes Work
|
|
=========================
|
|
|
|
Phabricator repository replicas are multi-master: every node is readable and
|
|
writable, and a cluster of nodes can (almost always) survive the loss of any
|
|
arbitrary subset of nodes so long as at least one node is still alive.
|
|
|
|
Phabricator maintains an internal version for each repository, and increments
|
|
it when the repository is mutated.
|
|
|
|
Before responding to a read, replicas make sure their version of the repository
|
|
is up to date (no node in the cluster has a newer version of the repository).
|
|
If it isn't, they block the read until they can complete a fetch.
|
|
|
|
Before responding to a write, replicas obtain a global lock, perform the same
|
|
version check and fetch if necessary, then allow the write to continue.
|
|
|
|
Additionally, repositories passively check other nodes for updates and
|
|
replicate changes in the background. After you push a change to a repository,
|
|
it will usually spread passively to all other repository nodes within a few
|
|
minutes.
|
|
|
|
Even if passive replication is slow, the active replication makes acknowledged
|
|
changes sequential to all observers: after a write is acknowledged, all
|
|
subsequent reads are guaranteed to see it. The system does not permit stale
|
|
reads, and you do not need to wait for a replication delay to see a consistent
|
|
view of the repository no matter which node you ask.
|
|
|
|
|
|
HTTP vs HTTPS
|
|
=============
|
|
|
|
Intracluster requests (from the daemons to repository servers, or from
|
|
webservers to repository servers) are permitted to use HTTP, even if you have
|
|
set `security.require-https` in your configuration.
|
|
|
|
It is common to terminate SSL at a load balancer and use plain HTTP beyond
|
|
that, and the `security.require-https` feature is primarily focused on making
|
|
client browser behavior more convenient for users, so it does not apply to
|
|
intracluster traffic.
|
|
|
|
Using HTTP within the cluster leaves you vulnerable to attackers who can
|
|
observe traffic within a datacenter, or observe traffic between datacenters.
|
|
This is normally very difficult, but within reach for state-level adversaries
|
|
like the NSA.
|
|
|
|
If you are concerned about these attackers, you can terminate HTTPS on
|
|
repository hosts and bind to them with the "https" protocol. Just be aware that
|
|
the `security.require-https` setting won't prevent you from making
|
|
configuration mistakes, as it doesn't cover intracluster traffic.
|
|
|
|
Other mitigations are possible, but securing a network against the NSA and
|
|
similar agents of other rogue nations is beyond the scope of this document.
|
|
|
|
|
|
Repository Hosts
|
|
================
|
|
|
|
Repository hosts must run a complete, fully configured copy of Phabricator,
|
|
including a webserver. They must also run a properly configured `sshd`.
|
|
|
|
If you are converting existing hosts into cluster hosts, you may need to
|
|
revisit @{article:Diffusion User Guide: Repository Hosting} and make sure
|
|
the system user accounts have all the necessary `sudo` permissions. In
|
|
particular, cluster devices need `sudo` access to `ssh` so they can read
|
|
device keys.
|
|
|
|
Generally, these hosts will run the same set of services and configuration that
|
|
web hosts run. If you prefer, you can overlay these services and put web and
|
|
repository services on the same hosts. See @{article:Clustering Introduction}
|
|
for some guidance on overlaying services.
|
|
|
|
When a user requests information about a repository that can only be satisfied
|
|
by examining a repository working copy, the webserver receiving the request
|
|
will make an HTTP service call to a repository server which hosts the
|
|
repository to retrieve the data it needs. It will use the result of this query
|
|
to respond to the user.
|
|
|
|
|
|
Setting up a Cluster Services
|
|
=============================
|
|
|
|
To set up clustering, first register the devices that you want to use as part
|
|
of the cluster with Almanac. For details, see @{article:Cluster: Devices}.
|
|
|
|
NOTE: Once you create a service, new repositories will immediately allocate
|
|
on it. You may want to disable repository creation during initial setup.
|
|
|
|
Once the hosts are registered as devices, you can create a new service in
|
|
Almanac:
|
|
|
|
- First, register at least one device according to the device clustering
|
|
instructions.
|
|
- Create a new service of type **Phabricator Cluster: Repository** in
|
|
Almanac.
|
|
- Bind this service to all the interfaces on the device or devices.
|
|
- For each binding, add a `protocol` key with one of these values:
|
|
`ssh`, `http`, `https`.
|
|
|
|
For example, a service might look like this:
|
|
|
|
- Service: `repos001.mycompany.net`
|
|
- Binding: `repo001.mycompany.net:80`, `protocol=http`
|
|
- Binding: `repo001.mycompany.net:2222`, `protocol=ssh`
|
|
|
|
The service itself has a `closed` property. You can set this to `true` to
|
|
disable new repository allocations on this service (for example, if it is
|
|
reaching capacity).
|
|
|
|
|
|
Migrating to Clustered Services
|
|
===============================
|
|
|
|
To convert existing repositories on an install into cluster repositories, you
|
|
will generally perform these steps:
|
|
|
|
- Register the existing host as a cluster device.
|
|
- Configure a single host repository service using //only// that host.
|
|
|
|
This puts you in a transitional state where repositories on the host can work
|
|
as either on-host repositories or cluster repositories. You can move forward
|
|
from here slowly and make sure services still work, with a quick path back to
|
|
safety if you run into trouble.
|
|
|
|
To move forward, migrate one repository to the service and make sure things
|
|
work correctly. If you run into issues, you can back out by migrating the
|
|
repository off the service.
|
|
|
|
To migrate a repository onto a cluster service, use this command:
|
|
|
|
```
|
|
$ ./bin/repository clusterize <repository> --service <service>
|
|
```
|
|
|
|
To migrate a repository back off a service, use this command:
|
|
|
|
```
|
|
$ ./bin/repository clusterize <repository> --remove-service
|
|
```
|
|
|
|
This command only changes how Phabricator connects to the repository; it does
|
|
not move any data or make any complex structural changes.
|
|
|
|
When Phabricator needs information about a non-clustered repository, it just
|
|
runs a command like `git log` directly on disk. When Phabricator needs
|
|
information about a clustered repository, it instead makes a service call to
|
|
another server, asking that server to run `git log` instead.
|
|
|
|
In a single-host cluster the server will make this service call to itself, so
|
|
nothing will really change. But this //is// an effective test for most
|
|
possible configuration mistakes.
|
|
|
|
If your canary repository works well, you can migrate the rest of your
|
|
repositories when ready (you can use `bin/repository list` to quickly get a
|
|
list of all repository monograms).
|
|
|
|
Once all repositories are migrated, you've reached a stable state and can
|
|
remain here as long as you want. This state is sufficient to convert daemons,
|
|
SSH, and web services into clustered versions and spread them across multiple
|
|
machines if those goals are more interesting.
|
|
|
|
Obviously, your single-device "cluster" will not be able to survive the loss of
|
|
the single repository host, but you can take as long as you want to expand the
|
|
cluster and add redundancy.
|
|
|
|
After creating a service, you do not need to `clusterize` new repositories:
|
|
they will automatically allocate onto an open service.
|
|
|
|
When you're ready to expand the cluster, continue below.
|
|
|
|
|
|
Expanding a Cluster
|
|
===================
|
|
|
|
To expand an existing cluster, follow these general steps:
|
|
|
|
- Register new devices in Almanac.
|
|
- Add bindings to the new devices to the repository service, also in Almanac.
|
|
- Start the daemons on the new devices.
|
|
|
|
For instructions on configuring and registering devices, see
|
|
@{article:Cluster: Devices}.
|
|
|
|
As soon as you add active bindings to a service, Phabricator will begin
|
|
synchronizing repositories and sending traffic to the new device. You do not
|
|
need to copy any repository data to the device: Phabricator will automatically
|
|
synchronize it.
|
|
|
|
If you have a large amount of repository data, you may want to help this
|
|
process along by copying the repository directory from an existing cluster
|
|
device before bringing the new host online. This is optional, but can reduce
|
|
the amount of time required to fully synchronize the cluster.
|
|
|
|
You do not need to synchronize the most up-to-date data or stop writes during
|
|
this process. For example, loading the most recent backup snapshot onto the new
|
|
device will substantially reduce the amount of data that needs to be
|
|
synchronized.
|
|
|
|
|
|
Contracting a Cluster
|
|
=====================
|
|
|
|
If you want to remove working devices from a cluster (for example, to take
|
|
hosts down for maintenance), first do this for each device:
|
|
|
|
- Change the `writable` property on the bindings to "Prevent Writes".
|
|
- Wait a few moments until the cluster synchronizes (see
|
|
"Monitoring Services" below).
|
|
|
|
This will ensure that the device you're about to remove is not the only cluster
|
|
leader, even if the cluster is receiving a high write volume. You can skip this
|
|
step if the device isn't working property to start with.
|
|
|
|
Once you've stopped writes and waited for synchronization (or if the hosts are
|
|
not working in the first place) do this for each device:
|
|
|
|
- Disable the bindings from the service to the device in Almanac.
|
|
|
|
If you are removing a device because it failed abruptly (or removing several
|
|
devices at once; or you skip the "Prevent Writes" step), it is possible that
|
|
some repositories will have lost all their leaders. See "Loss of Leaders" below
|
|
to understand and resolve this.
|
|
|
|
If you want to put the hosts back in service later:
|
|
|
|
- Enable the bindings again.
|
|
- Change `writable` back to "Allow Writes".
|
|
|
|
This will restore the cluster to the original state.
|
|
|
|
|
|
Monitoring Services
|
|
===================
|
|
|
|
You can get an overview of repository cluster status from the
|
|
{nav Config > Repository Servers} screen. This table shows a high-level
|
|
overview of all active repository services.
|
|
|
|
**Repos**: The number of repositories hosted on this service.
|
|
|
|
**Sync**: Synchronization status of repositories on this service. This is an
|
|
at-a-glance view of service health, and can show these values:
|
|
|
|
- **Synchronized**: All nodes are fully synchronized and have the latest
|
|
version of all repositories.
|
|
- **Partial**: All repositories either have at least two leaders, or have
|
|
a very recent write which is not expected to have propagated yet.
|
|
- **Unsynchronized**: At least one repository has changes which are
|
|
only available on one node and were not pushed very recently. Data may
|
|
be at risk.
|
|
- **No Repositories**: This service has no repositories.
|
|
- **Ambiguous Leader**: At least one repository has an ambiguous leader.
|
|
|
|
If this screen identifies problems, you can drill down into repository details
|
|
to get more information about them. See the next section for details.
|
|
|
|
|
|
Monitoring Repositories
|
|
=======================
|
|
|
|
You can get a more detailed view the current status of a specific repository on
|
|
cluster devices in {nav Diffusion > (Repository) > Manage Repository > Cluster
|
|
Configuration}.
|
|
|
|
This screen shows all the configured devices which are hosting the repository
|
|
and the available version on that device.
|
|
|
|
**Version**: When a repository is mutated by a push, Phabricator increases
|
|
an internal version number for the repository. This column shows which version
|
|
is on disk on the corresponding device.
|
|
|
|
After a change is pushed, the device which received the change will have a
|
|
larger version number than the other devices. The change should be passively
|
|
replicated to the remaining devices after a brief period of time, although this
|
|
can take a while if the change was large or the network connection between
|
|
devices is slow or unreliable.
|
|
|
|
You can click the version number to see the corresponding push logs for that
|
|
change. The logs contain details about what was changed, and can help you
|
|
identify if replication is slow because a change is large or for some other
|
|
reason.
|
|
|
|
**Writing**: This shows that the device is currently holding a write lock. This
|
|
normally means that it is actively receiving a push, but can also mean that
|
|
there was a write interruption. See "Write Interruptions" below for details.
|
|
|
|
**Last Writer**: This column identifies the user who most recently pushed a
|
|
change to this device. If the write lock is currently held, this user is
|
|
the user whose change is holding the lock.
|
|
|
|
**Last Write At**: When the most recent write started. If the write lock is
|
|
currently held, this shows when the lock was acquired.
|
|
|
|
|
|
Cluster Failure Modes
|
|
=====================
|
|
|
|
There are three major cluster failure modes:
|
|
|
|
- **Write Interruptions**: A write started but did not complete, leaving
|
|
the disk state and cluster state out of sync.
|
|
- **Loss of Leaders**: None of the devices with the most up-to-date data
|
|
are reachable.
|
|
- **Ambiguous Leaders**: The internal state of the repository is unclear.
|
|
|
|
Phabricator can detect these issues, and responds by freezing the repository
|
|
(usually preventing all reads and writes) until the issue is resolved. These
|
|
conditions are normally rare and very little data is at risk, but Phabricator
|
|
errs on the side of caution and requires decisions which may result in data
|
|
loss to be confirmed by a human.
|
|
|
|
The next sections cover these failure modes and appropriate responses in
|
|
more detail. In general, you will respond to these issues by assessing the
|
|
situation and then possibly choosing to discard some data.
|
|
|
|
|
|
Write Interruptions
|
|
===================
|
|
|
|
A repository cluster can be put into an inconsistent state by an interruption
|
|
in a brief window during and immediately after a write. This looks like this:
|
|
|
|
- A change is pushed to a server.
|
|
- The server acquires a write lock and begins writing the change.
|
|
- During or immediately after the write, lightning strikes the server
|
|
and destroys it.
|
|
|
|
Phabricator can not commit changes to a working copy (stored on disk) and to
|
|
the global state (stored in a database) atomically, so there is necessarily a
|
|
narrow window between committing these two different states when some tragedy
|
|
can befall a server, leaving the global and local views of the repository state
|
|
possibly divergent.
|
|
|
|
In these cases, Phabricator fails into a frozen state where further writes
|
|
are not permitted until the failure is investigated and resolved. When a
|
|
repository is frozen in this way it remains readable.
|
|
|
|
You can use the monitoring console to review the state of a frozen repository
|
|
with a held write lock. The **Writing** column will show which device is
|
|
holding the lock, and whoever is named in the **Last Writer** column may be
|
|
able to help you figure out what happened by providing more information about
|
|
what they were doing and what they observed.
|
|
|
|
Because the push was not acknowledged, it is normally safe to resolve this
|
|
issue by demoting the device. Demoting the device will undo any changes
|
|
committed by the push, and they will be lost forever.
|
|
|
|
However, the user should have received an error anyway, and should not expect
|
|
their push to have worked. Still, data is technically at risk and you may want
|
|
to investigate further and try to understand the issue in more detail before
|
|
continuing.
|
|
|
|
There is no way to explicitly keep the write, but if it was committed to disk
|
|
you can recover it manually from the working copy on the device (for example,
|
|
by using `git format-patch`) and then push it again after recovering.
|
|
|
|
If you demote the device, the in-process write will be thrown away, even if it
|
|
was complete on disk. To demote the device and release the write lock, run this
|
|
command:
|
|
|
|
```
|
|
phabricator/ $ ./bin/repository thaw <repository> --demote <device>
|
|
```
|
|
|
|
{icon exclamation-triangle, color="yellow"} Any committed but unacknowledged
|
|
data on the device will be lost.
|
|
|
|
|
|
Loss of Leaders
|
|
===============
|
|
|
|
A more straightforward failure condition is the loss of all servers in a
|
|
cluster which have the most up-to-date copy of a repository. This looks like
|
|
this:
|
|
|
|
- There is a cluster setup with two devices, X and Y.
|
|
- A new change is pushed to server X.
|
|
- Before the change can propagate to server Y, lightning strikes server X
|
|
and destroys it.
|
|
|
|
Here, all of the "leader" devices with the most up-to-date copy of the
|
|
repository have been lost. Phabricator will freeze the repository refuse to
|
|
serve requests because it can not serve reads consistently and can not accept
|
|
new writes without data loss.
|
|
|
|
The most straightforward way to resolve this issue is to restore any leader to
|
|
service. The change will be able to replicate to other devices once a leader
|
|
comes back online.
|
|
|
|
If you are unable to restore a leader or unsure that you can restore one
|
|
quickly, you can use the monitoring console to review which changes are
|
|
present on the leaders but not present on the followers by examining the
|
|
push logs.
|
|
|
|
If you are comfortable discarding these changes, you can instruct Phabricator
|
|
that it can forget about the leaders by doing this:
|
|
|
|
- Disable the service bindings to all of the leader devices so they are no
|
|
longer part of the cluster.
|
|
- Then, use `bin/repository thaw` to `--demote` the leaders explicitly.
|
|
|
|
To demote a device, run this command:
|
|
|
|
```
|
|
phabricator/ $ ./bin/repository thaw rXYZ --demote repo002.corp.net
|
|
```
|
|
|
|
{icon exclamation-triangle, color="red"} Any data which is only present on
|
|
the demoted device will be lost.
|
|
|
|
If you do this, **you will lose unreplicated data**. You will discard any
|
|
changes on the affected leaders which have not replicated to other devices
|
|
in the cluster.
|
|
|
|
|
|
Ambiguous Leaders
|
|
=================
|
|
|
|
Repository clusters can also freeze if the leader devices are ambiguous. This
|
|
can happen if you replace an entire cluster with new devices suddenly, or make
|
|
a mistake with the `--demote` flag. This may arise from some kind of operator
|
|
error, like these:
|
|
|
|
- Someone accidentally uses `bin/repository thaw ... --demote` to demote
|
|
every device in a cluster.
|
|
- Someone accidentally deletes all the version information for a repository
|
|
from the database by making a mistake with a `DELETE` or `UPDATE` query.
|
|
- Someone accidentally disables all of the devices in a cluster, then adds
|
|
entirely new ones before repositories can propagate.
|
|
|
|
If you are moving repositories into cluster services, you can also reach this
|
|
state if you use `clusterize` to associate a repository with a service that is
|
|
bound to multiple active devices. In this case, Phabricator will not know which
|
|
device or devices have up-to-date information.
|
|
|
|
When Phabricator can not tell which device in a cluster is a leader, it freezes
|
|
the cluster because it is possible that some devices have less data and others
|
|
have more, and if it chooses a leader arbitrarily it may destroy some data
|
|
which you would prefer to retain.
|
|
|
|
To resolve this, you need to tell Phabricator which device has the most
|
|
up-to-date data and promote that device to become a leader. If you know all
|
|
devices have the same data, you are free to promote any device.
|
|
|
|
If you promote a device, **you may lose data** if you promote the wrong device
|
|
and some other device really had more up-to-date data. If you want to double
|
|
check, you can examine the working copies on disk before promoting by
|
|
connecting to the machines and using commands like `git log` to inspect state.
|
|
|
|
Once you have identified a device which has data you're happy with, use
|
|
`bin/repository thaw` to `--promote` the device. The data on the chosen
|
|
device will become authoritative:
|
|
|
|
```
|
|
phabricator/ $ ./bin/repository thaw rXYZ --promote repo002.corp.net
|
|
```
|
|
|
|
{icon exclamation-triangle, color="red"} Any data which is only present on
|
|
**other** devices will be lost.
|
|
|
|
|
|
Backups
|
|
======
|
|
|
|
Even if you configure clustering, you should still consider retaining separate
|
|
backup snapshots. Replicas protect you from data loss if you lose a host, but
|
|
they do not let you rewind time to recover from data mutation mistakes.
|
|
|
|
If something issues a `--force` push that destroys branch heads, the mutation
|
|
will propagate to the replicas.
|
|
|
|
You may be able to manually restore the branches by using tools like the
|
|
Phabricator push log or the Git reflog so it is less important to retain
|
|
repository snapshots than database snapshots, but it is still possible for
|
|
data to be lost permanently, especially if you don't notice the problem for
|
|
some time.
|
|
|
|
Retaining separate backup snapshots will improve your ability to recover more
|
|
data more easily in a wider range of disaster situations.
|
|
|
|
|
|
Next Steps
|
|
==========
|
|
|
|
Continue by:
|
|
|
|
- returning to @{article:Clustering Introduction}.
|