mirror of
https://we.phorge.it/source/phorge.git
synced 2025-01-10 14:51:06 +01:00
Write "Why does Phabricator need so many databases?"
Summary: We will sell you as many new databases as you want, cheap! Just $1 per database! Test Plan: (O).(O) Reviewers: chad Reviewed By: chad Differential Revision: https://secure.phabricator.com/D15249
This commit is contained in:
parent
50b8815e44
commit
f5c8a2fb18
2 changed files with 134 additions and 4 deletions
|
@ -28,11 +28,10 @@ Databases
|
|||
=========
|
||||
|
||||
Each Phabricator application has its own database. The names are prefixed by
|
||||
`phabricator_` (this is configurable). This design has two advantages:
|
||||
`phabricator_` (this is configurable).
|
||||
|
||||
- Each database is easier to comprehend and to maintain.
|
||||
- We don't do cross-database joins so each database can live on its own
|
||||
machine. This gives us flexibility in sharding data later.
|
||||
Phabricator uses a separate database for each application. To understand why,
|
||||
see @{article:Why does Phabricator need so many databases?}.
|
||||
|
||||
Connections
|
||||
===========
|
||||
|
|
131
src/docs/flavor/so_many_databases.diviner
Normal file
131
src/docs/flavor/so_many_databases.diviner
Normal file
|
@ -0,0 +1,131 @@
|
|||
@title Why does Phabricator need so many databases?
|
||||
@group lore
|
||||
|
||||
Phabricator uses about 60 databases (and we may have added more by the time you
|
||||
read this document). This sometimes comes as a surprise, since you might assume
|
||||
it would only use one database.
|
||||
|
||||
The approach we use is designed to work at scale for huge installs with many
|
||||
thousands of users. We care a lot about working well for large installs, and
|
||||
about scaling up gracefully to meet the needs of growing organizations. We want
|
||||
small startups to be able to install Phabricator and have it grow with them as
|
||||
they expand to many thousands of employees.
|
||||
|
||||
A cost of this approach is that it makes Phabricator more difficult to install
|
||||
on shared hosts which require a lot of work to create or authorize access to
|
||||
each database. However, Phabricator does a lot of advanced or complex things
|
||||
which are difficult to configure or manage on shared hosts, and we don't
|
||||
recommend installing it on a shared host. The install documentation explicitly
|
||||
discouarges installing on shared hosts.
|
||||
|
||||
Broadly, in cases where we must choose between operating well at scale for
|
||||
growing organizations and installing easily on shared hosts, we prioritize
|
||||
operating at scale.
|
||||
|
||||
|
||||
Listing Databases
|
||||
=================
|
||||
|
||||
You can get a full list of the databases Phabricator needs with `bin/storage
|
||||
databases`. It will look something like this:
|
||||
|
||||
```
|
||||
$ /core/lib/phabricator/bin/storage databases
|
||||
secure_audit
|
||||
secure_calendar
|
||||
secure_chatlog
|
||||
secure_conduit
|
||||
secure_countdown
|
||||
secure_daemon
|
||||
secure_differential
|
||||
secure_draft
|
||||
secure_drydock
|
||||
secure_feed
|
||||
...<dozens more databases>...
|
||||
```
|
||||
|
||||
Roughly, each application has its own database, and then there are some
|
||||
databases which support internal systems or shared infrastructure.
|
||||
|
||||
|
||||
Operating at Scale
|
||||
==================
|
||||
|
||||
This storage design is aimed at large installs that may need more than one
|
||||
physical database server to handle the load the install generates.
|
||||
|
||||
The primary reason we a database per application is to allow large installs to
|
||||
scale up by spreading database load across more hardware. A large organization
|
||||
with many thousands of active users may find themselves limited by the capacity
|
||||
of a single database backend.
|
||||
|
||||
If so, they can launch a second backend, move some applications over to it, and
|
||||
continue piling on more users.
|
||||
|
||||
This can't continue forever, but provides a substantial amount of headroom for
|
||||
large installs to spread the workload across more hardware and continue scaling
|
||||
up.
|
||||
|
||||
To make this possible, we put each application in its own database and use
|
||||
database boundaries to enforce the logical constraints that the application
|
||||
must have in order for this to work. For example, we can not perform joins
|
||||
between separable tables, because they may not be on the same hardware.
|
||||
|
||||
Establishing boundaries with application databases is a simple, straightforward
|
||||
way to partition storage and make administrative operations like spreading load
|
||||
realistic.
|
||||
|
||||
|
||||
Ease of Development
|
||||
===================
|
||||
|
||||
This design is also easier for us to work with, and easier for users who
|
||||
want to work with the raw database data to understand and interact with.
|
||||
|
||||
We have a large number of tables (more than 400) and we can not reasonably
|
||||
reduce the number of tables very much (each table generally represents some
|
||||
meaningful type of object in some application0. It's easier to develop with
|
||||
tables which are organized into separate application databases, just like it's
|
||||
easier to work with a large project if you organize source files into
|
||||
directories.
|
||||
|
||||
If you aren't developing Phabricator and never look at the data in the
|
||||
database, you probably don't benefit from this organization. However, if you
|
||||
are a developer or want to extend Phabricator or look under the hood, it's
|
||||
easier to find what you're looking for and work with the tables and data when
|
||||
they're organized by application.
|
||||
|
||||
|
||||
Databases Have No Cost
|
||||
======================
|
||||
|
||||
In almost all cases, creating databases has zero cost, just like organizing
|
||||
source code into directories has zero cost.
|
||||
|
||||
Even if we didn't derive enormous benefits from this approach at scale, there
|
||||
is little reason //not// to organize storage like this.
|
||||
|
||||
There are a handful of administrative tasks which are very slightly more
|
||||
complex to perform on multiple databases, but these are all either automated
|
||||
with `bin/storage` or easy to build on top of the list of databases emitted by
|
||||
`bin/storage databases`.
|
||||
|
||||
For example, you can dump all the databases with `bin/storage dump`, and you
|
||||
can destroy all the databases with `bin/storage destroy`.
|
||||
|
||||
As mentioned above, an exception to this is that if you're installing on a
|
||||
shared host and need to jump through hoops to individually authorize access to
|
||||
each database, databases do cost something.
|
||||
|
||||
However, this cost is an artificial cost imposed by the selected environment,
|
||||
and this is only the first of many issues you'll run into trying to install and
|
||||
run Phabricator on a shared host. These issues are why we strongly discourage
|
||||
using shared hosts, and recommend against them in the install guide.
|
||||
|
||||
|
||||
Next Steps
|
||||
==========
|
||||
|
||||
Continue by:
|
||||
|
||||
- learning more about databases in @{article:Database Schema}.
|
Loading…
Reference in a new issue