2012-04-18 00:06:40 +02:00
|
|
|
@title Database Schema
|
|
|
|
@group developer
|
|
|
|
|
|
|
|
This document describes key components of the database schema and should answer
|
|
|
|
questions like how to store new types of data.
|
|
|
|
|
|
|
|
= Database System =
|
|
|
|
|
|
|
|
Phabricator uses MySQL with InnoDB engine. The only exception is the
|
|
|
|
`search_documentfield` table which uses MyISAM because MySQL doesn't support
|
|
|
|
fulltext search in InnoDB.
|
|
|
|
|
|
|
|
Let us know if you need to use other database system: @{article:Give Feedback!
|
|
|
|
Get Support!}.
|
|
|
|
|
|
|
|
= PHP Drivers =
|
|
|
|
|
|
|
|
Phabricator supports [[ http://www.php.net/book.mysql | MySQL ]] and
|
|
|
|
[[ http://www.php.net/book.mysqli | MySQLi ]] PHP extensions. Most installations
|
|
|
|
use MySQL but MySQLi should work equally well.
|
|
|
|
|
|
|
|
= Databases =
|
|
|
|
|
|
|
|
Each Phabricator application has its own database. The names are prefixed by
|
|
|
|
`phabricator_`. This design has two advantages:
|
|
|
|
|
|
|
|
* Each database is easier to comprehend and to maintain.
|
|
|
|
* We don't do cross-database joins so each database can live on its own machine
|
|
|
|
which is useful for load-balancing.
|
|
|
|
|
|
|
|
= Connections =
|
|
|
|
|
|
|
|
Phabricator specifies if it will use any opened connection just for reading or
|
|
|
|
also for writing. This allows opening write connections to master and read
|
|
|
|
connections to slave in master/slave replication. It is useful for
|
|
|
|
load-balancing.
|
|
|
|
|
|
|
|
= Tables =
|
|
|
|
|
|
|
|
Each table name is prefixed by its application. For example, Differential
|
|
|
|
revisions are stored in database `phabricator_differential` and table
|
|
|
|
`differential_revision`. This duplicity allows easy recognition of the table in
|
|
|
|
DarkConsole (see @{article:Using DarkConsole}) and other places.
|
|
|
|
|
|
|
|
The exception is tables which share the same schema over different databases
|
|
|
|
such as `edge`.
|
|
|
|
|
2013-01-24 01:59:14 +01:00
|
|
|
We use lower-case table names with words separated by underscores. The reason is
|
|
|
|
that MySQL can be configured (with `lower_case_table_names`) to lower-case the
|
|
|
|
table names anyway.
|
2012-04-18 00:06:40 +02:00
|
|
|
|
|
|
|
= Column Names =
|
|
|
|
|
|
|
|
Phabricator uses camelCase names for columns. The main advantage is that they
|
|
|
|
directly map to properties in PHP classes.
|
|
|
|
|
|
|
|
Don't use MySQL reserved words (such as `order`) for column names.
|
|
|
|
|
|
|
|
= Data Types =
|
|
|
|
|
|
|
|
Phabricator uses `int unsigned` columns for storing dates instead of `date` or
|
|
|
|
`datetime`. We don't need to care about time-zones in both MySQL and PHP because
|
|
|
|
of it. The other reason is that PHP internally uses numbers for storing dates.
|
|
|
|
|
|
|
|
Phabricator uses UTF-8 encoding for storing all text data. We use
|
|
|
|
`utf8_general_ci` collation for free-text and `utf8_bin` for identifiers.
|
|
|
|
|
2012-04-20 00:13:52 +02:00
|
|
|
We don't use the `enum` data type because each change to the list of possible
|
|
|
|
values requires altering the table (which is slow with big tables). We use
|
|
|
|
numbers (or short strings in some cases) mapped to PHP constants instead.
|
|
|
|
|
2012-04-18 00:06:40 +02:00
|
|
|
= JSON =
|
|
|
|
|
|
|
|
Some data don't require structured access - you don't need to filter or order by
|
|
|
|
them. We store these data as text fields in JSON format. This approach has
|
|
|
|
several advantages:
|
|
|
|
|
|
|
|
* If we decide to add another unstructured field then we don't need to alter the
|
|
|
|
table (which is slow for big tables in MySQL).
|
|
|
|
* Table structure is not cluttered by fields which could be unused most of the
|
|
|
|
time.
|
|
|
|
|
|
|
|
An example of such usage can be found in column
|
|
|
|
`differential_diffproperty.data`.
|
|
|
|
|
|
|
|
= Primary Keys =
|
|
|
|
|
|
|
|
Most tables have auto-increment column named `id`. However creating such column
|
|
|
|
is not required for tables which are not usually directly referenced (such as
|
|
|
|
tables expressing M:N relations). Example of such table is
|
|
|
|
`differential_relationship`.
|
|
|
|
|
|
|
|
= Indexes =
|
|
|
|
|
|
|
|
Create all indexes necessary for fast query execution in most cases. Don't
|
|
|
|
create indexes which are not used. You can analyze queries @{article:Using
|
|
|
|
DarkConsole}.
|
|
|
|
|
2013-04-06 08:02:06 +02:00
|
|
|
Older MySQL versions are not able to use indexes for tuple search:
|
|
|
|
`(a, b) IN ((%s, %d), (%s, %d))`. Use `AND` and `OR` instead:
|
|
|
|
`((a = %s AND b = %d) OR (a = %s AND b = %d))`.
|
|
|
|
|
2012-04-18 00:06:40 +02:00
|
|
|
= Foreign Keys =
|
|
|
|
|
|
|
|
We don't use InnoDB's foreign keys because our application is so great that
|
|
|
|
no inconsistencies can arise. It will just slow us down.
|
|
|
|
|
|
|
|
= PHIDs =
|
|
|
|
|
|
|
|
Each globally referencable object in Phabricator has its associated PHID
|
|
|
|
(Phabricator ID) which serves as a global identifier. We use PHIDs for
|
2012-11-09 05:44:45 +01:00
|
|
|
referencing data in different databases.
|
2012-04-18 00:06:40 +02:00
|
|
|
|
|
|
|
We use both autoincrementing IDs and global PHIDs because each is useful in
|
|
|
|
different contexts. Autoincrementing IDs are chronologically ordered and allow
|
|
|
|
us to construct short, human-readable object names (like D2258) and URIs. Global
|
|
|
|
PHIDs allow us to represent relationships between different types of objects in
|
2014-07-12 16:45:33 +02:00
|
|
|
a homogeneous way.
|
2012-04-18 00:06:40 +02:00
|
|
|
|
2012-10-05 00:26:16 +02:00
|
|
|
For example, the concept of "subscribers" is more powerfully done with PHIDs
|
2014-07-10 00:12:48 +02:00
|
|
|
because we could theoretically have users, projects, teams, and more all as
|
|
|
|
"subscribers" of other objects. Using an ID column we would need to add a
|
2012-11-09 05:44:45 +01:00
|
|
|
"type" column to avoid ID collision; using PHIDs does not require this
|
2012-10-05 00:26:16 +02:00
|
|
|
additional column.
|
|
|
|
|
2012-05-12 00:05:15 +02:00
|
|
|
= Transactions =
|
|
|
|
|
|
|
|
Transactional code should be written using transactions. Example of such code is
|
|
|
|
inserting multiple records where one doesn't make sense without the other or
|
|
|
|
selecting data later used for update. See chapter in @{class:LiskDAO}.
|
|
|
|
|
2012-04-18 00:06:40 +02:00
|
|
|
= Advanced Features =
|
|
|
|
|
|
|
|
We don't use MySQL advanced features such as triggers, stored procedures or
|
|
|
|
events because we like expressing the application logic in PHP more than in SQL.
|
|
|
|
Some of these features (especially triggers) can also cause big confusion.
|
|
|
|
|
|
|
|
Avoiding these advanced features is also good for supporting other database
|
|
|
|
systems (which we don't support anyway).
|
|
|
|
|
|
|
|
= Schema Denormalization =
|
|
|
|
|
|
|
|
Phabricator uses schema denormalization for performance reasons sparingly. Try
|
|
|
|
to avoid it if possible.
|
|
|
|
|
2012-10-05 00:26:16 +02:00
|
|
|
= Changing the Schema =
|
|
|
|
|
|
|
|
There are three simple steps to update the schema:
|
|
|
|
|
2012-11-09 05:44:45 +01:00
|
|
|
# Create a `.sql` file in `resources/sql/patches/`. This file should:
|
2014-07-12 16:45:33 +02:00
|
|
|
- Contain the appropriate MySQL commands to update the schema.
|
2014-07-10 00:12:48 +02:00
|
|
|
- Be named as `YYYYMMDD.patchname.ext`. For example, `20130217.example.sql`.
|
|
|
|
- Use `${NAMESPACE}` rather than `phabricator` for database names.
|
|
|
|
- Use `COLLATE utf8_bin` for any columns that are to be used as identifiers,
|
|
|
|
such as PHID columns. Otherwise, use `COLLATE utf8_general_ci`.
|
|
|
|
- Name all indexes so it is possible to delete them later.
|
2012-11-09 05:44:45 +01:00
|
|
|
# Edit `src/infrastructure/storage/patch/PhabricatorBuiltinPatchList.php` and
|
2014-09-08 15:08:56 +02:00
|
|
|
add your patch to
|
|
|
|
@{method@phabricator:PhabricatorBuiltinPatchList::getPatches}.
|
2013-01-27 02:11:39 +01:00
|
|
|
# Run `bin/storage upgrade`.
|
2012-10-05 00:26:16 +02:00
|
|
|
|
2012-11-16 16:22:31 +01:00
|
|
|
It is also possible to create more complex patches in PHP for data migration
|
2013-01-24 01:59:14 +01:00
|
|
|
(due to schema changes or otherwise.) However, the schema changes themselves
|
2012-11-16 16:22:31 +01:00
|
|
|
should be done in separate `.sql` files. Order can be guaranteed by editing
|
|
|
|
`src/infrastructure/storage/patch/PhabricatorBuiltinPatchList.php`
|
|
|
|
appropriately.
|
2012-11-09 05:44:45 +01:00
|
|
|
|
|
|
|
See the
|
|
|
|
[[https://secure.phabricator.com/rPb39175342dc5bee0c2246b05fa277e76a7e96ed3
|
2012-10-05 00:26:16 +02:00
|
|
|
| commit adding policy storage for Paste ]] for a reasonable example of the code
|
|
|
|
changes.
|
|
|
|
|
2012-04-18 00:06:40 +02:00
|
|
|
= See Also =
|
|
|
|
|
|
|
|
* @{class:LiskDAO}
|
2012-10-05 00:26:16 +02:00
|
|
|
* @{class:PhabricatorPHID}
|