1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-15 03:12:41 +01:00
phorge-phorge/src/docs/userguide/diffusion.diviner

147 lines
7.1 KiB
Text
Raw Normal View History

@title Diffusion User Guide
@group userguide
Guide to Diffusion, the Phabricator repository browser.
= Overview =
Diffusion is a repository browser which allows you to explore source code in a
Git or SVN repository, similar to software like Trac and GitWeb.
Diffusion provides a very high-performance SVN browser, a moderately
high-performance Git browser and relatively slow Mercurial browser. It achieves
performance by denormalizing large amounts of data about repository history into
a database and using this information like a cache so it can avoid querying the
repository directly. This data is generated by daemons which track repositories,
discover new commits, and parse and import them.
Diffusion is integrated with the other tools in the Phabricator suite. For
instance:
- when you commit Differential revisions to a tracked repository, they are
automatically updated and linked to the corresponding commits;
- you can add Herald rules to notify you about commits that match certain
rules;
- the Owners tool uses Diffusion to map repositories; and
- in all the tools, commit names are automatically linked.
= Repository Callsigns and Commit Names =
Each repository is identified by a "callsign", which is a short uppercase string
like "P" (for Phabricator) or "ARC" (for Arcanist).
Each repository must have a unique callsign. Callsigns must be unique within
an install but do not need to be globally unique, so you are free to use the
single-letter callsigns for brevity. For example, Facebook uses "E" for the
Engineering repository, "O" for the Ops repository, "Y" for a Yum package
repository, and so on, while Phabricator uses "P", "ARC", "PHU" for libphutil,
and "J" for Javelin. Keeping callsigns brief will make them easier to use, and
the use of one-character callsigns is recommended if they are reasonably
evocative and you have no more than 26 tracked repositories.
The primary goal of callsigns is to namespace commits to SVN repositories: if
you use multiple SVN repositories, each repository has a revision 1, revision 2,
etc., so referring to them by number alone is ambiguous. However, even for Git
they impart additional information to human readers and allow parsers to detect
that something is a commit name with high probability.
Diffusion uses this callsign and information about the commit itself to generate
a commit name, like "rE12345" or "rP28146171ce1278f2375e3646a1e1ea3fd56fc5a3".
The "r" stands for "revision". It is followed by the repository callsign, and
then a VCS-specific commit identifier (for SVN, the commit number; for Git, the
commit hash). When writing the name of a Git commit you may abbreviate the hash,
but note that hash collisions are probable for short prefix lengths. See this
post on the LKML for a historical explanation of Git's occasional internal use
of 7-character hashes:
https://lkml.org/lkml/2010/10/28/287
Because 7-character hashes are likely to collide for even moderately large
repositories, Diffusion generally uses either a 16-character prefix (which makes
collisions very unlikely) or the full 40-character hash (which makes collisions
astronomically unlikely).
= Adding Repositories =
Repository administration is accomplished through the "Repository" tool, which
is primarily a set of administrative interfaces for Diffusion. To add a
repository to Diffusion, you need to:
- create a new repository in the Repository tool; and
- start the daemons that will track and import the repository.
To create a new repository (or edit or delete an existing repository),
**you must be an administrator** (see
@{article:Configuring Accounts and Registration} for instructions on making an
existing account an administrator account). As an administrator, go to the
Repository tool and you'll have the options to create or edit repositories.
When you create a new repository, you need to specify a human-readable name,
a permanent "Callsign" (see previous section), and the underlying VCS type. Once
you have created a repository, you can go to the "Tracking" tab and set up
tracking in Diffusion.
Most of the options in the **Tracking** tab should be self-explanatory or are
safe to leave at their defaults. In broad strokes, Diffusion tracks SVN
repositories by issuing an "svn log" command periodically against the remote to
look for new commits. It tracks Git repositories by cloning a local copy and
issuing "git fetch" periodically.
Once you've configured everything (and made sure **Tracking** is set to
"Enabled"), you can launch the daemons to begin actually tracking the
repository.
= Running Diffusion Daemons =
For an introduction to Phabricator daemons, see
@{article:Managing Daemons with phd}. To actually track repositories, you need
to:
- run ##phd repository-launch-master## on one machine;
- run at least one @{class:PhabricatorTaskmasterDaemon} with
##phd launch taskmaster##. You should probably launch a few of these
somewhere. They are generic workers which run many different kinds of
background tasks, so if you already have some running you don't need to
launch more. However, if you are importing a very large repository, import
rate will primarily be a function of how many taskmasters you are running so
you may want to launch a bunch of them; and
- if you have multiple web frontends and have tracked Git repositories, run
##phd repository-launch-readonly## on each web frontend.
You can use the Daemon Console to monitor the daemons and their progress
importing the repository. Small repositories should import quickly, while
larger repositories may take some time (it takes about 10 minutes to begin
discovering commits in Facebook's 350,000-commit primary repository, and about
18 hours to import it all with 64 taskmasters on modern hardware). Commits
should begin appearing in Diffusion within a few minutes for all but the
largest repositories.
In detail, Diffusion uses several daemons to track, parse and import
repositories:
- **PhabricatorRepositoryGitFetchDaemon**: periodically runs "git fetch" to
keep git repositories up to date
- **PhabricatorRepositoryGitCommitDiscoveryDaemon**: periodically looks for
new commits and imports them
- **PhabricatorRepositorySvnCommitDiscoveryDaemon**: periodically runs
"svn log" to look for new commits and import them
- **PhabricatorRepositoryCommitTaskDaemon**: creates tasks to parse and
import newly discovered commits
The ##repository-launch-master## command just chooses the right daemons to
launch based on which repositories you've configured to be tracked. If you add
new repositories in the future, you should stop all the daemons and rerun
##repository-launch-master##.
If you run Phabricator with multiple web frontends, have your deployment script
do a ##phd stop## and ##phd repository-launch-readonly## when it deploys. It is
very unlikely you are impacted by this unless you are one of the largest
installs in the world.
= Building New Parsers =
You can add new classes which will extend or enhance Diffusion's ability to
parse commit messages.
TODO: This is an advanced feature which doesn't currently have documentation and
isn't terribly stable.