1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-11-15 03:12:41 +01:00
phorge-phorge/src/docs/userguide/diffusion.diviner

144 lines
6.8 KiB
Text
Raw Normal View History

@title Diffusion User Guide
@group userguide
Guide to Diffusion, the Phabricator repository browser.
= Overview =
Diffusion is a repository browser which allows you to explore source code in a
Git or SVN repository, similar to software like Trac and GitWeb.
Diffusion provides a very high-performance SVN browser, a moderately
high-performance Git browser and relatively slow Mercurial browser. It achieves
performance by denormalizing large amounts of data about repository history into
a database and using this information like a cache so it can avoid querying the
repository directly. This data is generated by daemons which track repositories,
discover new commits, and parse and import them.
Diffusion is integrated with the other tools in the Phabricator suite. For
instance:
- when you commit Differential revisions to a tracked repository, they are
automatically updated and linked to the corresponding commits;
- you can add Herald rules to notify you about commits that match certain
rules;
- the Owners tool uses Diffusion to map repositories; and
- in all the tools, commit names are automatically linked.
= Repository Callsigns and Commit Names =
Each repository is identified by a "callsign", which is a short uppercase string
like "P" (for Phabricator) or "ARC" (for Arcanist).
Each repository must have a unique callsign. Callsigns must be unique within
an install but do not need to be globally unique, so you are free to use the
single-letter callsigns for brevity. For example, Facebook uses "E" for the
Engineering repository, "O" for the Ops repository, "Y" for a Yum package
repository, and so on, while Phabricator uses "P", "ARC", "PHU" for libphutil,
and "J" for Javelin. Keeping callsigns brief will make them easier to use, and
the use of one-character callsigns is recommended if they are reasonably
evocative and you have no more than 26 tracked repositories.
The primary goal of callsigns is to namespace commits to SVN repositories: if
you use multiple SVN repositories, each repository has a revision 1, revision 2,
etc., so referring to them by number alone is ambiguous. However, even for Git
they impart additional information to human readers and allow parsers to detect
that something is a commit name with high probability (and allow distinguishing
between multiple copies of a repository).
Diffusion uses this callsign and information about the commit itself to generate
a commit name, like "rE12345" or "rP28146171ce1278f2375e3646a1e1ea3fd56fc5a3".
The "r" stands for "revision". It is followed by the repository callsign, and
then a VCS-specific commit identifier (for SVN, the commit number; for Git and
Mercurial, the commit hash). When writing the name of a Git commit you may
abbreviate the hash, but note that hash collisions are probable for short prefix
lengths. See this post on the LKML for a historical explanation of Git's
occasional internal use of 7-character hashes:
https://lkml.org/lkml/2010/10/28/287
Because 7-character hashes are likely to collide for even moderately large
repositories, Diffusion generally uses either a 16-character prefix (which makes
collisions very unlikely) or the full 40-character hash (which makes collisions
astronomically unlikely).
= Adding Repositories =
Repository administration is accomplished through the "Repository" tool, which
is primarily a set of administrative interfaces for Diffusion. To add a
repository to Diffusion, you need to:
- create a new repository in the Repository tool; and
- start the daemons that will track and import the repository.
To create a new repository (or edit or delete an existing repository),
**you must be an administrator** (see
@{article:Configuring Accounts and Registration} for instructions on making an
existing account an administrator account). As an administrator, go to the
Repository tool and you'll have the options to create or edit repositories.
When you create a new repository, you need to specify a human-readable name,
a permanent "Callsign" (see previous section), and the underlying VCS type. Once
you have created a repository, you can go to the "Tracking" tab and set up
tracking in Diffusion.
Most of the options in the **Tracking** tab should be self-explanatory or are
safe to leave at their defaults. In broad strokes, Diffusion tracks SVN
repositories by issuing an "svn log" command periodically against the remote to
look for new commits. It tracks Git and Mercurial repositories by cloning a
local copy and issuing `git fetch` or `hg pull` periodically.
Once you've configured everything (and made sure **Tracking** is set to
"Enabled"), you can launch the daemons to begin actually tracking the
repository.
= Running Diffusion Daemons =
In most cases, it is sufficient to run:
phabricator/bin/ $ ./phd start
...to start the daemons. For a more in-depth explanation of `phd` and daemons,
see @{article:Managing Daemons with phd}.
NOTE: If you have an unusually large install with multiple web frontends, see
notes in @{article:Managing Daemons with phd}.
You can use the Daemon Console to monitor the daemons and their progress
importing the repository. Small repositories should import quickly, while
larger repositories may take some time (it takes about 10 minutes to begin
discovering commits in Facebook's 350,000-commit primary repository, and about
18 hours to import it all with 64 taskmasters on modern hardware). Commits
should begin appearing in Diffusion within a few minutes for all but the
largest repositories.
== Tuning Daemons ==
By default, Phabricator launches one daemon to pull and discover all of the
tracked repositories. This works well for a small number of repositories or
a large number of relatively inactive repositories, but might benefit from
tuning in some cases. The daemon makes a rough effort to respect pull
frequencies defined in repository configuration, but may not be able to import
new commits very quickly if you have a large number of repositories (as it is
blocked waiting on I/O from other repositories). If you want to provide lower
commit import latency for some repositories, you can launch additional
dedicated daemons:
For example, if you want low latency on the repositories with callsigns
`A` and `B`, but don't care about latency for the other repositories, you could
launch two daemons like this:
phabricator/bin $ ./phd launch RepositoryPullLocal -- A B
phabricator/bin $ ./phd launch RepositoryPullLocal -- --not A --not B
The first one will work only on `A` and `B`, and should be able to import
commits with low latency more reliably. The second one will work on all other
repositories.
= Next Steps =
- Learn about creating a symbol index at
@{article:Diffusion User Guide: Symbol Indexes}; or
- understand daemons in detail with @{article:Managing Daemons with phd}; or
- give us feedback at @{article:Give Feedback! Get Support!}.