1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2025-02-03 10:28:23 +01:00
phorge-phorge/src/docs/contributor/internationalization.diviner
epriestley f46cf99274 Fix a typo in "Internationalization" documentation
Summary: Caught this while linking to it from D16405.

Test Plan: Consulted a dictionary.

Reviewers: chad, alexmv

Reviewed By: alexmv

Differential Revision: https://secure.phabricator.com/D16406
2016-08-16 17:48:00 -07:00

381 lines
13 KiB
Text

@title Internationalization
@group developer
Describes Phabricator translation and localization.
Overview
========
Phabricator partially supports internationalization, but many of the tools
are missing or in a prototype state.
This document describes what tools exist today, how to add new translations,
and how to use the translation tools to make a codebase translatable.
Adding a New Locale
===================
To add a new locale, subclass @{class:PhutilLocale}. This allows you to
introduce a new locale, like "German" or "Klingon".
Once you've created a locale, applications can add translations for that
locale.
For instructions on adding new classes, see
@{article@phabcontrib:Adding New Classes}.
Adding Translations to Locale
=============================
To translate strings, subclass @{class:PhutilTranslation}. Translations need
to belong to a locale: the locale defines an available language, and each
translation subclass provides strings for it.
Translations are separated from locales so that third-party applications can
provide translations into different locales without needing to define those
locales themselves.
For instructions on adding new classes, see
@{article@phabcontrib:Adding New Classes}.
Writing Translatable Code
=========================
Strings are marked for translation with @{function@libphutil:pht}.
The `pht()` function takes a string (and possibly some parameters) and returns
the translated version of that string in the current viewer's locale, if a
translation is available.
If text strings will ultimately be read by humans, they should essentially
always be wrapped in `pht()`. For example:
```lang=php
$dialog->appendParagraph(pht('This is an example.'));
```
This allows the code to return the correct Spanish or German or Russian
version of the text, if the viewer is using Phabricator in one of those
languages and a translation is available.
Using `pht()` properly so that strings are translatable can be tricky. Briefly,
the major rules are:
- Only pass static strings as the first parameter to `pht()`.
- Use parameters to create strings containing user names, object names, etc.
- Translate full sentences, not sentence fragments.
- Let the translation framework handle plural rules.
- Use @{class@libphutil:PhutilNumber} for numbers.
- Let the translation framework handle subject gender rules.
- Translate all human-readable text, even exceptions and error messages.
See the next few sections for details on these rules.
Use Static Strings
==================
The first parameter to `pht()` must always be a static string. Broadly, this
means it should not contain variables or function or method calls (it's OK to
split it across multiple lines and concatenate the parts together).
These are good:
```lang=php
pht('The night is dark.');
pht(
'Two roads diverged in a yellow wood, '.
'and sorry I could not travel both '.
'and be one traveler, long I stood.');
```
These won't work (they might appear to work, but are wrong):
```lang=php, counterexample
pht(some_function());
pht('The duck says, '.$quack);
pht($string);
```
The first argument must be a static string so it can be extracted by static
analysis tools and dumped in a big file for translators. If it contains
functions or variables, it can't be extracted, so translators won't be able to
translate it.
Lint will warn you about problems with use of static strings in calls to
`pht()`.
Parameters
==========
You can provide parameters to a translation string by using `sprintf()`-style
patterns in the input string. For example:
```lang=php
pht('%s earned an award.', $actor);
pht('%s closed %s.', $actor, $task);
```
This is primarily appropriate for usernames, object names, counts, and
untranslatable strings like URIs or instructions to run commands from the CLI.
Parameters normally should not be used to combine two pieces of translated
text: see the next section for guidance.
Sentence Fragments
==================
You should almost always pass the largest block of text to `pht()` that you
can. Particularly, it's important to pass complete sentences, not try to build
a translation by stringing together sentence fragments.
There are several reasons for this:
- It gives translators more context, so they can be more confident they are
producing a satisfying, natural-sounding translation which will make sense
and sound good to native speakers.
- In some languages, one fragment may need to translate differently depending
on what the other fragment says.
- In some languages, the most natural-sounding translation may change the
order of words in the sentence.
For example, suppose we want to translate these sentence to give the user some
instructions about how to use an interface:
> Turn the switch to the right.
> Turn the switch to the left.
> Turn the dial to the right.
> Turn the dial to the left.
Maybe we have a function like this:
```
function get_string($is_switch, $is_right) {
// ...
}
```
One way to write the function body would be like this:
```lang=php, counterexample
$what = $is_switch ? pht('switch') : pht('dial');
$dir = $is_right ? pht('right') : pht('left');
return pht('Turn the ').$what.pht(' to the ').$dir.pht('.');
```
This will work fine in English, but won't work well in other languages.
One problem with doing this is handling gendered nouns. Languages like Spanish
have gendered nouns, where some nouns are "masculine" and others are
"feminine". The gender of a noun affects which article (in English, the word
"the" is an article) should be used with it.
In English, we say "**the** knob" and "**the** switch", but a Spanish speaker
would say "**la** perilla" and "**el** interruptor", because the noun for
"knob" in Spanish is feminine (so it is used with the article "la") while the
noun for "switch" is masculine (so it is used with the article "el").
A Spanish speaker can not translate the string "Turn the" correctly without
knowing which gender the noun has. Spanish has //two// translations for this
string ("Gira el", "Gira la"), and the form depends on which noun is being
used.
Another problem is that this reduces flexibility. Translating fragments like
this locks translators into a specific word order, when rearranging the words
might make the sentence sound much more natural to a native speaker.
For example, if the string read "The knob, to the right, turn it.", it
would technically be English and most English readers would understand the
meaning, but no native English speaker would speak or write like this.
However, some languages have different subject-verb order rules or
colloquisalisms, and a word order which transliterates like this may sound more
natural to a native speaker. By translating fragments instead of complete
sentences, you lock translators into English word order.
Finally, the last fragment is just a period. If a translator is presented with
this string in an interface without much context, they have no hope of guessing
how it is used in the software (it could be an end-of-sentence marker, or a
decimal point, or a date separator, or a currency separator, all of which have
very different translations in many locales). It will also conflict with all
other translations of the same string in the codebase, so even if they are
given context they can't translate it without technical problems.
To avoid these issues, provide complete sentences for translation. This almost
always takes the form of writing out alternatives in full. This is a good way
to implement the example function:
```lang=php
if ($is_switch) {
if ($is_right) {
return pht('Turn the switch to the right.');
} else {
return pht('Turn the switch to the left.');
}
} else {
if ($is_right) {
return pht('Turn the dial to the right.');
} else {
return pht('Turn the dial to the left.');
}
}
```
Although this is more verbose, translators can now get genders correct,
rearrange word order, and have far more context when translating. This enables
better, natural-sounding translations which are more satisfying to native
speakers.
Singular and Plural
===================
Different languages have various rules for plural nouns.
In English there are usually two plural noun forms: for one thing, and any
other number of things. For example, we say that one chair is a "chair" and any
other number of chairs are "chairs": "0 chairs", "1 chair", "2 chairs", etc.
In other languages, there are different (and, in some cases, more) plural
forms. For example, in Czech, there are separate forms for "one", "several",
and "many".
Because plural noun rules depend on the language, you should not write code
which hard-codes English rules. For example, this won't translate well:
```lang=php, counterexample
if ($count == 1) {
return pht('This will take an hour.');
} else {
return pht('This will take hours.');
}
```
This code is hard-coding the English rule for plural nouns. In languages like
Czech, the correct word for "hours" may be different if the count is 2 or 15,
but a translator won't be able to provide the correct translation if the string
is written like this.
Instead, pass a generic string to the translation engine which //includes// the
number of objects, and let it handle plural nouns. This is the correct way to
write the translation:
```lang=php
return pht('This will take %s hour(s).', new PhutilNumber($count));
```
If you now load the web UI, you'll see "hour(s)" literally in the UI. To fix
this so the translation sounds better in English, provide translations for this
string in the @{class@phabricator:PhabricatorUSEnglishTranslation} file:
```lang=php
'This will take %s hour(s).' => array(
'This will take an hour.',
'This will take hours.',
),
```
The string will then sound natural in English, but non-English translators will
also be able to produce a natural translation.
Note that the translations don't actually include the number in this case. The
number is being passed from the code, but that just lets the translation engine
get the rules right: the number does not need to appear in the final
translations shown to the user.
Using PhutilNumber
==================
When translating numbers, you should almost always use `%s` and wrap the count
or number in `new PhutilNumber($count)`. For example:
```lang=php
pht('You have %s experience point(s).', new PhutilNumber($xp));
```
This will let the translation engine handle plural noun rules correctly, and
also format large numbers correctly in a locale-aware way with proper unit and
decimal separators (for example, `1000000` may be printed as "1,000,000",
with commas for readability).
The exception to this rule is IDs which should not be written with unit
separators. For example, this is correct for an object ID:
```lang=php
pht('This diff has ID %d.', $diff->getID());
```
Male and Female
===============
Different languages also use different words for talking about subjects who are
male, female or have an unknown gender. In English this is mostly just
pronouns (like "he" and "she") but there are more complex rules in other
languages, and languages like Czech also require verb agreement.
When a parameter refers to a gendered person, pass an object which implements
@{interface@libphutil:PhutilPerson} to `pht()` so translators can provide
gendered translation variants.
```lang=php
pht('%s wrote', $actor);
```
Translators will create these translations:
```lang=php
// English translation
'%s wrote';
// Czech translation
array('%s napsal', '%s napsala');
```
(You usually don't need to worry very much about this rule, it is difficult to
get wrong in standard code.)
Exceptions and Errors
=====================
You should translate all human-readable text, even exceptions and error
messages. This is primarily a rule of convenience which is straightforward
and easy to follow, not a technical rule.
Some exceptions and error messages don't //technically// need to be translated,
as they will never be shown to a user, but many exceptions and error messages
are (or will become) user-facing on some way. When writing a message, there is
often no clear and objective way to determine which type of message you are
writing. Rather than try to distinguish which are which, we simply translate
all human-readable text. This rule is unambiguous and easy to follow.
In cases where similar error or exception text is often repeated, it is
probably appropriate to define an exception for that category of error rather
than write the text out repeatedly, anyway. Two examples are
@{class@libphutil:PhutilInvalidStateException} and
@{class@libphutil:PhutilMethodNotImplementedException}, which mostly exist to
produce a consistent message about a common error state in a convenient way.
There are a handful of error strings in the codebase which may be used before
the translation framework is loaded, or may be used during handling other
errors, possibly raised from within the translation framework. This handful
of special cases are left untranslated to prevent fatals and cycles in the
error handler.
Next Steps
==========
Continue by:
- adding a new locale or translation file with
@{article@phabcontrib:Adding New Classes}.