1
0
Fork 0
mirror of https://we.phorge.it/source/arcanist.git synced 2024-11-25 00:02:40 +01:00

Force all mercurial commands to use UTF-8 encoding

Summary:
When non-ascii characters appear in revision titles/summaries the `patch` and `diff` (to update) commands will fail on Windows systems. This often occurs due to “smart quotes” or "em—dash" characters being inserted into commit messages by editors on "user-friendly" operating systems like macOS.

This can be worked around by forcing all mercurial commands to use the global option `--encoding utf-8` which applies for any mercurial command. This option was [[ https://www.mercurial-scm.org/repo/hg/rev/a88e02081a88 | added in ~2006 ]] so this should work across all supported versions of mercurial.

Refs T13649

Test Plan:
I created a diff on a mercurial repository using smart quotes in the "Title" and "Summary" fields as well as in the content of a file being changed. Then on macOS, Windows (PowerShell), and Windows (cmd.exe) I was able to `patch` down the revision, make a modification, and `diff` the change back up to Phabricator, as well as `land` the change. I verified the commit and content looked correct on macOS as well as on Windows by using `nvim` which seems to properly detect and render the encoding, whereas mercurial displays the smart quotes and em-dashes with odd characters instead.

I did a grep through Arcanist codebase to find other places where `--encoding` might be specified for mercurial commands and could not find any. In the event that somehow this argument is added elsewhere I verified that multiple specifications of `--encoding utf-8` does not cause any issues and the later specification of `--encoding` appears to "win".

```lang=console
$ hg --encoding utf-8 --encoding utf-8 log -r tip
# prints out results in UTF-8 without issue

$ hg --encoding utf-8 log --encoding latin-1 -r tip
# prints out results in latin-1 without issue
```

Reviewers: epriestley, #blessed_reviewers

Reviewed By: epriestley, #blessed_reviewers

Subscribers: Korvin

Maniphest Tasks: T13649

Differential Revision: https://secure.phabricator.com/D21676
This commit is contained in:
Christopher Speck 2021-06-27 20:28:45 -04:00
parent 246e604a07
commit c94c5bbf35

View file

@ -15,7 +15,10 @@ final class ArcanistMercurialAPI extends ArcanistRepositoryAPI {
protected function buildLocalFuture(array $argv) { protected function buildLocalFuture(array $argv) {
$env = $this->getMercurialEnvironmentVariables(); $env = $this->getMercurialEnvironmentVariables();
$argv[0] = 'hg '.$argv[0]; // Mercurial deceptively indicates that the default encoding is UTF-8
// however the actual default appears to be "something else", at least on
// Windows systems. Force all mercurial commands to use UTF-8 encoding.
$argv[0] = 'hg --encoding utf-8 '.$argv[0];
$future = newv('ExecFuture', $argv) $future = newv('ExecFuture', $argv)
->setEnv($env) ->setEnv($env)