1
0
Fork 0
mirror of https://we.phorge.it/source/arcanist.git synced 2024-11-10 00:42:40 +01:00

[Wilds] Sanitize UTF8 output in tsprintf(...) under Windows

Summary:
Ref T13209. In PHP, when you `echo` or `print` certain invalid sequences to the `cmd.exe` terminal under Windows 10, the entire string just vanishes into the ether.

I ran into this because `arc unit` was reporting "1 failing test" but not actually printing a test failure. That's because the failing test was the surrogate filtering test, and the test failure contained a reserved UTF16 surrogate sequence ("Expected: <filtered result>; Actual: <unfiltered result>"). See D19724.

To try to limit the damage this can cause, explicitly `phutil_utf8ize(...)` the output under Windows. When we don't //need// to do this I think it's slightly better not to (occasionally, the raw input might be useful in debugging or understanding something) which is why I'm not just doing it unconditionally.

Test Plan:
  - Wrote a script which did `echo tsprintf("%s", "<invalid surrogate sequence>");`.
  - On Windows 10 in `cmd.exe`, saw it print something instead of printing nothing.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13209

Differential Revision: https://secure.phabricator.com/D19725
This commit is contained in:
epriestley 2018-10-02 10:43:48 -07:00
parent b192185045
commit 9ac0b69798

View file

@ -70,6 +70,13 @@ final class PhutilTerminalString extends Phobject {
$value = preg_replace('/\r(?!\n)/', '<CR>', $value); $value = preg_replace('/\r(?!\n)/', '<CR>', $value);
} }
// See T13209. If we print certain invalid unicode byte sequences to the
// terminal under "cmd.exe", the entire string is silently dropped. Avoid
// printing invalid sequences.
if (phutil_is_windows()) {
$value = phutil_utf8ize($value);
}
return $value; return $value;
} }
} }