From 9ac0b69798c6717666a4a251a1586f651b0eef13 Mon Sep 17 00:00:00 2001 From: epriestley Date: Tue, 2 Oct 2018 10:43:48 -0700 Subject: [PATCH] [Wilds] Sanitize UTF8 output in `tsprintf(...)` under Windows Summary: Ref T13209. In PHP, when you `echo` or `print` certain invalid sequences to the `cmd.exe` terminal under Windows 10, the entire string just vanishes into the ether. I ran into this because `arc unit` was reporting "1 failing test" but not actually printing a test failure. That's because the failing test was the surrogate filtering test, and the test failure contained a reserved UTF16 surrogate sequence ("Expected: ; Actual: "). See D19724. To try to limit the damage this can cause, explicitly `phutil_utf8ize(...)` the output under Windows. When we don't //need// to do this I think it's slightly better not to (occasionally, the raw input might be useful in debugging or understanding something) which is why I'm not just doing it unconditionally. Test Plan: - Wrote a script which did `echo tsprintf("%s", "");`. - On Windows 10 in `cmd.exe`, saw it print something instead of printing nothing. Reviewers: amckinley Reviewed By: amckinley Maniphest Tasks: T13209 Differential Revision: https://secure.phabricator.com/D19725 --- src/xsprintf/PhutilTerminalString.php | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/xsprintf/PhutilTerminalString.php b/src/xsprintf/PhutilTerminalString.php index 8d99b093..1d42f288 100644 --- a/src/xsprintf/PhutilTerminalString.php +++ b/src/xsprintf/PhutilTerminalString.php @@ -70,6 +70,13 @@ final class PhutilTerminalString extends Phobject { $value = preg_replace('/\r(?!\n)/', '', $value); } + // See T13209. If we print certain invalid unicode byte sequences to the + // terminal under "cmd.exe", the entire string is silently dropped. Avoid + // printing invalid sequences. + if (phutil_is_windows()) { + $value = phutil_utf8ize($value); + } + return $value; } }