1
0
Fork 0
mirror of https://we.phorge.it/source/arcanist.git synced 2024-11-22 06:42:41 +01:00

Replace function utf8_decode() - deprecated since PHP 8.2

Summary:
The function utf8_decode() was a shortcut to convert strings
encoded from UTF-8 to ISO-8859-1 ("Latin 1").

This function was deprecated since PHP 8.2 and will be dropped
in PHP 9:

https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode

As mentioned in the RFC, if a $string is a valid UTF-8 string,
so this could be used to count the number of code points:

    strlen(utf8_decode($string))

It works because any unmappable code point is replaced with the
single byte '?' in the output. But, the correct native approach
should be this one:

    mb_strlen($string, 'UTF-8');

Also, another good approach is this one:

    iconv_strlen($string, 'UTF-8')

Note that mb_strlen() was introduced in PHP 4, so, there
are no compatibility issues in using that.

Note that the mbstring extension is already required in the installation
documentation, so this should not change anything for any person.

https://we.phorge.it/T15188

https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode

https://www.php.net/manual/en/function.utf8-decode

https://www.php.net/manual/en/function.mb-convert-encoding.php

https://github.com/rectorphp/rector/blob/main/docs/rector_rules_overview.md#utf8decodeencodetombconvertencodingrector

Closes T15188

Test Plan:
- I was able to execute "arc lint" from PHP 8.2
- I was able to execute this "arc diff" from PHP 8.2
- With this patch you can still run "arc lint" with your local version

Reviewers: O1 Blessed Committers, avivey

Reviewed By: O1 Blessed Committers, avivey

Subscribers: speck, tobiaswiese, Matthew, Cigaryno

Maniphest Tasks: T15188

Differential Revision: https://we.phorge.it/D25092
This commit is contained in:
Valerio Bozzolan 2023-03-25 10:49:35 +01:00
parent 9e1bb955fa
commit 08dfffd5ca

View file

@ -288,8 +288,12 @@ function phutil_is_utf8_slowly($string, $only_bmp = false) {
* @return int The character length of the string. * @return int The character length of the string.
*/ */
function phutil_utf8_strlen($string) { function phutil_utf8_strlen($string) {
if (function_exists('utf8_decode')) { if (function_exists('mb_strlen')) {
return strlen(utf8_decode($string)); // Historically, this was just a call to strlen(utf8_decode($string))
// but, since PHP 8.2, that function is deprecated, so this is
// the current equivalent.
// https://we.phorge.it/T15188
return mb_strlen($string, 'UTF-8');
} }
return count(phutil_utf8v($string)); return count(phutil_utf8v($string));
} }