Joomla! Issue Tracker | Joomla! CMS #39583 - [4.2] PHP8.2 Replace deprecated utf8 encode/decode to mb string

? Composer Dependency Changed PHP 8.x ?

Fixed in Code Base
14 Jan 2023
Medium
Build: 4.2-dev
# 39583
Diff
Digital-Peak:php82/utf8

Pending

User tests: Successful: Unsuccessful:

laoneo
9 Jan 2023

Pull Request for Issue #38258 and partially for #38922.

Summary of Changes

This pr replaces the few calls to utf8_encode/utf8_decode to mbstring. It uses hardcoded the latin 1 character set (ISO-8859-1).

[Update]
It also makes the mbstring polyfill an explicit dependency, so we ensure that it still works on hosts, which do not have the mbstring module loaded.

~~If we should support all, then 'ISO-8859-1' can be replaced with mb_list_encodings().~~ See comment below why not!

Testing Instructions

Actual result BEFORE applying this Pull Request

A deprecated warning is shown.

Expected result AFTER applying this Pull Request

No deprecated warning is shown.

Link to documentations

Please select:

Documentation link for docs.joomla.org:
No documentation changes for docs.joomla.org needed
Pull Request link for manual.joomla.org:
No documentation changes for manual.joomla.org needed

fc75902 9 Jan 2023

Replace deprecated utf8 encode/decode to mb string

joomla-cms-bot - change - 9 Jan 2023

Category

⇒

Front End com_users Installation Libraries

laoneo - open - 9 Jan 2023

laoneo - change - 9 Jan 2023

Status

New

⇒

Pending

laoneo - change - 9 Jan 2023

The description was changed

laoneo - edited - 9 Jan 2023

nikosdion - comment - 9 Jan 2023

If we should support all, then 'ISO-8859-1' can be replaced with mb_list_encodings().

No, we absolutely shouldn't!

The reason utf8_decode was deprecated is that it only ever supported ISO-8859-1 as per https://wiki.php.net/rfc/remove_utf8_decode_and_utf8_encode in the Introduction at the very top of the page (emphasis mine):

The built-in functions utf8_encode and utf8_decode convert strings encoded in ISO-8859-1 (“Latin 1”) to and from UTF-8, respectively. While this is sometimes a useful feature, they are commonly misunderstood

I am unwilling to consider this PR a good solution because mbstring is actually not a hard requirement for Joomla 4 per the published technical requirements. I know it's nonsensical and probably a mistake on that page.

Unless that page is amended to list mbstring as a hard requirement it makes far more sense to follow the approach in #38922 with a function which uses whatever is actually available — including a fallback to a pure PHP implementation. Do note that the code in #38922 is in fact following the “Throwing the kitchen sink at it” section of the PHP RFC for utf8_decode deprecation, i.e. the “official” way PHP core developers want us to polyfill this deprecated feature. It also uses utf8_decode on older PHP versions to remove surprises, so there's that.

As a second stage we should look at every instance of utf8_decode and decide if it actually makes sense to be doing that (most of that code came from early PHP 5 versions which didn't support UTF8 natively) and whether we actually meant to use ISO-8859-1 (e.g. when handling URLs that's deliberate) instead of blindly trying to use whatever encoding. I believe that in most cases we'll just end up removing the decoding altogether.

laoneo - comment - 9 Jan 2023

I really like the idea to recheck if the code still makes sense and then remove what is not really needed. For now to improve compatibility I would prefer to replace the deprecated functions with alternatives.

From a maintenance perspective I would not add two helper functions which do basically replace 5 native function calls, even when it requires to change the compatibility requirements.

nikosdion - comment - 9 Jan 2023

@laoneo I was thinking of a one-two approach. First add a drop-in replacement marked as @internal and @deprecated 5.0, then go through the (few, to be fair) instances where we use that function, thus removing the need for the drop-in replacement and the drop-in replacement itself.

This would be perfectly viable with regards to semantic versioning as @internal clearly communicates “thou shalt not use” to 3PDs and @deprecated sets the expectation for dropping the darned thing on a very short horizon.

Why do that instead of just doing the second step? To make sure that real world use of the replacement doesn't cause any issues in use cases we have never thought of or could have expected just by looking at the code. I am a bit conservative here but I have seen… THINGS ?

8288372 11 Jan 2023

Make mb string polyfill dependency explicit

joomla-cms-bot - change - 11 Jan 2023

Category

Front End com_users Installation Libraries

⇒

Front End com_users External Library Composer Change Installation Libraries

laoneo - change - 11 Jan 2023

Labels

Added: ?

9315021 11 Jan 2023

Merge branch '4.2-dev' into php82/utf8

laoneo - change - 11 Jan 2023

Labels

Added: Composer Dependency Changed

laoneo - comment - 11 Jan 2023

We have been discussing this in maintainers and in #38922. Making the mbstring polyfill an explicit dependency and converting the core code to mbstring is the least disruptive change to improve PHP 8.2 compatibility in the 4.x series.

laoneo - change - 11 Jan 2023

The description was changed

laoneo - edited - 11 Jan 2023

laoneo - change - 11 Jan 2023

The description was changed

laoneo - edited - 11 Jan 2023

f32cedb 11 Jan 2023

Order

laoneo - change - 12 Jan 2023

Title

…

Replace deprecated utf8 encode/decode to mb string

[4.2] PHP8.2 Replace deprecated utf8 encode/decode to mb string

laoneo - edited - 12 Jan 2023

68ee206 12 Jan 2023

Merge branch '4.2-dev' into php82/utf8

laoneo - change - 12 Jan 2023

Labels

Added: PHP 8.x

carlitorweb - test_item - 12 Jan 2023 - Tested successfully

carlitorweb - comment - 12 Jan 2023

I have tested this item ✅ successfully on 68ee206

For test it. Login in the frontend, and go to the profile user page

_{This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39583.}

f3e4c20 13 Jan 2023

Merge branch '4.2-dev' into php82/utf8

laoneo - change - 13 Jan 2023

The description was changed

laoneo - edited - 13 Jan 2023

joomdonation - test_item - 13 Jan 2023 - Tested successfully

joomdonation - comment - 13 Jan 2023

I have tested this item ✅ successfully on f3e4c20

_{This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39583.}

joomdonation - alter_testresult - 13 Jan 2023 - carlitorweb: Tested successfully

joomdonation - change - 13 Jan 2023

Status

Pending

⇒

Ready to Commit

joomdonation - comment - 13 Jan 2023

I restored test result from @carlitorweb because it was lost for some reasons. RTC

_{This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39583.}

fbb4872 14 Jan 2023

Merge branch '4.2-dev' into php82/utf8

roland-d - change - 14 Jan 2023

Labels

Added: ?

roland-d - change - 14 Jan 2023

Status	Ready to Commit	⇒	Fixed in Code Base
Closed_Date	0000-00-00 00:00:00	⇒	2023-01-14 11:01:41
Closed_By		⇒	roland-d

roland-d - close - 14 Jan 2023

roland-d - merge - 14 Jan 2023

roland-d - comment - 14 Jan 2023

Thank you

Add a Comment

Older
Newer

Joomla! Issue Tracker - CMS

[#39583] - [4.2] PHP8.2 Replace deprecated utf8 encode/decode to mb string

Summary of Changes

Testing Instructions

Actual result BEFORE applying this Pull Request

Expected result AFTER applying this Pull Request

Link to documentations

Add a Comment