?
avatar mahagr
mahagr
11 May 2016

One of our users ran into some issues with his Joomla installation because of his server had default_charset=latin1 in php.ini.

Warning: htmlspecialchars(): charset `latin1' not supported, assuming utf-8 in ....

Documentation for htmlspecialchars() says that: "PHP 5.6: The default value for the encoding parameter was changed to be the value of the default_charset configuration option"

The issue can be fixed by making sure that all htmlspecialchars() calls have encoding set.

Additionally in my tests there is an issue if for some reason incoming encoding is wrong, which is easy to fix by using ENT_SUBSTITUTE (PHP 5.4+ only!):

    var_dump(htmlspecialchars('Täällä olen minä',  ENT_COMPAT, 'UTF-8'));
    var_dump(htmlspecialchars('T'.chr(132).chr(132).'ll'.chr(132).' olen min'.chr(132), ENT_COMPAT, 'UTF-8'));
    var_dump(htmlspecialchars('T'.chr(132).chr(132).'ll'.chr(132).' olen min'.chr(132), ENT_COMPAT|ENT_SUBSTITUTE, 'UTF-8'));

Results:

string(20) "Täällä olen minä"
string(0) ""
string(24) "T��ll� olen min�"
avatar mahagr mahagr - open - 11 May 2016
avatar Bakual
Bakual - comment - 11 May 2016

Afaik, we usually define the encoding when using htmlspecialchars().
Do you have a specific place where it wasn't done and produced that error?

avatar mahagr
mahagr - comment - 11 May 2016

A quick search indicates that there are probably hundreds of locations where the call only has a single parameter. It looks like the parameters are omitted more often than used.

avatar brianteeman
brianteeman - comment - 11 May 2016

And yet we have never had reports. Isn't this a server config issue?
On 11 May 2016 12:08 pm, "Matias Griese" notifications@github.com wrote:

A quick search indicates that there are probably hundreds of locations
where the call only has a single parameter. It looks like the parameters
are omited more often than used.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#10399 (comment)

avatar mahagr mahagr - change - 11 May 2016
The description was changed
Title
Always use htmlspecialchars($str, ENT_COMPAT|ENT_SUBSTITUTE, 'UTF-8')
Always use htmlspecialchars($str, ENT_COMPAT, 'UTF-8')
avatar mahagr mahagr - change - 11 May 2016
Title
Always use htmlspecialchars($str, ENT_COMPAT|ENT_SUBSTITUTE, 'UTF-8')
Always use htmlspecialchars($str, ENT_COMPAT, 'UTF-8')
avatar mahagr
mahagr - comment - 11 May 2016

Yes, in this case the issues were caused by a bad server configuration, but PHP documentation says that omitting the encoding parameter is not recommended.

But as you still support PHP 5.3, you may get into troubles as the function call in PHP 5.3 still uses ISO-8859-1 encoding by default, which may result on broken characters with UTF-8.

So not having those two extra parameters set is a bad idea in both PHP 5.3 and 5.6+.

avatar brianteeman
brianteeman - comment - 11 May 2016

If it was just one or two places to edit then of course I would say go for
it but if it really is so many places and with no other reports i am
reluctant

On 11 May 2016 at 12:25, Matias Griese notifications@github.com wrote:

Yes, in this case the issues were caused by a bad server configuration,
but PHP documentation says that omitting the encoding parameter is not
recommended.

But as you still support PHP 5.3, you may get into troubles as the
function call in PHP 5.3 still uses ISO-8859-1 encoding by default, which
may result on broken characters with UTF-8.

So not having those two extra parameters set is a bad idea in both PHP 5.3
and 5.6+.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#10399 (comment)

Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/

avatar Bakual
Bakual - comment - 11 May 2016

So latin1 isn't a supported value for default_charset and thus it produced the warning. At least I understand the documentation like this.

I'm a bit with Brian here. While it would be recommended to always specify the encoding in htmlspecialchars(), we didn't really experience any issues in the places where it wasn't done.
Technically, the issue is valid. In practice, it doesn't seem to matter.

If someone wants to track down all occurances and fix them, go ahead. It could be a pain to test though.

avatar mahagr
mahagr - comment - 11 May 2016

LOL, it looks that at least @zero-24 is working on this issue. :)

I reported this issue as it took fairly long time to figure out what was wrong in the user setup and as it affected our software as well. And the warning was indeed caused by bad configuration.

BTW: Would it be a good idea to introduce some common escape() method into Joomla? I hate using htmlspecialchars() for some obvious reasons.

avatar Bakual
Bakual - comment - 11 May 2016

BTW: Would it be a good idea to introduce some common escape() method into Joomla? I hate using htmlspecialchars() for some obvious reasons.

You mean something like https://github.com/joomla/joomla-cms/blob/staging/libraries/legacy/view/legacy.php#L347-L367

avatar mahagr
mahagr - comment - 11 May 2016

Yes, but something that can be used outside of views.

avatar Bakual
Bakual - comment - 11 May 2016

Not sure if it makes sense for modules and plugins as those usually don't have that many strings to escape.
It's only a single function call we're talking about. You may end up writing more code than what you try to replace :smile:

avatar mahagr
mahagr - comment - 11 May 2016

Yeah, I agree its likely much bigger change, but it would prevent people from omitting parameters. But as you can see, nobody uses $this->escape() either in template files... :)

avatar Bakual
Bakual - comment - 11 May 2016

But as you can see, nobody uses $this->escape() either in template files... :)

That was my thinking as well, but a search proved me wrong :smile:

avatar brianteeman brianteeman - close - 11 May 2016
avatar brianteeman brianteeman - change - 11 May 2016
Status New Closed
Closed_Date 0000-00-00 00:00:00 2016-05-11 13:24:55
Closed_By brianteeman
avatar brianteeman brianteeman - close - 11 May 2016
avatar brianteeman brianteeman - close - 11 May 2016
avatar brianteeman
brianteeman - comment - 11 May 2016

Closed as @zero-24 is being a hero


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/10399.

avatar zero-24
zero-24 - comment - 11 May 2016

I'm doing it step by step so it can be easily reviewed and tested :smile:

avatar izharaazmi
izharaazmi - comment - 11 May 2016

Can we have something to abstract this dirty looking function call into something like: JHtmlString::escape() or JHtml::escape() and use it consistently in the core at least?

About the 3rd party developers we can't help much. I have seen many of them still using $_GET, $_REQUEST, $_SESSION etc. So they are not a question here.

PS: If done so, we may deprecate the JViewLegacy::escape too and have a common method everywhere. This is however just a flying idea and needs discussion.

avatar Bakual
Bakual - comment - 11 May 2016

I'm not a huge fan of wrapping a native PHP function into an own static method.
JViewLegacy::escape is a bit special as it allows different encoding and escaping mechanism being used. I don't know if that functionality is actually used anywhere but it's there. If not for that additional feature, I would be in favor of just replacing it with the htmlspecialchars() call directly.

avatar mahagr
mahagr - comment - 11 May 2016

I can see one use for the method: being able to do better in PHP 5.4+:

htmlspecialchars($str, ENT_COMPAT|ENT_SUBSTITUTE, 'UTF-8');

That would not break if input is for some reason in non-UTF-8 encoding.

avatar brianteeman brianteeman - change - 13 May 2016
Labels Added: ?

Add a Comment

Login with GitHub to post a comment