?
avatar avila-devlogic
avila-devlogic
11 Mar 2017

At some occasions JSON data in the database (options data) might contain whitespace characters that will effectively bring the website down since v3.6.3 because stringToObject() throws an exception whenever Json is malformed.

Expected result

That stringToObject strips any whitespace as it's already partially done in https://github.com/joomla/joomla-cms/blob/staging/libraries/vendor/joomla/registry/src/Format/Json.php#L60

Actual result

Website brought down with a message "Error decoding JSON data: Syntax error" whenever whitespace is encountered in registry.

System information (as much as possible)

Joomla! 3.6.5 Stable [ Noether ] 1-December-2016 22:46 GMT

Additional comments

This bug report is about whitespace only.
However, my personal opinion is that malformed Json shouldn't bring the web page down, but such an issue should be logged and reported to the web site administrator. After all, Joomla itself is the one that saved such a malformed registry entry into the database.

Possible resolution @l60:

    $what   = "\\x00-\\x19"; // all whitespace characters except space itself
    $data = trim(preg_replace( "/[".$what."]+/" , '' , $data));

More resources:
https://forum.joomla.org/viewtopic.php?t=937036
https://github.com/joomla/joomla-cms/blob/staging/libraries/vendor/joomla/registry/src/Format/Json.php#L72

avatar avila-devlogic avila-devlogic - open - 11 Mar 2017
avatar joomla-cms-bot joomla-cms-bot - labeled - 11 Mar 2017
avatar mbabker
mbabker - comment - 11 Mar 2017

Joomla doesn't do any sort of parsing of raw JSON strings, other than to remove wrapping whitespace and validate the first and last characters are the expected brackets before trying to convert a string to an object (and if that bracket condition can't be met then it tries to treat it as an INI string). IMO, it shouldn't try to do any manipulation of a potential JSON string either, especially as whitespace is allowed in certain contexts which is why you can have "pretty" formatted JSON. Adding that type of processing seems like it is asking for trouble and could potentially alter user supplied data. Either way, correct use of a Registry object or PHP's json_* functions can't ever cause "Joomla itself is the one that saved such a malformed registry entry into the database" to be a true statement.

Most of the JSON related errors have come from broken data either from extensions, bad migrations, or server related configurations causing unexpected results (I think magic quotes was the result of this). All of the broken data has been blissfully ignored for years. The Registry library now correctly checks for bad data and warns implementors to the problem. The issue is there is no proper error handling of most error conditions so things that should never bring down a site do (i.e. reading from the cache store, or logging, or JSON decoding problems). The fix is not to remove the error throwing.

avatar avila-devlogic
avila-devlogic - comment - 12 Mar 2017

I understand your point and I'm not suggesting to remove error throwing - it's there for the fail-fast sake.
At the very least, we seem to agree that hard-fail "Error decoding JSON data: Syntax error" message isn't solution either as it urges administrator of the web site to look into the code and modify it in order to uncover JSON string and to find offending content.

I don't see any case where user might work with JSON in Joomla directly since it's a user-oriented software.
Either way, this should be addressed, probably in some higher layer of the registry handling.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/14502.

avatar mbabker
mbabker - comment - 12 Mar 2017

The only room for improvement is to basically wrap all uses of new Registry($string); or $registry->loadString($string); in a try/catch. Which means going through a massive ton of code to add error handling for what is essentially someone direct altering the database to create invalid JSON strings. The registry API is only running json_decode() to read a string and json_encode() to write a string, so there isn't anything within the API that can cause an incorrectly formatted string to be created by the API in the first place unless your PHP configuration is set up in such a way that PHP can break it, in which case you've already got bigger issues.

Everything in the libraries/vendor directory is "third party", it is either code that isn't Joomla's or code that is decoupled and usable outside the CMS distribution; the Registry package is one of those. So it can't be changed so that instead of throwing an Exception it calls JFactory::getApplication->enqueueMessage(); and the Registry package is in general unaware of logging tools (no support for the PSR-3 logging interface and it can't depend on being able to call JLog::add() for the same reason as before). So there isn't a "softer" option.

avatar avila-devlogic
avila-devlogic - comment - 12 Mar 2017

Hi Michael,

I'm sorry if I'm ignorant to the Joomla's policy on handling code decoupling, but standing point "there isn't a softer option" is too brittle for my taste.
Component's purpose is not to be purposeful to itself, but to be more manageable and finally to find it's purpose for the user. Hence suggestion to do something about a way to propagate errors in a softer way instead of slapping the user.
Either way, the fix I've proposed will suffice for me ATM.

Best wishes,
Ahmed

avatar mbabker
mbabker - comment - 12 Mar 2017

To be honest, it is not the fact that there is whitespace in the database that has been the problem. It has primarily been other formatting errors (bracket placement, quoting, etc.). So a whitespace fix is honestly a minimal at best solution and doesn't really address the most common problems.

Also, not knowing 100% the JSON specification, it seems to me like it could be a bad idea to do a massive alteration of an input string to remove certain characters. I don't think it's the JSON formatter object's responsibility to validate and alter input to make it usable, these are responsibilities of the downstream implementor (it's basically the same as sanitizing/validating your data before using it elsewhere).

It's an annoying issue, but I do not believe there is a valid fix to be applied in the Registry API. The formatting objects are classes of minimal responsibility (basically an abstraction layer in the Registry to be able to take its internal data object and convert it to a string of several format types and to be able to read that string back, in the case of JSON it is nothing more than a wrapper around json_encode() and json_decode()). Strings created through it are correctly handled. It cannot guarantee the validity of all data sources though and I don't feel it's this layer's responsibility to try and fix broken data.

So the options in the Registry API are to either retain the status quo (alert to broken data) or revert the validation fix and go back to 3.6.2 behavior where the API blissfully ignored broken data and gave no indication that it just gave back a null value and none of your data was added into the Registry.

avatar avila-devlogic
avila-devlogic - comment - 12 Mar 2017

I couldn't agree more on downstream responsibility and core API sets should definitely be strict and not let bad data pass by unnoticed, but that opens another set of questions like why exception couldn't been handled more gradually for downstream to adjust to it. Registry being API so close to the PHP could actually use warnings that would go to the php's error_log instead.
I think that would be the best option.

avatar Bhola-B2C
Bhola-B2C - comment - 14 Mar 2017

Actually, It is referring to which white space... I cant understand..Can anyone help me out?

avatar avila-devlogic
avila-devlogic - comment - 15 Mar 2017

Whitespace that might appear in JSON string coming from a database for whatever the reason. For me it turned out to be database export/import, different mysql versions. But it can happen for whatever reason.

avatar Bhola-B2C
Bhola-B2C - comment - 15 Mar 2017

Do we need to trim whitespace just from beginning and end or do we need to consider those present in middle also?

avatar Bhola-B2C
Bhola-B2C - comment - 15 Mar 2017

Do we need to trim whitespace just from beginning and end or do we need to consider those present in middle also?

avatar avila-devlogic
avila-devlogic - comment - 15 Mar 2017

yes we do as it can end up having whitespaces within

On Wed, Mar 15, 2017 at 7:25 AM, Bhola-B2C notifications@github.com wrote:

Do we need to trim whitespace just from beginning and end or do we need to
consider those present in middle also?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#14502 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHW0xhg3GrXSTN1t5t6JY-POGaAJ3i1Jks5rl4RagaJpZM4MaXtK
.

--

Ahmed Vila | Software engineer

[image: Symphony - email signature_UNI.png]

Mobile | +387 62 139 348

Web | www.symphony.is

San Francisco | Sarajevo | Belgrade
No one can whistle a symphony.

avatar franz-wohlkoenig franz-wohlkoenig - change - 4 Apr 2017
Category Libraries JavaScript Libraries
avatar franz-wohlkoenig franz-wohlkoenig - change - 4 Apr 2017
Priority Medium Very low
Status New Discussion
avatar brianteeman brianteeman - change - 18 Aug 2017
Status Discussion Closed
Closed_Date 0000-00-00 00:00:00 2017-08-18 21:05:54
Closed_By brianteeman
avatar brianteeman brianteeman - close - 18 Aug 2017
avatar brianteeman
brianteeman - comment - 18 Aug 2017

So the options in the Registry API are to either retain the status quo (alert to broken data) or revert the validation fix and go back to 3.6.2 behavior where the API blissfully ignored broken data and gave no indication that it just gave back a null value and none of your data was added into the Registry.

I am closing this based on this comment above

Add a Comment

Login with GitHub to post a comment