? No Code Attached Yet a11y bug ?
avatar crystalenka
crystalenka
15 Nov 2022

This is not the first time this is coming up and if it's not fixed, it won't be the last. :)

This is collection of the open and closed issues I found so far that noted similar issues: (I'm sure I missed some)

I am creating a new issue to more comprehensively define the problem, and why it absolutely needs to be fixed in either J4 or J5. I am hoping that by collecting the information and discussion here, we can come up with a solution together.

When introduced, custom fields were a way to add additional static data to articles. It wasn't really flexible enough to do anything beyond that so the issues with versioning/article preview were less pronounced (though still there).

Now with subforms and many 3rd party custom fields plugins, custom fields has become a way to create structured content within Joomla. This is fantastic! Except... there's no versioning. Changes don't get shown with the preview button. Etc etc.

Since it's firmly part of the 'content' mental model of most users now on sites that have custom fields, the lack of transparency into whether or not custom fields are versioned is severely lacking and creating the wrong impression. I know there are architectural challenges to this as noted in previous issues, but this is a must-have for us to be able to ethically promote the 'versioning' feature. Both are core supported features, and they need to work together.

So.... what do we do? How do we solve this?

avatar crystalenka crystalenka - open - 15 Nov 2022
avatar joomla-cms-bot joomla-cms-bot - labeled - 15 Nov 2022
avatar joomla-cms-bot joomla-cms-bot - change - 15 Nov 2022
Labels Added: ? No Code Attached Yet a11y ?
avatar joomla-cms-bot joomla-cms-bot - labeled - 15 Nov 2022
avatar nikosdion
nikosdion - comment - 15 Nov 2022

Versioning is implemented in plg_behaviour_versionable by hooking into the table's after store and before delete events.

Fields are not part of an article or any 3PD extension content. They are not saved in the same table. They are not part of the article's params column in the #__content table (nor should they ever be). They are stored in their own table, #__fields_values. If you version them, you need an interface which allows you to edit each individual field outside of the context of where it belongs so you can restore its old versions individually. This can very quickly lead to a clusterfsck of epic proportions.

If we changed fields to be saved as parameters to the table it would be a massive b/c break with no way to create a transitional release. It would also make any notion of searching or filtering by fields impossible on the 99.999% of live servers which do not have the MySQL JSON extensions installed.

Further to that, the field definitions themselves may differ over time which further complicates what is versionable and what is not.

The Preview button currently loads the front-end view of the specific article as it is already saved in the database. It doesn't magically push the changes you have made on the edit page. This is due to practical considerations. You can't have the different models and any number of arbitrary view template overrides magically replace data in their SQL queries —especially the ones JOINing data from different tables— to construct an accurate preview. An accurate preview would only work if the totality of the information required for article presentation was self-contained in the #__content table.

Essentially, what you are asking for is whether we can use a NoSQL approach to articles to which the answer is No for several reasons:

  • Abusing an RDBMS as a NoSQL store has never worked for anyone.
  • Using a real NoSQL database makes that CMS 100% incompatible with previous versions and all extensions. Moreover, it makes it unusable but on a handful of hosts which offer a compatible NoSQL server. In other words, it's a new product, with a different target audience that's already better served by PHP frameworks (and typically operates on microservices, not a monolithic architecture like Joomla).
  • Even if we half-ass it it would be a fundamental b/c break for all existing sites, it would make custom fields non-searchable, and break all 3PD extensions using custom fields, penalising once more the 3PDs who are stupid enough like me to trust Joomla won't screw them over.

It is a 100%, hand down, big ABSOLUTELY HELL NOT from me.

The correct approach to this is having a staging site. Do your changes to the staging site, move them over to the live site.

avatar brianteeman
brianteeman - comment - 15 Nov 2022

Also see #38658 (comment) for an alternative possibility which is exactl what another cms uses

avatar nikosdion
nikosdion - comment - 15 Nov 2022

@brianteeman As I said, by doing what you described

it would make custom fields non-searchable, and break all 3PD extensions using custom fields, penalising once more the 3PDs who are stupid enough like me to trust Joomla won't screw them over.

avatar brianteeman
brianteeman - comment - 15 Nov 2022

I agree that it woud not be b/c and I say as much.

How would it make then mon-searchable though?

avatar nikosdion
nikosdion - comment - 15 Nov 2022

Storing the fields data in the #__content table means that they are JSON-encoded, e.g. as part of the params field or as part of a new field, let's call it fields.

Problems:

  • You need a different architecture in the system plugin and the com_fields model to implement that.
  • Existing fields data can no longer be read.
  • Even if Joomla creates a migration for its own crap, third party extensions are completely broken and need to be reimplemented WITHOUT being able to offer a version which works on both Joomla versions (before and after the change), WITHOUT being able to offer a realistic migration. Therefore we go back to where 3PDs say “fsck Joomla” and go back to implementing their own thing because Joomla proves itself to be completely unreliable, once again.
  • Fields become unsearchable (remember that K2 was using this approach and you and me had talked about how it made custom fields unsearchable in great length a decade ago with its developers). You cannot have a database query which targets specific data in a JSON document inside a table column UNLESS your database server supports the MySQL JSON data type. This creates two sub-problems
    • The JSON support in MySQL is slow, optional, and turned off on every single hosting provider I have gotten my hands on (and they are quite a lot, thanks to what I do for a living). In other words, we create a CMS which can only be installed on dedicated servers, killing Joomla by excluding its core target market. That sounds stupid.
    • PostgreSQL has an entirely different and completely incompatible language for handling JSON data. The differences are so fundamental they cannot be addressed with our query abstraction architecture. We either need to have completely separate database queries for each DBMS which will inevitably lead to massive bugs and feature disparity between DBMS, or we need to throw away the entire Joomla DatabaseDriver and start from scratch, therefore completely killing ALL third party extensions — which means that we killed Joomla.

Does it make sense to save the patient's toe if in the process of doing so we kill the patient?

The other alternative is to have a hybrid method which saves new data to a JSON-encoded field, reads from both the existing table AND the JSON-encoded field and agree that fields data is completely non-searchable in a database query. Therefore we are removing features we already have, breaking b/c super duper extra hard (and losing the last shreds of developer trust and credibility as a reliable software development platform) and we have still not solved the preview problem. We have not solved it because previewing changes requires knowing:

  • Which menu item is this going to be rendered in, as it may use a different template than the default site template
  • Which view template override is going to be used, because this decision might be made contingent on runtime data by the template itself
    Further to that, we would need to somehow inject the currently modified data without saving it to the database, into a query that joins the #__content table with half a dozen other tables. Changing that query to not use the JOINs would decrease the site's performance by a factor of 8 which is suicidal for SEO and usability.

So, again, I am asking: has anyone actually thought about the implementation details and how they don't make sense? Or is both this RFC and the discussion wishful thinking without any attempt to think beforehand whether it can be realistically implemented and what it really means if it is implemented? Because what you are describing is no longer Joomla, it's Joomla X i.e. a different CMS for a target audience who doesn't actually wants to use a CMS and has no interest in using that product.

avatar brianteeman
brianteeman - comment - 15 Nov 2022

I did not say anything about storing each field as json in a single params field. I know that would be a poor option and everything you say is true.

However what I said was

When you create the field it also adds a field to the #__content table

That obviously was not clear enough so I will try again. In other words you alter the content table by adding a new db field everytime you add a new custom field

avatar nikosdion
nikosdion - comment - 15 Nov 2022

@brianteeman I will let you figure out on your own why this is absolutely catastrophic. Hint: talk a database administrator, ideally not in person (unless you have complete disregard for your personal safety).

avatar brianteeman
brianteeman - comment - 15 Nov 2022

as far as I have been told this is exactly what drupal does.

avatar nikosdion
nikosdion - comment - 15 Nov 2022

Please, at the very least, read https://dev.mysql.com/doc/refman/8.0/en/innodb-row-format.html if not the entire https://dev.mysql.com/doc/refman/8.0/en/innodb-storage-engine.html section. I won't even start telling you why table column proliferation leads to insane performance degradation when you start JOINing tables, something we need to do unless we regress to Joomla 1.5 levels of lack of permissions management; I think that we were all taught matrix multiplications in highschool and all of us participating in this kind of discussion understand that JOINing tables results in matrix multiplication between tables with everything that means about stack and temporary filesystem space usage.

There is a reason why a Joomla site which can be comfortably served by a single, moderately beefy server needs a cluster of web and primary-secondary database servers to serve the same load with Drupal.

Moreover, just because someone else is doing something wrong —presumably for their own legacy reasons— doesn't mean we should do adopt it. For example, WordPress uses a varchar, non-indexed field with the literal string values yes and no in its #__options table to select which options, stored as PHP serialised data, are going to be preloaded on each request. This table runs to the hundreds of thousands of rows on a typical site and I don't think it takes a particularly sharp DBA to figure out that using an unindexed VARCHAR column as the selector running on each page load is suicidal for performance, nor does it take a particularly sharp PHP developer to realise that PHP serialised data is the unique combination of slow, unsafe, and error-prone when JSON-serialised data could have been easily used in its place. WordPress developers are keenly aware of this problem but cannot fix it because their stance on b/c is stubbornly combative, even to the detriment of their own product (which is something afforded to them by the fact that they are immensely popular, giving them the false idea that users don't have a choice).

Finally, and I will stress this very strongly, THERE WERE CCKs towards the mid- to end of life of Joomla 1.5 which did exactly what you described — in fact they were altering both the users and the content tables. The end result was slow sites, broken updates, and these CCKs dying off by the time Joomla 3.2 was around.

As you keep saying, Brian, just because you can doesn't mean you should. This is one of these cases.

I fully understand the issues with storing the field values as separate entries on a big database table. You might remember that this is a discussion I've had with you several times. As I had told you, it's not ideal but the practical alternatives are far worse.

The ideal solution would be commercial hosts supporting the JSON MySQL data type and Joomla dropping PostgreSQL support (which, even according to Joomla's inaccurate stats, barely breaches 0.1% across all time, and barely 0.05% in recent data). However, we are not there yet. First of all, we have to contend with the silly ideological ankylosis which proclaims that we have to have 99.95% of our users using MySQL and MariaDB suffer the lack of performance optimisation for the benefit of the 0.05% who might use PostgreSQL (and how many of them in production is even more debatable). Second, there is no buy in from commercial hosts to support the JSON field type on their servers as it would compress their profits (it's slow, meaning their server loads would increase and they'd have to put fewer sites on each physical server).

In short, there is a path forward which is ideal and COULD be made with a modicum of b/c in mind (at the very least the API could be stable and a migration solution offered) but the conditions are not ripe just yet. I reckon that by Joomla 7 we might be able to re-evaluate the conditions.

avatar Septdir
Septdir - comment - 15 Nov 2022

Saving field values ​​in a JSON column is faster than using classic EAV. But it will be noticeable with many items.
To be honest, I was just amazed at how big the increase was in using JSON with 700,000 items in one category without caching.

However, as @nikosdion rightly noted, this requires the technical compliance of the server. Namely, you need MYSQL 8.0.21 or MariaDb 10.4.1 and shared hosting is not ready to change the database version, even considering that MYSQL5 support will end in a year.

Also, as @nikosdion again correctly noted, there is a compatibility problem with third-party extensions.

To summarize, this is applicable in separately created components, where you can set certain requirements for the server, but I'm not sure that this is justified in the case of standard fields

avatar Hackwar Hackwar - change - 22 Feb 2023
Labels Added: bug
avatar Hackwar Hackwar - labeled - 22 Feb 2023
avatar SemaphoreOxalis
SemaphoreOxalis - comment - 17 Nov 2023

Hi,
Thanks for all the explanation.
With the new J!5 database requirement and because J!5 is a move forward version maybe it's now possible ?

Add a Comment

Login with GitHub to post a comment