Have an article with a apostrophe in its title, such as Let's go
.
Generated alias is let-s-go
This is what happens in Joomla 3. In Joomla 4, the apostrophe is stripped, the alias is lets-go
I think this is a significant backward compatibility issue that will break a lot of URLs if people upgrade a Joomla 3 site to Joomla 4.
I have seen a number of reported issues with non-latin characters such as #27875 and #35014. Those appeared to be related to transliteration happening in a different way, and the language pack being involved.
This is a bit different in that it happens with stock, en-GB joomla content.
I have not taken the time to look into the code, as this was reported on a French group by others but I tested to see if it appears as well on stock en-GB Joomla. I think this should be modified and reverted to the Joomla 3 behavior, mostly for content B/C reasons and the expected ranking drops if a URL changes unknowingly.
Note that I tested with other usual suspects such as $ or # and Jooml 3 & 4 behavior appears to be the same for these characters.
Has this been reported before?
Labels |
Added:
No Code Attached Yet
|
Care to expand or link to something where this was discussed and we could get an idea of the reasons behind?
(I looked through 18 pages of "alias" related Github issues before posting the above, could not find anything)
In any case, the url(s) that were created in J3 will not be modified
Hi @infograf768 :)
You're right, existing aliases won't be modified, except in edge cases where articles or menu items are deleted and then re-created for instance.
So this likely is not a general B/C break I guess.
@brianteeman OK, I see. I can see it kind of works ok for the English language, not so much for french where the apostrophe is often going to be at the start (d'évaluer and dévaluer really are not the same word for instance).
Too late to change now anyway.
Not tested but can't the custom transliterate function in a language pack address this https://docs.joomla.org/J3.x:Making_a_Language_Pack_for_Joomla#Example_2_-_Custom_transliteration_implemented
It should, the call to transliterate() happens before the regular expression that drops anything but latin alphanumeric.
That's likely where the change should have happened, in the en-* localise.php files.
I'll suggest that to the French localization team.
PS: I noticed that your change was done in src/j4/libraries/src/Filter/OutputFilter.php
but not in the vendored folder: src/j4/libraries/vendor/joomla/filter/src/OutputFilter.php
, so there's likely an inconsistency risk down the line, right?
OK, this kind of work but won't likely with multilingual sites.
The thing is, the transliterator called is the one corresponding to the current admin language.
So if the backend is displayed in French and you create a French article (language set to FR in the article options), then the transliterate() method in fr-FR.localise.php is used and I can put change there to preserve apostrophe.
However, if the backend is set to English and I create/edit the same article, then the default localise.php file is used, and my custom transliteration for French is not applied.
So for most people I guess adding ' ' => '\'',
at the top of the $glyph_array in transliterate()
will do the job.
For multilingual sites, they'll be better off adding a transliterate method in localize.php
, so that it applies to all items.
Again, I'd think this should be done in per language transliterate but it's too late now for 4.x.
Labels |
Added:
bug
|
Again, I'd think this should be done in per language transliterate but it's too late now for 4.x.
is there anything left to do here or should it be closed
It was a deliberate change