User tests: Successful: 0 Unsuccessful: 0
As title says.
Completes #20711
in #20711 when using a non-latin language, the aliases for the category and article are for example in Greek: el-gr
After this patch they will be unicode, i.e for Greek article
Title: Άρθρο (el-gr) Alias: άρθρο-el-gr Category: Κατηγορία (el-gr)
For Greek Category
Title: Κατηγορία (el-gr) Alias: κατηγορία-el-gr
As the Greek pack has issues, install at least Persian language (it is one of the languages installable in 4.0.)
Patch and install the Multilingual sample data.
Status | New | ⇒ | Pending |
Category | ⇒ | Front End Plugins |
Title |
|
Title |
|
setting unicodeslugs is something that is normally done in the global configuration. IF I read this code correctly it will only set the slugs to unicode for the sample data. Won't that then lead to confusion for the user when they create content and dont get unicode slugs
... look at blog.php before commenting. thanks.
I said IF I read it correctly
However I have now read that file which isnt used in this pr and that also has the same bug
that just explains why it was done this way - doesnt it mean it is the correct way.
As it is the same code we already have in the other sample data plugin, it should be fine.
With J4 the transliteration comment pointed out before should no longer be a problem, so it can probably be done "right" now.
Won't that then lead to confusion for the user when they create content and dont get unicode slugs
How were the slugs generated in the sample data SQL files. Same potential confusion point exists based on the state of whomever's database and global config that data was exported from.
Serious question now, because I don't do multilanguage (other people do my dirty work on the sites I maintain that support it) and because I don't know the history or requirements of the feature.
With J4 and PHP 7, is there anything in transliterated versus unicode that really needs us to keep having this behavior toggle? Or what is the real gain to keeping this behavior toggle?
How were the slugs generated in the sample data SQL files. Same potential confusion point exists based on the state of whomever's database and global config that data was exported from.
In 3.x, this is not solved and we get el-gr
or fa-ir
as alias when installing Joomla with the optional multilang with a non latin language. I guess because such a multilang setting was designed to just see how a basic multilingual site would be working and it did not really matter. Or because we just did not consider the problem which I just found now.
After all, as we always got an alias that was readable, it looked ok.
Importing anything containing a unicode alias does not break a site at all.
When editing such an item indeed and saving it, the alias could be changed to a date if unicode alias is not set which is the default behavior since 1.5 when the name/title can't be transliterated. If there is some parts in the alias which are pure ascii, that part would be kept as explained.
We can, if it bothers so much some people, for this multilingual sample data plugin, decide to use a fixed alias instead of unicode, for example:
for category
'alias' => 'cat-' . strtolower($itemLanguage->language) ,
for article
'alias' => 'art-' . strtolower($itemLanguage->language) ,
EDIT: imho there is no way to switch to non-unicode for the blog sampledata plugin.
With J4 and PHP 7, is there anything in transliterated versus unicode that really needs us to keep having this behavior toggle? Or what is the real gain to keeping this behavior toggle?
Can you explain what you mean by behavior toggle? If you mean keeping the unicode alias feature switch, anyone using wikipedia would immediately understand why we implemented the feature and should keep it. Transliteration just does not exist for Arabic, Chinese, etc. via a php file and it is common now to use unicode in urls as well as IDN, which we also implemented.
Using PHP7, as we have discussed recently, has nothing to do with the matter. We do know there is a specific transliteration php method, but it requires a module that is not enabled on all Servers.
In any case, even if we were forcing that (which you said we can't), it would only solve our need of some ascii transliterate in packs and joomla generally speaking. Users should still be able to switch to pure unicode.
Sorry, did not think that you would maybe mean the opposite, i.e. no switch but always unicode.
It was decided to let the choice to the user as copy/pasting (in a forum, in a mail, or in glip for example) unicode urls may not obtain the same results depending on the browser you copied it from.
For example, Firefox let's copy with percent encodings. In this case NO problem, the link is clickable.
Safari copies with unicode and it may breaK the link.
Here are screenshots in Glip
Safari
Firefox
Clicking on a link in the html does not create any issue.
Using PHP7, as we have discussed recently, has nothing to do with the matter. We do know there is a specific transliteration php method, but it requires a module that is not enabled on all Servers.
True, the Intl extension may not always be installed (it isn't by default). Another matter at the time was the PHP Transliterator wasn't available for PHP 5.3, that was the bigger hurdle to things with support in 3.x. Maybe we're in a state now where we can re-evaluate and decide if we can even conditionally support that or if we need to stick with the code we already have.
And thanks for the rest, it honestly clears up why the unicode config option is there for me.
Just a thought but is there any reason that the unicode alias is a global setting as opposed to a language setting. Would someone in greek want a non-unicode alias? I dont know. But perhaps if the unicode alias setting was moved to the language xml definition and then could be overwridden in the content language setting if needed then this problem would be resolved. As soon as the greek content language was created it would know that it should be a unicode alias etc.
unicode alias is achoice or not for every site and should never be enforced by the lang pack.
And, yes some Greek do not want unicode, thus why was enforced in the greek pack a transliteration method. As content languages can be deleted, they can’t obviously be used to define such parameter.
I did NOT say enforced
Labels |
Added:
?
|
Not sure how to test this, I installed Persian, applied the patch, created the sample data module, clicked on Multilingual Sample Data, then a Persian Menu was installed with a Home item and an hidden category list item, I don't see any items in there.
I don't see any items in there.
@coolcat-creations cause waiting of Decision #21553
Can this be tested now?
Title |
|
Drone relaunched
Can be tested again :)
I have tested this item
Just followed the steps as written in testing instructions. Clean installation of joomla 4 a 12, installed persian lang and created samples. Then edited a menu entry in the menu manager for fa-IR -> got the same result as written in the comments already #20759 (comment)
//Edit: I need to add some lines to this.
As it is written in the comment I've linked, if the unicode settings is set to true, it will work - the question is, shouldn't it work if I just follow the test instructions steps - like a normal user would install only the multilanguage sample data, without turning the settings on?
Looking in the changed code @multilang.php there can be identified a way to achieve the perfect solution by turning the unicode settings on. Problem here, it won't save the changed settings.
Factory::getConfig()->set('unicodeslugs', $unicode);
@infograf768 I am not sure what the correct way is here either. However if you install the sample data and you get your Unicode slugs and after that edit an item and save it, the Unicode slug is gone. @brianteeman mentioned that as well in his comment ( #20759 (comment) ).
My reasoning is, people installing sample data are most likely doing this in either a test site or a development site, not in a live site. So I don't see changing the Unicode setting to true as a problem. Even if it is set to true, if the user doesn't use a language requiring it, they won't notice it.
Why wouldn't we set the Unicode to Yes when importing Unicode slugs?
Why wouldn't we set the Unicode to Yes when importing Unicode slugs?
Because this is only a sample data and user may not want to use unicode slugs for the final site, including for non-latin languages. As I explained already, some users for Greek language may prefer to use the Greek default transliteration.
As I explained already, some users for Greek language may prefer to use the Greek default transliteration.
So we have 2 evils here :) Users whose language can be transliterated but don't want it and users who want to use it but their transliterated items are broken :/
I have tested this item
After installing the sampledata I can see the sample items with Unicode slugs
All clear, test is good as well.
I have tested this item
Installed a non-latin language (persian), applied patch and installed the multilingual sample data. Everything went well, the aliases for category and article are in unicode as expected so.
Status | Pending | ⇒ | Ready to Commit |
Status "Ready To Commit". 1 Build failing.
Labels |
Added:
?
|
Status | Ready to Commit | ⇒ | Fixed in Code Base |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2019-09-09 10:50:11 |
Closed_By | ⇒ | wilsonge |
Thanks JM!
@Bakual @laoneo