User tests: Successful: Unsuccessful:
Currently if "Unicode Aliases" are disabled in global configuration, transliteration is made based on the current language used in administration area. This pull request will make all core extensions aliases to be transliterated based on the actual resource ( article, category, banner etc. ) language . If "All" is selected as the resource language then the site's default language will be used as a fallback.
Since all core extensions are using the JApplicationHelper::stringURLSafe which calls the JFilterOutput::stringURLSafe both of them are changed with a second extra argument holding the language variable. All the core table classes which are calling ApplicationHelper::stringURLSafe for generating the alias are also changed to provide the language of the resource.
The generated alias now should be transliterated correctly. Without this pull request the result would be a date based alias.
Status | New | ⇒ | Pending |
Labels |
Added:
?
|
I have tested this item unsuccessfully on 99896b4
it will work if you replace ...
$lang = JFactory::getLanguage($language);
for
$lang = JLanguage::getInstance($language);
in https://github.com/kavadas/joomla-cms/blob/staging/libraries/joomla/filter/output.php#L75
Also you need to make sure that when using another language that cannot transliterate that characters it fallback to the date alias.
Example; use that grrek title and choose en-GB as language beforre saving. Result = empty alias
Let me detail things a bit more for testers.
A language pack is including or not a custom transliterate()
method in its xx-XX.localise.php (This file is included in both admin and site part of the language).
For example, fr-FR (French) is including
public static function transliterate($string)
{
$str = JString::strtolower($string);
// Specific language transliteration.
// This one is for latin 1, latin supplement , extended A, Cyrillic, Greek
$glyph_array = array(
'a' => 'a,à,á,â,ã,ä,å,ā,ă,ą,ḁ,α,ά',
'ae' => 'æ',
'b' => 'β,б',
'c' => 'c,ç,ć,ĉ,ċ,č,ћ,ц',
'ch' => 'ч',
'd' => 'ď,đ,Ð,д,ђ,δ,ð',
'dz' => 'џ',
'e' => 'e,è,é,ê,ë,ē,ĕ,ė,ę,ě,э,ε,έ',
'f' => 'ƒ,ф',
'g' => 'ğ,ĝ,ğ,ġ,ģ,г,γ',
'h' => 'ĥ,ħ,Ħ,х',
'i' => 'i,ì,í,î,ï,ı,ĩ,ī,ĭ,į,и,й,ъ,ы,ь,η,ή',
'ij' => 'ij',
'j' => 'ĵ,j',
'ja' => 'я',
'ju' => 'яю',
'k' => 'ķ,ĸ,κ',
'l' => 'ĺ,ļ,ľ,ŀ,ł,л,λ',
'lj' => 'љ',
'm' => 'μ,м',
'n' => 'ñ,ņ,ň,ʼn,ŋ,н,ν',
'nj' => 'њ',
'o' => 'ò,ó,ô,õ,ø,ō,ŏ,ő,ο,ό,ω,ώ',
'oe' => 'œ,ö',
'p' => 'п,π',
'ph' => 'φ',
'ps' => 'ψ',
'r' => 'ŕ,ŗ,ř,р,ρ,σ,ς',
's' => 'ş,ś,ŝ,ş,š,с',
'ss' => 'ß,ſ',
'sh' => 'ш',
'shch' => 'щ',
't' => 'ţ,ť,ŧ,τ,т',
'th' => 'θ',
'u' => 'u,ù,ú,û,ü,ũ,ū,ŭ,ů,ű,ų,у',
'v' => 'в',
'w' => 'ŵ',
'x' => 'χ,ξ',
'y' => 'ý,þ,ÿ,ŷ',
'z' => 'ź,ż,ž,з,ж,ζ'
);
foreach($glyph_array as $letter => $glyphs)
{
$glyphs = explode(',', $glyphs);
$str = str_replace($glyphs, $letter, $str);
}
return $str;
}
Greek is also providing transliteration
public static function transliterate($string)
{
$str = JString::strtolower($string);
//Specific language transliteration.
//This one is for latin 1, latin supplement , extended A, Cyrillic, Greek
$glyph_array = array(
'afth' => 'αυθ',
'afk' => 'αυκ',
'afks' => 'αυξ',
'afp' => 'αυπ',
'afs' => 'αυσ',
'aft' => 'αυτ',
'aff' => 'αυφ',
'afx' => 'αυχ',
'afps' => 'αυψ',
'efth' => 'ευθ',
'efk' => 'ευκ',
'efks' => 'ευξ',
'efp' => 'ευπ',
'efs' => 'ευσ',
'eft' => 'ευτ',
'eff' => 'ευφ',
'efx' => 'ευχ',
'efps' => 'ευψ',
'ifth' => 'ηυθ',
'ifk' => 'ηυκ',
'ifks' => 'ηυξ',
'ifp' => 'ηυπ',
'ifs' => 'ηυσ',
'ift' => 'ηυτ',
'iff' => 'ηυφ',
'ifx' => 'ηυχ',
'ifps' => 'ηυψ',
'-b' => '-μπ',
'-d' => '-ντ',
'-g' => '-γκ',
' b' => ' μπ',
' d' => ' ντ',
' g' => ' γκ',
'av' => 'αυ',
'ev' => 'ευ',
'iv' => 'ηυ',
'ou' => 'ου',
'a' => 'a,à,á,â,ã,ä,å,ā,ă,ą,ḁ,α,ά',
'ae' => 'æ',
'b' => 'б,^μπ',
'c' => 'c,ç,ć,ĉ,ċ,č,ћ,ц',
'ch' => 'ч',
'd' => 'ď,đ,Ð,д,ђ,δ,ð,^ντ',
'dz' => 'џ',
'e' => 'e,è,é,ê,ë,ē,ĕ,ė,ę,ě,э,ε,έ',
'f' => 'ƒ,ф,φ',
'g' => 'ğ,ĝ,ğ,ġ,ģ,г,γ,^γκ',
'h' => 'ĥ,ħ,Ħ,х',
'i' => 'i,ì,í,î,ï,ı,ĩ,ī,ĭ,į,и,й,ъ,ы,ь,η,ή,ι,ί,ϊ,ΐ',
'ij' => 'ij',
'j' => 'ĵ,j',
'ja' => 'я',
'ju' => 'яю',
'k' => 'ķ,ĸ,κ',
'ks' => 'ξ',
'l' => 'ĺ,ļ,ľ,ŀ,ł,л,λ',
'lj' => 'љ',
'm' => 'μ,м',
'n' => 'ñ,ņ,ň,ʼn,ŋ,н,ν',
'nj' => 'њ',
'o' => 'ò,ó,ô,õ,ø,ō,ŏ,ő,ο,ό,ω,ώ',
'oe' => 'œ,ö',
'p' => 'п,π',
'ps' => 'ψ',
'r' => 'ŕ,ŗ,ř,р,ρ',
's' => 'ş,ś,ŝ,ş,š,с,σ,ς',
'ss' => 'ß,ſ',
'sh' => 'ш',
'shch' => 'щ',
't' => 'ţ,ť,ŧ,τ,т',
'th' => 'θ',
'u' => 'u,ù,ú,û,ü,ũ,ū,ŭ,ů,ű,ų,у',
'v' => 'в,β',
'w' => 'ŵ',
'x' => 'χ',
'y' => 'ý,þ,ÿ,ŷ,υ,ύ,ϋ,ΰ',
'z' => 'ź,ż,ž,з,ж,ζ'
);
foreach($glyph_array as $letter => $glyphs) {
preg_match_all('/(\^[^,]+(,|$))/', $glyphs, $matches);
if (count($matches[0])) {
foreach ($matches[0] as $m) {
if (strpos($m, ',')) {
$glyphs = str_replace($m, '', $glyphs);
}
elseif(strpos($glyphs, ',')) {
$glyphs = str_replace(','.$m, '', $glyphs);
}
else {
$glyphs = '';
}
$str = preg_replace('/'.$m.'/', $letter, $str);
}
}
$glyphs = explode(',', $glyphs);
$str = str_replace($glyphs, $letter, $str);
}
return $str;
}
If the language pack is not providing a custom transliteration, then it is the core transliterate which is used. See: https://github.com/joomla/joomla-cms/blob/staging/libraries/joomla/language/transliterate.php
Therefore this PR tries to force using the possible transliterate() method from the language used by a Content Language instead of current language when unicode alias is off.
I confirm the comment from @andrepereiradasilva , i.e.
it will work if you replace ...
$lang = JFactory::getLanguage($language);
for
$lang = JLanguage::getInstance($language);
I modified the PR to include this and I had no issues when language is set to All or to a language without a transliterate() method, as English(UK) {en-GB}. In both cases I got a date and if the title contains only latin characters, it is the core transliterate method which was applied.
The only drawback is obviously on a monolanguage site (with multiple installed languages) if the Content Languages are not created. In that case behavior is as usual.
But this is indeed (after the modification suggested) a real nice improvement.
Category | ⇒ | Multilanguage Router / SEF |
This PR has received new commits.
I have updated the code to use JLanguage::getInstance instead of JFactory::getLanguage . It should work fine now.
I have tested this item unsuccessfully on ce90695
Test scenario: Multilanguage site (no unicode alias). Used title "Καλωσορίζουμε"
Used the following languages:
1. Englishh UK (en-GB). Doesn't have transliteration in "en-GB.localize.php" (default admin language).
2. Portuguese (pt-PT). Doesn't have transliteration in "pt-PT.localize.php".
3. Greek (el-GR). Does have transliteration in "el-GR.localize.php".
4. Russian (ru-RU). Does have transliteration in "ru-RU.localize.php".
Test 1.1: Content language = "All"
Resulting alias: "2016-05-05-XX-XX-XX"
Test 1.2: Content language = "English (en-GB)"
Resulting alias: "2016-05-05-XX-XX-XX"
Test 1.3: Content language = "Portuguese (pt-PT)"
Resulting alias: "2016-05-05-XX-XX-XX"
Test 1.4: Content language = "Russian (ru-RU)"
Resulting alias: "2016-05-05-XX-XX-XX"
Test 1.5: Content language = "Greek (el-GR)"
Resulting alias: "kalosorizoume"
Test 1.1: Content language = "All"
Resulting alias: empty
Test 2.2: Content language = "English (en-GB)"
Resulting alias: empty
Test 2.3: Content language = "Portuguese (pt-PT)"
Resulting alias: empty
Test 2.4: Content language = "Russian (ru-RU)"
Resulting alias: empty
Test 2.5: Content language = "Greek (el-GR)"
Resulting alias: "kalosorizoume"
Didn't test the other items (menus, categories, modules, etc), but can have the same problem to.
Also didn't tert creating articles in the frontend.
This PR has received new commits.
It seems that banners is the only core extension with an alias that does not check for empty value in order to fallback to date based alias. I updated the pull request. Hopefully, it will work now.
This PR has received new commits.
banners work now but not newsfeeds
I am testing with French and title of the various items: ωφmytitle
as I am sure French will transliterate this to ophmytitle
This PR has received new commits.
I have tested this item successfully on 9937915
Tested:
In languages with (Greek, Russian) or without (English, Portuguese) transliteration.
Tested creating articles in the frontend to.
All good!
Thanks! Nice PR.
Existing behaviour of using current language in all cases (for transliteration of non-unicode aliases)
but the above assumption
(e.g. some user has backend UI in English and frontend (site) default language is Greek)
Recently we added a workaround in our extension, for the existing bogus behaviour and implement exactly the behaviour added by this PR
it is nice to see a PR to fix this, just a note for any 3rd party extensions
i will test this too
i think this is a bug-fix and not a "new feature", will this get a J3.5.x milestone ?
i think this is a bug-fix and not a "new feature", will this get a J3.5.x
milestone ?
The next release is 3.6 ;;)
On 10 May 2016 at 09:44, Georgios Papadakis notifications@github.com
wrote:
Existing behaviour of using current language in all cases (for
transliteration of non-unicode aliases)
- i guess it was done thinking that the user editing an element will have UI language same as the item user is editing ?
but the above assumption
- if element has non-ALL language is obviously just wrong
- if an element has 'ALL'/empty language then 99% of time it means it is in the frontend (site) default language and not in the current language
(e.g. some user has backend UI in English and frontend (site) default
language is Greek)Recently we added a workaround in our extension, for the existing bogus
behaviour and implement exactly the behaviour added by this PR
it is nice to see a PR to fix this, just a note for any 3rd party
extensions
- they will need to pass the element's language to JApplicationHelper::stringURLSafe for this to work for them too
- if they do not, then there is a change in existing behaviour in that since not passing element language, instead of current language they will get transliteration in the site's default language but it is 99% correct and better than existing (as said above)
i will test this too
i think this is a bug-fix and not a "new feature", will this get a J3.5.x
milestone ?—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#10348 (comment)
Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/
Does not work for (transliteration is always using site default language)
Works for:
The problem with newsfeed
that calls:
$table->alias = JApplicationHelper::stringURLSafe($table->name);
i have not seen any other CORE extensions doing this there,
i think it should be removed from there
This PR has received new commits.
I have updated the NewsfeedsModelNewsfeed model to use the language argument to JApplicationHelper::stringURLSafe . This way we make sure we don't break anything in case this code is used somewhere else. If anyone ( @mbabker ) can verify that the alias handling from the model is not required, i would be more than happy to remove those lines.
@kavadas can you make a PR with the relative changes @ https://github.com/joomla-extensions/weblinks ?
Folks, this looks broken.
No unicode alias
Content Languages present: English and French.
Editing an item with title ωφmytest
, language set to All or English
Result: the French transliteration is used and I get ophmytest
Better to wait all is solved here before proposing a patch for weblinks I guess
I have tested this item unsuccessfully on bb7cadb
hmm, this happens on an updated site, not on all my test sites... What the h... ?
@infograf768
can you be provide more details of
if it is the newsfeed form / component, then is your frontend site default language French ?
This PR changes the default transliteration language to be site default language thus you would get always french transliteration if you call:
JApplicationHelper::stringURLSafe without passing to it the element's language
it is not a bug of current PR it is because of change of default transliteration language and an extra commit for it was made already to fix it
bb7cadb
i will test to see if issue is fixed
Beside the newsfeed form
Every form, and yes, on that test site default languages are both set to English.
I have to find out why I get this here on this precise site
@infograf768
maybe the patch was not really applied ? function stringURLSafe() in file libraries/joomla/filter/output.php should have been modified
Can you make the following test ?
add:
ob_start();
debug_print_backtrace(DEBUG_BACKTRACE_IGNORE_ARGS);
$function_trace = ob_get_clean();
$info = __FUNCTION__.'()'
."\n\n".'PARAMETER string</b> to transliterate: "' .$string. '"'
."\n". 'PARAMETER language: "' .$language .'"'
."\n" .'language -used-: "' .$lang->getTag() .'"' . "\n" .'RETURN string: ' .'"'.$str.'"' . "\n\n";
JFactory::getApplication()->enqueueMessage('<pre>' . $info . $function_trace. '</pre>');
You should get only 1 call to it with the not yet transliterated alias ... , like this:
stringURLSafe()
PARAMETER string to transliterate: "ωφmytest"
PARAMETER language: "..."
language -used-: "..."
RETURN string: ....0 JFilterOutput::stringURLSafe() called at [...\libraries\cms\application\helper.php:97]
1 JApplicationHelper::stringURLSafe() called at [...\libraries\legacy\table\content.php:188]
...
All files correctly updated on that site.
I had already tested $lang, for example $lang IS en-GB but still I get the transliteration
stringURLSafe()
PARAMETER string to transliterate: "ωφmytest"
PARAMETER language: "en-GB"
language -used-: "en-GB"
RETURN string: "ophmytest"
#0 JFilterOutput::stringURLSafe() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/application/helper.php:97]
#1 JApplicationHelper::stringURLSafe() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/legacy/table/content.php:188]
#2 JTableContent->check() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/legacy/model/admin.php:1172]
#3 JModelAdmin->save() called at [/Applications/MAMP/htdocs/trunkgitnew/administrator/components/com_content/models/article.php:577]
#4 ContentModelArticle->save() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/legacy/controller/form.php:737]
#5 JControllerForm->save() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/legacy/controller/legacy.php:728]
#6 JControllerLegacy->execute() called at [/Applications/MAMP/htdocs/trunkgitnew/administrator/components/com_content/content.php:21]
#7 require_once(/Applications/MAMP/htdocs/trunkgitnew/administrator/components/com_content/content.php) called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/component/helper.php:405]
#8 JComponentHelper::executeComponent() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/component/helper.php:380]
#9 JComponentHelper::renderComponent() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/application/administrator.php:98]
#10 JApplicationAdministrator->dispatch() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/application/administrator.php:152]
#11 JApplicationAdministrator->doExecute() called at [/Applications/MAMP/htdocs/trunkgitnew/libraries/cms/application/cms.php:257]
#12 JApplicationCms->execute() called at [/Applications/MAMP/htdocs/trunkgitnew/administrator/index.php:51]
there must be something else on that test site.
1 . Can you check if file:
administrator\language\en-GB\en-GB.localise.php
has been modified , and a transliterate() method was added to it ?
2 . And then can you add PHP reflection to the test message to get the ClassName of language object
ob_start();
debug_print_backtrace(DEBUG_BACKTRACE_IGNORE_ARGS);
$function_trace = ob_get_clean();
$r = new ReflectionClass($lang);
$info = __FUNCTION__.'()'
."\n\n".'PARAMETER string</b> to transliterate: "' .$string. '"'
."\n". 'PARAMETER language: "' .$language .'"'
."\n" .'language -used-: "' .$lang->getTag() .'"'
."\nLANGUAGE object classname: " .$r->getName()
. "\n" .'RETURN string: ' .'"'.$str.'"' . "\n\n";
JFactory::getApplication()->enqueueMessage('<pre>' . $info . $function_trace. '</pre>');
I have tested this item successfully on bb7cadb
I found out. I had overriden a while ago the en-GB.localise.php in the overrides folder to check if this file was correctly used ...
Sorry. Test successful here.
I have tested this item successfully on bb7cadb
Status | Pending | ⇒ | Ready to Commit |
Labels |
Added:
?
|
Milestone |
Added: |
Status | Ready to Commit | ⇒ | Fixed in Code Base |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2016-05-16 16:09:19 |
Closed_By | ⇒ | rdeutz |
Labels |
Removed:
?
|
after applying the patch in a multilangugae site (en-GB; pt-PT; es-ES; el-GR) created an article in greek language with
Καλωσορίζουμε τους επισκέπτεςε
as title and the resulting alias is still a date.The unicode alias are set to No in global config.
Admin language is set to en-GB
So i think this PR concept is very good, but it doesn't work.