User tests: Successful: Unsuccessful:
This fixes the issue found in #5615 (comment)
Basically if a language string contains double quotes inside they are removed by parse_ini_string
and then not shown when rendering.
To test this fix modify administrator/language/en-GB/en-GB.plg_system_languagecode.ini
and set the plugin description to something like:
PLG_SYSTEM_LANGUAGECODE_XML_DESCRIPTION="Provides the ability to change the language code in the generated HTML document to improve SEO.<br />The fields will "appeara" when the plugin is enabled and saved.<br />More information <span class="label label-important">permet</span> at <a href="_QQ_"http://www.w3.org/TR/xhtml1/#docconf"_QQ_">W3.org</a> "
Then go to plugins management and search the System - Language Code
plugin. Click there to edit it and pay attention to the plugin description.
Before applying the patch the permet
label should be rendered in grey (only label
class applied):
After patching it should be shown in red (all the classes applied):
Labels |
Added:
?
|
I don't think it's human errors. It's about how we want language strings to work. If we want to allow using double quotes inside language strings (which would be great to get rid of "_QQ_"
) we need to keep them on rendering.
If parse_ini_strings
removes them then it makes no sense to allow its use.
My personal opinion is that removing "_QQ_"
makes the translation easier.
My opinion is that we should just get people to escape their strings. There's no need for us to do it for them (which is what the callback is doing as I understand it)
Either way escaping is required. We can require escaping to be performed by translators, developers, etc. (using \"
style escapes, not the QQ constant) or we can allow non-standard INIs and do it ourselves. Personally, I'd rather escape them myself than depend on the underlying API to do it.
I would also be in favor of advising translators to use escaped (\"
) double quotes OR - use Transifex and stop worrying about those nasty details - of course, once we get rid of that QQ...
Sorry but I'm lost now. Based on this coment and the global discussion there I thought we were allowing unescaped double quotes inside language strings.
Without escaping them things like this are wrong:
JERROR_COULD_NOT_FIND_TEMPLATE="Could not find template "%s"."
because it makes no sense to add quotes if they are going to be removed on rendering.
I think someone has to write our translating standards somewhere because for example this PR is affected by this decision. Also com_localise
needs a standard to prevent bad strings from being used.
I'm closing this. Thanks all for reviewing.
Status | Pending | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2015-01-28 14:03:04 |
Admittedly, I thought that there were zero issues whatsoever with using unescaped quotes. All my test cases were in rendered pages in a browser where I didn't see the missing quotes. It wasn't until this morning when JM mentioned something in Slack that I really dug into things and realized that you do need to escape your double quotes always for proper functioning.
Moving forward, I'd suggest the _QQ_
constant be deprecated (we did this on the Framework at joomla-framework/language#11 (again note my bad understanding in the PR description)) and that the standard behavior is that quotes are escaped via \"
. I'd also suggest the debug parser in JLanguage be updated to check for unescaped quotes and add an error to the listing in debug mode. Lastly, I'd suggest that the debug code in JLanguage::parse() be moved to a publicly accessible method (JLanguage::debugFile() perhaps?) so that third parties can use that to check error conditions without needing to enable debug mode. Someone could write a quick CLI script, for example, that developers could use to parse files for errors if the debug code were publicly accessible.
I don't like the idea of a JLanguage::debugFile()
. I think all the file handling should be removed/abstracted from JLanguage
using something like:
https://github.com/phproberto/com_localise/blob/langcheck/component/admin/lang/file.php
And used like:
// $path contains the path to the language file
$langFile = LocaliseLangFile::getInstance($path);
if (!$langFile->check())
{
// Errors contains detailed errors
$errors = $langFile->getLinesErrors();
}
To get the list of language strings:
$langFile = LocaliseLangFile::getInstance($path);
$strings = $langFile->getStrings();
// Errors encountered
if (false === $strings)
{
// All the errors
$errors = $langFile->getErrors();
// Last error
$error = $langFile->getLastError();
}
// Process $strings here
That way JLanguage::parse()
will only call it to get the strings instead of parsing it itself.
I discussed with @wilsonge about integrating that with the framework language so we can add support not only for ini
files but also for other kind of files like LanguageFileIni
, LanguageFilePo
. The management could be done transparently by the LanguageFile
class based on file extension.
Abstracting out the parsing would be a huge +1 from me. Same with anything that gets the debugging code into an open API. Plenty of room for improvement, it just needs to happen.
Cool I have the code so I can contribute it. What about creating a PR for core and then we can see the issues there and improve it for the framework? When we add the framework language package to core we just deprecate it and done?
Whatever you think is best to get the code tested merged into a Joomla repo.
Some thought:
We also have to think B/C (I know... it is my obsession...).
As of now the system proposing language packs updates in J does not deal with sub versions.
I.e. we propose to users using any Joomla 3.x version the last language pack for a specific language, whether it is a 3.2 or a 3.3.6 or, to come, a 3.4.0.1. Our Cron job does not differentiate between the sub-versions.
Also as some TTs are late or abandon for a while their translation (look at the versions of packs proposed in Extensions Manager => Install Languages displayed in the list), if we abandon B/C parsing of these files, our users will not be able to use them at all. Not speaking of 3pd strings.
Many TTs have also deleted their former sub versions packs when they proposed an updated pack, which therefore makes it difficult to change our cron to differentiate between the packs.
If these 3.4.0.1 packs can't be parsed correctly on a < 3.4.0 joomla install, we are in big trouble.
Last, quite a non-negligeable number of users are not using anyway the Install Languages from Extension Manager, but just pick the packs directly from our joomlacode repository.
Basically in my opinion, we just can't abandon parsing "_QQ_"
and must make sure we are totally B/C in the 3.x series.
According to our strategy, we will support _QQ_
until 4.0, given that we deprecate the useage in 3.x.
However we don't promise that any code or language file written for 3.4 will also work in 3.3. That would basically stop development and would be very stupid.
So users installing a 3.4 language pack into a 3.3 site are doing something we don't support. However an outdated 3.3 language pack will work just fine in 3.4 and obviously has to do.
Everything works like before. The only thing we are not introducing and what confused me is the use of double quotes without escaping them. So there is no B/C issue and old language files work perfectly.
And as I understand it 3.4 language files would "work" on 3.3 anyhow but would just show an error in the language debugger? Either way it's a non-issue. We cannot expect 3.4 code to support 3.3. It's like saying my tags code isn't supported in 3.1.4....
Are we happy then with \"
being used for the quote symbol and for displaying quotes themselves using the html entity "
? If so will get Brian to update his PR one last time and merge it.
If you decide to change and use \"
instead, this means that an error should be provided when debugging if no escape \
, and making sure it lets "_QQ_"
alone.
Of course it's very important that "_QQ_"
works. That is definitely needed to keep our b/c pledge!
I agree that we should try and provide a parsing error in that case but the debugging not 100% working properly isn't going to be a show stopper for a very very edge case of people installing newer language packs onto older Joomla versions.
Of course, the debugging would only work for the version where it is updated. :)
@phproberto ok lets go with it then please. If you are showing the quotes on the page you should use html entity "
and if you are using it for a class or a url then the escaped quote \"
I think that works for everyone :)
ok. I'll modify localise PR this weekend. I cannot lose 1 full week in this. But I think this is not blocking anything now.
And anyone working on displaying a non escaped " in core debug lang is welcome
I can do it when it's ready for localise. I think it would be easy if we abstract the parsing to an external system.
That's fine roberto :) Thankyou so much!
BTW: I see no reason why a \"
could not replace a & quot;
Therefore we could just decide to always use \" in the future
Personal opinion only. I don't think it's Joomla's jobs to fix human errors in the language files and would suggest the debug parser flag these errors instead of us making this fix.
Aside from that, just on reading this PR, it functions as advertised.