User tests: Successful: Unsuccessful:
Currently the extension installer accepts for installation and uninstallation sql scripts specified in extensions manifest xml files only the value 'utf8' for the "charset" attribute, and for update sql scripts the "charset" attribute is not used.
Beginning with 3.5.0 Joomla! will support utf8mb4. SQL create table and alter table statements containing "utf8mb4" can be downgraded to "utf8" by the new database driver, so Extension Installers can provide scripts for utf8mb4 only if they want to be installable only on 3.5.0 or later.
But if the same extenions shall be installable also on a pre-3.5.0, it either has to stay with utf8 (no mb4) or provide 2 package versions, one for pre-3.5.0 and one for 3.5.0 and later.
This PR here extends handling of the "charset" atribute for installation and uninstallation sql scripts and adds handling of this atribute for update sql scripts specified in the manifest xml file in following way:
If for mysql related drivers only one file is specified for one of the charsets "utf8mb4" and "utf8", this file will be used, if necessary with query downgrade.
Otherwise, if both files are specified, it will use the one suitable to actual utf8mb4 support by the server and client api combination.
To be on the safer side and also backwards compatible with extensions which already have changed their sql scripts to utf8mb4 but not have changed yet their charset attibute's value to 'utf8mb4' because this will only be supported after this PR here is merged, the qury will be downgraded also if necessary.
So with this PR, extenions developers have the choice to stay with utf8 and provide scripts for utf8 only, or change to utf8mb4 and provide only 1 set of script for this character set, which will not work on pre-3.5.0 with database not supporting utf8mb4, or provide scripts for both, so on a pre-3.5.0 where files with charset attributes different to utf8 were ignored the utf8 scripts will run.
From the schema update scripts, a pre-3.5.0 Joomla! uses the first schema path which can be found for the right servertype, without any charset info. So if the schema path for utf8 is specified as first in the xml manifest, it will also work for non-utf8 databases on a pre-3.5.0.
Conclusion: With this PR being implemented, extensions can use the regular mechanisms to be installable/uninstallable/updateable on any Joomla! version and apply only the scrips fot the particular charset for mysql server type, as long as for update schema paths the one for utf8 is specified as 1st, and can also use their schema updates to perform the utf8mb4 conversion.
Apply this patch to a latest staging and then download following packages and install them with Extension Manager "Upload & install package":
Result: All 3 packages can be installed without error.
Now check with phpMyAdmin that each extension has created a table with name = db prefix + extension name:
#__test_package_utf8-mb4_utf8
#__test_package_utf8-mb4
#__test_package_utf8
Verify that these tables all have 1 column col1
with "..._unicode_ci" and col2
with "..._bin".
Depending on utf8mb4 support of the database servers, the charset of the table and the columns is as follows:
#__test_package_utf8-mb4_utf8
: Either utf8mb4 or utf8#__test_package_utf8-mb4
: Either utf8mb4 or utf8#__test_package_utf8
: utf8The tables do all not contain any records.
Now install the update packages for these three packages:
Result: All 3 packages can be updated without error.
Now check with phpMyAdmin that in each extension's table, a record has been inserted with following values:
#__test_package_utf8-mb4_utf8
: col1
= '#2', col2
= '#ffd' if no utf8mb4 support, col1
= '#1', col2
= '#ffe' if utf8mb4 support.#__test_package_utf8-mb4
: col1
= '#1', col2
= '#ffe'#__test_package_utf8
: col1
= '#1', col2
= '#ffe'(values of col 1 have 2 zeros between the hashtag and the last digit but GitHub comments somehow eat them here).
Now uninstall the 3 packages.
Result: All 3 packages can be uninstalled without error.
Now check with phpMyAdmin that the 3 tables have been deleted.
Result: The 3 tables have been deleted.
Hint: Please do not use such an old version like 3.2.7 for this test, because they have a bug with handling extensions not having had any schema update in their previous versions and so no version number could be found in the #__schemas
table. This could be fixed by updating my test packages with some dummy sql script, but better use a 3.4.8 for testing to save me from this work now.
Install versions 1.0.0 of the 3 packages used with Test 1.
Result:
Now check with phpMyAdmin that for the 3 extensions the tables have been or not been created as follows:
Result: Package with 1 sql for charset "utf8mb4" only is not compatible with Joomla! pre-3.5.0.
Hint: If you would change the charset in the xml to "utf8", it would cause SQL error "Unknown character set utf8mb4" on a database without utf8mb4 support.
Check the character sets and collations of the 2 tables which have been created:
Result: The 2 tables have utf8 character sets and collations.
Now update the 2 packages for which the table has been created to version 1.0.1.
Result: The 2 packages can be updated without error.
Now check with phpMyAdmin that in each extension's table, a record has been inserted with following values:
#__test_package_utf8-mb4_utf8
: col1
= '#2', col2
= '#ffd' if no utf8mb4 support, col1
= '#1', col2
= '#ffe' if utf8mb4 support.#__test_package_utf8
: col1
= '#1', col2
= '#ffe'Now uninstall all 3 packages.
Result: Success, no errors, tables deleted.
Conclusion:
Status | New | ⇒ | Pending |
Labels |
Added:
?
|
@andrepereiradasilva @wilsonge PR is ready for test.
Fixed silly copy and paste error in testing instructions for Test 2.
@andrepereiradasilva @wilsonge PR is now really ready for test.
Title |
|
Title |
|
Very nice. This takes care of installation of most components out there and by reading the code I believe it also handles adding new tables on later updates to those components, as long as they use standard installation and update files which Joomla provides support for.
The only thing that remains open to me is how components (and Joomla as whole) are supposed to handle conversion from utf8 to mb4 on existing Joomla sites.
@mahagr Yes, it will not solve the "how to convert to utf8mb4" thing because on a pre-3.5 Joomla!, the charset attribute is not checked for schema update scripts, and also Joomla! version is not checked, so the conversion would run on those old Joomla!s, too, and if 2 scripts then even both would run.
So it would need some php script for this purpose to be contained in the extension package.
But I (and maybe not me alone) will check if I (we) can provide an example extensions dummy (like those used here for testing) where this will be solved.
If necessary I also will propose PR(s) for the core, .g. if it turns out that it could make sense to extend the utf8_conversion table by a column "extension_id" to save extensions conversion status there (and make core use its own extension id 600), like Joomla! core does it now (3.5.0-rcx) to cater for later migration of data from non-utf8mb4 to utf8mb4 database.
Anyway, if you find time, can you test this PR here?
Test on a utf8mb4 capable db server:
Used Joomla 3.2.7 for the tests
Updated none.
Question: how we do a utf8 conversion for an extension that needs to work in all 3.x versions after this patch. For what i understand, we do a dual charset package and put the utf8mb4 update sql only in utf8mb4 update part right?
@andrepereiradasilva About your "problem" 1: The charset attribute was mandatory before for install and uninstall, and if had to be utf8, and I have not changed this, see old code:
$fCharset = (strtolower($file->attributes()->charset) == 'utf8') ? 'utf8' : '';
...
if ($fCharset == 'utf8' && $fDriver == $dbDriver)
Only for update I made it new, and there it is optional for B/C.
Please check this, if it is mandatory or not on an unpatched 3.5.0-rc-something, and if mandatory, too, then mark your test result with success, otherwise with failed.
About conversion for extensions: This PR cannot change the fact that for pre-3.5.0, for updates there is not checked any charset attibute, so if you have 2 update scripts, 1 for utf8mb4 and 1 for utf8, it will be ok pn 3.5.0+, but on pre-3.5.0 it will run both scripts, whcih is not desired and so not usable for converting and even crash on a pre-3.5.0 without utf8mb4 support if one of these update scripts contains utf8mb4 stuff. On pre-3.5.0 we do not have query downgrade.
I may have to change test instructions for this case even, because on a pre-3.5.0 for the utf8-mb4_utf8 package there should be 2 records after updating (but I think I skipped this check for Test 2 so description should at least not be misleading).
On a 3.5.0 which knows about charset atrribute for update scripts and has query downgrade, this would not be a problem. There conversion could be handled with 1 update script, and if necessary separate script for separacte charsets also can be used.
Another thing I have to correct with another, new PR is that here I could not use "utf8mb4" within table names or mysql comments on tables or columns, or in default values for columns, because query downgrade simply replaces every occurance of "utf8mb4" by "utf8", regardless if in quoted text or name quoted stuff or not.
I also think about providing with another PR an example extension with a php script to handle conversion, if this is possible, and extend the core ba a column "extension_id" in the utf8_conversion table (and core will use 700), so this php script for the example extension could contain parts of the core's script.php (3.5.0+), where query downgrade and conversion handling is implemented locally. You could helo me with that by ideas and discussion and testing if you want.
Thanks for feedback and testing so far.
@andrepereiradasilva About your "problem" 1: The charset attribute was mandatory before for install and uninstall, and if had to be utf8, and I have not changed this, see old code:
you seem to be right! that's strange! there is no "default" value for charset, it just don't install/update properly without the charset... no error, nothing, install, but the tables are not created. in 3.4.8 https://github.com/joomla/joomla-cms/blob/3.4.x/libraries/cms/installer/installer.php#L902, and checked in 3.2.7 and it's the same thing.
So i will remove the problem from my comment and continue the tests since this is another thing that IMHO should be looked after in other PR.
Well in the code there is a comment telling that this seems to be an open point, if the charset attribute for install and uninstall should be optional and not mandatory. The necessary code change would be little, just to an "$blablavar = isset(atributes['charset']) ? atributes['charset'] : 'utf8' or so, like I did for the updates, also for installation and uninstallation. But I wanted to stay B/C.
@richard67 completed my tests in 3.2.7 check my comment above for results
Well in the code there is a comment telling that this seems to be an open point, if the charset attribute for install and uninstall should be optional and not mandatory. The necessary code change would be little, just to an "$blablavar = isset(atributes['charset']) ? atributes['charset'] : 'utf8' or so, like I did for the updates, also for installation and uninstallation. But I wanted to stay B/C.
That is another PR for sure.
Other PR should be the action messages ... IMHO installed ok when it doesn't create the tables is not good. But this is a new PR for improvement, not related to this.
@andrepereiradasilva The things you marked with red "!?" are partly expected behavior:
If on a pre-3.5.0 a script on installation is skipped because charset does not match "utf8", it is still reported as successful install. Is old behavior.
Furthermore it seems that error handling or output (messages) for updates is not done somehow, but am sure this is same without this patch.
So following of your points are expected behavior:
Install: #__test_package_utf8-mb4: Didn't create the tables: ok. But says install correctly.
Update: #__test_package_utf8-mb4: row not inserted (there are no tables). But says install correctly.
The following 2 I am not sure if I not have maybe some mistake in the update sql, because at least the 2nd one should work:
Update:
#__test_package_utf8-mb4_utf8
: rows NOT inserted#__test_package_utf8
: rows NOT insertedI have to check this, especially the 2nd one should work.
Thanks for testing so far.
@andrepereiradasilva
I cannot reproduce your points "#__test_package_utf8
: rows NOT inserted" and "#__test_package_utf8-mb4_utf8
: rows NOT inserted" for updating on a pre-3.5.0 not supporting utf8mb4.
I installed on a clean 3.4.0 versions 1.0.0 of the packages, then updated test_package_utf8 and test_package_utf8-mb4_utf8 to 1.0.1 (but not the one the table was missing after install, as I described), and then 1 record has been inserted for each of the 2 tables.
So maybe something else went wrong with your test?
i will test again, now with 3.4.8 and 3.2.7
Table 'joomla-348.wr08k_test_package_utf8-mb4' doesn't exist SQL=INSERT INTO
wr08k_test_package_utf8-mb4(
col1,
col2) VALUES ('#001', '#ffe');
Error: Error installing component
So clearly seems a problem in 3.2.7, maybe this Joomla version doesn't support something you added in the manifest.
@andrepereiradasilva I just compared installer.php from 3.4.8 and 3.2.7 (good to have a nice tool like Beyond Compare, my favourite diff tool, because GitHub stuff is still by far not as good as that.
One difference is that in 3.2.7 the pdo driver was not supported yet by the installer.php, but the rest of the core somehow did maybe. At least there is a pdo driver (but not a pdomysql) on 3.2.7.
Can this be a reason?
If you have pdo driver selected for the 3.2.7 test (what I do not really believe but who knows), test with mysqli or so please.
wait i will check further and tell you something later. let's leave it pending for now until i get some troubleshoot.
As you can see for my tests the rest (on a utf8mb4 capable) is working fine.
Tests are still needed on a non utf8mb4 capable.
@andrepereiradasilva Ha I found a code difference which is maybe the reason why nothing done when updating on 3.2.7.
Followig is there in installer.php of 3.4.8 but not in the one of 3.2.7:
// No version - use initial version.
if (!$version)
{
$version = '0.0.0';
}
The $version variable comes in both Joomla version from following database query:
$query = $db->getQuery(true)
->select('version_id')
->from('#__schemas')
->where('extension_id = ' . $eid);
$db->setQuery($query);
$version = $db->loadResult();
That means if there is no version information stored for this extension in #__schemas
table, it does the update in 3.4.8 but does nothing in 3.2.7.
And schema version info is obtained from update files.
And versions 1.0.0 of my package do not contain schema update scripts, so this info is not saved after the installation of these version.
I am pretty sure this is the reason!
So for now use 3.4.8 for the B/C test please and see what happened on 3.2.7 as expected behavior.
@andrepereiradasilva P.S. to my previous comment: Maybe I should have named my update sqls only with version number and not with version number + date? This could have impact on the schema version number determination too, which on a 3.2.7 cause nothing to be done if no version number from db and on 3.4.8 will be done with version 0.0.0, as you can see in previous post? You can maybe rename in one of my 1.0.1 packages the update sql to 1.0.1.sql and check if it works, and if not maybe try 1.0.0 bevause it wants to see version to be updated from and not version to be updated to?
i confirm that adding
// No version - use initial version.
if (!$version)
{
$version = '0.0.0';
}
on 3.2.7 works
so the update schemas on 3.2.7 seems to not work properly
But this is not an issue related to this PR, so i will mark as test sucessufully
I have tested this item successfully on d599085
Only tested on a utf8mb4 capable server (5.5.5).
There are several issues in the extension upgrade that should be looked after this PR.
But, this issues are there, with or without this PR (check my comments above).
@andrepereiradasilva Thanks for testing. Maybe just file names of my update sql are not ok for schema version stuff. I have not found any of my extensions using schema update sqls, so mayber it is an unused feature and they all use their php srcipt to do stuff?
Did you check my ideas regarding conversion for extensions here #9468 (comment) and here #9468 (comment)?
Could be interesting for your experiments for the localise component, too (if JM not stopped you already).
Updated test package test_package_utf8-mb4_utf8.zip to work also on pre-3.5.0 Joomla! and updated testing instructions and description accordingly.
@wilsonge I just checked with @andrepereiradasilva : This PR here is all what is needed to make extensions be able to handle utf8mb4 and utf8 and conversion.
Please if yoiu can, test this PR here, it really would be great to have it in 3.5.0.
@mahagr Meanwhile I have found out that this PR is sufficient for all needs of extensions regarding support of both utf8mb4 and utf8, including conversion with use of the extension's schema update.
Can you test this? Please read again changed description and test instructions.
It really would be a great help if you could test.
I have tested this item successfully on d599085
Downloaded and installed Staging
Applied patch with Patchtester
completed all the checks
Installed 3.4.8
completed all the checks
Full success
Yeah, 2 good tests! Thanks for testing, guys
@richard67 I'll find some time to test a real component in a real environment on Monday morning at latest. How should I perform update from J3.4 to J3.5 with utf8 conversion? I mean what is the "officially thought" way of doing it?
I have tested this item successfully on d599085
Test OK
Test instructions were spot on. Makes it easy to test
Category | ⇒ | Installation |
Status | Pending | ⇒ | Ready to Commit |
RTc - thanks
Labels |
Added:
?
|
@mahagr Nice if you wanna test it with a real extension. Use @andrepereiradasilva 's example for com_localise as source to look up things: joomla-projects/com_localise#286.
In case if it does not work, please not use it as test result here. The test for here are made with easy examples, a real extension may cause own problems not related to this PR or utf8mb4. But this here is good now anyway, no need to test it again.
If there is an official thought? I don't know, and please don't ask me to make some, I am only a volunteer contributor who made things work, with much help from Andre with testing and investigating, also from others.
What I know is that some larger extensions who already had own installation php scripts and own methods to migrate their data on updates have already changed to utf8mb4, like e.g. Akeeba backup or Community Builder.
This here now and Andre's use case for the localise component serve now as an example for how it can be done for any extension without them having to write a new php script.
The theory behind the utf8mb4 conversion and what has to be done in principle is very well explained here: https://mathiasbynens.be/notes/mysql-utf8mb4.
For what is worth now, i also tested this PR with success on a non utf8mb4 capable server (MySql 5.1.x) in Joomla 3.5.0 staging and in Joomla 3.4.8.
Hello and thank you for your huge contribution!
Regarding this:
So with this PR, extenions developers have the choice to stay with utf8 and provide scripts for utf8 only, or change to utf8mb4 and provide only 1 set of script for this character set, which will not work on pre-3.5.0 with database not supporting utf8mb4, or provide scripts for both, so on a pre-3.5.0 where files with charset attributes different to utf8 were ignored the utf8 scripts will run.
Will we have any docs for devs related to utf8mb4
stuff?
I want to introduce a new test case which is failing badly:
Database will still display utf8
instead of the new utf8mb4
variants.
In this case 1.0.1 could have easily had alter tables to fix the collation, but it would still fail if you installed 1.0.1 before patching and then either installed 1.0.1 again or 1.0.2... You would still end up having wrong collation.
This means that there is currently no way of fixing database collation without creating installer script in PHP that runs during every update of the component. I think its too much to be asked from average extension developer.
Simplest way to add support for this is to create simple SQL file with needed alter tables, utf8mb4.sql
which gets automatically run once if the file exists. In order to do it #9487 and some extra logic is also needed.
Exactly - I think we need to combine #9487 in here and then add an extra conversion SQL file which we can then run store in the utf8_conversion table. That should be doable for 3.5.1
We might even then be able to adapt database fixer to then run that (even longer term)
@webdongle yes for core, no for 3rd party extensions - there's no way core can update those because we can't check the index's etc
@Webdongle I am aware that this change will be part of Joomla 3.5 and not a patch or update to it. The reason why I created a test case by using non-patched version as the starting point is that its easier to test than upgrade from Joomla 3.4.
Basically the issue will exist in every site that has been upgraded from any older versions of Joomla.
@mahagr The packages provided here for the test are not made to do utf8mb4 conversion, they are just inserting records to show that they have been running.
So your scenario is not valid, that you expect a collation change after updating.
If the update sql in 1.0.1 would do utf8mb4 conversion, it would be like you expect.
A working show case can be found here: joomla-projects/com_localise#286
So please not mess up my PR here with scenarios which obsviously (read overview and testing instructions) are not subject of this PR!!!
Closing this even if it is RTC and has good tests because using different files for different charsets for the extensions is not really a good solution.
The better solution is to handle utf8mb4 conversion scripts scripts by the core and run conversion again if necessary, but not run it on every update if not necessary, as it is proposed for the core with my PR #9590 .
As soon as that is ready, I'll make a PR to make core handle conversion for extensions, too.
My plan is to have 2 new attibutes in the XML of extensions, specifying the full paths of the 2 conversion scripts, which will be similar as those for the core.
Status | Ready to Commit | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2016-03-26 00:37:02 |
Closed_By | ⇒ | richard67 |
Labels |
Removed:
?
|
P.S.: I will not delete the branch of this PR for a while in case if a maintainer wants to reopen it.
@wilsonge I know it's late, but if it is possible this one here should go into 3.5.0 too.
It will be backwards compatible with old extensions and such already having prepared for utf8mb4.
This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9468.