? ? Pending

User tests: Successful: Unsuccessful:

avatar richard67
richard67
6 Apr 2020

Pull Request for remaining part of Issue #28493 .

Alternative to PR #28587 .

Summary of Changes

Changing collation of columns term, stem and soundex in table #__finder_terms and columns term and stem in tables #__finder_tokens and #__finder_tokens_aggregates to utf8mb4_bin.

This is the same as done in PR #28587 , except here it is done only for columns mentioned above and not for the complete tables.

Changing table #__finder_terms_common so that only column term has utf8mb4_bin collation, so it fits to how we do it with other tables having utf8mb4_bin collated columns.

Testing Instructions

Thanks to @infograf768 for the testing instructions.

Moving testing instructions to a comment below because Drone seems to have problems with certain unicode characters in the description of a PR.

Documentation Changes Required

None.

avatar richard67 richard67 - open - 6 Apr 2020
avatar richard67 richard67 - change - 6 Apr 2020
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 6 Apr 2020
Category SQL Administration com_admin com_finder Installation
avatar richard67 richard67 - change - 6 Apr 2020
Labels Added: ?
avatar richard67 richard67 - change - 6 Apr 2020
The description was changed
avatar richard67 richard67 - edited - 6 Apr 2020
avatar infograf768
infograf768 - comment - 7 Apr 2020

This works also.

avatar richard67
richard67 - comment - 7 Apr 2020

@wilsonge Please chose which one you like more, this one or #28587 .

avatar richard67 richard67 - change - 7 Apr 2020
The description was changed
avatar richard67 richard67 - edited - 7 Apr 2020
avatar richard67 richard67 - change - 7 Apr 2020
The description was changed
avatar richard67 richard67 - edited - 7 Apr 2020
avatar richard67
richard67 - comment - 7 Apr 2020

Testing Instructions

Thanks to @infograf768 for the testing instructions. I've added the update test to make sure I haven't made a mistake in the update sql script.

Test 1: New installation

Step 1: Patch and make a clean install.

Step 2: Create and publish an article which contains

Chinese: 不能创建文件

Greek: Εγκατάσταση Γλωσσών

German: Europäer

French: être noël

Simple chinese: 不

Four bytes character:
?
or
?
equivalent of U+20E9D  ?
groupés par 3 不??创

Step 3: Create a smartsearch module in frontend.

Step 4: In frontend, search for any Chinese character or group of characters, but specially for the 4 bytes
?
and
?

Result: OK and hightlighting is correct.
Screen Shot 2020-04-06 at 12 16 14

Test 2: Update

Step 1: Update a 3.10 to 4.0-dev plus the patch of this PR applied, using the update package built for this PR or the corresponding custom update URL. Packages and update URL can be found here: https://ci.joomla.org/artifacts/joomla/joomla-cms/4.0-dev/28592/downloads/31062/.

Step 2: Repeat steps 2 to 4 of the previous "Test 1: New installation".

Result: Same as for "Test 1: New installation".

avatar richard67 richard67 - change - 7 Apr 2020
The description was changed
avatar richard67 richard67 - edited - 7 Apr 2020
avatar richard67 richard67 - change - 7 Apr 2020
Title
[4.0] [WiP] Fixing smartsearch issue with some multibytes characters (alternative proposal)
[4.0] Fixing smartsearch issue with some multibytes characters (alternative proposal)
avatar richard67 richard67 - edited - 7 Apr 2020
avatar richard67 richard67 - change - 7 Apr 2020
The description was changed
avatar richard67 richard67 - edited - 7 Apr 2020
avatar Hackwar
Hackwar - comment - 7 Apr 2020

I would prefer this over #28587.

avatar alikon
alikon - comment - 8 Apr 2020

I have tested this item successfully on 0b20612

with mysql 8.0.19


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28592.

avatar alikon alikon - test_item - 8 Apr 2020 - Tested successfully
avatar infograf768
infograf768 - comment - 8 Apr 2020

I have tested this item successfully on 0b20612

Will close the other one


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28592.

avatar infograf768 infograf768 - test_item - 8 Apr 2020 - Tested successfully
avatar infograf768 infograf768 - change - 8 Apr 2020
Status Pending Ready to Commit
avatar infograf768
infograf768 - comment - 8 Apr 2020

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28592.

avatar richard67
richard67 - comment - 8 Apr 2020

@alikon @infograf768 I hope you have also done "Test 2: Update".

avatar alikon
alikon - comment - 8 Apr 2020

my bad and laziness no.... i'll do sorry

avatar richard67
richard67 - comment - 8 Apr 2020

Update test is important to check that I haven't made a mistake in the update SQL and that the order of processing is correct, i.e. at the end collations are like they should be for the tables and columns handled by this PR.

avatar richard67 richard67 - change - 8 Apr 2020
Labels Added: ?
avatar richard67
richard67 - comment - 8 Apr 2020

@alikon @infograf768 wait with the update test. I've just updated to latest 4.0-dev and so new update package will be built by drone.

avatar richard67
richard67 - comment - 8 Apr 2020

New update package and custom URL have been built by drone. I've updated the link in the testing instructions.

@alikon Ready for the update test now.

@infograf768 If you have done both tests before, installation and update, please just mark your test result again. Otherwise, if you haven't done the update test: Could you do that test, too? It is important to check there is no mistake in the update sql.

avatar alikon
alikon - comment - 8 Apr 2020

Screenshot from 2020-04-08 12-19-43
test done and can confirm that work updating from 3.10 to 4

avatar alikon
alikon - comment - 8 Apr 2020

Screenshot from 2020-04-08 12-32-27

avatar alikon
alikon - comment - 8 Apr 2020

I have tested this item successfully on 938edd1

this time tested the update too


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/28592.

avatar alikon alikon - test_item - 8 Apr 2020 - Tested successfully
avatar humblehumanbeing
humblehumanbeing - comment - 9 Apr 2020

Not sure if related but Joomla 3x table fields such as images in #__content, params in #__modules and #__menu are converting unicode characters to 6-bytes. They appear unicode in the backend but you see multi-byte in db.

avatar infograf768
infograf768 - comment - 9 Apr 2020

@humblehumanbeing

Not sure if related but Joomla 3x table fields such as images in #__content, params in #__modules and #__menu are converting unicode characters to 6-bytes. They appear unicode in the backend but you see multi-byte in db.

Looks like you are confusing bytes and bits. Unicode UTF8 is max 4 bytes.
Please give some examples of what you mean.
For params, getting the format \u....\u... is totally normal and is unrelated to the issue here.

avatar humblehumanbeing
humblehumanbeing - comment - 9 Apr 2020

Sorry if it is the case.
In the backend create an article, enter 'ş İ Ğ Ö Ç' for fulltext image alt tag.
Check the field images via phpmyadmin
It is encoded as '\u015f \u0130 \u011e \u00d6 \u00c7'
Same in module/menu params.
This makes db manipulation, find and replace for example, via a db software nearly impossible

avatar infograf768
infograf768 - comment - 9 Apr 2020

These are JSON encoded and it is not related to the issue here with finder.
See https://www.php.net/manual/fr/function.json-encode.php

avatar humblehumanbeing
humblehumanbeing - comment - 9 Apr 2020

Thanks for the clarification.

avatar wilsonge wilsonge - change - 9 Apr 2020
Status Ready to Commit Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2020-04-09 22:07:08
Closed_By wilsonge
avatar wilsonge wilsonge - close - 9 Apr 2020
avatar wilsonge wilsonge - merge - 9 Apr 2020
avatar wilsonge
wilsonge - comment - 9 Apr 2020

Thanks!

Add a Comment

Login with GitHub to post a comment