? Pending

User tests: Successful: Unsuccessful:

avatar infograf768
infograf768
6 Apr 2020

Pull Request for remaining part of Issue #28493

Summary of Changes

Changing 3 tables ( finder_terms, finder_tokens and finder_aggregates) to the utf8mb4_bin collation, including in the specific J4 update.sql

Testing Instructions

See below because of drone sensibility. GRRR!!

Result

Result OK and hightlighting is correct.
Screen Shot 2020-04-06 at 12 16 14

avatar infograf768 infograf768 - open - 6 Apr 2020
avatar infograf768 infograf768 - change - 6 Apr 2020
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 6 Apr 2020
Category SQL Administration com_admin com_finder Installation
avatar wilsonge
wilsonge - comment - 6 Apr 2020

This will make all search case insensitive. Kinda unconvinced that's what we need

avatar Hackwar
Hackwar - comment - 6 Apr 2020

This could be a solution and the case insensitivity wouldn't be a problem, but I'm still looking at the side effects.

avatar alikon
alikon - comment - 6 Apr 2020

instead of changing the collation of the whole table i'd prefer to change the collation of the single fields involved....

avatar richard67
richard67 - comment - 6 Apr 2020

instead of changing the collation of the whole table i'd prefer to change the collation of the single fields involved....

Yes, that would be better. @Hackwar Could you list the columns which need binary collation?

avatar infograf768
infograf768 - comment - 6 Apr 2020

Could you list the columns which need binary collation?

In this case I welcome patches to my branch or create a new PR and I will delete this. No ego here 😄

avatar richard67
richard67 - comment - 6 Apr 2020

@Hackwar Wich columns need to be utf8mb4_binary if we decide to change collation not for the complete tables but only for columns where needed? Only columns term? Or also columns stem?

Update: And in case of table #__finder_terms also column soundex?

avatar richard67
richard67 - comment - 6 Apr 2020

instead of changing the collation of the whole table i'd prefer to change the collation of the single fields involved....

I just see in case of #__finder_terms_common we also did binary collation for the complete table when we implemented the utf8mb4 conversion in J3. The only column affected by this in addition to the term column was the language column in that case. But that isn't really a problem.

So it could be easier to do it for the complete tables like here in this PR now.

avatar richard67
richard67 - comment - 6 Apr 2020

@infograf768 Could you test if PR #28592 works, too?

avatar infograf768
infograf768 - comment - 7 Apr 2020

#28592 works OK too.

avatar richard67
richard67 - comment - 7 Apr 2020

@wilsonge Please chose which one you like more, this one or #28592 .

avatar infograf768 infograf768 - change - 7 Apr 2020
The description was changed
avatar infograf768 infograf768 - edited - 7 Apr 2020
avatar infograf768 infograf768 - change - 7 Apr 2020
The description was changed
avatar infograf768 infograf768 - edited - 7 Apr 2020
avatar infograf768
infograf768 - comment - 7 Apr 2020

Testing instructions

Patch and make a clean install.
Create and publish an article which will contain

Chinese: 不能创建文件

Greek: Εγκατάσταση Γλωσσών

German: Europäer

French: être noël

Simple chinese: 不

Four bytes character:
𠹷
or
𨈇
equivalent of U+20E9D  𠺝
groupés par 3 不𠹷𨈇创

Create a smartsearch module in frontend

Search

For any Chinese character or group of characters, but specially for the 4 bytes
𠹷
and
𨈇

avatar Hackwar
Hackwar - comment - 7 Apr 2020

I would prefer #28592 over this one.

avatar infograf768
infograf768 - comment - 8 Apr 2020

Closing in favor of #28592

avatar infograf768 infograf768 - close - 8 Apr 2020
avatar infograf768 infograf768 - change - 8 Apr 2020
Status Pending Closed
Closed_Date 0000-00-00 00:00:00 2020-04-08 09:45:27
Closed_By infograf768
Labels Added: ?

Add a Comment

Login with GitHub to post a comment