?
avatar 810
810
29 Feb 2016

Steps to reproduce the issue

Go to database fix page, with enabled redirects and have data in there

Expected result

Fix the new database collation

Actual result

Duplicate entry 'http://www.kunena.org/Array/169-K-16-and-K-17-Support-Archive/11' for key 'idx_link_old' SQL=ALTER TABLE #__redirect_links ADD UNIQUE idx_link_old ( old_url (100));

on the fix with the message:
•The Joomla! Core database tables have not been converted yet to UTF-8 Multibyte (utf8mb4).

System information (as much as possible)

J3.5.0-Beta3

Additional comments

richard67
Ah, no, sorry! Is not an old issue and not data. The index on this column was in fact introduced by the utf8mb4 conversion, I remember now. #9221

avatar 810 810 - open - 29 Feb 2016
avatar richard67
richard67 - comment - 29 Feb 2016

The problem is that since the change to UTF8MB4, the index of the "old_url" column is limited to the first 100 chacters of the URL, and so this duplicate key problem comes.

@wilsonge What shall we do? 100 characters is a bit short for URLs today. Sorry I saw it when working on the UTF8MB4 thing but not further thought about it.

The other columns where we did this were all "alias" columns, for those 100 characters are ok, so we have this problem only for this 1 column which deals with URLs.

avatar richard67
richard67 - comment - 29 Feb 2016

@wilsonge I know you discussed that once with the guys, and maybe you can ask Eli for advice, and so I ping you.

Old max. lenght for indexes with Innodb storage engine was 255 characters, with utf8mb4 it can be only 191.

With the changes for utf8mb4 we enlarged thes column to 400 but limited the lenght which goes into the index to 100.

We did so for the alias columns (where it makes sense) and for the old_url column of the redirect links, too.

What could we do to solve this?

We could increase this from 100 to 191 for the redirect links old URLs, but then this issue can still happen, not a good idea.

Or we change the lenght of the column itself to 191, too, so no duplicate key can occur anymore due to this "truncation" of what goes into the index, but max. lenght would be 191 then. Better but not nice because so short URLs then.

And if we limit the lenght to 191 this is a kind of BC break, and worth a note in the release notes.

Or maybe it's enough to change the storage engine for this table to MyIsam?

That would be the easiest thing if this helps, but is maybe not supported on every database?


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

@wilsonge As URLs can become arbitrary long (but recommended to stay below 2048 because this is the limit of the sitemap protocol), we were limited before with the 255 characters also too much.

Idea: A way out would be either to have no unique key (bad), or to use an md5 or sha checksum or digest as the unique key to make sure the URLs are unique.

Would this be a way?

Using MyISAM as storage engine would not help much, would be a few characters more only in utf8mb4.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar wilsonge
wilsonge - comment - 29 Feb 2016

I'll chat to Eli - but as a starter for 10 lets increase the index back to its new max (191)

avatar infograf768
infograf768 - comment - 29 Feb 2016

FYI: I tested again a 3.4.8 updated to staging and using FIX (had to do it twice as reported before) and did not have the issue.
One of my redirect urls in utf8 was:
http://localhost:8888/testwindows/trunkgitnew/ta/பன்-மொழி-படிப்படியாக/136-ஒரே-உள்ளடக்கத்தைப்-பல்வேறு-மொழிகளின்-இடைமுகங்களில்-காட்சிப்படுத்தவும்.html
148 characters, and all went fine.

If that url had not been converted in core, it would have been
http://localhost:8888/testwindows/trunkgitnew/ta/%E0%AE%AA%E0%AE%A9%E0%AF%8D-%E0%AE%AE%E0%AF%8A%E0%AE%B4%E0%AE%BF-%E0%AE%AA%E0%AE%9F%E0%AE%BF%E0%AE%AA%E0%AF%8D%E0%AE%AA%E0%AE%9F%E0%AE%BF%E0%AE%AF%E0%AE%BE%E0%AE%95/136-%E0%AE%92%E0%AE%B0%E0%AF%87-%E0%AE%89%E0%AE%B3%E0%AF%8D%E0%AE%B3%E0%AE%9F%E0%AE%95%E0%AF%8D%E0%AE%95%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AF%88%E0%AE%AA%E0%AF%8D-%E0%AE%AA%E0%AE%B2%E0%AF%8D%E0%AE%B5%E0%AF%87%E0%AE%B1%E0%AF%81-%E0%AE%AE%E0%AF%8A%E0%AE%B4%E0%AE%BF%E0%AE%95%E0%AE%B3%E0%AE%BF%E0%AE%A9%E0%AF%8D-%E0%AE%87%E0%AE%9F%E0%AF%88%E0%AE%AE%E0%AF%81%E0%AE%95%E0%AE%99%E0%AF%8D%E0%AE%95%E0%AE%B3%E0%AE%BF%E0%AE%B2%E0%AF%8D-%E0%AE%95%E0%AE%BE%E0%AE%9F%E0%AF%8D%E0%AE%9A%E0%AE%BF%E0%AE%AA%E0%AF%8D%E0%AE%AA%E0%AE%9F%E0%AF%81%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AE%B5%E0%AF%81%E0%AE%AE%E0%AF%8D.html

i.e. 804 characters

avatar infograf768
infograf768 - comment - 29 Feb 2016

oops, that was the new url column...

avatar wilsonge
wilsonge - comment - 29 Feb 2016

74f522c Increase in index length. Awaiting a response from Eli - so will keep this issue open for now

avatar Stevec4
Stevec4 - comment - 29 Feb 2016

As JM reported, I updated a 3.4.8 install to beta 3 had to apply "fix" twice.
This message appeared after the first fix.
Table 'sfc3_user_profiles' does not have column 'profile_value' with type 'TEXT'. (From file 3.3.4-2014)
No errors after second fix applied.
Steve

avatar richard67
richard67 - comment - 29 Feb 2016

@Stevec4 @infograf768 and on c.c @wilsonge The issue that the

Table 'sfc3_user_profiles' does not have column 'profile_value' with type 'TEXT'. (From file 3.3.4-2014)

message being shown only when something else was fixed before I had also before the changes for utf8mb4, e.g. when I tested 3.5.0. Beta 1 or even when I updated 3.4.7 to 3.4.8 and before, too.

It is related to how the database schema manager processes the update files and statements from them.

If there is for example an sql file adding a new column to a table or modifying a column's data type, and then later comes for example another sql where this column is dropped, then a first run of the database schema manager will not find any problems after having processed all the sql update files in the chronological order (given by the names which sort in the right way).

But on a second run, the 1st sql is checked again and now complains because it expects the column as defined in this sql, not knowing about the
later drop, and so this is shown as a database problem to be fixed and will be fixed then.

This is why I wrote in the testing instructions of my PR for the utf8mb4 conversion that this might happen, that you use the fix button and then see some new pronlems so you have to use the button twice or even more times.

Because now the utf8mb4 conversion forces something to be fixed, this problems becomes visible for all and not only for some of the users.

We (you if you want, or I will do later) can open a new issue for that, but please keep it separate from the utf8mb4 related issues.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar Stevec4
Stevec4 - comment - 29 Feb 2016

Richard was there to be a message for the user to know about applying fix to the database? I didn't notice one. Just curious
Steve

avatar richard67
richard67 - comment - 29 Feb 2016

Hmm, I just see my previous comment is not 100% correct, there is only 1 update file modifying this column, so it might be slightly different.

But I am 100% sure I saw this already in past, before Beta 2, maybe with Beta 1 or even when updating 3.4.7. to 3.4.8.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

Update 1: Meanwhile I have seen that when updasting 3.4.8 to Beta 3, the column "profile_value" of the "#__user_profiles" table has data type "TEXT", as it should be.

After the first use of the "Fix" button the data type is then "MEDIUMTEXT". I will check if this comes from the utf8mb4 comversion or what else causes this data type changem but it is not another, later update sql as I thought in my 2nd last comment.

This causes the db schema manager then to see thas it has to be changed to "TEXT" in the second run.

It seems to be an implicit conversion done by mysql, there is not any update sql doing this.

I'll check now if this is caused by statements from the utf8(mb4) conversion and provide an "Update 2" comment here.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

@Stevec4 No, there was not planned to have any special message or so. It was not me who planned the utf8mb4 changes, I only fixed them.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar Stevec4
Stevec4 - comment - 29 Feb 2016

I mean in the admin page to alert the user to apply fix

Steve

On Mon, Feb 29, 2016 at 9:53 AM, Richard Fath notifications@github.com
wrote:

@Stevec4 https://github.com/Stevec4 No, there was not planned to have
any special message or so. It was not me who planned the utf8mb4 changes, I

only fixed them.

This comment was created with the J!Tracker Application
https://github.com/joomla/jissues at issues.joomla.org/joomla-cms/9247
https://issues.joomla.org/tracker/joomla-cms/9247.


Reply to this email directly or view it on GitHub
#9247 (comment).

avatar richard67
richard67 - comment - 29 Feb 2016

Update 2: @wilsonge I have found the reason.

The statement

ALTER TABLE #__user_profiles CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

performed on utf8(mb4) conversion changes the data type of the "profile_value" column from "TEXT" to "MEDIUMTEXT".

I have to check if this happens also for other columns of other tables where nothing is reported because nothing in any update sql and so nothing checked, and what is the best way to handle that, and make a PR then.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

Update 3: @wilsonge Seems to be a MySQL bug: https://bugs.mysql.com/bug.php?id=31291, but we can workaround that by converting them back at the end of the utf8(mb4) conversion.

Am just investigating details and will make a PR later today.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

@Stevec4 I understood what you mean.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

Update 4: @wilsonge Ah, not a bug but a feature of MySQL when reading the linked page, but still I think we can workaround that.

avatar richard67
richard67 - comment - 29 Feb 2016

@wilsonge Where can we discuss possible solutions which would require some decision form a PLT member (e.g. you) or discussion with Eli? I do not wanna hijack this issue for long discussions. You can email me to 'richard.fath' . 'at character' . 't-online.de'.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar richard67
richard67 - comment - 29 Feb 2016

@810 @infograf768 @Stevec4 I opened a new issue #9258 with a detailled analysis.

@wilsonge George, can you discuss that issue #9258 with everybody necessary, e.g. Eli @aschkenasy ?


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

avatar wilsonge
wilsonge - comment - 29 Feb 2016

Yes we can :) I have a skype text conversation I can add you into if you ping it to my email?

avatar richard67
richard67 - comment - 29 Feb 2016

If I had your email ... :smile:

avatar richard67
richard67 - comment - 29 Feb 2016

@wilsonge my skype is richard.fath.67

avatar wilsonge
wilsonge - comment - 29 Feb 2016

Yah that's where I was about to email you and then got distracted by work stuff and lost my track. Apologies! Adding you now!

avatar brianteeman brianteeman - change - 29 Feb 2016
Labels Added: ?
avatar wilsonge wilsonge - change - 1 Mar 2016
Status New Closed
Closed_Date 0000-00-00 00:00:00 2016-03-01 00:35:03
Closed_By wilsonge
avatar wilsonge wilsonge - close - 1 Mar 2016
avatar wilsonge wilsonge - close - 1 Mar 2016
avatar wilsonge
wilsonge - comment - 1 Mar 2016

OK talked this over with Eli and we are going to remove the unique constraint here - it's not the index length that is the problem - but that the data truncation from 255 to 191 characters causes two url's over 191 characters to be the same.

Separately to this @richard67 is going to increase the length of the old url field I believe so it can take a larger size of urls (resurrecting #4781)

avatar richard67
richard67 - comment - 1 Mar 2016

@wilsonge Please note the comment I make on a code line in the sql on your commit. Syntax error danger!

avatar wilsonge
wilsonge - comment - 1 Mar 2016

Fixed it - thankyou!

avatar richard67
richard67 - comment - 1 Mar 2016

@wilsonge The enlargement of the links (all old url, new url and referrer) I will do tomorrow.

I will do it for 2048 characters, not for 2083 chars as it was in #4781.

2083 chars was the limit of IE8 or so, other browsers have much higher limits, we cannot follow them all.

Such long URLs are normally generated on the fly and are dynamic.

The sitemap protocol limits to 2048.

We should keep this limit because we can assume people wanna have their URL to be able in a sitemap and so not will exceed this limit.

Let me know if you disagree.

avatar andrepereiradasilva
andrepereiradasilva - comment - 1 Mar 2016

i think there is no limit for a uri lenght in the http rfc. it all comes down to what the app supports.

But there is a limit in sitemap protocol of 2048, so i think it's a good number.

avatar 810
810 - comment - 1 Mar 2016

fix works, but when click on fix, the page keeps spinning, and will not stop

avatar richard67
richard67 - comment - 1 Mar 2016

@andrepereiradasilva Is exactly how you say

avatar richard67
richard67 - comment - 1 Mar 2016

@810 New PR #9269 - please test.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9247.

Add a Comment

Login with GitHub to post a comment