Inspect the code: components/com_banners/src/Models/BannersModel.php Line 196
$regexp = '[[:<:]]' . $keyword . '[[:>:]]';
For MySQL 8.x and MariDB 10.x it needs to be '\\b' . $keyword . '\\b'
Edited: neglected to quote code.
Labels |
Removed:
?
|
Labels |
Added:
No Code Attached Yet
|
Labels |
Added:
Information Required
|
According to this document it should still work with MariaDB: https://mariadb.com/kb/en/regular-expressions-overview/
Word boundaries
The
[:<:]
and[:>:]
patterns match the beginning and the end of a word respectively.
Prior to MySQL 8.0.4, MySQL used the Henry Spencer regular expression library to support regular expression operations, rather than International Components for Unicode (ICU).”
The [[:<:]] and [[:>:]] word boundary markers are only supported by the Spencer library, for ICU it would be \b instead.
What I found for MariaDB on the page linked in my previous comment is:
MariaDB starting with 10.0.5
Until MariaDB 10.0.5, MariaDB used the POSIX 1003.2 compliant regular expression library. The new PCRE library is mostly backwards compatible with what is described below - see the PCRE Regular Expressions article for the enhancements made in 10.0.5.
So it seems they now support \b
, too.
The thing is that in J4 we still have to support MySQL versions older than 8. And PostgreSQL seems to use \b
for the backspace character and not for word boundaries.
Labels |
Removed:
Information Required
|
A simple way to test whether a sequence of numbers contains a specific number (corrected typo):
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '11', '[[:>:]]'),
'1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '1111', '[[:>:]]')
MySQL said:
3685 - Illegal argument to a regular expression.
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('\\b', '11', '\\b'),
'1,2,3,11,22,33,111,222,333' REGEXP CONCAT('\\b', '1111', '\\b')
Returns 1 and 0
Documentation: https://dev.mysql.com/doc/refman/8.0/en/regexp.html#regexp-compatibility
With MariaDB 10.3.38 the first test returns 0 and 0. The second test returns 1 and 0. Joomla 5 requires MariaDB 10.4 which I have yet to install.
If we want to avoid database type and version specific code in the CMS, we could extend the database query classes in the framework by new methods to return the right regexp pattern like e.g. for word boundaries (and whatever else might be used in the CMS with the regexp
method of the query), so we could use e.g. $query->regexpPatternWordStart()
and $query->regexpPatternWordEnd()
(any ideas for shorter names are welcome), or we could use something like $query->regexpPatternWord($keyword)
to prepend the word start and append the word end pattern, and similar for other use cases.
What do you all think?
Update: As far as I can see, only the banners model uses the regexp
method of the query object.
A simple way to test whether a sequence of numbers contains a specific number:
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:>:]]', '11', '[[:>:]]'), '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:>:]]', '1111', '[[:>:]]')
With MariaDB 10.3.38 the first test returns 0 and 0. The second test returns 1 and 0. Joomla 5 requires MariaDB 10.4 which I have yet to install.
@ceford Shouldn't it use '[[:<:]]'
for the word start, so it is as follows?
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '11', '[[:>:]]'),
'1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '1111', '[[:>:]]')
Please test again with the right query.
A possible fix in the CMS (4.3-dev branch, but should be merged all the way up) could look as follows:
See 4.3-dev...richard67:4.3-dev-fix-banners-model-regexp-query .
Code for that can be found here: https://github.com/richard67/joomla-cms/blob/321c5f2846136cc3147e46b0671bcd5640a48f82/components/com_banners/src/Model/BannersModel.php
As far as I can see we don't have to do such stuff anywhere else in the CMS up to now.
But a solution in the database framework would maybe be more elegant, so we could use something like this:
$regexp = $query->regexpWord($keyword);
Please test again with the right query.
Yes: that was a typo - I had updated my localhost db. Test on live site using MySQL 5.x:
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '11', '[[:>:]]'),
'1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '1111', '[[:>:]]');
Returns 1 and 0.
I don't have a test for the Banners instance. I don't use banners. I came across the problem in a custom component. I have about 8 instances there so your proposed solution would involve a lot of code repetition. A utility function would be better.
On MariaDB 10.6 both ways '[[:<:]]'...'[[:>:]]'
and '\\b'...'\\b'
seem to work and give the expected result (1 and 0).
On MySQL 8.0.33 I get the same result as you reported, SQL error "Illegal argument to a regular expression" with '[[:<:]]'...'[[:>:]]'
, and expected result with '\\b'...'\\b'
.
Apologies for my clumsiness - seems I tested MariaDB with my typo. New test:
Server version: 10.3.38-MariaDB-0ubuntu0.20.04.1 - Ubuntu 20.04
SELECT '1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<:]]', '11', '[[:>:]]'),
'1,2,3,11,22,33,111,222,333' REGEXP CONCAT('[[:<]]', '1111', '[[:>:]]');
MariaDB returns 1 and 0
The //b
word boundary marker also returns 1 and 0.
Labels |
Added:
bug
|
@ceford How do you know that? Is there some documentation about that? If so, could you link to it? Or did you have some issue which is solved by the suggested change? If so: Could you provide some instructions on how to reproduce it? Thanks in advance.