User tests: Successful: Unsuccessful:
Smart Search currently can fail under certain conditions when two terms would actually be handled as identical due to collation, but because of a bug are treated as 2 separate terms. This results in the unique index for term, language
being violated for the terms table. This PR fixes the GROUP BY statement.
Unfortunately, this is rather difficult to reproduce. I had content which contained the words messsystem
and meßsystem
, which triggered the problem on one server, but then again I can't reproduce it locally. So... Codereview?
Please select:
Documentation link for docs.joomla.org:
No documentation changes for docs.joomla.org needed
Pull Request link for manual.joomla.org:
No documentation changes for manual.joomla.org needed
Status | New | ⇒ | Pending |
Category | ⇒ | Administration com_finder |
Some more explanation: This problem results in an exception which at least aborts the mass-indexing in Smart Search and potentially also creates a fatal error during saving of the content in question. While content is already saved, it is still rather bad that we get a fatal error during that process. I've encountered this on several occasions in the past, especially in combination with Falang (which probably mainly is because sites with Falang have more non-english content) Unfortunately it isn't as simple as putting those similar words into an article and saving it. I couldn't find out how to reproduce the problem or how to create a minimal example.
The main problem seems to be, that the GROUP BY differentiates the terms based on (among other things) the weight and thus thinks that two terms are different, even though they are actually identical by collations standards.