From a com_search search bar, search for a word with an apostrophe
The expected result is for the word to be found and results returned.
Joomla removes the apostrophe creating a new word ("don't" becomes "dont"), and does not display any (or at least, correct) results.
This issue can be reproduced going back to at least Joomla 2.5, but is current on 3.4.5
I've been able to correct this issue with the following change:
in com_search/controller.php line ~47:
$badchars = array('#', '>', '<', '\');
$searchword = trim(str_replace($badchars, '', $this->input->getString('searchword', null, 'post')));
replace it with this:
$badchars = array('#', '>', '<', '\', '\'');
$goodchars = array('','','','',' ');
$searchword = trim(str_replace($badchars, $goodchars, $this->input->getString('searchword', null, 'post')));
My apologies about the syntax errors, I hadn't seen how to post code, and it appears the editor removed some slashes without me noticing. You are correct about the original code, and here is the correctly-displayed update that I've made to my site:
$badchars = array('#', '>', '<', '\\', '\'');
$goodchars = array('','','','',' ');
$searchword = trim(str_replace($badchars, $goodchars, $this->input->getString('searchword', null, 'post')));
As for the differing results, I wonder if it's a question of language. I used "don't" as an example above. Let me be more precise. My Joomla front end is set to French (fr_fr), and one search term example is "acadie", which can also be commonly written as "l'acadie" (depending on context). If I search using the original search controller, "l'acadie" becomes "lacadie", and the expected results are not shown (as in, no results at all, since "lacadie" isn't a word). With the modification I propose above, however (replacing the apostrophe with a space), "l'acadie" becomes "l acadie" and all of the expected results are displayed for "acadie" and "l'acadie". I can reproduce this behavior with other similar words.
Why is replacing a single quote with a space wrong? It doesn't appear to be any more "wrong" than replacing a single quote with nothing (which grammatically, at least, is certainly just as "wrong"). I would suggest that the "correct" thing would be to actually find a way to handle single quotes (apostrophes) in search queries. That seemingly not being possible, I feel that replacing the current nothing with a space at least allows for the expected functionality in these cases.
Cheers
Before your patch, still looking for don't, I get (same as above):
I also get l'acadie (link is search-com-search.html?searchword=l'acadie&ordering=newest&searchphrase=any&limit=0
After your patch, I totally miss the don t
(but still get l acadie
as above)
Link is search-com-search.html?searchword=don t&ordering=newest&searchphrase=all&limit=20
I'm somewhat at a loss to explain our differing results. If I search for "l'acadie" on my site, I don't get any results (without my patch). With the patch, I get all the expected output. Would the front-end language setting affect this in some way (my site is set to fr_fr on the front-end)?
As for "don't", I see what you mean about replacing the apostrophe with a space, but I suppose a solution to my problem won't be as easy to work on if we're not able to recreate it on another site.
I had a realization about this last night. The issue on my install is that the apostrophe is getting copied/pasted from Word, and is the "curly" apostrophe, not a straight apostrophe. Updating the site content to use straight apostrophes solves the issue.
Thanks for your feedback,
-Liam
Status | New | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2015-12-21 13:21:49 |
Closed_By | ⇒ | liamhanks |
Or, search for "acadie".
That code is wrong and will create an unexpected error
Also replacing anyway a singlequote by a space is also wrong.
My tests show here that search works. Although it does not highlight the
't
, it does return the desired result:Also note that the original code is not what you display above.
It is
$badchars = array('#', '>', '<', '\\');
and not$badchars = array('#', '>', '<', '\');