terms listed in ###_finder_terms_common table to be ignored
terms listed in ###_finder_terms_common table are not ignored and continue to appear in finder tables
joomla-3.9.16
fedora31
php-7.3.14
Am I misunderstanding the intention of the finder_terms_common table? Is there an interface for updating it, or must it be done manually?
Words like 'of' and 'as' continue to appear in the terms table. We have a very large site with a multi-gigabyte finder_terms table that we'd like to minimize.
I'm using mariadb-10.3.21
Thank you. Sorry I forgot to include that.
Any updates on this? Can I ask someone to test ###_finder_terms_common on their system and see if it's working properly to exclude terms from the finder tables on joomla-3.9.15/16?
terms listed in
#_finder_terms_common
table to be ignoredAm I misunderstanding the intention of the finder_terms_common table? Is there an interface for updating it, or must it be done manually?
@Hackwar @chrisdavenport Could someone answer this question? I am unfortunately not familiar with smart search (com_finder).
@dwreski I can confirm that on both Joonla 3.x and in 4.0-dev you can find all common terms defined in database table #__finder_terms_common
also in table #__finder_terms
.
The good news: In J4 I could find in the Smart Search component's options in the "Index" tab an option "Filter Common Words". If you switch that on and do a reindex, the common terms should disappear from the terms table.
The bad news: In J3 I didn't find this option.
What I also could find out meanwhile by asking around is that the #__finder_terms_common
is intended to be maintained "manually", i.e. with SQL in database client (e.g. phpMyAdmin).
@richard67 thanks so much for your help.
So have we confirmed that there is a bug because the terms exist in both #__finder_terms_common also in table #__finder_terms?
@dwreski Not sure if it is a bug or just missing functionality.
In J4 the finder aka smart search was restructured and widely rewritten, and missing functionality has been added.
@HLeithner Would you say it's a bug that common terms like "and" are included in finder terms and there is no way to filter them out in J3 like we have it in J4? Without that filter, the #__finder_terms_common
table is useless in J3.
It depends, if it really should work and is documented that it should work by adding terms with phpmyadmin (wth) then it should work, if only the table exists without reference that it should work then it would be a new feature. If it's only a missing join or similar it should be fixed, if you need a new gui for this I think it has to wait for j4
Why is there an option for this?!
Good question. And I haven't tested if it really works in J4.
It's not even ignoring the default terms added in J3. See attached.
The common words feature is not really clear how it should work. In J3 the feature is only supposed to mark terms as "common" words by setting the flag in the table and later on, this is supposed to somehow improve the search results by ommitting common words in the search query. However this works sketchy at best and is not really reliable from my POV. There is no GUI or API to change that table except by modifying it in phpmyadmin directly. I don't think we will modify this in J3 anymore.
J4 has the option to work like in J3, but also to not index common words at all. There is a way to add additional (language specific) words to the table with the language packs by adding a txt file, otherwise editing via phpmyadmin is a possibility. One idea in J4 was also to allow for filtering out problematic words in the indexed search which you maybe don't want to see to be easily found, for example for political dissidents.
Unfortunately what you are seeing is the "correct" behavior. Fortunately for you, the effects of this aren't as severe as they seem. The common words will only appear once in the terms table and there will be only one mapping entry per content item per common word. The additional storage space for this wouldn't really make THAT big a difference...
yes
Status | New | ⇒ | Expected Behaviour |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2020-07-07 19:26:44 |
Closed_By | ⇒ | richard67 |
Closing as expected behavior. Thanks @Hackwar for the explanations.
@dwreski Which kind and version of database?
@Hackwar Could you have a look?