Joomla! Issue Tracker | Joomla! CMS #23495 - [4.0] Smart Search searching performance improvement

Fixed in Code Base
20 Jan 2019
Medium
Build: 4.0-dev
# 23495
Diff
Hackwar:j4finder_performance

Pending

User tests: Successful: Unsuccessful:

Hackwar
9 Jan 2019

Smart Search currently adds a join for every taxonomy group and for every required search term. This is the naive way of making sure that each required taxonomy and each required search term is included. Unfortunately, that means for somewhat large indexes (in my specific case ~6000 indexed documents with half a million different terms and 2.2 million mappings between terms and documents) the load on the database system gets very high. If someone now just goes ahead and copies a sentence out of a text and searches for that with maybe 15 words, that makes 15 joins over those 2.2 million mappings, so > 30+ million rows to process (again, doing this naively. It works slightly differently internally.) Long story short, you can dDoS a large website by simply making a few dozen calls to the search with a long search query and the MySQL server will come to a total gridlock and die. This normally requires an admin to go in and restart MySQL to stop the table lock.

This change should fix that issue and improve the performance greatly. Instead of doing 15 joins for a 15 word search query, it does 1 join that contains all search terms and in the HAVING clause we are checking if for every search term, there is at least one match in the result. So now we are just doing an index lookup once on the mapping table and then in the HAVING clause we are just working on the (hopefully) small result set.

How to test?

Index a large site, at least a 1000 articles
Go to a random article and copy a longer sentence from it
Search for this sentence in Smart Search in the frontend.
See that it takes a very long time and maybe even your MySQL server crashes. hooray!
Apply the patch, run the search again.
See that you get a result in a reasonable timeframe.

16e4e7b 9 Jan 2019

Smart Search searching performance improvement

Hackwar - open - 9 Jan 2019

Hackwar - change - 9 Jan 2019

Status

New

⇒

Pending

joomla-cms-bot - change - 9 Jan 2019

Category

⇒

Front End com_finder

accc75e 16 Jan 2019

Merge branch '4.0-dev' into j4finder_performance

wilsonge - change - 16 Jan 2019

Labels

Added: ?

03bfe77 20 Jan 2019

Merge branch '4.0-dev' into j4finder_performance

wilsonge - change - 20 Jan 2019

Status	Pending	⇒	Fixed in Code Base
Closed_Date	0000-00-00 00:00:00	⇒	2019-01-20 21:06:48
Closed_By		⇒	wilsonge

wilsonge - close - 20 Jan 2019

wilsonge - merge - 20 Jan 2019

wilsonge - comment - 20 Jan 2019

I'm not seeing any tests here - and I think the best way to get tests is to merge it in plenty of time for stable and see what happens here. More than happy to revert if any issues get found testing it in the main trunk

Add a Comment

Older
Newer

Joomla! Issue Tracker - CMS

[#23495] - [4.0] Smart Search searching performance improvement

How to test?

Add a Comment