? Pending

User tests: Successful: Unsuccessful:

avatar Hackwar
Hackwar
9 Jan 2019

Smart Search currently adds a join for every taxonomy group and for every required search term. This is the naive way of making sure that each required taxonomy and each required search term is included. Unfortunately, that means for somewhat large indexes (in my specific case ~6000 indexed documents with half a million different terms and 2.2 million mappings between terms and documents) the load on the database system gets very high. If someone now just goes ahead and copies a sentence out of a text and searches for that with maybe 15 words, that makes 15 joins over those 2.2 million mappings, so > 30+ million rows to process (again, doing this naively. It works slightly differently internally.) Long story short, you can dDoS a large website by simply making a few dozen calls to the search with a long search query and the MySQL server will come to a total gridlock and die. This normally requires an admin to go in and restart MySQL to stop the table lock.

This change should fix that issue and improve the performance greatly. Instead of doing 15 joins for a 15 word search query, it does 1 join that contains all search terms and in the HAVING clause we are checking if for every search term, there is at least one match in the result. So now we are just doing an index lookup once on the mapping table and then in the HAVING clause we are just working on the (hopefully) small result set.

How to test?

  1. Index a large site, at least a 1000 articles
  2. Go to a random article and copy a longer sentence from it
  3. Search for this sentence in Smart Search in the frontend.
  4. See that it takes a very long time and maybe even your MySQL server crashes. hooray!
  5. Apply the patch, run the search again.
  6. See that you get a result in a reasonable timeframe.
avatar Hackwar Hackwar - open - 9 Jan 2019
avatar Hackwar Hackwar - change - 9 Jan 2019
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 9 Jan 2019
Category Front End com_finder
avatar wilsonge wilsonge - change - 16 Jan 2019
Labels Added: ?
avatar wilsonge wilsonge - change - 20 Jan 2019
Status Pending Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2019-01-20 21:06:48
Closed_By wilsonge
avatar wilsonge wilsonge - close - 20 Jan 2019
avatar wilsonge wilsonge - merge - 20 Jan 2019
avatar wilsonge
wilsonge - comment - 20 Jan 2019

I'm not seeing any tests here - and I think the best way to get tests is to merge it in plenty of time for stable and see what happens here. More than happy to revert if any issues get found testing it in the main trunk

Add a Comment

Login with GitHub to post a comment