User tests: Successful: Unsuccessful:
In real life tests with rather large data sets and complex search queries, it turned out that there are some performance improvements possible in Smart Search. This PR improves this by changes to the database search query.
The database query results of Smart Search consist of 2 parts: The row ID belonging to the result to display and a serialized object of what to display as a result. This object can contain a lot of data, like the whole text of an article, prices, publishing dates, tags, categories and potentially lots of other data.
Right now, when retrieving the results, the binary blob column object
is always "dragged along" during the calculations, which can result in long running queries. Basically, the database will first load all rows including that column and then filter all of it down to what we actually want. This change does not load the object
column initially, since we don't need it to calculate the result set, but attaches it later on with a merge. The initial calculation of the result set is done in the $query
variable, which is then used as a subquery in the $wrappingQuery
. In this wrapping query, the links table is joined to the results again and instead of the content of the object
column of all rows, we are now only loading it for those rows which we are actually going to return. In order to really only load those rows which we currently need, the subquery is using a LIMIT
in the query, which requires additional code to properly calculate the total number of results.
The access check is changed to not use bound parameters, since the DatabaseQuery code here fails otherwise. For the total count of results, the query with the wrapping query is left in place, even though this could be simplified to simply returning the $query
variable. However I feel it is easier to debug when in the frontend you still have mostly the same query. The retrieve()
and store()
methods were removed because we actually aren't using them anymore. These are from a time long gone.
Retrieving the search results takes some time. In my specific tests it took more than 7 seconds to just run the query on the database.
Retrieving the results is a lot faster. In my specific tests it took slightly more than 2 seconds to run the query on the database.
Please be aware that the above mentioned times are from my personal development setup and don't represent the performance of Smart Search in real life. My setup is slow and intentionally not optimized. The search query I used is intentionally complex, specifically to measure the effect of changes like this one. On a live system the times are much more reasonable. I'm noting them here to give a feeling of the performance improvements we can expect.
Status | New | ⇒ | Pending |
Category | ⇒ | Front End com_finder |
Title |
|
Status | Pending | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2022-01-19 09:41:55 |
Closed_By | ⇒ | Hackwar | |
Labels |
Added:
?
|
If you are interested, you might want to look at these performance and security issues too :) https://github.com/joomla/joomla-cms/issues?q=is%3Aopen+is%3Aissue+author%3APhilETaylor+finder+