User tests: Successful: Unsuccessful:
Using preg_match
here can be very very slow and sometimes even crash the page. Anyway, we don't need a regular expression for this simple task.
Labels |
Added:
?
|
This change affects the results page of the smart search. It includes one major 'under the hood' kind of improvement and one minor change visible to users.
1) Configure a smart search results page. Make sure it's configured to display description
.
2) Run a search. Preferably one that returns many large results.
1) The page will probably load very slowly. It might even crash.
2) If the page does load, you will probably notice that some of the result descriptions start and/or end mid-word.
1) The page should load much faster. It should not crash.
2) Except in very rare cases, descriptions will contain full words.
The original code is using preg_match
on the description of each and every search result. This is a notoriously slow function and seems to be especially slow when run against a long string. The power of regex is not needed here at all. We only need to find a simple string, and a certain number of characters around it. Using the regular string functions is much much faster.
Also, although JHtmlString::truncate()
has an option to avoid splitting text in the middle of words, it was not used here. It should be used. That only affects the end of the description though, we also need to do a little extra to avoid splitting words at the beginning of the string.
One more thing. The regex that was being used was being created by simply dropping the search term into the middle of a regex string. It was not escaped with preg_quote
first. I don't have a concrete example of how to exploit this do something weird but it's clearly not proper. Anyway, this change remove regular expressions so that's irrelevant anyway.
@test almost ok
I can reproduce your findings and the improvements when applying your patch. When I search for only one word or select a phrase from autocompletion then I get the following notice:
Notice: String offset cast occurred in C:\xampp\htdocs\joomla33_test\components\com_finder\views\search\tmpl\default_result.php on line 31
When I search for two or more words the result page is "clean".
Sorry I can´t reproduce it anymore. I played around with different combinations of the settings and now the notice is gone and I haven´t noted my previous settings.
You're gonna want to get this tested and merged. It fixes a serious problem in the 'Smart Search' listing. I'm surprised more people haven't noticed their pages crashing.
Maybe @dgt41, @smanzi or @anibalsanchez can have a look at this patch. I could easly reproduce the issue with the data of one of my clients. In this database I have approximately 400 very long articles. I wasn´t able to reproduce the problem with the "test data sample" offered by joomla install.
@okonomiyaki3000 Hi! I'm the old 'smanzi' and I changed GitHub account since a couple of days: thanks for calling me in!
Performance wise this PR is a major improvement!
In my test environment (with a very small "corpus" of articles and a slow PC!) time to visualize results has decreased from 10 seconds to 2 seconds, consistently.
IMHO this a PR we should not miss.
Only thing I don't totally agree with is some "extra" optimization steps you took like e.g.: moving some logic down to the HTML part of things: while it is absolutely true that you can spare some extra CPU cycle, on the other this will make the code less easy to port this view to JLayouts if and when this decision will be taken.
In any case this is an @test success!
@smz I guess you're talking about $base
. Well, actually, the best thing would be to get it out of this file altogether as this template is being called in a loop and $base
is going to be the same for every iteration. The best thing would be to generate it once and reuse it although I realize that, as optimizations go, that's an extremely minor one
Agreed!
Why don't you do that? Is there any reason why $base
should remain here and be worthlessly computed for each result? Ghosts of B/C?
Because then the number of files modified by this PR would double. Lame excuse?
I'm inclined to agree with that too...
... see, what happens is that once I start, I can't stop.
eh, I know how it goes... (while honing last bits of CSS on one of my sites...)
eh, I know how it goes... (while honing last bits of CSS on one of my sites...)
ready for 2nd test?
I expect it's fine now.
As a side note, I use a linter that shows me phpcs rule violations in real time so it's quite obvious to me that our rules are not very suitable for template files. For stuff like:
<?php if ($x) : ?>
<?php doSomething(); ?>
<?php endif; ?>
We will always get errors on the first and last lines about "control structures" and "blank lines". Is there any PHPCS rule wizard working on this project who can fix that?
sorry about this. Just couldn't leave it alone. replace a tab with a space...
hmmm... you know I think there is a problem? I don't see highlights...
Don't panic: let me check...
OK, this is not related to your PR, but I see this code in the generated HTML:
jQuery(function ($) {
var start = document.getElementById('highlighter-start');
var end = document.getElementById('highlighter-end');
if (!start || !end || !Joomla.Highlighter) {
return true;
}
highlighter = new Joomla.Highlighter({
startElement: start,
endElement: end,
className: 'highlight',
onlyWords: false,
tag: 'span'
}).highlight(["dicet"]);
$(start).remove();
$(end).remove();
});
And I can't find where this is generated... (I think it miss a (document).ready
)
got it: behavior.php
Nope, it's using the shorthand ready function:
jQuery(function ($) {});
yeah... but then there isn't actually any jQuery inside of it (if I'm not wrong: total JS noob here...)
I think we should open a separate issue for this...
I'm afraid it's the damn MT missing...
Nope... not MT: giving up and opening an issue, @ò##$!!
is MT out now? This does require it but it can be rewritten so that it doesn't.
I guess it doesn't really require MT but it uses Array.map which is not available before IE9. Of course all the good browsers are fine.
Anyway, it seems to work OK for me. I see a few minor problems in the script (missing semicolons, etc.) but it still works. For me anyway.
@okonomiyaki3000 You probably don't have the latest staging, where some mods were introduced: that's why it worked for you...
Great job. About the jQuery ready function, this:
jQuery(function ($) {});
is equivalent to:
jQuery(document).ready(function ($) {});
and, in each case, the argument ($
) passed to the callback is jQuery. So, inside that function you can $ instead of jQuery and it will work even in no conflict mode.
@(re)test works!
Status | Pending | ⇒ | Ready to Commit |
@test: Works as expected, a huge improvement. Well done !!
Moving to RTC as there are multiple good tests.
Status | Ready to Commit | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2014-12-17 06:00:22 |
Can you please add testing instructions? From the file you touch it is related to the item description in the Smart Search result page.