? Success

User tests: Successful: Unsuccessful:

avatar okonomiyaki3000
okonomiyaki3000
21 Nov 2014

Using preg_match here can be very very slow and sometimes even crash the page. Anyway, we don't need a regular expression for this simple task.

Votes

# of Users Experiencing Issue
0/1
Average Importance Score
3.00

avatar okonomiyaki3000 okonomiyaki3000 - open - 21 Nov 2014
avatar jissues-bot jissues-bot - change - 21 Nov 2014
Labels Added: ?
avatar Bakual
Bakual - comment - 21 Nov 2014

Can you please add testing instructions? From the file you touch it is related to the item description in the Smart Search result page.

avatar okonomiyaki3000
okonomiyaki3000 - comment - 21 Nov 2014

This change affects the results page of the smart search. It includes one major 'under the hood' kind of improvement and one minor change visible to users.

Testing

1) Configure a smart search results page. Make sure it's configured to display description.
2) Run a search. Preferably one that returns many large results.

Before

1) The page will probably load very slowly. It might even crash.
2) If the page does load, you will probably notice that some of the result descriptions start and/or end mid-word.

After

1) The page should load much faster. It should not crash.
2) Except in very rare cases, descriptions will contain full words.

What?

The original code is using preg_match on the description of each and every search result. This is a notoriously slow function and seems to be especially slow when run against a long string. The power of regex is not needed here at all. We only need to find a simple string, and a certain number of characters around it. Using the regular string functions is much much faster.

Also, although JHtmlString::truncate() has an option to avoid splitting text in the middle of words, it was not used here. It should be used. That only affects the end of the description though, we also need to do a little extra to avoid splitting words at the beginning of the string.

One more thing. The regex that was being used was being created by simply dropping the search term into the middle of a regex string. It was not escaped with preg_quote first. I don't have a concrete example of how to exploit this do something weird but it's clearly not proper. Anyway, this change remove regular expressions so that's irrelevant anyway.

avatar waader
waader - comment - 23 Nov 2014

@test almost ok

I can reproduce your findings and the improvements when applying your patch. When I search for only one word or select a phrase from autocompletion then I get the following notice:

Notice: String offset cast occurred in C:\xampp\htdocs\joomla33_test\components\com_finder\views\search\tmpl\default_result.php on line 31

When I search for two or more words the result page is "clean".

avatar okonomiyaki3000
okonomiyaki3000 - comment - 25 Nov 2014

@waader I couldn't duplicate the issue but I could rework the code a little to remove any use of string offsets. If you still see this notice, please let me know what search terms you're using and also the 'description' text of the items where this occurs.

avatar waader
waader - comment - 26 Nov 2014

Sorry I can´t reproduce it anymore. I played around with different combinations of the settings and now the notice is gone and I haven´t noted my previous settings.

avatar okonomiyaki3000
okonomiyaki3000 - comment - 8 Dec 2014

:pineapple:

avatar okonomiyaki3000
okonomiyaki3000 - comment - 15 Dec 2014

You're gonna want to get this tested and merged. It fixes a serious problem in the 'Smart Search' listing. I'm surprised more people haven't noticed their pages crashing.

avatar waader
waader - comment - 15 Dec 2014

Maybe @dgt41, @smanzi or @anibalsanchez can have a look at this patch. I could easly reproduce the issue with the data of one of my clients. In this database I have approximately 400 very long articles. I wasn´t able to reproduce the problem with the "test data sample" offered by joomla install.

avatar smz
smz - comment - 16 Dec 2014

@okonomiyaki3000 Hi! I'm the old 'smanzi' and I changed GitHub account since a couple of days: thanks for calling me in!

Performance wise this PR is a major improvement!

In my test environment (with a very small "corpus" of articles and a slow PC!) time to visualize results has decreased from 10 seconds to 2 seconds, consistently.

IMHO this a PR we should not miss.

Only thing I don't totally agree with is some "extra" optimization steps you took like e.g.: moving some logic down to the HTML part of things: while it is absolutely true that you can spare some extra CPU cycle, on the other this will make the code less easy to port this view to JLayouts if and when this decision will be taken.

In any case this is an @test success!

avatar smz smz - test_item - 16 Dec 2014 - Tested successfully
avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

@smz I guess you're talking about $base. Well, actually, the best thing would be to get it out of this file altogether as this template is being called in a loop and $base is going to be the same for every iteration. The best thing would be to generate it once and reuse it although I realize that, as optimizations go, that's an extremely minor one

avatar smz
smz - comment - 16 Dec 2014

Agreed! :smile:

avatar smz
smz - comment - 16 Dec 2014

Why don't you do that? Is there any reason why $base should remain here and be worthlessly computed for each result? Ghosts of B/C?

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

Because then the number of files modified by this PR would double. Lame excuse?

avatar smz
smz - comment - 16 Dec 2014

I'm inclined to agree with that too... :stuck_out_tongue_winking_eye:

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

... see, what happens is that once I start, I can't stop.

avatar smz
smz - comment - 16 Dec 2014

eh, I know how it goes... (while honing last bits of CSS on one of my sites...)

avatar smz
smz - comment - 16 Dec 2014

eh, I know how it goes... (while honing last bits of CSS on one of my sites...)

avatar smz
smz - comment - 16 Dec 2014

ready for 2nd test?

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

I expect it's fine now.

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

As a side note, I use a linter that shows me phpcs rule violations in real time so it's quite obvious to me that our rules are not very suitable for template files. For stuff like:

<?php if ($x) : ?>
    <?php doSomething(); ?>
<?php endif; ?>

We will always get errors on the first and last lines about "control structures" and "blank lines". Is there any PHPCS rule wizard working on this project who can fix that?

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

sorry about this. Just couldn't leave it alone. replace a tab with a space...

avatar smz
smz - comment - 16 Dec 2014

hmmm... you know I think there is a problem? I don't see highlights...
Don't panic: let me check...

avatar smz
smz - comment - 16 Dec 2014

No, I don't see highlights in the descriptions even without this PR: weird... I could had sworn they were there some time ago...

And my old Russian (CCCP) chronometer says that maybe you have cut a couple of tenth of a seconds more! :+1:

@test success (confirmed)

avatar smz
smz - comment - 16 Dec 2014

... actually there is an highlighting JS in the result page, but it doesn't seems to work: I think we need @dgt41 here: Dimiiiiiiiitriiiiiis!!!! :smile:

avatar smz
smz - comment - 16 Dec 2014

OK, this is not related to your PR, but I see this code in the generated HTML:

jQuery(function ($) {
    var start = document.getElementById('highlighter-start');
    var end = document.getElementById('highlighter-end');
    if (!start || !end || !Joomla.Highlighter) {
        return true;
    }
    highlighter = new Joomla.Highlighter({
        startElement: start,
        endElement: end,
        className: 'highlight',
        onlyWords: false,
        tag: 'span'
    }).highlight(["dicet"]);
    $(start).remove();
    $(end).remove();
});

And I can't find where this is generated... (I think it miss a (document).ready)

avatar smz
smz - comment - 16 Dec 2014

got it: behavior.php

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

Nope, it's using the shorthand ready function:

jQuery(function ($) {});
avatar smz
smz - comment - 16 Dec 2014

yeah... but then there isn't actually any jQuery inside of it (if I'm not wrong: total JS noob here...)

avatar smz
smz - comment - 16 Dec 2014

I think we should open a separate issue for this...

avatar smz
smz - comment - 16 Dec 2014

I'm afraid it's the damn MT missing...

avatar smz
smz - comment - 16 Dec 2014

Nope... not MT: giving up and opening an issue, @ò##$!!

avatar smz
smz - comment - 16 Dec 2014

Opened #5438

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

is MT out now? This does require it but it can be rewritten so that it doesn't.

avatar smz
smz - comment - 16 Dec 2014

It's being kicked out... @dgt41 did wonders in that sense and removed it from wherever possible (in back-end), but in this case I don't think it's a MT issue: I tried to force load it and got the same results...

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

I guess it doesn't really require MT but it uses Array.map which is not available before IE9. Of course all the good browsers are fine.

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

Anyway, it seems to work OK for me. I see a few minor problems in the script (missing semicolons, etc.) but it still works. For me anyway.

avatar smz
smz - comment - 16 Dec 2014

FF 34 here! Strange thing, if you click on article from the results page, there (in the article) the search term is highlighted... Now I'll compare the two... but lets' move discussion about this in #5438, to not litter your PR with unrelated stuff (my guilt, I know...)

avatar smz
smz - comment - 16 Dec 2014

Fixed: PR is #5439

avatar smz
smz - comment - 16 Dec 2014

@okonomiyaki3000 You probably don't have the latest staging, where some mods were introduced: that's why it worked for you...

avatar okonomiyaki3000
okonomiyaki3000 - comment - 16 Dec 2014

Great job. About the jQuery ready function, this:

jQuery(function ($) {});

is equivalent to:

jQuery(document).ready(function ($) {});

and, in each case, the argument ($) passed to the callback is jQuery. So, inside that function you can $ instead of jQuery and it will work even in no conflict mode.

avatar waader
waader - comment - 16 Dec 2014

@(re)test works!

avatar waader waader - test_item - 16 Dec 2014 - Tested successfully
avatar dgt41
dgt41 - comment - 16 Dec 2014

@test success

avatar anibalsanchez
anibalsanchez - comment - 16 Dec 2014

@test success, smart search results are Ok

avatar anibalsanchez anibalsanchez - test_item - 16 Dec 2014 - Tested successfully
avatar roland-d roland-d - change - 17 Dec 2014
Status Pending Ready to Commit
avatar roland-d
roland-d - comment - 17 Dec 2014

@test: Works as expected, a huge improvement. Well done !!

Moving to RTC as there are multiple good tests.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/5154.
avatar roland-d roland-d - change - 17 Dec 2014
Status Ready to Commit Closed
Closed_Date 0000-00-00 00:00:00 2014-12-17 06:00:22
avatar roland-d roland-d - close - 17 Dec 2014

Add a Comment

Login with GitHub to post a comment