? Success

User tests: Successful: Unsuccessful:

avatar chrisdavenport
chrisdavenport
13 Oct 2016

Pull Request for Issue #12408 .

Summary of Changes

Strip head first because head may contain script or style blocks.

Testing Instructions

Copy-paste an entire web page with a head block that contains inline CSS and/or JavaScript into a new article. You might see CSS or JS terms in the autosuggester. If you don't, try another web page until you do. You can try this page: https://www.joomla.org/announcements/release-news/5418-joomla-254-released.html then search for "col2".

Apply this PR.

Copy-paste the same article again so it gets re-indexed. The CSS/JS terms should no longer appear in the autosuggester.

Documentation Changes Required

None.

avatar chrisdavenport chrisdavenport - open - 13 Oct 2016
avatar chrisdavenport chrisdavenport - change - 13 Oct 2016
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 13 Oct 2016
Labels Added: ?
avatar joomla-cms-bot joomla-cms-bot - change - 13 Oct 2016
Category Administration Components
avatar alikon
alikon - comment - 23 Oct 2016

i've followed the test instruction but still "col2" is present after patch
bfp

i've purged the indexer before to copy again the same page

avatar chrisdavenport
chrisdavenport - comment - 23 Oct 2016

@alikon Can you send me a copy of the article that is causing this?

avatar chrisdavenport
chrisdavenport - comment - 23 Oct 2016

The problem went deeper than I originally thought.

In order to keep memory use down, the indexer::tokenizeToDb method was chunking the input string before handing over to the parser. The parser was also chunking, which although redundant, was, I thought, at least harmless. However, it turns out that the indexer chunking process was, on certain input strings, placing chunk boundaries inside HTML tags. The HTML parser was then not recognising them as being tags and was generating tokens for HTML attributes which should have been removed.

The fix is to understand that the format-dependent (in this case, HTML) parser must always pre-process the input before chunking can take place.

What I've now done is remove chunking from the indexer, but also simplified the code a little, removing some duplication and reducing indentation by removing some else clauses.

@alikon Please re-test and also make sure that everything that should be indexed is still being indexed. Thank you.

avatar alikon
alikon - comment - 25 Oct 2016

i've tested before the merge of 3.7.x branch to staging
well done chris

avatar alikon alikon - test_item - 25 Oct 2016 - Tested successfully
avatar alikon
alikon - comment - 25 Oct 2016

I have tested this item successfully on ef83bfb


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/12411.

avatar brianteeman
brianteeman - comment - 6 Dec 2016

I have tested this item successfully on ef83bfb

Thanks


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/12411.

avatar brianteeman brianteeman - test_item - 6 Dec 2016 - Tested successfully
avatar jeckodevelopment jeckodevelopment - change - 6 Dec 2016
The description was changed
Status Pending Ready to Commit
avatar jeckodevelopment
jeckodevelopment - comment - 6 Dec 2016

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/12411.

avatar jeckodevelopment jeckodevelopment - change - 6 Dec 2016
Labels Removed: ?
avatar jeckodevelopment jeckodevelopment - edited - 6 Dec 2016
avatar joomla-cms-bot joomla-cms-bot - change - 6 Dec 2016
Category Administration Components Administration com_finder Components
avatar brianteeman brianteeman - change - 6 Dec 2016
Milestone Added:
avatar rdeutz
rdeutz - comment - 9 Dec 2016

@chrisdavenport can you have a look at the conflicts, thanks

avatar chrisdavenport
chrisdavenport - comment - 9 Dec 2016

You should test again, just in case I screwed something up.

avatar rdeutz rdeutz - reference | d78ff06 - 13 Dec 16
avatar rdeutz rdeutz - merge - 13 Dec 2016
avatar rdeutz rdeutz - close - 13 Dec 2016
avatar rdeutz rdeutz - change - 13 Dec 2016
Status Ready to Commit Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2016-12-13 12:19:19
Closed_By rdeutz
avatar rdeutz rdeutz - close - 13 Dec 2016
avatar rdeutz rdeutz - merge - 13 Dec 2016
avatar chrisdavenport chrisdavenport - head_ref_deleted - 19 Dec 2016
avatar cpfeifer cpfeifer - reference | fb3e6ba - 22 Dec 16

Add a Comment

Login with GitHub to post a comment