Sorry, I hit return by mistake and the initial comment wasn't already entered. As Jssues doesn't updates modification done in GitHub I'm forced to repeat it here:
This fixes the second of the issues I exposed in #5204
The problem was with the use of PHP strip_tags()
Using it a string of the kind <h1>Title</h1><p>This is the paragraph</p> is translated to:
TitleThis is the paragraph
This probably fixes your immediate issue, but there are other issues with the parser that also need resolving. I have some untested code that I wrote a while back that I'll try to dig out that I think should fix the parser properly.
@chrisdavenport I know and I agree! This thing indexes even things like "itemid"!!! If you have better code ready, please feel free to submit it: I'll promptly retract this PR, but in the meanwhile I think it is a small step forward indexing something with more sense...
Sorry, I hit return by mistake and the initial comment wasn't already entered. As Jssues doesn't updates modification done in GitHub I'm forced to repeat it here:
This fixes the second of the issues I exposed in #5204
The problem was with the use of PHP strip_tags()
Using it a string of the kind <h1>Title</h1><p>This is the paragraph</p> is translated to:
TitleThis is the paragraph
Adding a space after each > does the trick...