User tests: Successful: Unsuccessful:
Smart Search splits up the text into single words and indexes those. The chinese language uses single characters per word and thus we can't just split on whitespace. For that, we have additional code for tokenisation for Chinese content, which unfortunately up to now was broken.
The strategy was to get a list of all chinese characters in the term, then to replace them all in the given term and add the chinese character as new terms to the list at the end. When a term is empty when replacing all chinese characters (because it only contained those characters and no numbers or latin chars) that term has to be removed from the list.
The problem was, that characters could be present more than once in the input, but when replacing them, all occurences would be replaced at once the first time. The list of to-be-replaced characters however did contain those characters for each occurence in the input term and when the input term ran empty before the list of all characters was processed, this threw a notice.
This PR tries to fix that. In a first attempt, I tried to replace all chars at once and then to add all matches as new terms at the end. I wrote lots of comments and it took quite some work, as you can see from the first commit in this PR. Then I noticed, that I could have this a lot easier. Now I'm just modifying the input before handing it to our default tokenisation routine by adding whitespace around each chinese character. That is a lot easier, shorter and better to understand than that previous attempt...
die;
on line 633 (before the return $linkId;
) to abort the redirect when saving an article.标签印刷机在更换印刷工艺时的调试时间灵活适用于所有应用
and mark the articles language as chinese.You get a white page with a notice Undefined Offset at X
The page is white without any notices at all.
When you remove the die;
from the Indexer.php, saving works normally again.
Please select:
Documentation link for docs.joomla.org:
No documentation changes for docs.joomla.org needed
Pull Request link for manual.joomla.org:
No documentation changes for manual.joomla.org needed
Thanks to @coolcat-creations for reporting this to me and helping with the debugging.
Category | ⇒ | Administration com_finder |
Status | New | ⇒ | Pending |
Labels |
Added:
PR-4.3-dev
|
I have tested this item
It seems this fixes also the failing blog sample data installation when backend is Chinese.
Status | Pending | ⇒ | Ready to Commit |
RTC
Labels |
Added:
?
|
Labels |
Added:
bug
|
Status | Ready to Commit | ⇒ | Fixed in Code Base |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2023-08-07 19:58:33 |
Closed_By | ⇒ | obuisard |
I have tested this item✅ successfully on ba36ce2
Thank you for the patch, I tested it, the search did not break and the article with the string was indexed.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/41275.