Indexer encountered a duplicate entry and terminated
5.4
Ignore the duplicate and proceed
Index creation terminated
No response
| Labels |
Added:
No Code Attached Yet
bug
|
||
I've been hitting this bug on Joomla 5.4.5 and traced it to the root cause, which differs from what the previous PRs addressed.
Root cause
The GROUP BY in the INSERT into #__finder_terms is over-specified — it includes ta.term_weight, ta.phrase, ta.stem, and ta.common. Since the aggregation loop runs once per indexing context (title, body, metadata, etc.) with a different term_weight each time, a word appearing in both title and body generates two rows in tokens_aggregate with the same (term, language) but different term_weight. The over-specified GROUP BY treats these as distinct rows, and the INSERT attempts to write both — violating the idx_term_language unique key.
This is intra-article, so it happens even with a single article. That's why PR #47472 (LEFT JOIN + IS NULL) didn't fully fix it: neither row exists yet in finder_terms at INSERT time, so both pass the IS NULL filter.
The fix
Reduce GROUP BY to (term, language) only — the exact columns of the unique index — and use MIN() for all other columns:
- ' SELECT ta.term, ta.stem, ta.common, ta.phrase, ta.term_weight, SOUNDEX(ta.term), ta.language' .
+ ' SELECT ta.term, MIN(ta.stem), MIN(ta.common), MIN(ta.phrase), MIN(ta.term_weight), SOUNDEX(ta.term), ta.language' .
' FROM ' . $db->quoteName('#__finder_tokens_aggregate') . ' AS ta' .
' WHERE ta.term_id = 0' .
- ' GROUP BY ta.term, ta.stem, ta.common, ta.phrase, ta.term_weight, SOUNDEX(ta.term), ta.language'
+ ' GROUP BY ta.term, ta.language'This is standard SQL — no INSERT IGNORE, fully compatible with PostgreSQL.
Verified in production on Joomla 5.4.5 / MySQL at colon.com.uy. Full reindex completes without errors on a site with hundreds of articles containing accented Spanish words.
Full analysis and patched file: https://github.com/dariofin/joomla-finder-duplicate-fix
The suggested fix would not work with PostgreSQL.
„Insert ignore“ is MySQL and MariaDB syntax. On PostgreSQL this causes an SQL syntax error. As Joomla also supports PostgreSQL, this PR cannot be accepted as it is.
See https://www.postgresql.org/docs/current/sql-insert.html
Besides that, I don’t think it would be the right fix. It would circumvent the issue but not fix the root cause.