No Code Attached Yet bug
avatar mtense
mtense
22 Mar 2026

What happened?

Indexer encountered a duplicate entry and terminated

  • Steps to reproduce: index any content where two items share a normalised term (e.g. accented name like "Sueño Stereo" that strips to a term already
  • The exact error: Duplicate entry 'X-*' for key 'idx_term_language'
  • File: administrator/components/com_finder/src/Indexer/Indexer.php ~line 524
  • Fix: change INSERT INTO to INSERT IGNORE INTO

Version

5.4

Expected result

Ignore the duplicate and proceed

Actual result

Index creation terminated

System Information

No response

Additional Comments

avatar mtense mtense - open - 22 Mar 2026
avatar joomla-cms-bot joomla-cms-bot - change - 22 Mar 2026
Labels Added: No Code Attached Yet bug
avatar joomla-cms-bot joomla-cms-bot - labeled - 22 Mar 2026
avatar mtense mtense - change - 22 Mar 2026
The description was changed
avatar mtense mtense - edited - 22 Mar 2026
avatar richard67
richard67 - comment - 23 Mar 2026

The suggested fix would not work with PostgreSQL.

„Insert ignore“ is MySQL and MariaDB syntax. On PostgreSQL this causes an SQL syntax error. As Joomla also supports PostgreSQL, this PR cannot be accepted as it is.

See https://www.postgresql.org/docs/current/sql-insert.html

Besides that, I don’t think it would be the right fix. It would circumvent the issue but not fix the root cause.

avatar muhme
muhme - comment - 27 Mar 2026

@mtense I was not able to reproduce the issue using a multilingual site and having multiple articles in one language using Cafè and Cafe. No error with Components > Smart Search > Index in 5.4-dev branch. Could you give more advice? Do I missing something?

avatar exlemor
exlemor - comment - 27 Mar 2026

Hi @mtense, I can not sadly reproduce your error. Can you please make a screen video capture so that we can see exactly what you are seeing? Thank you.

avatar dariofin
dariofin - comment - 25 Apr 2026

I've been hitting this bug on Joomla 5.4.5 and traced it to the root cause, which differs from what the previous PRs addressed.

Root cause

The GROUP BY in the INSERT into #__finder_terms is over-specified — it includes ta.term_weight, ta.phrase, ta.stem, and ta.common. Since the aggregation loop runs once per indexing context (title, body, metadata, etc.) with a different term_weight each time, a word appearing in both title and body generates two rows in tokens_aggregate with the same (term, language) but different term_weight. The over-specified GROUP BY treats these as distinct rows, and the INSERT attempts to write both — violating the idx_term_language unique key.

This is intra-article, so it happens even with a single article. That's why PR #47472 (LEFT JOIN + IS NULL) didn't fully fix it: neither row exists yet in finder_terms at INSERT time, so both pass the IS NULL filter.

The fix

Reduce GROUP BY to (term, language) only — the exact columns of the unique index — and use MIN() for all other columns:

- ' SELECT ta.term, ta.stem, ta.common, ta.phrase, ta.term_weight, SOUNDEX(ta.term), ta.language' .
+ ' SELECT ta.term, MIN(ta.stem), MIN(ta.common), MIN(ta.phrase), MIN(ta.term_weight), SOUNDEX(ta.term), ta.language' .
  ' FROM ' . $db->quoteName('#__finder_tokens_aggregate') . ' AS ta' .
  ' WHERE ta.term_id = 0' .
- ' GROUP BY ta.term, ta.stem, ta.common, ta.phrase, ta.term_weight, SOUNDEX(ta.term), ta.language'
+ ' GROUP BY ta.term, ta.language'

This is standard SQL — no INSERT IGNORE, fully compatible with PostgreSQL.

Verified in production on Joomla 5.4.5 / MySQL at colon.com.uy. Full reindex completes without errors on a site with hundreds of articles containing accented Spanish words.

Full analysis and patched file: https://github.com/dariofin/joomla-finder-duplicate-fix

Add a Comment

Login with GitHub to post a comment