No Code Attached Yet bug
avatar stAn47
stAn47
26 Sep 2019

Joomla 3.9.4, 3.9.6 and 3.9.12 tested

short:

  • there is a mess in alternate vs canonical URL set up as one can rewrite the other one due to usage of URL in document->_links regardless of checking the types (i.e. canonical vs alternate) which can in worse case end up with wrong canonical URL set by system-sef

hello,
we discovered that array_unique at HtmlDocument.php (former JDocumentHtml) overwrites canonical URl with alternative URL from language plugin and thus no canonical URL is shown on the page.

my problem is that Canonical and Alternate URLs are not correct when accessed on different URLs even when canonical is correctly set by the underlying system.

example of URLs (these are virtuemart category pages which have language variants/alternate urls):

this URL should be canonical for all the other URLs listed below and also canonical to itself:

https://domain.com/nl/exterieur/skidplates.html
-> does not show CANONICAL but shows correct ALTERNATIVE

https://domain.com/nl/exterieur/skidplates.html?test
-> shows correct canonical, but NO ALTERNATIVE (i.e. language variants)

https://domain.com/nl/exterieur/bumperbeschermers////////////////////////////////.html
-> does not sohw CANONICAL but show corret ALTERNATIVE

if any core dev here has a site which is both multilanguage and uses .html suffix, please try to see/test if those 3 urls above return proper results, which i believe should be consistent:

  • canonical URL set on all example URLs
  • alternate URLs set on all example URLs

i now:

  • removed array_unique from within DocumentHtml at line 335 which unsets canonical if alternate URLs are used
  • configured system SEF to use domain name (which further alters canonical)
  • adjusted template file which sets canonical with:

testing code:

$link = 'index.php?option=com_virtuemart&view=category';
if($this->category->virtuemart_category_id > 0){
	$root = JUri::root(); 
	if (substr($root, -1) === '/') $root = substr($root, 0, -1); 
	$link .= '&virtuemart_category_id='.$this->category->virtuemart_category_id;
	$url = JRoute::_($link, false); 
	$document->addHeadLink($url, 'canonical'); 

}

if i leave the domain to be added:

$url = $root.JRoute::_($link, false); 

system-sef sets canonical to current URL which is not correct.

also there is language filter plugin which overwrites the canonical URL with alternate URL

adjusted coce in language filter plugin:

				foreach ($languages as $i => &$language)
				{
					
					if (!isset($doc->_links[$server.$language->link])) {
					$doc->addHeadLink($server . $language->link, 'alternate', 'rel', array('hreflang' => $i));
					}
				}

since DocumentHtml uses unique URL keys for _links even when the array_unique would be removed, i think language filter plugin should check if the key entry already exists.

best regards, stan

avatar stAn47 stAn47 - open - 26 Sep 2019
avatar joomla-cms-bot joomla-cms-bot - labeled - 26 Sep 2019
avatar stAn47
stAn47 - comment - 26 Sep 2019

to get this resolved minimalistically for url like /nl/test///.html which has canonical /nl/test.html

i now:

  • modified the language fitler plugin not to ovewrite canonical with alternative if already exists
  • removed domain from system - sef (as this inserts current URL into canonical which is not correct)
  • left application to use $root.JRoute... as this was the original code

now i get proper:

  • canonical
  • alternate URLs are now ok with /de/test.html /fr/test.html etc...
  • base URL is still current -> i am not sure if this is 100% correct

best regards, stan


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/26416.

avatar infograf768
infograf768 - comment - 26 Sep 2019

Confused. Joomla core does not provide real canonical. The sef plugin just replaces a domain by another withe same structure.

avatar stAn47
stAn47 - comment - 27 Sep 2019

ok, the bugs are:

  1. when adding both alternate and canonical which are SAME URLs DocumentHtml class overwrites one with the other (since $doc->_links[$url] is the same)
  2. joomla core SEF plugin modifies canonical with incorrect URL (i.e. current) when canonical was set with full sef url including domain in previous code
  3. due to the bug in n.1 language fitlter plugin sets alternate URLs which overwrite canonical URLs in n1

in short:

  • joomla core does not allow to use both alternate and canonical URLs if they are same

which is a quite limitation on CMS since rel="link" becomes to be quite popular on various html5 related systems.

should i explain a bit more ?

stan

avatar brianteeman
brianteeman - comment - 27 Sep 2019

It is not a canonical url
It is a canonical domain

avatar stAn47
stAn47 - comment - 27 Sep 2019

hmm, i am not sure if we understand each other - we use canonical URLs to tell bots which is the prefered URL among multiple URLs which can get generated or accessed.

for example joomla's URLs can be accessed by SEF URls or by the non-SEF urls, other than this you can add further paramters such as utm_source to the URLs which must not alter the canonical or alternate URLs.

so each page should have a unique canonical URL regardless of current URL (which can contain the utm_source for example) so that it's clear for all bots which URL is the prefered one.

alternate URLs should include all language variants including self, so if we visit an URL it should contain both canonical URL and all variants of alternate URLs (per all languages that are linked to a menu, article, etc... )

the problem is that joomla core by design does not allow to use alternate and canonical URL within.

<link href="https://www.domain.com/de/exterieur/ladekantenschutz.html" rel="alternate" hreflang="de-DE" />
<link href="https://www.domain.com/fr/exterieur/protections-de-seuil-de-coffre.html" rel="alternate" hreflang="fr-FR" />
<link href="https://www.domain.com/exterior/bumper-protector.html" rel="alternate" hreflang="en-GB" />
<link href="https://www.domain.com/nl/exterieur/bumperbeschermers.html" rel="canonical" />

but should also include:

<link href="https://www.domain.com/nl/exterieur/bumperbeschermers.html" rel="alternate" hreflang="nl-NL" />

the last one is not possible by "joomla design"

stan

avatar stAn47
stAn47 - comment - 27 Sep 2019

by "joomla design" i mean using document functions:

$doc->addHeadLink($url, 'alternate', 'rel', array('hreflang' => $i));
$doc->addHeadLink($url, 'canonical', 'rel');

since type is ignored and $url is used as unique key in DocumentHtml overriding previous usage for same $url

stan

avatar infograf768
infograf768 - comment - 27 Sep 2019

alternate URLs should include all language variants including self,

The languagefilter system plugin already does that. by default. For core components.

The sef plugin does not create real canonical url as explained to you above.

avatar stAn47
stAn47 - comment - 27 Sep 2019

yes, the problem is that language filter plugin overwrites canonical URL set by a 3rd party extension (or from within template during component dispatch) and system SEF then sets it to incorrect URL (current) overwriting the language filter's generated alternate URL.

so we cannot say "does that" since that is what i am trying to report here. whatever it does, it is not consitent with the desired outcome.

right now joomla does not support multiple "< link > elements with the same URL - is this desired?

stan

avatar infograf768
infograf768 - comment - 27 Sep 2019

So, you are running a 3rd party extension. Without testing it, if it is free, hard to help you.

Now, please look again at the code we use in the language filter plugin when we want to add a x-default

// Use a custom tag because addHeadLink is limited to one URI per tag
$doc->addCustomTag('<link href="' . $server . $languages[$xdefault_language]->link . '" rel="alternate" hreflang="x-default" />');

This lets overcome the limitation for addHeadLink.

EDIT: and once more, using the canonical produced by sef plugin, which only aim is to deal with another domain is NOT a solution for canonical urls.

avatar ReLater
ReLater - comment - 16 Aug 2021

I can confirm this issue with current 3.9.28.
The reason is, like mentioned above, that languagefilter.php and sef.php both use
$doc->addHeadLink($link...

  • Multilang page DE and EN.
  • If you enter a Site Domain in SEF that matches exactly the domain in current URL. E.g. https://example.com.
  • and open a page https://example.com/de/myproducts.
  • then
    • languagefilter.php adds correctly 2 alternate links via addHeadLink().
    • Afterwards sef.php does the same for the same URL and overrides the alternate tag.

Reason is that addHeadLink() respectively $doc->_links uses/is an array where the URls (links) are unique array keys.

I'm talking here about the "normal" alternate links, not the x-alternate. And only about core.

Compare

avatar jokorntheuer
jokorntheuer - comment - 12 Sep 2022

I am here to confirm that this issue still exists - in current joomla 3.X versions as also in Joomla 4!
As @ReLater already posted, this is a serious issue for Joomla when it comes to SEO and multi-language site.

The solution is quite easy:

  • the self referencing-alternate lang tag should not be overwritten by the canonical link!
  • simply fix it so that every language-page has a canonical link AND a self-referencing alternate lang link.

Wordpress does it correctly. This is the method as it is recommended by google!

Read here:
https://developers.google.com/search/docs/advanced/crawling/localized-versions

I really hope that this issue will be taken serious and gets fixed soon. It is a burden for joomla users long enough.

I have discussed this issue with Stefan Wendhausen on Facebook and he told me to push this issue.

greetings,
Joe


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/26416.

avatar weeblr
weeblr - comment - 14 Dec 2022

Hi

Chiming in, the issue is still the same in current J3 and J4, and is not related to any 3rd-party extension.

Can be reproduced by:

  • using a multilingual site
  • setting a canonical domain to the current domain (not a different domain), something people do to ensure a canonical is always present. A similar problem exists anyway if using a different domain, see below.

When you do that the hreflang for current page language disappear, replaced with the canonical. As such the hreflang set becomes invalid.

Note: if instead of adding a self-referencing domain in the SEF plugin option, you indeed point it at another domain, you do get both the desired canonical AND the hreflang tag set.
However this hreflang tag set is invalid because the hreflang set requires that the pages are not canonicalized to another page.
(this was only documented by John Mueller in a Google+ post in 2015 but it's pretty obvious why it's required. That post is mentioned at hreflang.org on this page).

One way to handle this may be to always use addCustomTag() in the language filter plugin. Currently, the language filter plugin only uses addCustomTag() for the x-default link but still use addHeadLink() for all the "regular" hreflang links.

Cheer

/cc @brianteeman @infograf768

avatar weeblr
weeblr - comment - 14 Dec 2022

I realize the above mentioned solution (using addCustomTag() in language filter) is a fix to prevent a canonical set to the same page to override the hreflang, but I did not expand on what to do when a different domain is set in the SEF plugin.

It becomes a bit harder but I think still doable: the language filter is supposed to also use that different domain to output the hreflang tag sets.

In other words, the hrelang sets should always use the canonical version of pages in the set, not the "local" version.

That would solve it for Joomla, not for 3rd-party extensions adding a canonical of course, but i'm not sure how that's solvable.

avatar jokorntheuer
jokorntheuer - comment - 13 Feb 2023

Is this fix planned to be implemented in future joomla4 version now, or not?
From an SEO-Perspective, the most crucial part for multi-languge websites is:

  • being able to have a self-referencing alternate-language link in the head for every language.
  • canonical link NOT replacing the self-referencing alternate LANG-Link, instead: both should be displayed
  • having the opportunity to also output alternate lang="en" or alternate lang="de" instead of the ISO-Format forcing me to use de-DE or en-GB.
    both should/could be implemented as options in the "system language plugin"

thanks
joe

avatar Hackwar Hackwar - change - 19 Feb 2023
Labels Added: No Code Attached Yet bug
Removed: ?
avatar Hackwar Hackwar - labeled - 19 Feb 2023
avatar Slava287
Slava287 - comment - 12 Jun 2023

Hello. The problem persists in j4 in the most recent version. Who knows how to solve this problem?


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/26416.

Add a Comment

Login with GitHub to post a comment