User tests: Successful: Unsuccessful:
This Pull Request is related to issues #9333 and #9556.
Also to PR #9559
Add a better check for adding the canonical html tag.
Domain in the SEF plugin field | Canonical added by a component | Result |
---|---|---|
(empty) | No | The canonical html tag will NOT be added |
(empty) | Yes | No change (the component added canonical html tag stays the same) |
Yes | No | The canonical html tag will be added with the new domain typed in the SEF plugin domain field (ex: https://www.example.com/menu-item/). |
Yes | Yes | The canonical html tag will be the same as added by the component but now with the new domain typed in the SEF plugin domain field. |
Status | New | ⇒ | Pending |
Labels |
Added:
?
|
If canonical is added only when domain is set, there is no need to revert other changes since in this case SEF plugin should override components anyway. Otherwise components would not use the canonical domain.
In such a case, canonical should simply be the specified domain + current path.
By my logic, it would be best to check for existing canonicals before doing anything.
ok so let me try to code something for that.
don't test yet.
Category | ⇒ | Plugins Router / SEF |
ok i changed the code.
How it works now:
Back to using onAfterDispatch.
If a canonical EXISTS in the document when the onAfterDispatch
is run, the plugin doesn't add a canonical.
This should solve issues with components and duplicate canonicals.
If a canonical DOESN'T EXIST in the document when the onAfterDispatch
is run and a domain is ADDED in the SEF plugin domain field, it adds the canonical with that domain.
If a canonical DOESN'T EXIST in the document when the onAfterDispatch
is run and a domain is NOT ADDED in the SEF plugin domain field, it only adds the canonical if the canonical URI is different than the current uri.
Please check code and test.
Open to any suggestions/improvements regarding this.
Nice and simple to allow components to add rel canonical without reverse engineering anything
It doesn't override component canonicals when canonical domain is set.
Can add this: foreach ($doc->_links as $canonical => $link)
to get existing canonical link as $canonical
. Then use it with JUri::getInstance()
to get only the path. Then prepend the canonical $domain
to it.
EDIT: in this case the original canonical also needs to be unset unset($doc->_links[$canonical])
since we're adding a modified.
For me this is good, I would never override canonical from a custom component. The rest works as described. So, test ok from me.
It still remains an issue though that canonicals generated this way are not real canonicals, so I am not sure if this is useful at all (or that it doesn't have a negative impact).
I just realized you would like the plugin to replace the domain part of the custom component - that sounds like a good idea. At present, this is indeed not the case.
It doesn't override component canonicals when canonical domain is set.
Can add this: foreach ($doc->_links as $canonical => $link) to get existing canonical link as $canonical. Then use it with JUri::getInstance() to get only the path. Then prepend the canonical $domain to it.
EDIT: in this case the original canonical also needs to be unset unset($doc->_links[$canonical]) since we're adding a modified.I just realized you would like the plugin to replace the domain part of the custom component - that sounds like a good idea. At present, this is indeed not the case.
I will add this them.
please check now.
Test successful, works as described.
Test instructions updated.
@chivitli if it's ok for you. please mark as "Tested successfully" in "Joomla! Issue Tracker" https://issues.joomla.org/tracker/joomla-cms/9565 (after login the "Test this" button appears).
For more information see https://docs.joomla.org/Testing_Joomla!_patches#Recording_test_results
I have tested this item successfully on d00b99c
I have tested this item successfully on d00b99c
Works as expected.
Sorry folks, but I am totally lost...
I am using basic staging.
After this patch, eventhough I have left the Domain field empty in the SEF plugin, I now get a canonical.
<base href="http://localhost:8888/trunkgitnew/contact-component.html" />
<meta name="description" content="UNE DESCRIPTION" />
<meta name="generator" content="Joomla! - Open Source Content Management" />
<title>Contact Component - My very long test site name</title>
<link href="http://localhost:8888/trunkgitnew/contact-component.html" rel="canonical" />
<link href="/trunkgitnew/templates/protostar/favicon.ico" rel="shortcut icon" type="image/vnd.microsoft.icon" />
[...]
As far as I know, a canonical link is only used to show Search Engine the preferred url when multiple URLs reach the same page
A canonical link element is an HTML element that helps webmasters prevent duplicate content issues by specifying the "canonical" or "preferred" version of a web page[1][2] as part of search engine optimization. It is described in RFC 6596, which went live in April 2012.
Why then create a canonical when only one URL exists?
I do now understand though that we get also a canonical before this patch when loading another page than the home as I have indeed overseen that in my former tests...
I just don't understand the use of it in that case.
There are always multiple urls in Joomla. Not only multiple, but an infinite number of them... As mentioned, Google says: Mark up the canonical page and any other variants with a rel="canonical" link element.
There is no need for this optimization to not show canonical on the canonical page.
Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.
Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.
Ok the plugin, previously had code:
Now it WILL happen that it will add 2 (different) rel canonical for 2 pages that are same
But
I don't know google having special code to try to detect if 2 pages are the same if rel canonical is missing, furthermore which 2 pages would it compare and why it try to compare them ?
I do think that in scenario 4 canonical URL should not be added at all since it's going be incorrect most of the time anyways. Better no canonical than wrong canonical.
If canonical is not found and canonical domain is not set, leave it as is or add a possibly incorrect canonical.
As ggppdk said before, correct canonicals can only be provided by components. Unless route helpers can be used in the plugin dynamically using URL vars, I don't see how this can be fixed at the moment.
wait i'm the one confused now!
will respond to each one of your comments.
After this patch, eventhough I have left the Domain field empty in the SEF plugin, I now get a canonica
It shouldn't. Are you at http://localhost:8888/trunkgitnew/contact-component.html
URI?
I do now understand though that we get also a canonical before this patch when loading another page than the home as I have indeed overseen that in my former tests...
I just don't understand the use of it in that case.Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.
No, we will only get canonical on the pages in which the canoncial is different from the current url. Example:
https://domain.tlg/menu-item/?aaaaa
will get canonicalhttps://domain.tlg/menu-item/
will not get canonicalSee https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82
There are always multiple urls in Joomla. Not only multiple, but an infinite number of them... As mentioned, Google says: Mark up the canonical page and any other variants with a rel="canonical" link element.
There is no need for this optimization to not show canonical on the canonical page.
Actually that's true.
I can remove the check in https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82 if you guys think it's better.
Ok the plugin, previously had code: not to add rel canonical if canonical URL is same as current URL
Now it WILL happen that it will add 2 (different) rel canonical for 2 pages that are same
Yes, before this PR. This is fixed by this PR.
But the effect will be the same as before, goggle will consider the 2 pages different because if you do not add rel canonical for 2 urls then the rel canonical is supposed to be the URL itself !
Don't understand.
I don't know google having special code to try to detect if 2 pages are the same if rel canonical is missing, furthermore which 2 pages would it compare and why it try to compare them ?
anyway the code to check if current URL is same as calculated canonical is not directly relevant to this PR, so it can be re-added
That code is already in this PR. See https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82.
I do think that in scenario 4 canonical URL should not be added at all since it's going be incorrect most of the time anyways. Better no canonical than wrong canonical.
What scenario 4? Don't understand. Do you mean If canonical is not found and canonical domain is not set, leave it as is or add a possibly incorrect canonical.
If so, that code is already in this PR. See https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82.
As ggppdk said before, correct canonicals can only be provided by components. Unless route helpers can be used in the plugin dynamically using URL vars, I don't see how this can be fixed at the moment.
This PR does not overwrite the component added canonical (only if you set the domain in the domain field, and even then only changes the domain).
Yes, currently it adds incorrect canonical. Perhaps should just leave it as is (i.e. don't add canonical at all). Wrong canonical URL beats the purpose of canonical tag, to be honest.
To clarify, this is because JRoute::_('index.php?' . http_build_query($this->app->getRouter()->getVars()), false);
builds non-canonical URL. So the "canonical" URL isn't really canonical, which makes it pointless and maybe even harmful (in sense of search engines).
To clarify, this is because JRoute::_('index.php?' . http_build_query($this->app->getRouter()->getVars()), false); builds non-canonical URL. So the "canonical" URL isn't really canonical, which makes it pointless and maybe even harmful (in sense of search engines).
To understand better, can you give examples where the canonical is wrongly built with that code?
Is this case, at least is good
Example:
BTW an easy way to check if the canonical is added in thsi PR is add in this line https://github.com/joomla/joomla-cms/pull/9565/files#diff-78cd3849e61af8fc9737ca40ef91ba05R84
echo 'Current: ' . rawurldecode($uri->toString()) . ' | Canonical: ' . htmlspecialchars($canonical);
Then you can navigate trough the site, when that message appears, a canonical exists.
But the canonical added on https://domain.tlg/menu-item/?aaaaa is https://domain.tlg/menu-item/?aaaaa=, which is clearly incorrect.That is the problem. It should be https://domain.tlg/menu-item, but we cannot simply set this in the system-wide SEF plugin. It could be possible by checking for component and view vars and loading respective route helpers to build correct URLs but I don't think this change would be approved by the team.
But the canonical added on https://domain.tlg/menu-item/?aaaaa is https://domain.tlg/menu-item/?aaaaa, which is clearly incorrect.That is the problem.
Ok, right i see what you mean now.
Actually the canonical will be https://domain.tlg/menu-item/?aaaaa=
I notice a lot of cases where that happens.
So i agree, it should be disabled them.
It should be https://domain.tlg/menu-item, but we cannot simply set this in the system-wide SEF plugin. It could be possible by checking for component and view vars and loading respective route helpers to build correct URLs but I don't think this change would be approved by the team.
That seems to be a router issue, not the canonical generation. So this will not come in this PR.
This PR has received new commits.
This PR has received new commits.
Yeah, I understand that. This is why canonical should not be added in the plugin like this. But this needs to be discussed with the team, I think.
ok. updated the code, now it ONLY adds (or changes in the case a component already added) the canonical if the SEF plugin domain field has some value.
Also added a placeholder (hint) to the domain field.
Test instructions updated in the first post, please test now.
This PR has received new commits.
Since canonical URL is now used only to canonize domains, it can be left as current path with canonical domain prepended. Routed URL is incorrect anyways, so no point in leaving it there.
right, you're right will change it
This PR has received new commits.
done please retest and mark the test result again.
I have tested this item successfully on 86f63a6
I have tested this item successfully on 86f63a6
@infograf768 ok for you now too?
This PR has received new commits.
I have tested this item successfully on 99b82a7
Status | Pending | ⇒ | Ready to Commit |
It is now behaving as should. Thanks!
RTC
You can merge into 3.5.1
Labels |
Added:
?
|
Status | Ready to Commit | ⇒ | Fixed in Code Base |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2016-03-26 15:11:58 |
Closed_By | ⇒ | wilsonge |
Labels |
Removed:
?
|
Merged - thanks :)
Milestone |
Added: |
@scence @SharkyKZ @infograf768 @ggppdk @chivitli
Can you test this and check if all issues are resolved?