? Pending

User tests: Successful: Unsuccessful:

avatar andrepereiradasilva
andrepereiradasilva
24 Mar 2016

This Pull Request is related to issues #9333 and #9556.

Also to PR #9559

Summary of Changes

Add a better check for adding the canonical html tag.

Testing Instructions

  1. Apply this patch. With SEF system plugin active
  2. Do the following test scenarios:
Domain in the SEF plugin field Canonical added by a component Result
(empty) No The canonical html tag will NOT be added
(empty) Yes No change (the component added canonical html tag stays the same)
Yes No The canonical html tag will be added with the new domain typed in the SEF plugin domain field (ex: https://www.example.com/menu-item/).
Yes Yes The canonical html tag will be the same as added by the component but now with the new domain typed in the SEF plugin domain field.
avatar andrepereiradasilva andrepereiradasilva - open - 24 Mar 2016
avatar andrepereiradasilva andrepereiradasilva - change - 24 Mar 2016
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 24 Mar 2016
Labels Added: ?
avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

@scence @SharkyKZ @infograf768 @ggppdk @chivitli

Can you test this and check if all issues are resolved?

avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

@wilsonge if tested ok, IMHO should go to 3.5.1

avatar SharkyKZ
SharkyKZ - comment - 24 Mar 2016

If canonical is added only when domain is set, there is no need to revert other changes since in this case SEF plugin should override components anyway. Otherwise components would not use the canonical domain.

In such a case, canonical should simply be the specified domain + current path.

avatar SharkyKZ
SharkyKZ - comment - 24 Mar 2016

By my logic, it would be best to check for existing canonicals before doing anything.

  1. If canonical is found and canonical domain is set, replace current domain with canonical domain.
  2. If canonical is found and canonical domain is not set, leave it as is.
  3. If canonical is not found and canonical domain is set, add a canonical.
  4. If canonical is not found and canonical domain is not set, leave it as is or add a possibly incorrect canonical.
avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

ok so let me try to code something for that.
don't test yet.

avatar brianteeman brianteeman - change - 24 Mar 2016
Category Plugins Router / SEF
avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

ok i changed the code.

How it works now:

  • Back to using onAfterDispatch.

  • If a canonical EXISTS in the document when the onAfterDispatch is run, the plugin doesn't add a canonical.
    This should solve issues with components and duplicate canonicals.

  • If a canonical DOESN'T EXIST in the document when the onAfterDispatch is run and a domain is ADDED in the SEF plugin domain field, it adds the canonical with that domain.

  • If a canonical DOESN'T EXIST in the document when the onAfterDispatch is run and a domain is NOT ADDED in the SEF plugin domain field, it only adds the canonical if the canonical URI is different than the current uri.

Please check code and test.

Open to any suggestions/improvements regarding this.

avatar ggppdk
ggppdk - comment - 24 Mar 2016

Nice and simple to allow components to add rel canonical without reverse engineering anything

  • will test this
avatar SharkyKZ
SharkyKZ - comment - 24 Mar 2016

It doesn't override component canonicals when canonical domain is set.
Can add this: foreach ($doc->_links as $canonical => $link) to get existing canonical link as $canonical. Then use it with JUri::getInstance() to get only the path. Then prepend the canonical $domain to it.
EDIT: in this case the original canonical also needs to be unset unset($doc->_links[$canonical]) since we're adding a modified.

avatar chivitli
chivitli - comment - 24 Mar 2016

For me this is good, I would never override canonical from a custom component. The rest works as described. So, test ok from me.

It still remains an issue though that canonicals generated this way are not real canonicals, so I am not sure if this is useful at all (or that it doesn't have a negative impact).

avatar chivitli
chivitli - comment - 24 Mar 2016

I just realized you would like the plugin to replace the domain part of the custom component - that sounds like a good idea. At present, this is indeed not the case.

avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

It doesn't override component canonicals when canonical domain is set.
Can add this: foreach ($doc->_links as $canonical => $link) to get existing canonical link as $canonical. Then use it with JUri::getInstance() to get only the path. Then prepend the canonical $domain to it.
EDIT: in this case the original canonical also needs to be unset unset($doc->_links[$canonical]) since we're adding a modified.

I just realized you would like the plugin to replace the domain part of the custom component - that sounds like a good idea. At present, this is indeed not the case.

I will add this them.

avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

please check now.

avatar chivitli
chivitli - comment - 24 Mar 2016

Test successful, works as described.

avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Mar 2016

Test instructions updated.

@chivitli if it's ok for you. please mark as "Tested successfully" in "Joomla! Issue Tracker" https://issues.joomla.org/tracker/joomla-cms/9565 (after login the "Test this" button appears).
For more information see https://docs.joomla.org/Testing_Joomla!_patches#Recording_test_results

avatar chivitli chivitli - test_item - 24 Mar 2016 - Tested successfully
avatar chivitli
chivitli - comment - 24 Mar 2016

I have tested this item :white_check_mark: successfully on d00b99c


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar SharkyKZ SharkyKZ - test_item - 24 Mar 2016 - Tested successfully
avatar SharkyKZ
SharkyKZ - comment - 24 Mar 2016

I have tested this item :white_check_mark: successfully on d00b99c

Works as expected.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar infograf768
infograf768 - comment - 25 Mar 2016

Sorry folks, but I am totally lost...
I am using basic staging.
After this patch, eventhough I have left the Domain field empty in the SEF plugin, I now get a canonical.

 <base href="http://localhost:8888/trunkgitnew/contact-component.html" />
  <meta name="description" content="UNE DESCRIPTION" />
  <meta name="generator" content="Joomla! - Open Source Content Management" />
  <title>Contact Component - My very long test site name</title>
  <link href="http://localhost:8888/trunkgitnew/contact-component.html" rel="canonical" />
  <link href="/trunkgitnew/templates/protostar/favicon.ico" rel="shortcut icon" type="image/vnd.microsoft.icon" />
[...]

As far as I know, a canonical link is only used to show Search Engine the preferred url when multiple URLs reach the same page
A canonical link element is an HTML element that helps webmasters prevent duplicate content issues by specifying the "canonical" or "preferred" version of a web page[1][2] as part of search engine optimization. It is described in RFC 6596, which went live in April 2012.

Why then create a canonical when only one URL exists?

avatar infograf768
infograf768 - comment - 25 Mar 2016

I do now understand though that we get also a canonical before this patch when loading another page than the home as I have indeed overseen that in my former tests...
I just don't understand the use of it in that case.

avatar chivitli
chivitli - comment - 25 Mar 2016

There are always multiple urls in Joomla. Not only multiple, but an infinite number of them... As mentioned, Google says: Mark up the canonical page and any other variants with a rel="canonical" link element.

There is no need for this optimization to not show canonical on the canonical page.

avatar infograf768
infograf768 - comment - 25 Mar 2016

Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.

avatar ggppdk
ggppdk - comment - 25 Mar 2016

Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.

Ok the plugin, previously had code:

  • not to add rel canonical if canonical URL is same as current URL

Now it WILL happen that it will add 2 (different) rel canonical for 2 pages that are same

But

  • the effect will be the same as before, goggle will consider the 2 pages different because if you do not add rel canonical for 2 urls then the rel canonical is supposed to be the URL itself !
avatar ggppdk
ggppdk - comment - 25 Mar 2016

I don't know google having special code to try to detect if 2 pages are the same if rel canonical is missing, furthermore which 2 pages would it compare and why it try to compare them ?

  • anyway the code to check if current URL is same as calculated canonical is not directly relevant to this PR, so it can be re-added
avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

I do think that in scenario 4 canonical URL should not be added at all since it's going be incorrect most of the time anyways. Better no canonical than wrong canonical.

If canonical is not found and canonical domain is not set, leave it as is or add a possibly incorrect canonical.

As ggppdk said before, correct canonicals can only be provided by components. Unless route helpers can be used in the plugin dynamically using URL vars, I don't see how this can be fixed at the moment.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

wait i'm the one confused now!
will respond to each one of your comments.

@infograf768

After this patch, eventhough I have left the Domain field empty in the SEF plugin, I now get a canonica

It shouldn't. Are you at http://localhost:8888/trunkgitnew/contact-component.html URI?

I do now understand though that we get also a canonical before this patch when loading another page than the home as I have indeed overseen that in my former tests...
I just don't understand the use of it in that case.

Are'nt we going to get a canonical for ANY page loaded this way. I mean when we have another url to get to the same page, we would have a different canonical.

No, we will only get canonical on the pages in which the canoncial is different from the current url. Example:

  • https://domain.tlg/menu-item/?aaaaawill get canonical
  • https://domain.tlg/menu-item/will not get canonical

See https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82

@chivitli

There are always multiple urls in Joomla. Not only multiple, but an infinite number of them... As mentioned, Google says: Mark up the canonical page and any other variants with a rel="canonical" link element.
There is no need for this optimization to not show canonical on the canonical page.

Actually that's true.
I can remove the check in https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82 if you guys think it's better.

@ggppdk

Ok the plugin, previously had code: not to add rel canonical if canonical URL is same as current URL
Now it WILL happen that it will add 2 (different) rel canonical for 2 pages that are same

Yes, before this PR. This is fixed by this PR.

But the effect will be the same as before, goggle will consider the 2 pages different because if you do not add rel canonical for 2 urls then the rel canonical is supposed to be the URL itself !

Don't understand.

I don't know google having special code to try to detect if 2 pages are the same if rel canonical is missing, furthermore which 2 pages would it compare and why it try to compare them ?
anyway the code to check if current URL is same as calculated canonical is not directly relevant to this PR, so it can be re-added

That code is already in this PR. See https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82.

@SharkyKZ

I do think that in scenario 4 canonical URL should not be added at all since it's going be incorrect most of the time anyways. Better no canonical than wrong canonical.

What scenario 4? Don't understand. Do you mean If canonical is not found and canonical domain is not set, leave it as is or add a possibly incorrect canonical.
If so, that code is already in this PR. See https://github.com/andrepereiradasilva/joomla-cms/blob/patch-6/plugins/system/sef/sef.php#L79-L82.

As ggppdk said before, correct canonicals can only be provided by components. Unless route helpers can be used in the plugin dynamically using URL vars, I don't see how this can be fixed at the moment.

This PR does not overwrite the component added canonical (only if you set the domain in the domain field, and even then only changes the domain).

avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

Yes, currently it adds incorrect canonical. Perhaps should just leave it as is (i.e. don't add canonical at all). Wrong canonical URL beats the purpose of canonical tag, to be honest.

To clarify, this is because JRoute::_('index.php?' . http_build_query($this->app->getRouter()->getVars()), false); builds non-canonical URL. So the "canonical" URL isn't really canonical, which makes it pointless and maybe even harmful (in sense of search engines).

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

To clarify, this is because JRoute::_('index.php?' . http_build_query($this->app->getRouter()->getVars()), false); builds non-canonical URL. So the "canonical" URL isn't really canonical, which makes it pointless and maybe even harmful (in sense of search engines).

To understand better, can you give examples where the canonical is wrongly built with that code?

Is this case, at least is good
Example:

BTW an easy way to check if the canonical is added in thsi PR is add in this line https://github.com/joomla/joomla-cms/pull/9565/files#diff-78cd3849e61af8fc9737ca40ef91ba05R84

echo 'Current: ' . rawurldecode($uri->toString()) . ' | Canonical: ' . htmlspecialchars($canonical);

Then you can navigate trough the site, when that message appears, a canonical exists.

avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

But the canonical added on https://domain.tlg/menu-item/?aaaaa is https://domain.tlg/menu-item/?aaaaa=, which is clearly incorrect.That is the problem. It should be https://domain.tlg/menu-item, but we cannot simply set this in the system-wide SEF plugin. It could be possible by checking for component and view vars and loading respective route helpers to build correct URLs but I don't think this change would be approved by the team.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

But the canonical added on https://domain.tlg/menu-item/?aaaaa is https://domain.tlg/menu-item/?aaaaa, which is clearly incorrect.That is the problem.

Ok, right i see what you mean now.
Actually the canonical will be https://domain.tlg/menu-item/?aaaaa=
I notice a lot of cases where that happens.
So i agree, it should be disabled them.

It should be https://domain.tlg/menu-item, but we cannot simply set this in the system-wide SEF plugin. It could be possible by checking for component and view vars and loading respective route helpers to build correct URLs but I don't think this change would be approved by the team.

That seems to be a router issue, not the canonical generation. So this will not come in this PR.

avatar joomla-cms-bot
joomla-cms-bot - comment - 25 Mar 2016

This PR has received new commits.

CC: @chivitli, @SharkyKZ


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar joomla-cms-bot
joomla-cms-bot - comment - 25 Mar 2016

This PR has received new commits.

CC: @chivitli, @SharkyKZ


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

Yeah, I understand that. This is why canonical should not be added in the plugin like this. But this needs to be discussed with the team, I think.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

ok. updated the code, now it ONLY adds (or changes in the case a component already added) the canonical if the SEF plugin domain field has some value.
Also added a placeholder (hint) to the domain field.

Test instructions updated in the first post, please test now.

avatar andrepereiradasilva andrepereiradasilva - change - 25 Mar 2016
The description was changed
avatar joomla-cms-bot
joomla-cms-bot - comment - 25 Mar 2016

This PR has received new commits.

CC: @chivitli, @SharkyKZ


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

Since canonical URL is now used only to canonize domains, it can be left as current path with canonical domain prepended. Routed URL is incorrect anyways, so no point in leaving it there.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

right, you're right will change it

avatar joomla-cms-bot
joomla-cms-bot - comment - 25 Mar 2016

This PR has received new commits.

CC: @chivitli, @SharkyKZ


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

done please retest and mark the test result again.

avatar SharkyKZ SharkyKZ - test_item - 25 Mar 2016 - Tested successfully
avatar SharkyKZ
SharkyKZ - comment - 25 Mar 2016

I have tested this item :white_check_mark: successfully on 86f63a6


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar chivitli chivitli - test_item - 25 Mar 2016 - Tested successfully
avatar chivitli
chivitli - comment - 25 Mar 2016

I have tested this item :white_check_mark: successfully on 86f63a6


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

@infograf768 ok for you now too?

avatar joomla-cms-bot
joomla-cms-bot - comment - 25 Mar 2016

This PR has received new commits.

CC: @chivitli, @SharkyKZ


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Mar 2016

Just a little change at @wilsonge request, no need to retest.

avatar infograf768 infograf768 - test_item - 26 Mar 2016 - Tested successfully
avatar infograf768
infograf768 - comment - 26 Mar 2016

I have tested this item :white_check_mark: successfully on 99b82a7


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar infograf768 infograf768 - change - 26 Mar 2016
Status Pending Ready to Commit
avatar infograf768
infograf768 - comment - 26 Mar 2016

It is now behaving as should. Thanks!

RTC

@wilsonge

You can merge into 3.5.1


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9565.

avatar joomla-cms-bot joomla-cms-bot - change - 26 Mar 2016
Labels Added: ?
avatar wilsonge wilsonge - change - 26 Mar 2016
Status Ready to Commit Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2016-03-26 15:11:58
Closed_By wilsonge
avatar wilsonge wilsonge - close - 26 Mar 2016
avatar wilsonge wilsonge - merge - 26 Mar 2016
avatar joomla-cms-bot joomla-cms-bot - close - 26 Mar 2016
avatar wilsonge wilsonge - reference | d8a127f - 26 Mar 16
avatar wilsonge wilsonge - merge - 26 Mar 2016
avatar wilsonge wilsonge - close - 26 Mar 2016
avatar joomla-cms-bot joomla-cms-bot - change - 26 Mar 2016
Labels Removed: ?
avatar wilsonge
wilsonge - comment - 26 Mar 2016

Merged - thanks :)

avatar andrepereiradasilva andrepereiradasilva - head_ref_deleted - 26 Mar 2016
avatar wilsonge wilsonge - change - 26 Mar 2016
Milestone Added:
avatar andrepereiradasilva
andrepereiradasilva - comment - 26 Mar 2016

@SharkyKZ @chivitli @infograf768 @ggppdk @wilsonge thanks for tests and/or contributions.

Add a Comment

Login with GitHub to post a comment