? Success

User tests: Successful: Unsuccessful:

avatar infograf768
infograf768
5 Sep 2014

This PR to solve the issue described in #4215 when language is the default site language and remove url language code is on.
after patch, the alternate link for the default site language will not contain the code anymore:
instead of
mysite.com/en/whatever
one will get
mysite.com/whatever

Example of result (en-GB is default site and url language code is on):
Home page before patch:

  <link href="http://localhost:8888/testwindows/trunkgitnew/en" rel="alternate" hreflang="en-GB" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/de/" rel="alternate" hreflang="de-DE" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/it/" rel="alternate" hreflang="it-IT" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/es/" rel="alternate" hreflang="es-ES" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/mk/" rel="alternate" hreflang="mk-MK" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/ta/" rel="alternate" hreflang="ta-IN" />

Home page after:

  <link href="http://localhost:8888/testwindows/trunkgitnew/" rel="alternate" hreflang="en-GB" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/de/" rel="alternate" hreflang="de-DE" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/it/" rel="alternate" hreflang="it-IT" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/es/" rel="alternate" hreflang="es-ES" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/mk/" rel="alternate" hreflang="mk-MK" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/ta/" rel="alternate" hreflang="ta-IN" />

Another associated item after patch:

 <link href="/testwindows/trunkgitnew/fr/instructions-multilangue.html" rel="canonical" />
  <link href="/testwindows/trunkgitnew/fr/instructions-multilangue.feed?type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
  <link href="/testwindows/trunkgitnew/fr/instructions-multilangue.feed?type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/multi-lingual-steps-by-steps.html" rel="alternate" hreflang="en-GB" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/de/schritt-für-schritt-zur-mehrsprachigkeit.html" rel="alternate" hreflang="de-DE" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/it/multilingua-passo-dopo-passo.html" rel="alternate" hreflang="it-IT" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/es/multiidioma-paso-a-paso.html" rel="alternate" hreflang="es-ES" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/mk/повеќејазичност-чекор-по-чекор.html" rel="alternate" hreflang="mk-MK" />
  <link href="http://localhost:8888/testwindows/trunkgitnew/ta/பன்-மொழி-படிப்படியாக.html" rel="alternate" hreflang="ta-IN" />
avatar infograf768 infograf768 - open - 5 Sep 2014
avatar jissues-bot jissues-bot - change - 5 Sep 2014
Labels Added: ?
avatar brianteeman brianteeman - change - 5 Sep 2014
The description was changed
Title
#4215 Multilingual: Remove language code in alternate link
#4215 Multilingual: Remove language code in alternate link
avatar infograf768
infograf768 - comment - 5 Sep 2014

Just thinking that we maybe should not do it for the home page as it could lead to confusions :
The language code taken out would let the url redirected to Browser Settings or Default site, depending on the language filter settings. Default site would be fine, but browser settings would not really although not much harm. At the same time, it is only for SEO in the alternate link...

What do you think?

avatar Hackwar
Hackwar - comment - 5 Sep 2014

How is this supposed to work?
I would want to have the following 2 scenarios:
I have english as my default language for the site. Now I want to be able to set it that going to domain.tld/ gets me to
a) domain.tld/en/ with a redirect and all subsequent english pages have the /en/ prefix or
b) domain.tld/ is already the english version and all subsequent english pages don't have any prefix
All other languages, which are not default, should always have the prefix. Is that possible with this PR?

avatar infograf768
infograf768 - comment - 5 Sep 2014

No chnage in behavior. You can do both as usual. This patch just takes off the lang prefix in the alternate link when Remove url lang code is on for the default site lang

avatar brianteeman brianteeman - change - 5 Sep 2014
Category Multilanguage
avatar fontanil
fontanil - comment - 5 Sep 2014

@test
patch works fine for me (test on 3.3.3 multilingual).
thanks

avatar infograf768
infograf768 - comment - 5 Sep 2014

Hmm, thinking further... (Thansk Bembelimen)
What would happen if a user chnages the default site lang... ?

avatar bembelimen
bembelimen - comment - 5 Sep 2014

I don't like the idea for the following reason:

At the moment, when I have an alternate link, I get a prefix. So searchengines takes all links with prefix in its search results (and after clicking Joomla! redirects, which is perfect).

Now if we change this, searchengines will (propably) save the links from the default language without the language prefix and the others with (which is ok, because both work). But when I now change the default language (after searchengines has indexed my page) all the links from the old default language will break (because they are saved without prefix)

avatar infograf768
infograf768 - comment - 5 Sep 2014

I agree. Closing this PR now.

avatar infograf768 infograf768 - change - 5 Sep 2014
Title
#4215 Multilingual: Remove language code in alternate link
#4215 Multilingual: Remove language code in alternate link
Status Pending Closed
Closed_Date 0000-00-00 00:00:00 2014-09-05 10:21:23
avatar infograf768 infograf768 - close - 5 Sep 2014
avatar infograf768 infograf768 - close - 5 Sep 2014
avatar smanzi
smanzi - comment - 6 Sep 2014

As asked by @infograf768 I'm commenting here instead of at the original #4215 issue tracker, so you will forgive me if there will be a bit of redundancy.

@bembelimen: I do not agree with your objections for the following reasons:

  • Changing the default language of a live site is a bad-bad-thing (or at least it is something you should do accepting the consequences)
  • Search engines will eventually re-crawl the affected pages and index the new content
  • If you think you'll ever need to do that, just do not set the "Remove URL Language Code"
  • In the Google document cited by @infograf768 (https://support.google.com/webmasters/answer/189077?hl=en) it is clearly stated (under "Common Mistakes") that "If page A links to page B, page B must link back to page A. If this is not the case for all pages that use hreflang annotations, those annotations may be ignored or not interpreted correctly.". This the exact scenario we have now, without this PR: http://example.com/ -> http://example.com/it/ -> http://example.com/en/. There is no rel="alternate" link back to http://example.com/
  • Without this PR all (or most) URLs for the default language site are indexed by Googlebot with the language code, but then you present those pages in the browser without the language code. This does not makes any sense: if you want the URLs to be indexed with the language code, just do not set the "Remove URL Language Code" option and be happy with that.

I would also add that I think this PR will not cause any problem to already indexed "default language pages with the language code": crawlers will eventually re-crawl those pages, they will be redirected (303) to the canonical URL (just as now) and re-index it. This time, anyway, everything will be consistent and obeying to the rules stated by Google.

I agree the opinion of a SEO expert will be useful: does anybody has Matt Cutts phone number? :-)

Anyway I'm asking to re-open this PR (at least for visibility reasons): it can be kept in a suspended state until there is consensus on its fate...

Regards,

Sergio


This comment was created with the J!Tracker Application at http://issues.joomla.org/.

avatar infograf768 infograf768 - change - 6 Sep 2014
Title
#4215 Multilingual: Remove language code in alternate link
#4215 Multilingual: Remove language code in alternate link
Status Closed New
avatar infograf768 infograf768 - reopen - 6 Sep 2014
avatar infograf768 infograf768 - reopen - 6 Sep 2014
avatar infograf768
infograf768 - comment - 6 Sep 2014

Reopened for discussion.

avatar phproberto
phproberto - comment - 6 Sep 2014

After reading both PRs I think @smanzi is right. And this is something that should work like this by default.

So the problem now seems to be if there are sites out there with the remove language from url enabled in multilanguage sites. Of course that does not make sense for me because the language was not removed after all. And if they have issues they can always change the remove language from url option.

So :+1: here to merge.

avatar infograf768
infograf768 - comment - 8 Sep 2014

As I said above, I am concerned by B/C and also by the Home alternate.
If the languagefilter is set to Browser Settings for new visitors and remove language code, then
mysite.com will redirect to the browser language and NOT to the default site language, therefore false information for crawlers.

avatar smanzi
smanzi - comment - 8 Sep 2014

Really, I don't see a problem, but maybe I'm missing something:

In languagefilter we have two options for Language Selection for new Visitors

  • Site Language
  • Browser Settings

If we choose Browser Settings and the visitor is using a browser set with the same language as the default language and we have also set the Remove URL Language Code option, then the visitor will see site's URLs, without the language code. Isn't this expected?

Crawlers will not identify themselves with a particular language, will fall in the default language (as defined in Language Manager), they too will see URLs without the language code, they will index the default site and all the correctly (thanks to your PR) defined rel="alternate" pages.

If instead a visitor comes with a browser set for a supported non-default language, he will be automatically be redirected to the correct /languagecode/ URLs. This too is expected.

If we set Language Selection for new Visitors to Site Language, everybody (visitors and crawlers) will fall in the default language, with or without language code as per the Remove URL Language Code option (which BTW should actually be "Remove default Language Code"). Crawlers too will be quite happy, as above.

I see absolutely no problem in the Home page: what problems do you see?

As far as regards B/C, I don't think not correcting a bug can be considered B/C...

Is there something I'm missing?

This comment was created with the J!Tracker Application at http://issues.joomla.org/.

avatar infograf768
infograf768 - comment - 10 Sep 2014

I have asked some SEO specialists to have a look.

avatar smanzi
smanzi - comment - 10 Sep 2014

@infograf768
Great, thanks!

I hope this PR can make it for J3.4...

avatar smanzi
smanzi - comment - 13 Sep 2014

@infograf768
Any news from the SEO specialists?
I see we are on the eve of the release of 3.3.4 and I'm wondering if there is a possibility to see this PR included...

avatar infograf768
infograf768 - comment - 14 Sep 2014

Still waiting...

avatar brianteeman brianteeman - change - 21 Sep 2014
Title
#4215 Multilingual: Remove language code in alternate link
#4215 Multilingual: Remove language code in alternate link
Status New Pending
avatar jessicadunbar
jessicadunbar - comment - 29 Sep 2014

Hi
I know i'm a bit late to comment on this PR.
Examples of proper hreflang implementation are http://www.ea.com/ and https://www.mozilla.org/en-US/

ea homepage

<link href="http://www.ea.com/" rel="canonical"/>
<link rel="alternate" hreflang="en-US" href="http://www.ea.com/"/>
<link rel="alternate" hreflang="en-AU" href="http://www.ea.com/au"/>
<link rel="alternate" hreflang="en-CA" href="http://www.ea.com/ca"/>
<link rel="alternate" hreflang="en-GB" href="http://www.ea.com/uk"/>

ea en-GB page

<link href="http://www.ea.com/uk" rel="canonical"/>
<link rel="alternate" hreflang="en-US" href="http://www.ea.com/"/>
<link rel="alternate" hreflang="en-AU" href="http://www.ea.com/au"/>
<link rel="alternate" hreflang="en-CA" href="http://www.ea.com/ca"/>
<link rel="alternate" hreflang="en-GB" href="http://www.ea.com/uk"/>

On the multilingual demo
multilingual.demojoomla.com/fr/ this is what it currently displays

<link href="/fr/?format=feed&amp;type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
  <link href="/fr/?format=feed&amp;type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
  <link href="http://multilingual.demojoomla.com/en/" rel="alternate" hreflang="en-GB" />
  <link href="http://multilingual.demojoomla.com/de/" rel="alternate" hreflang="de-DE" />
  <link href="http://multilingual.demojoomla.com/it/" rel="alternate" hreflang="it-IT" />
  <link href="http://multilingual.demojoomla.com/es/" rel="alternate" hreflang="es-ES" />
  <link href="http://multilingual.demojoomla.com/mk/" rel="alternate" hreflang="mk-MK" />
  <link href="http://multilingual.demojoomla.com/ta/" rel="alternate" hreflang="ta-IN" />

multilingual.demojoomla.com/fr/ should display this (the only change is to en-GB)

<link href="/fr/?format=feed&amp;type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
  <link href="/fr/?format=feed&amp;type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
  <link href="http://multilingual.demojoomla.com/" rel="alternate" hreflang="en-GB" />
  <link href="http://multilingual.demojoomla.com/de/" rel="alternate" hreflang="de-DE" />
  <link href="http://multilingual.demojoomla.com/it/" rel="alternate" hreflang="it-IT" />
  <link href="http://multilingual.demojoomla.com/es/" rel="alternate" hreflang="es-ES" />
  <link href="http://multilingual.demojoomla.com/mk/" rel="alternate" hreflang="mk-MK" />
  <link href="http://multilingual.demojoomla.com/ta/" rel="alternate" hreflang="ta-IN" />

Hope this helps
Jess

avatar smanzi
smanzi - comment - 29 Sep 2014

Thank-you for your contribution, Jessica!

So basically, if I'm not mistaken, in your opinion it is OK to apply this PR, which implements this behavior of removing the language code for the default language's rel="alternate" links when it is also removed for the site's URLs. Good: I hope this can make it for 3.3.5 (which is soon to be released, I think)

What I found interesting, anyway, is what mozilla.org is doing: they do not remove the language codes (nor from the URLs, nor from the rel="alternate" links), and this is possible with Joomla too, but what they do (and we don't) is to add a rel="alternate" link for a apparently non-existent default language.
As an example, in en-US they have:

    <link rel="canonical" hreflang="en-US" href="https://www.mozilla.org/en-US/">
    <link rel="alternate" hreflang="x-default" href="https://www.mozilla.org/">
    <link rel="alternate" hreflang="en-US" href="https://www.mozilla.org/en-US/" title="English (US)">
    ...

I'm wondering if we should do the same when we do not remove the language code from our URL's...

Thanks, again.
Sergio

avatar infograf768
infograf768 - comment - 30 Sep 2014

Too late for 3.3.5. This would go in 3.4.0

avatar smanzi
smanzi - comment - 30 Sep 2014

@infograf768

I see PR are still being committed to 3.3.5, like e.g., this one: #4398
Why this cannot being committed?

avatar brianteeman
brianteeman - comment - 30 Sep 2014

3.3.5 was released an hour ago.

#4398 has not been committed

On 30 September 2014 15:54, Sergio Manzi (smz) notifications@github.com
wrote:

@infograf768 https://github.com/infograf768

I see PR are still being committed to 3.3.5, like e.g., this one: #4398
#4398
Why this cannot being committed?


Reply to this email directly or view it on GitHub
#4221 (comment).

Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/

avatar smanzi
smanzi - comment - 30 Sep 2014

Ah, OK, sorry, I stand corrected.
Somebody will not be that happy... more for #4398 than this one... :-)

avatar smanzi
smanzi - comment - 30 Sep 2014

Really, don't take this as an ironic comment, but maybe we have a window of opportunity with 3.3.6...

avatar infograf768 infograf768 - close - 2 Oct 2014
avatar infograf768 infograf768 - change - 2 Oct 2014
Status Pending Closed
Closed_Date 2014-09-05 10:21:23 2014-10-02 08:12:57
avatar infograf768
infograf768 - comment - 2 Oct 2014

3.3.5 and 3.3.6 were specific releases aimed at solving Regressions.
Now we can merge this one.

avatar infograf768
infograf768 - comment - 2 Oct 2014

Please test #4425 for 2.5.x

avatar smanzi
smanzi - comment - 2 Oct 2014

Glad it finally made it through!
I don't have a 2.5.x environment to test #4425 at this time: I'll set up one tomorrow...

avatar infograf768
infograf768 - comment - 8 Oct 2014

Still need tests on 2.5.x

avatar smanzi
smanzi - comment - 8 Oct 2014

Negative test for #4425 (see comment there...)
Also this PR is suspicious (due to the nature of the negative test on #4425)

avatar smanzi
smanzi - comment - 8 Oct 2014

Confirmed: the issues exists also for this PR.

The problem is due to the usage of the str_replace() function to remove the language code, as it acts globally on the link.

We should probably use a correctly set preg_replace() or make the site domain name part of the replace subject.

avatar smanzi
smanzi - comment - 8 Oct 2014

... sorry, I meant to say "part of the replace search!"

avatar smanzi
smanzi - comment - 8 Oct 2014

To fix this issue, both in this PR and in #4425 you can substitute the two:

$relLink = str_replace('/' . $language->sef, '', $link);

with:

$relLink = preg_replace('|/' . $language->sef . '/|', '/', $link, 1);

avatar infograf768
infograf768 - comment - 10 Oct 2014

Confirming the issue and your solution
I corrected the patch for 2.5.

Can you make a PR for 3.x ?

avatar smanzi
smanzi - comment - 10 Oct 2014

Hi @infograf768!

Well, yes, I can try to make a PR: I'm not much into that, but this is definitely something I want to learn, so this is can be an occasion to sharpen my teeth... Can you provide a little bit of guidance?

I'm a bit ashamed to confess, but it is still not clear to me which branch I should fork and to which I should send my PR (the 3 instances you're talking about, I think): master, staging, 3.4-dev? I think I've read somewhere that staging is being abandoned: is this correct? What about 3.2.x?

Also, in my PR should I reference the original issue (#4215), this PR (#4221), or open a new issue and then reference that new one?

Keep in mind that I can use GitHub for Windows or edit directly in the GitHub site: no Linux here...

Thanks in advance for any guidance you can provide

avatar mbabker
mbabker - comment - 10 Oct 2014

Regarding the branches, it's explained at https://github.com/joomla/joomla-cms/blob/staging/CONTRIBUTING.md#branches

http://docs.joomla.org/Using_the_Github_UI_to_Make_Pull_Requests will give you a real quick run down on creating a pull request using the GitHub website. I don't think we have docs right now on working with the desktop clients.

avatar smanzi
smanzi - comment - 10 Oct 2014

Thanks @mbabker

I've read the document, and what I understand is that normally I should clone master and send PR to staging for the current version, 3.3.x.

Anyway, @infograf768 has previously stated that this fix is going to be incorporated into 3.4.0 (see above, 10 days ago), so I think I should clone 3.4-dev and send PR to that, correct?

But what @infograf768 has stated 4 hours ago (we have 3 instances in 3.x) is puzzling me...

avatar Bakual
Bakual - comment - 10 Oct 2014

I've read the document, and what I understand is that normally I should clone master and send PR to staging for the current version, 3.3.x.

master and staging should be identical most of time. So it doesn't really matter which of those two you use.

Anyway, infograf768 has previously stated that this fix is going to be incorporated into 3.4.0 (see above, 10 days ago), so I think I should clone 3.4-dev and send PR to that, correct?

Doing it against staging is the best thing to do most of the time. If we decide it goes into 3.4, we can always commit it there.

avatar smanzi
smanzi - comment - 10 Oct 2014

Thanks Bakual,

I've also noticed, now, that @infograf768 committed this PR to staging, so it is clear enough that my yet-to-be PR correcting this PR should be committed to staging too.

Is it OK if in my PR I reference this PR or should I reference the original Issue?

avatar smanzi
smanzi - comment - 10 Oct 2014

hmm... I'm really afraid that my fix is not safe either:

If we have a host called "engineering.example.com" and we are going to remove the /en language code from a link we will screw-up the host name:

http://engineering.example.com/en/

will become

http://gineering.example.com/en

Have more work to do on this...

avatar smanzi
smanzi - comment - 10 Oct 2014

Nope, bullshit! I'm searching for /en/, not just /en

Sorry!

avatar smanzi
smanzi - comment - 10 Oct 2014

OK, I created my PR... I hope everything is alright...

Add a Comment

Login with GitHub to post a comment