User tests: Successful: Unsuccessful:
This PR to solve the issue described in #4215 when language is the default site language and remove url language code is on.
after patch, the alternate link for the default site language will not contain the code anymore:
instead of
mysite.com/en/whatever
one will get
mysite.com/whatever
Example of result (en-GB is default site and url language code is on):
Home page before patch:
<link href="http://localhost:8888/testwindows/trunkgitnew/en" rel="alternate" hreflang="en-GB" />
<link href="http://localhost:8888/testwindows/trunkgitnew/de/" rel="alternate" hreflang="de-DE" />
<link href="http://localhost:8888/testwindows/trunkgitnew/it/" rel="alternate" hreflang="it-IT" />
<link href="http://localhost:8888/testwindows/trunkgitnew/es/" rel="alternate" hreflang="es-ES" />
<link href="http://localhost:8888/testwindows/trunkgitnew/mk/" rel="alternate" hreflang="mk-MK" />
<link href="http://localhost:8888/testwindows/trunkgitnew/ta/" rel="alternate" hreflang="ta-IN" />
Home page after:
<link href="http://localhost:8888/testwindows/trunkgitnew/" rel="alternate" hreflang="en-GB" />
<link href="http://localhost:8888/testwindows/trunkgitnew/de/" rel="alternate" hreflang="de-DE" />
<link href="http://localhost:8888/testwindows/trunkgitnew/it/" rel="alternate" hreflang="it-IT" />
<link href="http://localhost:8888/testwindows/trunkgitnew/es/" rel="alternate" hreflang="es-ES" />
<link href="http://localhost:8888/testwindows/trunkgitnew/mk/" rel="alternate" hreflang="mk-MK" />
<link href="http://localhost:8888/testwindows/trunkgitnew/ta/" rel="alternate" hreflang="ta-IN" />
Another associated item after patch:
<link href="/testwindows/trunkgitnew/fr/instructions-multilangue.html" rel="canonical" />
<link href="/testwindows/trunkgitnew/fr/instructions-multilangue.feed?type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
<link href="/testwindows/trunkgitnew/fr/instructions-multilangue.feed?type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
<link href="http://localhost:8888/testwindows/trunkgitnew/multi-lingual-steps-by-steps.html" rel="alternate" hreflang="en-GB" />
<link href="http://localhost:8888/testwindows/trunkgitnew/de/schritt-für-schritt-zur-mehrsprachigkeit.html" rel="alternate" hreflang="de-DE" />
<link href="http://localhost:8888/testwindows/trunkgitnew/it/multilingua-passo-dopo-passo.html" rel="alternate" hreflang="it-IT" />
<link href="http://localhost:8888/testwindows/trunkgitnew/es/multiidioma-paso-a-paso.html" rel="alternate" hreflang="es-ES" />
<link href="http://localhost:8888/testwindows/trunkgitnew/mk/повеќејазичност-чекор-по-чекор.html" rel="alternate" hreflang="mk-MK" />
<link href="http://localhost:8888/testwindows/trunkgitnew/ta/பன்-மொழி-படிப்படியாக.html" rel="alternate" hreflang="ta-IN" />
Labels |
Added:
?
|
Title |
|
How is this supposed to work?
I would want to have the following 2 scenarios:
I have english as my default language for the site. Now I want to be able to set it that going to domain.tld/ gets me to
a) domain.tld/en/ with a redirect and all subsequent english pages have the /en/ prefix or
b) domain.tld/ is already the english version and all subsequent english pages don't have any prefix
All other languages, which are not default, should always have the prefix. Is that possible with this PR?
No chnage in behavior. You can do both as usual. This patch just takes off the lang prefix in the alternate link when Remove url lang code is on for the default site lang
Category | ⇒ | Multilanguage |
Hmm, thinking further... (Thansk Bembelimen)
What would happen if a user chnages the default site lang... ?
I don't like the idea for the following reason:
At the moment, when I have an alternate link, I get a prefix. So searchengines takes all links with prefix in its search results (and after clicking Joomla! redirects, which is perfect).
Now if we change this, searchengines will (propably) save the links from the default language without the language prefix and the others with (which is ok, because both work). But when I now change the default language (after searchengines has indexed my page) all the links from the old default language will break (because they are saved without prefix)
I agree. Closing this PR now.
Title |
|
||||||
Status | Pending | ⇒ | Closed | ||||
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2014-09-05 10:21:23 |
As asked by @infograf768 I'm commenting here instead of at the original #4215 issue tracker, so you will forgive me if there will be a bit of redundancy.
@bembelimen: I do not agree with your objections for the following reasons:
I would also add that I think this PR will not cause any problem to already indexed "default language pages with the language code": crawlers will eventually re-crawl those pages, they will be redirected (303) to the canonical URL (just as now) and re-index it. This time, anyway, everything will be consistent and obeying to the rules stated by Google.
I agree the opinion of a SEO expert will be useful: does anybody has Matt Cutts phone number? :-)
Anyway I'm asking to re-open this PR (at least for visibility reasons): it can be kept in a suspended state until there is consensus on its fate...
Regards,
Sergio
This comment was created with the J!Tracker Application at http://issues.joomla.org/.
Title |
|
||||||
Status | Closed | ⇒ | New |
Reopened for discussion.
After reading both PRs I think @smanzi is right. And this is something that should work like this by default.
So the problem now seems to be if there are sites out there with the remove language from url enabled in multilanguage sites. Of course that does not make sense for me because the language was not removed after all. And if they have issues they can always change the remove language from url option.
So here to merge.
As I said above, I am concerned by B/C and also by the Home alternate.
If the languagefilter is set to Browser Settings for new visitors and remove language code, then
mysite.com will redirect to the browser language and NOT to the default site language, therefore false information for crawlers.
Really, I don't see a problem, but maybe I'm missing something:
In languagefilter we have two options for Language Selection for new Visitors
If we choose Browser Settings and the visitor is using a browser set with the same language as the default language and we have also set the Remove URL Language Code option, then the visitor will see site's URLs, without the language code. Isn't this expected?
Crawlers will not identify themselves with a particular language, will fall in the default language (as defined in Language Manager), they too will see URLs without the language code, they will index the default site and all the correctly (thanks to your PR) defined rel="alternate" pages.
If instead a visitor comes with a browser set for a supported non-default language, he will be automatically be redirected to the correct /languagecode/ URLs. This too is expected.
If we set Language Selection for new Visitors to Site Language, everybody (visitors and crawlers) will fall in the default language, with or without language code as per the Remove URL Language Code option (which BTW should actually be "Remove default Language Code"). Crawlers too will be quite happy, as above.
I see absolutely no problem in the Home page: what problems do you see?
As far as regards B/C, I don't think not correcting a bug can be considered B/C...
Is there something I'm missing?
This comment was created with the J!Tracker Application at http://issues.joomla.org/.
I have asked some SEO specialists to have a look.
@infograf768
Great, thanks!
I hope this PR can make it for J3.4...
@infograf768
Any news from the SEO specialists?
I see we are on the eve of the release of 3.3.4 and I'm wondering if there is a possibility to see this PR included...
Still waiting...
Title |
|
||||||
Status | New | ⇒ | Pending |
Hi
I know i'm a bit late to comment on this PR.
Examples of proper hreflang implementation are http://www.ea.com/ and https://www.mozilla.org/en-US/
ea homepage
<link href="http://www.ea.com/" rel="canonical"/>
<link rel="alternate" hreflang="en-US" href="http://www.ea.com/"/>
<link rel="alternate" hreflang="en-AU" href="http://www.ea.com/au"/>
<link rel="alternate" hreflang="en-CA" href="http://www.ea.com/ca"/>
<link rel="alternate" hreflang="en-GB" href="http://www.ea.com/uk"/>
ea en-GB page
<link href="http://www.ea.com/uk" rel="canonical"/>
<link rel="alternate" hreflang="en-US" href="http://www.ea.com/"/>
<link rel="alternate" hreflang="en-AU" href="http://www.ea.com/au"/>
<link rel="alternate" hreflang="en-CA" href="http://www.ea.com/ca"/>
<link rel="alternate" hreflang="en-GB" href="http://www.ea.com/uk"/>
On the multilingual demo
multilingual.demojoomla.com/fr/ this is what it currently displays
<link href="/fr/?format=feed&type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
<link href="/fr/?format=feed&type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
<link href="http://multilingual.demojoomla.com/en/" rel="alternate" hreflang="en-GB" />
<link href="http://multilingual.demojoomla.com/de/" rel="alternate" hreflang="de-DE" />
<link href="http://multilingual.demojoomla.com/it/" rel="alternate" hreflang="it-IT" />
<link href="http://multilingual.demojoomla.com/es/" rel="alternate" hreflang="es-ES" />
<link href="http://multilingual.demojoomla.com/mk/" rel="alternate" hreflang="mk-MK" />
<link href="http://multilingual.demojoomla.com/ta/" rel="alternate" hreflang="ta-IN" />
multilingual.demojoomla.com/fr/ should display this (the only change is to en-GB)
<link href="/fr/?format=feed&type=rss" rel="alternate" type="application/rss+xml" title="RSS 2.0" />
<link href="/fr/?format=feed&type=atom" rel="alternate" type="application/atom+xml" title="Atom 1.0" />
<link href="http://multilingual.demojoomla.com/" rel="alternate" hreflang="en-GB" />
<link href="http://multilingual.demojoomla.com/de/" rel="alternate" hreflang="de-DE" />
<link href="http://multilingual.demojoomla.com/it/" rel="alternate" hreflang="it-IT" />
<link href="http://multilingual.demojoomla.com/es/" rel="alternate" hreflang="es-ES" />
<link href="http://multilingual.demojoomla.com/mk/" rel="alternate" hreflang="mk-MK" />
<link href="http://multilingual.demojoomla.com/ta/" rel="alternate" hreflang="ta-IN" />
Hope this helps
Jess
Thank-you for your contribution, Jessica!
So basically, if I'm not mistaken, in your opinion it is OK to apply this PR, which implements this behavior of removing the language code for the default language's rel="alternate" links when it is also removed for the site's URLs. Good: I hope this can make it for 3.3.5 (which is soon to be released, I think)
What I found interesting, anyway, is what mozilla.org is doing: they do not remove the language codes (nor from the URLs, nor from the rel="alternate" links), and this is possible with Joomla too, but what they do (and we don't) is to add a rel="alternate" link for a apparently non-existent default language.
As an example, in en-US they have:
<link rel="canonical" hreflang="en-US" href="https://www.mozilla.org/en-US/">
<link rel="alternate" hreflang="x-default" href="https://www.mozilla.org/">
<link rel="alternate" hreflang="en-US" href="https://www.mozilla.org/en-US/" title="English (US)">
...
I'm wondering if we should do the same when we do not remove the language code from our URL's...
Thanks, again.
Sergio
Too late for 3.3.5. This would go in 3.4.0
3.3.5 was released an hour ago.
#4398 has not been committed
On 30 September 2014 15:54, Sergio Manzi (smz) notifications@github.com
wrote:
@infograf768 https://github.com/infograf768
I see PR are still being committed to 3.3.5, like e.g., this one: #4398
#4398
Why this cannot being committed?—
Reply to this email directly or view it on GitHub
#4221 (comment).
Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/
Really, don't take this as an ironic comment, but maybe we have a window of opportunity with 3.3.6...
Status | Pending | ⇒ | Closed |
Closed_Date | 2014-09-05 10:21:23 | ⇒ | 2014-10-02 08:12:57 |
3.3.5 and 3.3.6 were specific releases aimed at solving Regressions.
Now we can merge this one.
Still need tests on 2.5.x
Confirmed: the issues exists also for this PR.
The problem is due to the usage of the str_replace() function to remove the language code, as it acts globally on the link.
We should probably use a correctly set preg_replace() or make the site domain name part of the replace subject.
... sorry, I meant to say "part of the replace search!"
Confirming the issue and your solution
I corrected the patch for 2.5.
Can you make a PR for 3.x ?
Hi @infograf768!
Well, yes, I can try to make a PR: I'm not much into that, but this is definitely something I want to learn, so this is can be an occasion to sharpen my teeth... Can you provide a little bit of guidance?
I'm a bit ashamed to confess, but it is still not clear to me which branch I should fork and to which I should send my PR (the 3 instances you're talking about, I think): master, staging, 3.4-dev? I think I've read somewhere that staging is being abandoned: is this correct? What about 3.2.x?
Also, in my PR should I reference the original issue (#4215), this PR (#4221), or open a new issue and then reference that new one?
Keep in mind that I can use GitHub for Windows or edit directly in the GitHub site: no Linux here...
Thanks in advance for any guidance you can provide
Regarding the branches, it's explained at https://github.com/joomla/joomla-cms/blob/staging/CONTRIBUTING.md#branches
http://docs.joomla.org/Using_the_Github_UI_to_Make_Pull_Requests will give you a real quick run down on creating a pull request using the GitHub website. I don't think we have docs right now on working with the desktop clients.
Thanks @mbabker
I've read the document, and what I understand is that normally I should clone master and send PR to staging for the current version, 3.3.x.
Anyway, @infograf768 has previously stated that this fix is going to be incorporated into 3.4.0 (see above, 10 days ago), so I think I should clone 3.4-dev and send PR to that, correct?
But what @infograf768 has stated 4 hours ago (we have 3 instances in 3.x) is puzzling me...
I've read the document, and what I understand is that normally I should clone master and send PR to staging for the current version, 3.3.x.
master
and staging
should be identical most of time. So it doesn't really matter which of those two you use.
Anyway, infograf768 has previously stated that this fix is going to be incorporated into 3.4.0 (see above, 10 days ago), so I think I should clone 3.4-dev and send PR to that, correct?
Doing it against staging
is the best thing to do most of the time. If we decide it goes into 3.4, we can always commit it there.
Thanks Bakual,
I've also noticed, now, that @infograf768 committed this PR to staging, so it is clear enough that my yet-to-be PR correcting this PR should be committed to staging too.
Is it OK if in my PR I reference this PR or should I reference the original Issue?
hmm... I'm really afraid that my fix is not safe either:
If we have a host called "engineering.example.com" and we are going to remove the /en language code from a link we will screw-up the host name:
http://engineering.example.com/en/
will become
http://gineering.example.com/en
Have more work to do on this...
Nope, bullshit! I'm searching for /en/, not just /en
Sorry!
OK, I created my PR... I hope everything is alright...
Just thinking that we maybe should not do it for the home page as it could lead to confusions :
The language code taken out would let the url redirected to Browser Settings or Default site, depending on the language filter settings. Default site would be fine, but browser settings would not really although not much harm. At the same time, it is only for SEO in the alternate link...
What do you think?