? Success

User tests: Successful: Unsuccessful:

avatar lunalars
lunalars
12 Nov 2014

On a multilingual site urls called without the language code are redirected by a "303 see other" instead of "301 moved permanently".
The code used here is the "first half" of https://github.com/joomla/joomla-cms/pull/2845/files provided by infograf768 (thanks!).
Edit:
Please note: this PR is only applying if "Use URL rewriting" is set to "Yes" and "Remove URL Language Code" is set to "Yes" "No" !

Steps to reproduce the issue:

  1. create a multilingual site with "Use URL rewriting" set to "Yes" in global configuration
  2. in plugin "System - Language Filter" set
    a. "Language Selection for new Visitors" to "Site Language"
    b. "Automatic Language Change" to "Yes"
    c. "Item associations" to "Yes"
    d. "Remove URL Language Code" to "Yes" "No"
  3. go to homepage in firefox
  4. start firebug and switch to to the "network" tab
  5. in the addressbar remove the language code and hit F5

Expected result:

The first redirect shown in the network tab should be a 301 redirect as the url without the language code is no longer available, and so it should be redirected "permanently".

Actual result:

The redirect is a "303 see other".

System information:

PHP 5.5.9-1ubuntu4.5
MySQLi 5.5.40-0ubuntu0.14.04.1
Caching Disabled
GZip Disabled

Additional comments:

After applying the patch, the redirect is a "301 moved permanently".

This was addressed a while ago on the old tracker:
http://joomlacode.org/gf/project/joomla/tracker/?action=TrackerItemEdit&tracker_item_id=33194&start=8400
and there were some problems with cache enabled mentioned by Jean-Marie Simonet.
So please test with and without cache enabled.

And sorry for adding the changed gitignore file here - i don't know how to remove it from the PR - still new to git.

Votes

# of Users Experiencing Issue
1/1
Average Importance Score
5.00

avatar lunalars lunalars - open - 12 Nov 2014
avatar jissues-bot jissues-bot - change - 12 Nov 2014
Labels Added: ?
avatar smanzi
smanzi - comment - 12 Nov 2014

I'm not sure at all a 301 is correct in this scenario:

  • Some uses "300 Multiple Choices"
  • Google uses a "302 Found"
  • Others use a "303 See Other" (like it was)

But there seems to be a consensus that 301 is not a good choice...

See: http://stackoverflow.com/questions/8325784/http-status-code-for-language-redirect

avatar Bakual
Bakual - comment - 12 Nov 2014

I also would say 301 is wrong.
It's meant for pages where the old one is no longer existant and was moved to a new place. Software which is a bit smarter (like for example RSS readers and search engines) usually update their stored links and never try the old link again.
I don't think that is what we want.

avatar lunalars
lunalars - comment - 12 Nov 2014

Let me explain a bit more:
The site where i ran into this had only one language (german) before. Now we decided to add english (more to come) and use the language code in the url for all languages (including default).
Google has indexed all "old" urls without the language code, but these are no longer valid now and so we have to tell google that they are moved permanently (this is what the seo expert i'm working with says).

Example:
http://www.example.com/ no longer exists and should be redirected to http://www.example.com/de

This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/5092.

avatar brianteeman
brianteeman - comment - 12 Nov 2014

In that example you should really be using the option in the plugin NOT to
show the language code on the default language. So www.example.com still
exists for the german and the new url is only for the english
www.example.com/en

On 12 November 2014 10:40, Lars Vonnahme notifications@github.com wrote:

Let me explain a bit more:
The site where i ran into this had only one language (german) before. Now
we decided to add english (more to come) and use the language code in the
url for all languages (including default).
Google has indexed all "old" urls without the language code, but these are
no longer valid now and so we have to tell google that they are moved
permanently (this is what the seo expert i'm working with says).

Example:
http://www.example.com/ no longer exists and should be redirected to
http://www.example.com/de

This comment was created with the J!Tracker Application
https://github.com/joomla/jissues at issues.joomla.org/joomla-cms/5092
http://issues.joomla.org/tracker/joomla-cms/5092.


Reply to this email directly or view it on GitHub
#5092 (comment).

Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/

avatar lunalars
lunalars - comment - 12 Nov 2014

First: thanks for looking at this!

SEO is pure fun: 10 people, 20 opinions ;-)

@smanzi
i followed the link and the others posted on stackexchange and also gave them to my seo expert. Will look deeper tomorrow. I might be wrong, but i think on one of googles pages i have read about using 301 would be the correct way - if i find this again, i'll post a link.

@Bakual
so this case seems to be the special one where it would be correct?! ;-)

@brianteeman
will talk with my seo expert. I cannot remember why we choose this way as it was decided a while ago (the "new" site is not online by now, so we could change it).

Ok, as i said: will talk to my seo expert and report back tomorrow (or maybe on friday).

avatar smanzi
smanzi - comment - 12 Nov 2014

If you are not online yet I'll surely follow @brianteeman advice and use the option of hiding the language code for the default language. Google will be really happy...

avatar brianteeman brianteeman - change - 13 Nov 2014
Category Multilanguage SEF
avatar ManuxGR
ManuxGR - comment - 13 Nov 2014

@lunalars
Probably your seo expert noticed duplicated content from Google Webmaster Tools thats why he advised 301 redirect, i agree.

I was using "Remove URL Language Code" set to "yes" for a year or more. So, for the default language "el" the links were redirected and removed "el" from the urls. And don't know why, but 2 weeks ago i got duplicated content from this option, so i decide to use language code, for both languages and i disabled this option. it makes more sense. (check the image).Duplicated content by Remove URL language code

From a seo perspective this is wrong. The language code should not be removed from the url. It's not logic for one language to use the code and for another not. Google likes urls to make sense even if something is obvious.

Of course this is only my opinion and probably we are here because google is making changes.. ;)

This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/5092.

avatar smanzi
smanzi - comment - 13 Nov 2014

@lunalars There is a bug (see #4215 and #4221) in the rel="alternate" links that is causing this duplication. Just apply #4221 and you will not see duplicated any more...

avatar ManuxGR
ManuxGR - comment - 14 Nov 2014

@smanzi I am not lunalars :) Thanks for your help

I am not using "remove language url code from url" any more. But the duplicated content is been indexed as the image above because i was using "remove language code" and associations with menus and articles. I think this is my issue #5092. How the duplicated content is going to be fixed with 303 redirects? I think i need 301. Sorry but i am not an expert. :)

avatar smanzi
smanzi - comment - 14 Nov 2014

@ManuxGR Ooops! Sorry for mixing up usernames!
Sorry, but I'm now really busy: I will follow up with a complete post later.
Anyway:

  • Your site was double-indexed because of the aforementioned bug.
  • I wouldn't had gone the way to add the language code for the default language...
  • The only solution I see is to use .htacces to redirect the unwanted URL to the good ones with 301 and...
  • Submit a correct sitemap, and...
  • Optional: Use Google Webmaster Tools to remove the unwanted URLs, and...
  • Be patient!
avatar ManuxGR
ManuxGR - comment - 14 Nov 2014

@smanzi Thanks once again for your reply.
I can see there is something holding you back from me to apply this fix #5092, please when you find some time explain me. I assume that you already know that if "remove language code url" is set "no", then the same redirects will apply by core. For example the Joomla.com/smart/screen will be redirected to Joomla.com/en/smart/screen by default.
@infograf768 suggested in a comment not to use "remove language code url" if the site is using associations, and i am using a lot of them.
I have already overloaded htaccess with 301 path redirects, thats why i am looking into this automated fix.

avatar lunalars
lunalars - comment - 14 Nov 2014

@smanzi "Be patient!" - agree! The most important part of our work on seo :-)

Didn't test, but agree with @smanzi : #4221 should fix (at least a part) of the duplicated content problem. But if i remember correctly, there is (or was?) a similar problem with languageswitcher, which also adds the languagecode.

Talked to my seo expert today and he thinks, that it's more logical to use the languagecode for all languages as this results in a more unified url structure (and he did it succesfully like this in 2 other non-joomla projects). So, in this case we really need the 301 as there is no url without languagecode anymore.
For me this sounds logical and as i'm not a seo expert (just basic knowledge) i will have to follow him on this project and for now go with the "core-hack" to get 301 :-) (yes, i know, core-hack is bad, i've been there :-) the client is warned and gave his ok)

@manuxGR I'm not sure what to do in your situation, but would again follow my seo expert and say it's for now better to keep the languagecode. But then - as you said - you would need the 301. If you are willing to pay the price, you could go with the core-hack for now, too.
By the way: we are also using associations and some htaccess redirects :-)

To me it seems there are different ways to do it right. Also i remember reading on google help page (sorry, again no link) that you can do it in different ways, as long as you avoid really bad things. Maybe, as there are these different opinions and needs, we can implement an options for this, to handle them? Yes, not too many options, please :-)

avatar ManuxGR
ManuxGR - comment - 14 Nov 2014

@lunalars Thanks for your opinion and for asking your seo expert. I agree. :)

But @smanzi is a coder and we need his detailed opinion so we can make a final decision that will work permanently.

I don't get it with #4221, this fix is only working when "remove url language code" is set to "yes" and because of language switcher pointing to wrong link and then a 303 redirect will apply.
BUT when the option is set to "no" the language switcher is working properly with associations and is pointing to the correct link no 303 redirect.

avatar smanzi
smanzi - comment - 14 Nov 2014

@ManuxGR me? a coder? You are greatly overestimating me! Well, I've been coding more or less since 1980, and two hands are not enough for enumerating the languages I had to learn, but as a Joomla coder I'm a real noob!

Now you will forgive me, but I have to make some history to wrap up what the problem is (and yes, IMHO we still have a problem).

I opened the #4215 issue because I discovered what I think is a bug in the Language Switcher module (remember, we have a Language Switcher module and a Language Filter plugin): Language Switcher generates links for the default language where the language code is not suppressed even if we set the Remove URL Language Code option.

Soon the attention shifted to a similar bug in the language filter and the rel="alternate" links it generates. This latter aspect was quickly fixed by @infograf768 with PR #4221

But... @infograf768 had some afterthoughts and closed the PR. A long discussion erupted that was finally cleared up by a SEO expert, @jessicadunbar with this post: #4221 (comment)

At this point #4221 saw the light, a similar PR #4425 for Joomla 2.5 was created and... everybody (including myself) forgot about the initial issue with the language switcher module!

So, here we are: Joomla is still responsible for the creation of double-indexed content (even when #4221 is applied) if we set the Remove URL Language Code option (I have evidence of this with a live site).

But... going to the wrong URL generated by the language switcher yield to a redirect to the correct one with a 303 - See Other HTTP status code, somebody could say. Well, Google doesn't give a damn and index the wrong URL as well. This, I didn't expect, really.

The w3.org document about status codes (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html) is not very clear about the 303. What I've read around, anyway, is that a 303 (contrary to a 301) does not pass page rank to the page to which one is redirected and is accepted as a valid URI, although the cached page is the one to which one is redirected. See http://www.marketingchip.com/seo-experiments/how-does-a-303-redirect-affect-seo/ very interesting

At this point I would say that:

  • Language switcher must be fixed, and...
  • I've changed my mind: 301 is what we need to redirect to non canonical URL!
avatar smanzi
smanzi - comment - 14 Nov 2014

My bad! Las line read:

  • I've changed my mind: 301 is what we need to redirect to from non canonical URL!
avatar smanzi
smanzi - comment - 14 Nov 2014

Untested yet, but: if you accept not having the language code for the default language it should be quite easy to de-duplicate all URLs by putting this in .htacces:

RewriteRule ^gr/(.*)$ $1 [R=301,L]

If gr is your default language code.

I see it more difficult to redirect naked pages to the ones with the default language code... anybody has an idea?

avatar smanzi
smanzi - comment - 14 Nov 2014

... no the above doesn't works: I have to think it over... :-/

avatar ManuxGR
ManuxGR - comment - 14 Nov 2014

@smanzi don't bother about htaccess, finally i will use code language for both languages and 301 redirects instead of 303. And i wish me good luck... once again.....

Many thanks for you support and for explain me so detailed, you are so helpful. I am glad they are people like you around and lunarars. :) I am reading the links you attached.

avatar smanzi
smanzi - comment - 15 Nov 2014

I've created two new PRs that are "language related":

  • #5109 eliminates the language code for the default language from the Language Switcher
  • #5110 is an extension of this PR (thanks @lunalars !). It does take into account some more cases, but it is a very preliminary PR.

There will be more to discuss as (after many and many tests) I've come to the conclusion that the possibility to eliminate the language code for the default language, while absolutely legit and without negative SEO impact if correctly implemented, is, at this time, full of issues and just a bit more of a kludge

avatar infograf768
infograf768 - comment - 15 Nov 2014

The point here imho is to decide which is better: a 303 or a 301.

Code is another matter.

This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/5092.

avatar ManuxGR
ManuxGR - comment - 15 Nov 2014

301, we have evidence that 303 created duplicated content. And language switcher should be fixed for next version so new websites dont have this problem.

avatar infograf768
infograf768 - comment - 15 Nov 2014

As I commented on #5109, this will not work because of the set cookie.

This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/5092.

avatar infograf768
infograf768 - comment - 17 Nov 2014

@lunalars and all
After much thought, I think the 301 is fine when remove url language code is set to Yes.
But this PR is not correct as it sets the 301 at the wrong place as we need to fullfill all conditions
1. Remove URL Language Code is set to YES in the language filter
2. Search Engine Friendly URLs is set to Yes in Global configuration

I created a new PR for you to test:
#5129

avatar lunalars lunalars - change - 17 Nov 2014
The description was changed
avatar lunalars
lunalars - comment - 17 Nov 2014

@infograf768 and all
Thanks to @infograf768 i just noticed an error in my "Steps to reproduce":
it should be "Remove URL Language Code" is set to "No" (it's edited)
as i want to keep the language code for all languages.
Sorry for the confusion!
So i think this PR is still valid for the situation i described above, if there is a consensus, that 301 would fit for it, too.

avatar infograf768
infograf768 - comment - 17 Nov 2014

Please just test #5129

avatar lunalars
lunalars - comment - 17 Nov 2014

@infograf768
Tested #5129 successfully, but it only applies, if "Remove URL Language Code" is set to "Yes" - in my example above it's set to "No".
Or am i missing something?

avatar infograf768
infograf768 - comment - 17 Nov 2014

@lunalars
If url language code is set to NO, we do provide the prefix /en/ and, for these links, we never get a 303

avatar lunalars
lunalars - comment - 17 Nov 2014

@infograf768
In this case, if you call the URL without the language code (eg. if you find an old link on google), you will get a 303. This still happens after applying #5129.

avatar infograf768
infograf768 - comment - 17 Nov 2014

no.
Look at my test site in #5219 and just enter an url without the prefix directly in the browser for en-GB.

avatar ManuxGR
ManuxGR - comment - 17 Nov 2014

It's working properly for me.

But what about negative SEO for bad competiros?
404

The url above is from secondary language and probably can not be find by spiders or humans or it's ok. I am just saying this could be a security whole or not?

avatar infograf768
infograf768 - comment - 17 Nov 2014

Folks, when commenting on a PR, do it in the correct PR.
@ManuxGR Do you mean this PR or the one I posted #5129

avatar smanzi
smanzi - comment - 17 Nov 2014

@infograf768 I've yet to tes, but I think the condition @lunalars want to handle is the opposite:

  • Old URL: without
  • New URL: with
avatar ManuxGR
ManuxGR - comment - 17 Nov 2014

Yes sorry this was for #5129

avatar smanzi
smanzi - comment - 17 Nov 2014

oops!

avatar infograf768
infograf768 - comment - 17 Nov 2014

Ref: #5092 (comment)

But what about negative SEO from bad competiros?
I have no idea what you are talking about.
And, if I look at your site : language code is present for both languages

avatar ManuxGR
ManuxGR - comment - 17 Nov 2014

Ok i forget about this

avatar lunalars
lunalars - comment - 17 Nov 2014

Maybe i'm blind :-)
As said #5129 works well for me, if "Remove URL Language Code" is set to "Yes" .
That's on your testsite and on my local testsite if i apply the settings you provided in your test instructions.

But this PR here (#5092) only applies, if "Remove URL Language Code" is set to "No" .

avatar infograf768
infograf768 - comment - 17 Nov 2014

Again, when set to NO, there is NO 303.

avatar smanzi
smanzi - comment - 17 Nov 2014

@infograf768 sorry if I post here again, but as you replied here I think it is better to keep all together.

You are right that there are no programmatically generated redirects, but there can be if one changes his site configuration from yes to no.

avatar lunalars
lunalars - comment - 17 Nov 2014

Sorry, but i get a 303 :-)
Will set up a test site in the evening an post the link and "instructioons" here.

avatar lunalars
lunalars - comment - 17 Nov 2014

Now i've set up a test site at http://www.lunadev.de with pure joomla 3.3.6 (if i should update to latest staging, please tell me!).
The only "hack" is a 301 redirect from non-www to www in htaccess.

  1. "Remove URL Language Code" is set to "No"
  2. in global config "SEF" is set to "Yes"
  3. in global config "SEF rewrite" is set to "Yes"

Imagine the site had only 1 language (german) before and google has indexed all URLs without language code.

Now you just added english as the second language and applied the settings above.
If you now call the "old" homepage at http://www.lunadev.de (or any other "old" URL without language code) you will be redirected by 303 to the new URL with the language code.

You can check this at http://redirectdetective.com :
enter http://www.lunadev.de
or http://www.lunadev.de/startseite-de/8-deutsch/1-german-article-1
and you will get a 303.

Eveything else works fine! No other redirects when using menu or language switcher on the site.

I hope this makes it more clear. If i mix up something, please tell me.

avatar infograf768
infograf768 - comment - 17 Nov 2014

The real url of this site is:
http://www.lunadev.de/de/startseite-de/8-deutsch/1-german-article-1
NOT
http://www.lunadev.de/startseite-de/8-deutsch/1-german-article-1 // without the "de" prefix as you have set :
"Remove URL Language Code" to "No"

If you had an old homepage
http://www.lunadev.de/startseite-de/8-deutsch/1-german-article-1
Then try to set URL Language code to YES and make sure German is the default language.

avatar infograf768 infograf768 - close - 17 Nov 2014
avatar infograf768
infograf768 - comment - 17 Nov 2014

Also, please post now in
#5129
as this PR is wrong and I am closing it.

avatar infograf768 infograf768 - change - 17 Nov 2014
Status Pending Closed
Closed_Date 0000-00-00 00:00:00 2014-11-17 15:55:54
avatar lunalars
lunalars - comment - 17 Nov 2014

Just one more question, as i think this dooes not belong in #5129 :
I absolutely agree with you on the "real URL", but what is (in your opinion) the correct way to handle the call of the "wrong" URL ?

avatar infograf768
infograf768 - comment - 17 Nov 2014

If you had an old homepage
http://www.lunadev.de/startseite-de/8-deutsch/1-german-article-1
Then set Remove URL Language Code to YES and make sure German is the default language.

If the path is the same, it will work.

avatar lunalars
lunalars - comment - 17 Nov 2014

This does not answer my last question.
And i can't change to yes as i can't decide this on my own.
But i'll be quiet now :-)

avatar smanzi
smanzi - comment - 17 Nov 2014

Lars, I would probably try to handle your situation within .htaccess

Add a Comment

Login with GitHub to post a comment