?
avatar sebastienlapoux
sebastienlapoux
24 Jul 2020

Steps to reproduce the issue

Write a Joomla link including a language segment in uppercase, example HTTPS://WWW.VARIO-EVENT.DE/DE/EVENTLOCATIONS/EVENTLOCATIONS-BERLIN.
Copy and paste this link into the navigator.
The link will be changed into HTTPS://WWW.VARIO-EVENT.DE/de/DE/EVENTLOCATIONS/EVENTLOCATIONS-BERLIN

Expected result

The navigator load HTTPS://WWW.VARIO-EVENT.DE/DE/EVENTLOCATIONS/EVENTLOCATIONS-BERLIN

Actual result

The link is redirected to HTTPS://WWW.VARIO-EVENT.DE/de/DE/EVENTLOCATIONS/EVENTLOCATIONS-BERLIN. de/DE -> 404

System information (as much as possible)

Joomla 3.

Additional comments

The language segment in uppercase is not recognized by Joomla. So Joomla is adding the language segment in lowercase and will redirect to de/DE.

avatar sebastienlapoux sebastienlapoux - open - 24 Jul 2020
avatar joomla-cms-bot joomla-cms-bot - labeled - 24 Jul 2020
avatar brianteeman
brianteeman - comment - 24 Jul 2020

confirmed for j4

avatar infograf768
infograf768 - comment - 25 Jul 2020

Tested on J3 as poster did. Partly confirm issue.

  1. The URL language code should never be Uppercase. It is automatically saved as lowercase when saving a Content Language.
  2. It is a known issue that after the domain (including subfolders paths if any) one should not use uppercase segments. AFAIK basically, URLs are case-sensitive, except for the path.

Firefox Macintosh:
This works
http://localhost:8888/INSTALLMULTI/TRUNKGITNEW/fr/blog/bienvenue-sur-votre-blog-fr-fr

These do not work => 404
http://localhost:8888/INSTALLMULTI/TRUNKGITNEW/FR/blog/bienvenue-sur-votre-blog-fr-fr
as well as
http://localhost:8888/INSTALLMULTI/TRUNKGITNEW/FR/BLOG/BIENVENUE-SUR-VOTRE-BLOG-FR-FR
and
http://localhost:8888/INSTALLMULTI/TRUNKGITNEW/fr/BLOG/BIENVENUE-SUR-VOTRE-BLOG-FR-FR

I do not confirm here obtaining de/DE/ (in my case fr/FR/)

Is that Joomla related?

avatar richard67 richard67 - change - 25 Jul 2020
Status New Closed
Closed_Date 0000-00-00 00:00:00 2020-07-25 08:54:40
Closed_By richard67
avatar richard67
richard67 - comment - 25 Jul 2020

@sebastienlapoux Your expectation is wrong, see e.g. the discussion here: https://stackoverflow.com/questions/7996919/should-url-be-case-sensitive.

According to W3's "HTML and URLs" they should:

There may be URLs, or parts of URLs, where case doesn't matter, but identifying these may not be easy. Users should always consider that URLs are case-sensitive.

and

Domain names are case insensitive according to RFC 4343. The rest of URL is sent to the server via the GET method. This may be case sensitive or not.

So in an URL the path, i.e. what comes after the domain part (which may be upper or lower or mixed case), might or might not be case-sensitive depending on the server OS and webserver configuration, but it is recommended to consider it as case-sensitive.

Later in the linked dicusssion you can find that some popular websites do it case-insensitive (stackovflow) and some case-sensitive (Wikipedia).

Google seems to work case-sensitive, too.

So you can use URL rewriting of your webserver to work arouing this, if you really insist in case-insensitive URLs, but I'd not recommend that.

Closing as not a Joomla core issue.

avatar richard67 richard67 - close - 25 Jul 2020
avatar richard67
richard67 - comment - 21 Aug 2020

Re-opened as requested in issue #30439 . @sebastienlapoux Please provide new information why it should be re-opened.

avatar richard67 richard67 - change - 21 Aug 2020
Status Closed New
Closed_Date 2020-07-25 08:54:40
Closed_By richard67
avatar richard67 richard67 - reopen - 21 Aug 2020
avatar sebastienlapoux
sebastienlapoux - comment - 21 Aug 2020

Hi @infograf768 and @richard67 ,

Thanks a lot for your answers.

3 websites with 3 fully different hosting managed by 3 different companies. Like you can see below, vario-event.de is not case sensitive and 2 others are case sensitive (speaking only about the path of the URL).
But the 3 websites have the same issue when just the language segment is into uppercase (last one on list below).
Have you please an idea to explain this behavior?

https://www.vario-event.de/de/EVENTLOCATIONS/eventlocations-berlin -> https://www.vario-event.de/de/eventlocations/eventlocations-berlin
or
https://www.innio.com/fr/SOLUTIONS/production-d-energie -> https://www.innio.com/fr/SOLUTIONS/production-d-energie
or
https://www.swimmingpool.eu/en/PRODUCTS/19321-hayward-easy-temp-heater -> https://www.swimmingpool.eu/en/PRODUCTS/19321-hayward-easy-temp-heater

https://www.vario-event.de/de/eventlocations/EVENTLOCATIONS-BERLIN -> https://www.vario-event.de/de/eventlocations/eventlocations-berlin
or
https://www.innio.com/fr/solutions/PRODUCTION-D-ENERGIE -> 404
or
https://www.swimmingpool.eu/en/products/19321-HAYWARD-EASY-TEMP-HEATER -> 404

https://www.vario-event.de/de/EVENTLOCATIONS/EVENTLOCATIONS-BERLIN -> https://www.vario-event.de/de/eventlocations/eventlocations-berlin
or
https://www.innio.com/fr/SOLUTIONS/PRODUCTION-D-ENERGIE -> 404
or
https://www.swimmingpool.eu/en/PRODUCTS/19321-HAYWARD-EASY-TEMP-HEATER -> 404

https://www.vario-event.de/de/eventlocations/eventlocations-BERLIN -> https://www.vario-event.de/de/eventlocations/eventlocations-berlin
or
https://www.innio.com/fr/solutions/production-d-ENERGIE -> 404
or
https://www.swimmingpool.eu/en/products/19321-hayward-easy-temp-HEATER -> 404

LANGUAGE SEGMENT in uppercase
https://www.vario-event.de/DE/eventlocations/eventlocations-berlin -> redirected to https://www.vario-event.de/de/DE/eventlocations/eventlocations-berlin and 404
or
https://www.innio.com/FR/solutions/production-d-energie -> redirected to https://www.innio.com/fr/FR/solutions/production-d-energie and 404
or
https://www.swimmingpool.eu/EN/products/19321-hayward-easy-temp-heater -> redirected to https://www.swimmingpool.eu/en/

Thanks in advance.

avatar infograf768
infograf768 - comment - 21 Aug 2020

I guess I have already explained above. Don't know what to say more.
Windows and Linux/MacOS do not behave the same way with uppercase urls.
Only the domain can be uppercase to be safe with all environments.

avatar richard67
richard67 - comment - 21 Aug 2020

@sebastienlapoux If you have an .htaccess file you can do a URL rewrite.

avatar infograf768
infograf768 - comment - 21 Aug 2020

see also https://www.computerhope.com/issues/ch000709.htm
and multiple other references on the Net

avatar rdeutz
rdeutz - comment - 21 Aug 2020

Hi Sebastien,

I think we have here different issues, if the langtag is in uppercase the systen doesn't recognise it as langtag an adds the default lang tag.

I don't think we can do so much here, because server configuration plays also a role. So I think the solution is to put something into .htaccess and convert the url before. I don't see that we can check server configuration and then guess what could be the right url and redirect. What do you thing? Any idee what could be a good other strategy?

avatar infograf768
infograf768 - comment - 22 Aug 2020

To test further, I modified the url language code in the db from fr to FR.
It works on MacOS/Linux, but we get to the same problem if the link is entered with a lowercase code.
http://localhost:8888/installmulti/trunkgitnew/FR/blog/bienvenue-sur-votre-blog-fr-fr is fine
but if one enters
http://localhost:8888/installmulti/trunkgitnew/fr/blog/bienvenue-sur-votre-blog-fr-fr
I get a 404.

So, it is possible to use an uppercase url language code, but it should always be entered as uppercase everywhere it is proposed.
This is not common and best practice is to avoid it by all means.

Hope this helps.

avatar sebastienlapoux
sebastienlapoux - comment - 24 Aug 2020

Thanks @infograf768!

Like our expectation, your test is showing that Joomla code is sensitive to the language variable case. It must not.

Romain below will provide a very easy fix into /plugins/system/languagefilter/languagefilter.php . Hoping it will works without other consequences. Waiting your feedback on it.

Thanks.

avatar manusfreedom
manusfreedom - comment - 24 Aug 2020

Please see the following commit: eab1664

avatar manusfreedom
manusfreedom - comment - 24 Aug 2020

And we can also take same solution with browser HTTP_ACCEPT_LANGUAGE in /libraries/src/Language/LanguageHelper.php (line 72 and line 87) but maybe all browsers respect lowercase:

...
        /**
         * Tries to detect the language.
         *
         * @return  string  locale or null if not found
         *
         * @since   1.5
         */
        public static function detectLanguage()
        {
                if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE']))
                {
                        $browserLangs = explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']);
                        $systemLangs = self::getLanguages();

                        foreach ($browserLangs as $browserLang)
                        {
                                // Slice out the part before ; on first step, the part before - on second, place into array
                                // Fix uppercase language in HTTP_ACCEPT_LANGUAGE
                                $browserLang = strtolower(substr($browserLang, 0, strcspn($browserLang, ';')));
                                $primary_browserLang = substr($browserLang, 0, 2);

                                foreach ($systemLangs as $systemLang)
                                {
                                        // Take off 3 letters iso code languages as they can't match browsers' languages and default them to en
                                        $Jinstall_lang = $systemLang->lang_code;

                                        if (strlen($Jinstall_lang) < 6)
                                        {
                                                if (strtolower($browserLang) == strtolower(substr($systemLang->lang_code, 0, strlen($browserLang))))
                                                {
                                                        return $systemLang->lang_code;
                                                }
                                                // Fix uppercase language in HTTP_ACCEPT_LANGUAGE
                                                //elseif ($primary_browserLang == substr($systemLang->lang_code, 0, 2))
                                                elseif ($primary_browserLang == strtolower(substr($systemLang->lang_code, 0, 2)))
                                                {
                                                        $primaryDetectedLang = $systemLang->lang_code;
                                                }
                                        }
                                }

                                if (isset($primaryDetectedLang))
                                {
                                        return $primaryDetectedLang;
                                }
                        }
                }

                return;
        }
...
avatar infograf768
infograf768 - comment - 24 Aug 2020

your test is showing that Joomla code is sensitive to the language variable case. It must not.

Nope.
Linux/MacOS are sensitive to the url language code case AS WELL as the other parts of the url except the domain... This is not specific to J.

avatar infograf768
infograf768 - comment - 24 Aug 2020

Now, tbh, why the heck are you insisting on using an uppercase sef when joomla is by default saving the sef as lowercase?

avatar sebastienlapoux
sebastienlapoux - comment - 26 Aug 2020

Hi @infograf768 ,

@brianteeman , @rdeutz , @richard67 thanks for your feedback.

It's Joomla code bug!! Joomla is always converting in lowercase in a lot of cases but not (only maybe) about the language segment.

Why we can change any characters in uppercase into any Joomla URL but just not for the language segment?

Thanks.

avatar rdeutz
rdeutz - comment - 26 Aug 2020

Why we can change any characters in uppercase into any Joomla URL but just not for the language segment?

Why you are changing it to uppercase at all?

avatar sebastienlapoux
sebastienlapoux - comment - 26 Aug 2020

Why you are changing it to uppercase at all?

Sometimes for marketing & communication purposes URL are written in uppercase. In the case where the user is filling manually this URL, the user by seing it in uppercase can push this URL in uppercase into his navigator too.

avatar infograf768
infograf768 - comment - 26 Aug 2020

Can do the

-			$sef = $parts[0];
+			$sef = strtolower($parts[0]);

as it does not break anything.

avatar richard67
richard67 - comment - 26 Aug 2020

And we can also take same solution with browser HTTP_ACCEPT_LANGUAGE in /libraries/src/Language/LanguageHelper.php (line 72 and line 87) but maybe all browsers respect lowercase:

@manusfreedom As far as I know and have seen, browsers always use lowercase, so no need for a fix there.

avatar infograf768 infograf768 - change - 26 Aug 2020
Status New Closed
Closed_Date 0000-00-00 00:00:00 2020-08-26 10:17:07
Closed_By infograf768
avatar infograf768
infograf768 - comment - 26 Aug 2020

Please test
#30485
closing

avatar infograf768 infograf768 - close - 26 Aug 2020

Add a Comment

Login with GitHub to post a comment