Joomla! Issue Tracker | Joomla! CMS #7681 - Fix for "Googlebot cannot access CSS and JS files"

Closed
2 Nov 2015
Medium
Build: staging
Easy Test
# 7681
Diff
pe7er:patch-16

Success

Success The Travis CI build passed Details

User tests: Successful: Unsuccessful:

pe7er
11 Aug 2015

In July 2015 a lot of people got "Googlebot cannot access CSS and JS files" notifications from Google.
This PR implements the fix described here:
http://upcity.com/blog/how-to-fix-googlebot-cannot-access-css-and-js-files-error-in-google-search-console/

04aba94 11 Aug 2015

Fix for "Googlebot cannot access CSS and JS files"

pe7er - open - 11 Aug 2015

pe7er - change - 11 Aug 2015

Status

New

⇒

Pending

joomla-cms-bot - change - 11 Aug 2015

Labels

Added: ?

Fedik - comment - 11 Aug 2015

I just leave it here: #6361 #6702 #6839 #6098

zero-24 - change - 11 Aug 2015

Easy

⇒

Yes

zero-24 - change - 11 Aug 2015

Category

⇒

Front End

smehrbrodt - comment - 12 Aug 2015

We might just remove the Disallow rule for plugins, components and modules. All of these might contain .js and .css files.

I would also remove cache since some plugins put cached images there (e.g. ImageSizer)

Bakual - comment - 12 Aug 2015

We might just remove the Disallow rule for plugins, components and modules. All of these might contain .js and .css files.

No. Those folders should not contain any js or css files to begin with (if the extension is properly developed), and Google should not index any of the other files in there.
Same for cache, there are files in there which definitively should not be indexed by Google.

If the proposed code here works, then that would be an acceptable solution imho.

photodude - comment - 12 Aug 2015

@pe7er Why only allow for googlebot? If this is a valid solution, why not allow all "bots" to index .js and .css files?

smehrbrodt - comment - 12 Aug 2015

No. Those folders should not contain any js or css files to begin with (if the extension is properly developed), and Google should not index any of the other files in there.

Then why do we need this patch at all?

mbabker - comment - 12 Aug 2015

The keywords in that post are "properly developed". Extensions which aren't following best practices and placing web assets in the media folder disallow you to do things like override the media with template level overrides and block the files from being indexed by bots without giving explicit permissions.

So, the patch is really only need if you are using extensions which don't use the images and media folders for assets that should be publicly accessible.

Of course, the other option is to just stop shipping a robots.txt file. Based on feedback in the forums, it seems that file is a major source of confusion and misunderstanding.

Bakual - comment - 12 Aug 2015

Also keep in mind that any change we do is only done to the robots.txt.dist file. The real robots.txt doesn't get changed, so the user who face this issue need to change it manually anyway.

We can of course do yet another postinstall message that we updated the robots.txt.dist file to test how many user actually read those

joomlamarco - comment - 18 Sep 2015

jommla 3.4.4 test was expected

_{This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/7681.}

joomlamarco - test_item - 18 Sep 2015 - Tested successfully

coolman01 - test_item - 24 Oct 2015 - Tested successfully

coolman01 - comment - 24 Oct 2015

I have tested this item successfully on 04aba94

works for me

_{This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/7681.}

zero-24 - alter_testresult - 24 Oct 2015 - joomlamarco: Tested successfully

zero-24 - change - 24 Oct 2015

Status

Pending

⇒

Ready to Commit

zero-24 - comment - 24 Oct 2015

rtc

_{This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/7681.}

joomla-cms-bot - change - 24 Oct 2015

Labels

Added: ?

photodude - comment - 24 Oct 2015

Still wondering, Why only allow for googlebot? If this is a valid solution, why not allow all "bots" to index .js and .css files?

zero-24 - change - 24 Oct 2015

Milestone

Added:

zero-24 - change - 24 Oct 2015

Milestone

Added:

pe7er - comment - 26 Oct 2015

I experienced a problem with the Google bot and therefore I implemented it only for Google bot.
Do other search engines check the js & css as well?

Bakual - comment - 26 Oct 2015

I don't know if other search engines do or not. But I'd say if they don't do it yet today, they probably will do sooner or later. And I don't want a PR every time a search engines adds that feature
Also is there a reason why they shouldn't index it when Google is allowed? Probably not.

From my understanding I would just remove the Google bot limitation and allow it for anyone. But then, I would first throw out those stupidly built extensions anyway

pe7er - comment - 26 Oct 2015

Ok, good point!

6805e32 26 Oct 2015

Allow indexing js & css scripts for mobility

joomla-cms-bot - comment - 26 Oct 2015

This PR has received new commits.

CC: @coolman01, @joomlamarco

_{This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/7681.}

roland-d - comment - 28 Oct 2015

Can you guys please test so we can merge it into 3.5? Thanks.

brianteeman - comment - 28 Oct 2015

The checker that we link to in the robots.txt file itself says that this is invalid

Line31 Allow: .js
Unknown command. Acceptable commands are "User-agent" and "Disallow".
A robots.txt file doesn't say what files/directories you can allow but just what you can disallow. Please refer to Robots Exclusion Standard page for more informations.
Line 32 Allow: .css
Unknown command. Acceptable commands are "User-agent" and "Disallow".
A robots.txt file doesn't say what files/directories you can allow but just what you can disallow. Please refer to Robots Exclusion Standard page for more informations.

mbabker - comment - 28 Oct 2015

Google's robots.txt specification expands on the actual robots.txt standard to add support for non-standard commands. So *IF* we want to add the Allow statements it has to be limited to bots from crawlers that use an extended standard. I've said it before and I'll say it again, I still think it's a bad idea to keep adding Google and/or crap extension specific workarounds to core.

brianteeman - comment - 28 Oct 2015

IF its decided to allow this code (I personally agree with @mbabker) then we should remove the checker link from the file that says its invalid

roland-d - comment - 28 Oct 2015

I do agree with @mbabker here, if this is a Google specific change, we shouldn't allow it. Next one will come that they want a Baidu exception, where we do we stop then?

I wasn't aware at first that it was a robot specific setting. If people want it, they can add it to their own robots file. If it were generic, we could add it.

zero-24 - change - 2 Nov 2015

Status	Ready to Commit	⇒	Pending
Labels

zero-24 - change - 2 Nov 2015

Labels

Removed: ?

zero-24 - change - 2 Nov 2015

Milestone

Removed:

roland-d - comment - 2 Nov 2015

Closing this issue as the solution is basically in place, we should use the media folder for media. Thanks everybody for your contributions.

roland-d - change - 2 Nov 2015

Status	Pending	⇒	Closed
Closed_Date	0000-00-00 00:00:00	⇒	2015-11-02 12:44:26
Closed_By		⇒	roland-d

roland-d - close - 2 Nov 2015

pe7er - head_ref_deleted - 5 Nov 2015

Add a Comment

Older
Newer

Joomla! Issue Tracker - CMS

[#7681] - Fix for "Googlebot cannot access CSS and JS files"

Add a Comment