? Success

User tests: Successful: Unsuccessful:

avatar chivitli
chivitli
20 Jul 2016

Pull Request for Issue # .

Summary of Changes

This is in addition to #5306 which was merged some time ago. While that pull request fixed issue with invalid article pages when article still had some pages, there was still a problem if you would completely remove page breaks from an article, because then the exception would never be thrown, it was too low in the code. Now if plugin detects that there are no pagebreaks in the article body, it will also additionaly check if there is a request for a subpage, and if yes throw 404.

Testing Instructions

To test, try opening some article that does not have page breaks and add to the end of the url ?start=xx where xx is any number. You should see the article normally. Apply the patch and make sure the same link gives 404 page.

avatar chivitli chivitli - open - 20 Jul 2016
avatar chivitli chivitli - change - 20 Jul 2016
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 20 Jul 2016
Labels Added: ?
avatar chivitli chivitli - change - 20 Jul 2016
The description was changed
avatar brianteeman
brianteeman - comment - 20 Jul 2016

I am not sure I want to see a 404 in that circumstance


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11217.

avatar chivitli
chivitli - comment - 20 Jul 2016

That would be consistent with behaviour for an article with page breaks. Imagine you have article with pages 1 and 2 and then you decided to remove them, to display all content on single page. Without this change, requesting page 2 (or requesting any page number) would show article normally. Compare this with current behavior: imagine article has pages 1, 2 and 3, and you remove page 3. Then page 3 and all above would throw 404. I don't see why behaviour should be different whether there is 1 or more pages.

avatar ggppdk
ggppdk - comment - 20 Jul 2016

hello @chivitli

"consistent" but very SEO unfriendly
Some sites are overly aggressive and do not through 404 at all, NEVER,

  • and you suggest throughing a 404 for an article that it exists but had its multi-page content converted to single page ?

    a 404 must be a very last choice

  • and in this case we do have a much better choice which is to display the article

if you want to see this improved you can suggest a 303 redirect to the non paged URL

  • but arguably current behaviour is ok too

and if you would want to improve it further:

  • if you have a 4 page article and you try to access page 7 then make a 303 redirect to the all pages of the article, i don't like much current behaviour of pagebreak to throw a 404, but for great majority of sites it is not important anyway, as they do not change number of pages in articles
avatar chivitli
chivitli - comment - 20 Jul 2016

@ggppdk

The current behavour was done based on the suggestion of @Hackwar at #5306 (comment) and @roland-d agreed at #5306 (comment)

avatar ggppdk
ggppdk - comment - 20 Jul 2016

Ok,
I fail to see that reducing the number of pages is good reason to throw a 404,

you have other choices to avoid "duplicate content" reasoning for throwing 404

  • either force page to add a rel canonical (by default this is avoided if current URL matches the canonical), but it is not our case
  • or do a 303 redirect to the all-pages page of the article
avatar csthomas
csthomas - comment - 21 Jul 2016

What joomla should do if visitor go to third page in an article that in past had 5 pages but now the article has only 1.

  • If joomla shows 404 then visitor may think that page was deleted. Only a few people may delete suffix from URL and go to correct page.
  • If joomla redirect to correct page then visitor do not see any problem, but joomla will have two requests to perform.
  • if joomla show correct content with canonical then visitor will be happy, google understand that (support canonical) and joomla has only one request.
avatar brianteeman brianteeman - change - 21 Jul 2016
Category Front End Router / SEF
avatar chivitli
chivitli - comment - 21 Jul 2016

It should not be responsability of pagebreak plugin to deal with canonicals, nor it can do this efficiently.
1) It cannot reliably determine which link is canonical, it can only use url though which it is accessed. Outputing that as "canonical" even though it may not be is probably not best for SEO.
2) Even if it could find correct canonical, they are automatically added in sef system plugin in the onAfterDispatch method. Page break plugin would have to parse all links already set in the document and unset canonical link if it exists. This again seems like mixing responabilities. Or it can try to change the canonical already set in the document, but this also seems like a strange functionality for page break plugin...

303 redirect to page 1 sounds ok to me, but I am not sure if it should be done here. Whatever option is chosen, behavour should be the same for articles that hasve page breaks, and invalid page is accessed, and for articles without page breaks, when invalid page is accessed. So either accepting this pull request and throwing always 404, or changing this and the change from #5306 to deal with it differently.

avatar ggppdk
ggppdk - comment - 21 Jul 2016

It should not be responsability of pagebreak plugin to deal with canonicals

agreed, i mentioned it as an option, i do not think that is correct to do it

303 redirect to page 1 sounds ok to me, but I am not sure if it should be done here. Whatever option is chosen, behavour should be the same for articles that hasve page breaks, and invalid page is accessed, and for articles without page breaks, when invalid page is accessed

yes i think make a 303 (or 302 ?) redirect instead of 404, for all of the invalid pages
it is the most proper and easy thing to do

avatar mbabker
mbabker - comment - 21 Jul 2016

Why is a redirect when you've tried paginating to a non-existing resource more correct than a 404? You've requested a resource that doesn't exist, by default that screams 404 to me. I'd say the fact that https://www.joomla.org/announcements/release-news.html?start=1500 renders a page saying "no articles" is the wrong behavior; different view but same concept, you've paginated beyond the number of available resources.

avatar andrepereiradasilva
andrepereiradasilva - comment - 21 Jul 2016

agree if @mbabker if a page doesn't exist is a 404

6.5.4. 404 Not Found

The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists. A 404 status code does not
indicate whether this lack of representation is temporary or
permanent; the 410 (Gone) status code is preferred over 404 if the
origin server knows, presumably through some configurable means, that
the condition is likely to be permanent.

A 404 response is cacheable by default; i.e., unless otherwise
indicated by the method definition or explicit cache controls (see
Section 4.2.2 of [RFC7234]).

https://tools.ietf.org/html/rfc7231#section-6.5.4

avatar ggppdk
ggppdk - comment - 21 Jul 2016

The article has not be deleted it may have less pages, or just 1,

For the sake of being "consistent" we tell visitors and search engines,

"get out of here" what you seek does not exist,

which is not true, an all-pages view of the article is not only right but it will most probably contain the desired content, you can even make 301 redirect with an expired cache time, in case someone in the future decides to re-add pages to it

anyway i don't care much about this
because it is rare that an article reduces the number of pages

  • it will be only a problem, for a website owner that will try to do in all of site content, then site will get a ton of 404
avatar mbabker
mbabker - comment - 21 Jul 2016

Right, but it's the fact that a "sub-resource" of the article (for lack of better terms) no longer exists. The parent item (article) is there, but the resource (page whatever) has been removed for whatever reason. Since that resource no longer exists, that to me signifies that a 404 is the proper behavior, unless someone configures a redirect to handle that situation then what Joomla does by default doesn't really matter.

avatar csthomas
csthomas - comment - 27 Sep 2016

After all I think joomla may return error 404 with message which display correct link to article,
like "Page not found. Did you mean [correct link to article]?"

avatar franz-wohlkoenig
franz-wohlkoenig - comment - 8 Jan 2017

I have tested this item successfully on 9cf5a04


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/11217.

avatar franz-wohlkoenig franz-wohlkoenig - test_item - 8 Jan 2017 - Tested successfully
avatar brianteeman
brianteeman - comment - 19 May 2017

I have tested this item successfully on 9cf5a04

tested successfully
if ?showall=&start=4 will give a 404 on a multipage article with only 3 pages then it is correct that this pr will give a 404 with the same param on an article with only one page


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/11217.

avatar brianteeman brianteeman - test_item - 19 May 2017 - Tested successfully
avatar franz-wohlkoenig franz-wohlkoenig - change - 20 May 2017
The description was changed
Status Pending Ready to Commit
avatar joomla-cms-bot joomla-cms-bot - edited - 20 May 2017
avatar joomla-cms-bot joomla-cms-bot - change - 20 May 2017
Category Front End Router / SEF Front End Plugins Router / SEF
avatar franz-wohlkoenig
franz-wohlkoenig - comment - 20 May 2017

RTC after two successful tests.

avatar rdeutz rdeutz - change - 22 May 2017
Status Ready to Commit Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2017-05-22 18:07:53
Closed_By rdeutz
Labels Added: ?
avatar rdeutz rdeutz - close - 22 May 2017
avatar rdeutz rdeutz - merge - 22 May 2017

Add a Comment

Login with GitHub to post a comment