Feature No Code Attached Yet
avatar universewrld
universewrld
18 Oct 2024

Is your feature request related to a problem? Please describe.

Even though Joomla 5.2 has some new settings to eliminate duplicate pages, there are still #44263 duplicate pages in the CMS.
I suggest reducing the number of duplicate pages in @joomla.

Describe the solution you'd like

New options for the System - SEF plugin:

  1. Redirect pages with ID (articles, categories) to the version of pages without ID
  2. Redirect from the version of a page with www to the version of a page without www (or vice versa)
  3. Canonical links for pagination pages (for pages like ?start=10)

What Joomla has already done to reduce duplicate pages:

  • Search Engine Friendly URLs
  • Use URL Rewriting
  • Force HTTPS
  • Strict handling of index.php
  • Trailing slash for URLs
  • Strict Routing
  • WWW redirect
  • IDs redirect #44455 #44477
  • Canonical links

Additional context

This will help eliminate almost all duplicate pages for search engines.
The fewer duplicate pages processed by search robots like @google, @google-gemini, @microsoft, @openai, etc, the less energy will be released by data processing centers, the less effect will be on the environment and climate change.

What is canonicalization - https://developers.google.com/search/docs/crawling-indexing/canonicalization
Redirects and Google Search - https://developers.google.com/search/docs/crawling-indexing/301-redirects
How to specify a canonical URL with rel="canonical" and other methods - https://developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls

avatar universewrld universewrld - open - 18 Oct 2024
avatar joomla-cms-bot joomla-cms-bot - change - 18 Oct 2024
Labels Added: No Code Attached Yet
avatar joomla-cms-bot joomla-cms-bot - labeled - 18 Oct 2024
avatar fgsw
fgsw - comment - 19 Oct 2024

Duplicate of #44263

avatar universewrld
universewrld - comment - 19 Oct 2024

Duplicate of #44263

this is not a duplicate, this is a feature request.
the previous issue was a bug report. do you see the difference?

you can close the previous issue #44263, but not this one.

avatar alikon alikon - change - 19 Oct 2024
Labels Added: Feature
avatar alikon alikon - labeled - 19 Oct 2024
avatar richard67
richard67 - comment - 20 Oct 2024
  1. Redirect from the version of a page with www to the version of a page without www

What if I want it vice versa, redirect non www to www?

avatar universewrld
universewrld - comment - 20 Oct 2024
  1. Redirect from the version of a page with www to the version of a page without www

What if I want it vice versa, redirect non www to www?

yes, that's what I meant.
there should be an option for WWW:

  • don't use redirect
  • redirect to WWW
  • redirect to no WWW
avatar universewrld universewrld - change - 20 Oct 2024
The description was changed
avatar universewrld universewrld - edited - 20 Oct 2024
avatar Mich-es
Mich-es - comment - 30 Oct 2024

Hii - Huge Problem with J5.2

Google has a problem with J5.2. Joomla now appends a rel=‘canonical’ to every crap page and Google no longer knows what the original is. This is a serious bug and should be fixed as soon as possible. On a site with around 12,000 URLs, I have 4,800 duplicate content pages with rel=‘canonical’ in 3 days. This must be solved with J5.2.1 and not with J5.3!

Greetings Mitches

avatar simbus82
simbus82 - comment - 30 Oct 2024

Canonical links for pagination pages (for pages like ?start=10)

Brrr, this is a really wrong approach.
Screenshot_20241030-221113.jpg

You should canonicalise paginated pages only to a "view all" page, not to a page that show only a limited number of child pages.

avatar universewrld
universewrld - comment - 4 Nov 2024

Canonical links for pagination pages (for pages like ?start=10)

Brrr, this is a really wrong approach. Screenshot_20241030-221113.jpg

You should canonicalise paginated pages only to a "view all" page, not to a page that show only a limited number of child pages.

the image shows that all pages with pagination such as /blog/page2, /blog/page3, etc. should specify /blog as the canonical first page.

In @joomla this would look like pages like /blog?start=10 and /blog?start=25 would point to /blog as the canonical blog page.

avatar simbus82
simbus82 - comment - 4 Nov 2024

It's the same, forgive my bluntness, but these are basic SEO concepts.

The "/blog" page on a Joomla site is typically already limited in the number of items (e.g., it shows intros to the latest 10 blog posts), so it's NOT suitable to become the canonical for a "/blog?start=10".

If you set a canonical tag that always points to /blog, you're telling search engines that all paginated pages (/blog?start=10, /blog?start=20, etc.) are identical to /blog. This is absolutely incorrect and causes a host of issues, including:

Loss of Content Indexing

Search engines will ignore pages after the first one (/blog, which shows ONLY 10 links to the underlying posts), because the canonical tag says that all paginated content (whether you like it or not, ?start=10 is pagination) is a duplicate of the initial page. In practice, search engines would only see the first 10 articles in the blog section, overlooking content on subsequent pages.

Redistribution of Link Juice to a Limited URL, Resulting in Loss of Link Juice to Posts Beyond the Tenth

When different pages all point to a single URL via an incorrect canonical (e.g., all paginated blog pages point to /blog), search engines concentrate the link juice on the declared canonical URL (/blog), ignoring the other pages.
As a result, the paginated pages (like /blog?start=10, /blog?start=20, etc.) lose their individual authority and fail to pass link juice to the posts or content within or beneath those pages.

Sitemap?

And even if you submit all blog post links in a sitemap to Google or other search engines, this doesn’t automatically guarantee that those links will be indexed correctly or receive the right amount of link juice (authority) if the canonical tag isn’t set up properly.

The sitemap is merely a list that helps search engines discover your URLs, but it doesn’t determine which version of a URL is considered the primary one, nor how internal links pass authority (link juice) within the site. If you have an incorrect canonical setup (for example, all paginated pages pointing to /blog as the canonical), you’re telling search engines that only the canonical page ("/blog") is the main version of all content.

avatar universewrld universewrld - change - 5 Nov 2024
The description was changed
avatar universewrld universewrld - edited - 5 Nov 2024
avatar universewrld
universewrld - comment - 5 Nov 2024

What is canonicalization

Canonicalization is the process of selecting the representative –canonical– URL of a piece of content. Consequently, a canonical URL is the URL of a page that Google chose as the most representative from a set of duplicate pages. Often called deduplication, this process helps Google show only one version of the otherwise duplicate content in its search results.

There are many reasons why a site may have duplicate content:

  • Region variants: for example, a piece of content for the USA and the UK, accessible from different URLs, but essentially the same content in the same language
  • Device variants: for example, a page with both a mobile and a desktop version
  • Protocol variants: for example, the HTTP and HTTPS versions of a site
  • Site functions: for example, the results of sorting and filtering functions of a category page
  • Accidental variants: for example, the demo version of the site is accidentally left accessible to crawlers

SOURCE: https://developers.google.com/search/docs/crawling-indexing/canonicalization

@simbus82 you are completely wrong!
Pages with pagination are duplicate content and @google says so!
This is a duplicate of content that is related to the site's function and is indicated in the official Google Help!

Image

avatar simbus82
simbus82 - comment - 5 Nov 2024

You really have no understanding of what pagination is and the context in which it should be managed or not with canonical.

https://developers.google.com/search/blog/2013/04/5-common-mistakes-with-relcanonical
Image
Image

https://developers.google.com/search/docs/specialty/ecommerce/pagination-and-incremental-page-loading
Image

https://yoast.com/rel-canonical/
Image

https://cognitiveseo.com/blog/19204/canonical-urls-seo/
Image

https://searchengineland.com/pagination-strategies-in-the-real-world-81204
Image

But think what you want, I have no gain from wasting time convincing you, about this so basic thing for a SEO Junior.
I hope the team is more competent than you and does not approve this nonsense.

I'm alwasy happy to give my help, the result of 20 years of web development and digital marketing, and over 10 years of SEO Specialist stuff for 6-figure projects.

You need to start studying, you're really a newbie in SEO.
I recommend The Art of SEO book.

avatar Hackwar
Hackwar - comment - 19 Nov 2024

The redirect from ID to no-ID is waiting to be tested with #44455 and #44477.

avatar universewrld
universewrld - comment - 19 Nov 2024

You really have no understanding of what pagination is and the context in which it should be managed or not with canonical.

you confused view single article which has multiple pages and view category blog.

page 1, page 2, page 3 etc view category blog should not be canonical pages.

@simbus82 I've been doing SEO for almost 20 years and I know a lot more about it than you do..
I'm not here as a developer, I'm here as a website owner. I do not sell extensions, I promote websites.

avatar universewrld universewrld - change - 19 Nov 2024
The description was changed
avatar universewrld universewrld - edited - 19 Nov 2024
avatar universewrld
universewrld - comment - 19 Nov 2024

The redirect from ID to no-ID is waiting to be tested with #44455 and #44477.

thanks! i added your links to my first post.

avatar simbus82
simbus82 - comment - 19 Nov 2024

You really have no understanding of what pagination is and the context in which it should be managed or not with canonical.

you confused view single article which has multiple pages and view category blog.

page 1, page 2, page 3 etc view category blog should not be canonical pages.

@simbus82 I've been doing SEO for almost 20 years and I know a lot more about it than you do..
I'm not here as a developer, I'm here as a website owner. I do not sell extensions, I promote websites.

"A lot more about it than you do", ok.
Contact me, we can talk about it.

avatar universewrld
universewrld - comment - 19 Nov 2024

we can talk about it.

Image

If you think that these links to page 1, page 2, etc. should be marked as canonical, then there is something wrong with you and your SEO skills.

avatar universewrld
universewrld - comment - 19 Nov 2024

@simbus82 here's an example based on Joomla! Blogs:

canonical links:
https://community.joomla.org/blogs.html - home page blog view
https://community.joomla.org/blogs/community/were-back.html - article pages of this blog

NON-CANONICAL LINKS:
https://community.joomla.org/blogs.html?start=5 - page 2 view blog category
https://community.joomla.org/blogs.html?start=10 - page 3 view blog category
https://community.joomla.org/blogs.html?start=15 - page 4 view blog category
etc

non-canonical links in this blog category should mark as canonical - the main page of this blog: https://community.joomla.org/blogs.html

@simbus82 should I give you a series of lessons in SEO or are you able to figure it all out yourself and read the information about SEO links more carefully?

avatar universewrld
universewrld - comment - 19 Nov 2024

@simbus82 here's an example based on Joomla! Blogs:

canonical links:
https://community.joomla.org/blogs.html - home page blog view
https://community.joomla.org/blogs/community/were-back.html - article pages of this blog

NON-CANONICAL LINKS:
https://community.joomla.org/blogs.html?start=5 - page 2 view blog category
https://community.joomla.org/blogs.html?start=10 - page 3 view blog category
https://community.joomla.org/blogs.html?start=15 - page 4 view blog category
etc

non-canonical links in this blog category should mark as canonical - the main page of this blog: https://community.joomla.org/blogs.html](https://community.joomla.org/blogs.html)

@simbus82 should I give you a series of lessons in SEO or are you able to figure it all out yourself and read the information about SEO links more carefully?

avatar brianteeman
brianteeman - comment - 19 Nov 2024

@universewrld I am in no way an expert on any of this. But all the content posted by @simbus82 #44310 (comment) from sources I absolutely trust say that you are wrong

avatar simbus82
simbus82 - comment - 19 Nov 2024

Last

@simbus82 here's an example based on Joomla! Blogs:

canonical links: https://community.joomla.org/blogs.html - home page blog view https://community.joomla.org/blogs/community/were-back.html - article pages of this blog

NON-CANONICAL LINKS: https://community.joomla.org/blogs.html?start=5 - page 2 view blog category https://community.joomla.org/blogs.html?start=10 - page 3 view blog category https://community.joomla.org/blogs.html?start=15 - page 4 view blog category etc

non-canonical links in this blog category should mark as canonical - the main page of this blog: https://community.joomla.org/blogs.html

@simbus82 should I give you a series of lessons in SEO or are you able to figure it all out yourself and read the information about SEO links more carefully?

Think what you want; unfortunately, you simply don’t understand what a canonical is and what it’s for.
If, for you, the content and (especially) the internal links of this page https://community.joomla.org/blogs.html?start=10 are IDENTICAL to those of this one https://community.joomla.org/blogs.html, and therefore you want to tell the search engine to NOT CONSIDER what’s on page https://community.joomla.org/blogs.html?start=10 but ONLY consider what’s on a page like this https://community.joomla.org/blogs.html, which will thus be the CANONICAL (Definition: A canonical URL is the URL of the page deemed most representative among a set of pages detectable as duplicates within a site), go ahead and good luck!

Unbelievable, you want to lecture me, hiding behind an anonymous nickname, spouting unmatched nonsense even after I provided (not that it was even necessary) the most authoritative global external sources on the matter.

If you want to talk to me, you can contact me on Linkedin, to talk as equals, between professionals, without hiding behind any anonymous alias. I won't answer you here anymore, I've already wasted too much time arguing about obvious things recognized worldwide.

Just for fun, let's see what ChatGPT thinks of this thread.

Image

avatar Mich-es
Mich-es - comment - 19 Nov 2024

Don't be put off - test it!
I used ‘Aimy Canonical’ after the paginated disaster out of necessity and set the paginated pages as canonical.

The explanations by @simbus82 imbus82 are correct and are also recommended by Google

Image

avatar universewrld
universewrld - comment - 19 Nov 2024

Just for fun, let's see what ChatGPT thinks of this thread.

all your knowledge is to refer to the chat bot ChatGPT, this is your real level + articles from 2013 that you indicated, but did not indicate articles from @google for 2024.

you have no knowledge, the chat bot replaced your brain and you just confirmed it yourself.

I have been doing SEO for almost 20 years and brought websites to the Top 1 in Google for high-frequency queries.
You have no level of analytics, you should not deceive people that you understand at least something in SEO.
SEO is not a theory, but a practice, practice on live websites, and not outdated documentation from 2013, which was wrong.

you don't even understand how search engines and search bots like Google bot or ChatGPT work, you are trying to prove to me that you know these things better than me, but you are a DEVELOPER, and I'm an SEO specialist, I promote websites, I write articles, not you.

Any SEO expert will tell you that Google does not need pages like "Blog Category - Page 1", "Blog Category - Page 2", "Blog Category - Page 3" in their search.

You really don't understand how Google collects information from all these websites and how their search engine works.
This thread is literally full of criminally erroneous opinions from you about how SEO works.
On any SEO forum, they will very quickly explain to you that you are wrong in everything you know about SEO.

avatar universewrld
universewrld - comment - 19 Nov 2024

@universewrld I am in no way an expert on any of this. But all the content posted by @simbus82 #44310 (comment) from sources I absolutely trust say that you are wrong

you are confusing <noindex> with rel="canonical" links.

Block Search indexing with noindex - https://developers.google.com/search/docs/crawling-indexing/block-indexing

if links like Blog - Page 1, Blog - Page 2 and Blog - Page 3 had the <noindex> tag, then the content on those pages would not be indexed, but page Blog - Page 1, Blog - Page 2 and Blog - Page 3 do not have the <noindex> tag, so all content there will be indexed.

Instead of Blog - Page 1, Blog - Page 2 and Blog - Page 3 in Google search there will be the main page of the category Blog view.

avatar universewrld
universewrld - comment - 19 Nov 2024

if you want to prohibit indexing of website pages, you can do it via noindex, nofollow and robots.txt file, but not via rel="canonical".

rel="canonical" only points to the main, canonical page, rel="canonical" does not prohibit indexing of other pages of your website.

some people in this thread have confused all SEO terms and definitions and are trying to pass themselves off as SEO experts.

avatar fgsw
fgsw - comment - 19 Nov 2024

can this be locked for now?

avatar Hackwar
Hackwar - comment - 19 Nov 2024

Indeed. if you want to argue about SEO, feel free to meet in a forum or chat of your choice, but the issue tracker of Joomla is not the place to do this. I'm locking this topic for now.

Add a Comment

Login with GitHub to post a comment