One result for the default category 'uncategorised'
No results
DB mysql
DB version 5.5.59-cll
PHP version 5.6.37
Apache
Joomla 3.8.11
As soon as an author filter is applied, categories no longer appear in search results
Labels |
Added:
?
|
Title |
|
Category | ⇒ | com_finder |
Labels |
Added:
J3 Issue
|
Status | New | ⇒ | Discussion |
Can't confirm.
I tried to reproduce the Issue and get Result "One result for the default category 'uncategorised'" (Article "Getting started").
Maybe i haven't get your Instructions correct so can you please proove if this Issue exist?
Hi Franz,
Your result is displaying what I am describing. The 1 result you are seeing is the 'getting started' article within the Uncategorised category.
There is no result that takes you direct to the uncategorised category. And if you remove the article, you will get 0 results.
I hope this makes sense.
Thanks
Hi Franz,
I didn't install sample data with my installation so didn't have the article 'Getting Started'. I should have specified, sorry.
With sample data installed, there should be 2 results. 1 for the getting started article, and 1 for the Uncategorised category.
Without sample data installed there should be 1 result, for the Uncategorised category.
You are seeing 1 result only when it should be showing 2.
I hope this is clearer.
You mean the Category itself should be displayed?
Yes.
The category is displayed when no search filter is applied. Applying a search filter that filters on Author prevents categories from being listed in search results.
Got it, thanks for Patience.
So Issue confirmed.
That is actually not a bug, but more or less intended behavior. When the categories are indexed, no author is stored, so when only results of a specific author should be shown, there is no match here. To fix this, the author of a category would have to be indexed in the /plugins/finder/categories/ plugin. So if we really decide that the authors of categories should be added as part of the taxonomy of a category, then this issue would have to be rephrased as "Finder - Categories plugin does not add the author as a taxonomy when indexing."
Status | Discussion | ⇒ | Information Required |
@aaron-harding any Comment?
Is there a reason author is not indexed on categories? I would say this is an oversight, it doesn't make sense that an author filter does not work on categories when it has an author field.
I understand you, @aaron-harding, and your view on this is a valid possible expectation. On the other hand, I would say that the categories are kind of like tags, more a taxonomy than a content item that has the prosaic complexity to be copyrightable and thus attributable to an author. That would be supported by the different name in the database for that field. I guess that field in com_categories is more for internal organisational purposes...
You could solve your problem by creating a copy of the category finder plugin and adding
// Add the author taxonomy data.
if ($item->created_user_id)
{
$author = JFactory::getUser($item->created_user_id);
$item->addTaxonomy('Author', $author->name);
}
in the index method.
I think this is a good situation where the project can't cater for every need that might come up in the community, but where the issue can be easily solved by the user himself with a little code change.
I appreciate the response, and the code to get the plugin to index categories in the same way. You could argue it's pointless me continuing this comment but I think your perception of categories is fundamentally wrong.
They are nothing like tags; a category is capable of holding similar information to an article. A title, a description and even an image. It is a piece of content that should not be excluded from search results, especially not when a specific filter is applied.
It not being copyrightable or attributable to an author is not a reason to have it excluded from search results when an author filter is applied. Any Joomla website using categories in the intended way, with an author filter, is going to have legitimate content missing from their search results.
I think the code you have provided should be implemented, unless there is a good reason for this to be the only scenario in which a category is not displayed in search results.
Thanks again.
Any Joomla website using categories in the intended way, with an author filter
Not every Joomla website uses categories in the way it appears you are, with this level of authorship metadata on categories that is intended to be filterable. Many of the joomla.org
subdomains use Smart Search, and on those sites filtering categories by author has no publishing value as it was merely the individual who was logged into the site at the time who created them. Yes, this filter ability makes sense in some contexts (a collaborative or news site or where a user's blog posts or news stories are published to a unique category that they "own" being one example). But, I would personally suggest that this level of filtering and authorship details is not something that applies to the vast majority of Joomla sites so it shouldn't be a behavior that is included in core and pushed to all websites (and I'd be hesitant about adding it with an option to configure it, you'd need a hierarchical option to do it right because you're probably going to find that some categories should have this authorship data and others should not then there's the argument of whether it should inherit or override parent category(ies)).
But you are making this decision for people. The bottom line is that it does not make sense to show categories in search results in every other instance, until you apply this 1 specific filter - a filter on a field that both categories and articles have.
Saying that "it has no publishing value" is you, deciding for a user, whether their categories have no publishing value, which is an incorrect assumption to make.
If I am searching for all content by a specific user, and not all of those results are being returned, then there is a problem.
If you don't want to show categories in a author-filtered search result list then you can configure the filter to behave that way by picking 'articles' under 'Types' - or even allow the removal of an author from a category full stop.
I should be the one who gets to decide whether my categories are important enough to be display in filtered search results.
All I'm saying is an argument can be made both ways, and changing it to behave the way you prefer can and probably will break expectations for those who aren't using authorship metadata on categories the same way they do on articles (or other frontend content types). I think arbitrarily excluding it does create a problem for cases like your's, but arbitrarily including it is also problematic for a lot of use cases as well. And making it a user configurable option creates a lot of headache too because I don't think it should be an all or nothing thing. Even using your examples, I guarantee that sites still have some kind of catch-all or generic category for static content where authorship isn't an attribute you are really concerned with having published or would rather have that data hidden. Then again, that issue also exists within articles too, so it's not a new issue, but it kind of supports my point here that having an all or nothing setting on author metadata creates unwelcome side effects.
If there is an argument both ways then you agree that this is a problem and needs to be looked at further, if the solution to simply have them included in results is not sufficient.
I have content that is not showing up in a search result because someone has decided that category metadata isn't as important. I would argue that the people using categories as 'catch alls' are using it improperly.
I'll reiterate in it's simplest form - I am asking the CMS to show me all content by user X. Not all content by user X is displayed. Therefore, aside from opinions on what constitutes 'publishing value', there is a problem - either with the search plugin or the fact that categories have this field in the first place.
If there was no author field on categories, there would be a different conversation, or none at all, but that is not the case.
A solution could be for the current category author to be wiped or the internal field renamed so that it does not affect current search configurations - then correct author filtering to be implemented so that moving forward it can be used if the user decides they want to.
Saying that I'm wrong and using it wrong because others use it a different way is in itself, wrong.
I would argue that the people using categories as 'catch alls' are using it improperly.
So how do you do static content pages which really don't fit into a category? Even if you put it into a "static content" category (or use the "Uncategorised" category that we seed as part of new installs), the odds are that you don't intend for the category to be part of the public navigation, probably shouldn't be part of the search index, and if it is you probably aren't aiming for the user who created that category to be listed as its author.
Saying that I'm wrong and using it wrong because others use it a different way is in itself, wrong.
I'm not trying to say you're wrong. I'm moreso saying that the way it is done in core now is wrong. With articles everything has the author metadata included in the search index, and I think this is a wrong behavior because there are articles on most sites that I have built or worked on where this is irrelevant (static content page where the "author" just happens to be whomever added it to the site). With categories, nothing has the author metadata; I personally don't have a use case where I would need this, but I get that you and others do, but I don't think the solution is to just arbitrarily add it to categories and have it be there for all of them.
Long and short, the current behavior isn't great at all, adding Hannes' solution to core is equally not great, and implementing what I feel would be the "right" solution (where you can selectively add/remove metadata on a per-item basis) honestly would make managing the search index pretty darn complex.
Just for the sake of completeness, a practical example of why I feel like the all or nothing behaviors suck. We'll use https://developer.joomla.org/nightly-builds.html for this.
It is a static content page, I don't feel like in the site's search index that I should show up as the author because it's not really "my" content, I just happened to be the user who input it into the site.
It is in a "catch all" category (arguments about whether they're wrong to use or not aside), I would personally suggest that the category taxonomy for this item added here is irrelevant and shouldn't be present.
I'll even go so far as to say I don't think this type of "static" page really needs to be aware of date attributes (created or publish up/down), so if we exposed the date filters on https://developer.joomla.org/search.html I would suggest it be wrong that I can find this article by filtering on the create date.
As it relates to category data now...
https://developer.joomla.org/news.html contains content written by multiple people, and in some cases the author isn't actually who wrote the content but whomever had an account on the site (and yes we've done our best to use the appropriate fields in the backend to "fix" that). Many of the category's posts also don't really need to have the author associated with it (does it really matter that I'm the "author" of the 3.9 alpha/beta release posts?).
https://developer.joomla.org/security-centre.html on the other hand is a category where every item posted is "authored" by one user (yes we actually do have a stub user account for JSST on the site), and in that case it would be appropriate to set the author to that account and have it included in the search index (so I'll retract my earlier statement about not having a use case for this, as I stumbled upon it just now).
I think that post does a good job of highlighting the bigger issue around this instance. To reply to your earlier one:
So how do you do static content pages which really don't fit into a category? Even if you put it into a "static content" category (or use the "Uncategorised" category that we seed as part of new installs), the odds are that you don't intend for the category to be part of the public navigation, probably shouldn't be part of the search index, and if it is you probably aren't aiming for the user who created that category to be listed as its author.
I would usually create that page as a category itself, so that it can be 'top level'. But I understand this itself is improper too. Perhaps there actually is a need for categories to be more like tags, with the ability of having an article not in a specific category. I think the last part of your paragraph is more about being able to excluding pages from search in general rather than in this specific case.
I'm not trying to say you're wrong. I'm moreso saying that the way it is done in core now is wrong. With articles everything has the author metadata included in the search index, and I think this is a wrong behavior because there are articles on most sites that I have built or worked on where this is irrelevant (static content page where the "author" just happens to be whomever added it to the site). With categories, nothing has the author metadata; I personally don't have a use case where I would need this, but I get that you and others do, but I don't think the solution is to just arbitrarily add it to categories and have it be there for all of them.
I agree
Long and short, the current behavior isn't great at all, adding Hannes' solution to core is equally not great, and implementing what I feel would be the "right" solution (where you can selectively add/remove metadata on a per-item basis) honestly would make managing the search index pretty darn complex.
That's fair enough - I do not have a suggestion off the top of my head on how I think it would be best to tackle this.
I really appreciate the time you've taken to discuss this with me Michael. Do you know if content hierarchy and searching has been improved or changed in J4? Or do you think we will see similar behaviours?
No one is trying to force any behavior onto you, which is why I gave you some code how to get what you want. On the other hand we need to decide what to do here and in the face of a tie in the arguments (at least for me) on this issue, I'm going to lean to what means less work for me. And that right now is not creating a PR for this/not adding this feature.
A possibility would be to add parameters for each and every taxonomy that we are adding in each and every plugin, so that you can control very precisely what gets indexed and what not. However, that means a lot of parameters and quite frankly, Joomla already has to many parameters.
Someone else has to decide on this in the end, since I don't have the power to close an issue. I've just given my opinion on this, since I've worked on com_finder quite a lot in the last few months. Also notice that you can switch off whole taxonomy branches in the backend when all my PRs for this are merged.
I appreciate that Hannes, just that if i were not a developer I would likely not be comfortable making code changes. I am grateful you've provided a solution for me, however.
Not sure there's a quick fix for this in light of everything discussed. It sounds like fundamentally categories and search taxonomy needs a bit of thought. I think an option to disable the inclusion of category author taxonomy would suffice in this instance but after the above, it feels like a bandaid solution.
I still think that categories should be displayed when author searches are done, but understand shipping an update that enables this feature could break existing instances.
Do we close this simply because there is not a graceful enough solution right now? Do we wait for J4? Are you aware of any plans to improve taxonomy and indexing in J4?
See these PRs that I did for com_finder in the last few months: https://github.com/joomla/joomla-cms/pulls?utf8=%E2%9C%93&q=is%3Apr+sort%3Aupdated-desc+author%3AHackwar+%5B4.0%5D+
Yes, please.
Title |
|
needs Label J4.
needs Label J4.
Labels |
Added:
J4 Issue
|
Labels |
Removed:
J3 Issue
|
Status | Information Required | ⇒ | Discussion |
Status | Discussion | ⇒ | Confirmed |
Build | master | ⇒ | 4.0-dev |
Status | Confirmed | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2022-01-27 15:38:16 |
Closed_By | ⇒ | Quy | |
Labels |
Added:
No Code Attached Yet
Removed: ? |
@aaron-harding will try to reproduce Issue.