User tests: Successful: Unsuccessful:
Because it currently does not work at all.
http://joomlacode.org/gf/project/joomla/tracker/?action=TrackerItemEdit&tracker_item_id=33207&start=0
So I tested both patches and the current staging
branch by running the Smart Search indexer and doing a search on the front end since that generates a UNION query. Here's my results:
PR #2735 Query:
SELECT t.term, t.term_id
FROM #__finder_terms AS t
WHERE t.term = 'joomla' AND t.phrase = 0
UNION (
SELECT t.term, t.term_id
FROM #__finder_terms AS t
WHERE t.stem = 'joomla' AND t.phrase = 0)
PR #2986 Query:
(
SELECT t.term, t.term_id
FROM #__finder_terms AS t
WHERE t.stem = 'joomla' AND t.phrase = 0)
Current Staging:
SELECT t.term, t.term_id
FROM #__finder_terms AS t
WHERE t.term = 'joomla' AND t.phrase = 0
Judging by this, @chrisdavenport patch looks to be the correct fix.
It depends on how you would want union()
(and unionAll()
) to work.
JDatabaseQuery
has the concept of query 'type'. So, SELECT
is a type, INSERT
is a type, etc. Calling one of those functions will set the type which affects how the query will be rendered.
There are two possible approaches to UNION
. One is to treat it as a type which is what my patch does. The other is to treat it as a clause of the SELECT
type which is what #2735 does. The upatched code fails because the union
functions treat UNION
as a clause but __toString
treats it as a type.
There may be pros and cons to both approaches. I suggest considering carefully how you would like union()
and unionAll()
to work. In any case, unionDistinct()
should be deprecated because UNION DISTINCT
makes no sense.
Personally, I'm impartial to how it should work. With that said, I don't know enough about UNION queries to know how it is supposed to work, but the query I got with Chris' patch applied looks the most correct to me.
The reason is that the way to use the function is different in my patch. His is like:
$query->select('*')->from('foo')->union('select * from bar');
But mine is like:
$query1->select('*')->from('foo');
$query2->select('*')->from('bar');
$query3->union($query1)->union($query2);
or the last line could be:
$query3->union(array($query1, $query2));
So the difference is that, in his, you start with a select and append one or more selects to it with union()
. In mine, you just pass any number of selects to union()
and it wraps them up as a separate query.
So I'd change wherever $query->union()
is implemented to use the code as expected too that way there's a good way to test the change. From my test, it's obvious the current code is broken, but without fixing at least that query, your patch doesn't fair much better at the moment (not saying it's wrong, just at a glance, that's the appearance).
But how is the code expected to work? The unpatched code is written halfway to function like my patch and halfway to function like his so it doesn't work at all. So we have both fixed it but with different approaches to how union()
should be used.
Since it is currently useless, there is no reason to think that either way should cause a bc issue. We should consider how is the most useful and sensible way for the function to be used in practice and implement it that way. Maybe this is a topic that should be discussed on the google groups forum and get some additional opinions. Actually, I'm not sure which approach is better myself. I'd like to think more about what are the benefits and drawbacks of each.
Imho$query->select('*')->from('foo')->union('select * from bar');
is much more intuitive than $query3->union(array($query1, $query2));
Actually i think that having both instances as JDatabaseQuery elements makes more sense than having one being a written SQL clause
Imho
$query->select('*')->from('foo')->union('select * from bar');
is much more intuitive than
+1
Actually i think that having both instances as JDatabaseQuery elements makes more sense than having one being a written SQL clause
You can of course pass the second query as JDatabaseQuery object as well. The example was a bit misleading
$query1->select('*')
->from('foo');
$query2->select('*')
->from('bar');
$query1->union($query2);
Oh ok then to that
It's not clear to me, so, this is a question.
Would both generate SQL like the following, with the following results (given sample data)?
UNION EXAMPLE:
SELECT id, `title`
FROM `jos_content`
UNION
SELECT id, `username` as `title`
FROM `jos_users`
ORDER BY `title`
LIMIT 0, 3
RESULTS:
859, 'admin'
1, 'Administrator Components'
2, 'Archive Module'
Don't know about Elijah's code, but with my proposed fix:
$q1->select('id, title')->from('#__content');
$q2->select('id, username as title')->from('#__users');
$q1->union($q2)->order('title');
produces:
SELECT id, title
FROM #__content
UNION (
SELECT id, username as title
FROM #__users)
ORDER BY title
If you were to put an order on the second SELECT, like this:
$q1->select('id, title')->from('#__content');
$q2->select('id, username as title')->from('#__users')->order('username');
$q1->union($q2)->order('title');
then you'd get this:
SELECT id, title
FROM #__content
UNION (
SELECT id, username as title
FROM #__users
ORDER BY username)
ORDER BY title
which I think is syntactically correct although if memory serves the "ORDER BY username" will be ignored (by MySQL at least) without a LIMIT clause in the same SELECT clause, which is not possible with the current implementation of JDatabaseQuery (but could be added quite easily).
Thanks @chrisdavenport -- that's what is needed. +1
As I understand it...
SELECT id, title
FROM #__content
UNION (
SELECT id, username as title
FROM #__users
ORDER BY username)
ORDER BY title
means that you get a the items from #__content
in the default order, get the items from #__users
ordered by username
(which will be aliased as title
), then take the full set of both results and order it by title
. In this example ORDER BY username
is redundant but in the end everything works out the way you want. I think you might have problems in more complex examples.
How about this:
SELECT a, b, c
FROM foo
UNION
SELECT a, b, c
FROM bar
If I want the results from foo
to be ordered by b
and the results from bar
to be ordered by c
, is this possible?
Obviously this is kind of an edge case.
In MySQL the "ORDER BY username" without an associated LIMIT clause is optimised out (ie. ignored), so I could have written "ORDER BY id" or "ORDER BY somethingelse" and it would make no difference. See: http://dev.mysql.com/doc/refman/5.0/en/union.html. Currently there is no mechanism in JDatabaseQuery to get a LIMIT on that SELECT because the LIMIT is applied when you do $db->setQuery($query, $offset, $limit).
For your "edge case", I think this should work:
$q1->select('a, b, c, b AS sort_col')->from('foo');
$q2->select('a, b, c, c AS sort_col')->from('bar');
$q1->union($q2)->order('sort_col');
which produces:
SELECT a, b, c, b AS sort_col
FROM foo
UNION (
SELECT a, b, c, c AS sort_col
FROM bar)
ORDER BY sort_col
@chrisdavenport Ah, you are quite right. It doesn't work quite like I expected (http://dev.mysql.com/doc/refman/5.0/en/union.html) but you bring up a good point about the use of ORDER BY with LIMIT inside of a UNIONed query. Since the query passed to union()
will not have a LIMIT, ORDER BY is pointless. But what if we could give it a LIMIT? What if we had something like:
public function union($query, $offset = 0, $limit = 0)
Since $distinct
and $glue
serve no purpose, get rid of them, replace them with something useful.
Since $distinct and $glue serve no purpose, get rid of them, replace them with something useful.
Just a note: It's not easy to just replace an argument. Even if it currently doesn't work and serves no purpoase, it may still be used. If you're going to replace it, you will have to make sure that you can distinguish between the old use (as distinct
and glue
-> could be anything, including int) and the new one (as offset
and limit
-> both integers). This could be hard in this case.
It's not just the arguments that serve no purpose though, it's the function itself. It currently does not work at all. Who could be using it?
Who could be using it?
Apparently Smart Finder does: https://github.com/joomla/joomla-cms/blob/staging/administrator/components/com_finder/helpers/indexer/query.php#L1295
Then Smart Finder is also broken. How long has this gone unnoticed, I wonder?
The offset and limit are not conceptually associated with the union; they're associated with the select, so I would argue that adding them as parameters to the union() method would be wrong in my opinion.
However, it looks like I was wrong about limits not being available in JDatabaseQuery as it looks like the capability has already been added! There is a JDatabaseQueryLimitable interface and those database drivers that support it can do $query->setLimit($limit, $offset).
So, this
$q1->select('a, b, c, b AS sort_col')->from('foo')->setLimit(200, 25);
$q2->select('a, b, c, c AS sort_col')->from('bar')->setLimit(100, 50);
$q1->union($q2)->order('sort_col');
will produce this
SELECT a, b, c, b AS sort_col
FROM foo
UNION (
SELECT a, b, c, c AS sort_col
FROM bar LIMIT 50, 100)
ORDER BY sort_col LIMIT 25, 200
And this would produce the same result
$q1->select('a, b, c, b AS sort_col')->from('foo');
$q2->select('a, b, c, c AS sort_col')->from('bar')->setLimit(100, 50);
$q1->union($q2)->order('sort_col');
$db->setQuery($q1, 200, 25);
It doesn't seem to be testable with the current unit tests because the mock database driver does not have the JDatabaseQueryLimitable interface. As a test newbie I have no idea how to fix that.
A better example would be:
$q1->select('a, b, c')->from('foo');
$q2->select('a, b, c')->from('bar')->order('c')->setLimit(100, 50);
$q1->union($q2)->order('b');
which produces:
SELECT a, b, c
FROM foo
UNION (
SELECT a, b, c
FROM bar
ORDER BY c LIMIT 50, 100)
ORDER BY b
Oh good, then it's already doable. As for the current unit tests, I think they should all (the ones related to union) be rewritten anyway since they obviously do nothing of value.
Another edge case comes to mind. How would you put a limit (and possibly order) on the first query as opposed to the result of the UNION?
PR #2735 already includes revised and several new unit tests. Could you suggest any additional ones?
For your edge case, I think you'd have to use a dummy query, like this:
$q0->select('a, b, c')->from('foo')->where('1=0');
$q1->select('a, b, c')->from('foo')->order('b')->setLimit(200, 25);
$q2->select('a, b, c')->from('bar')->order('c')->setLimit(100, 50);
$q0->union(array($q1, $q2))->setLimit(10, 5);
which produces:
SELECT a, b, c
FROM foo
WHERE 1=0
UNION (
SELECT a, b, c
FROM foo
ORDER BY b LIMIT 25, 200)
UNION (
SELECT a, b, c
FROM bar
ORDER BY c LIMIT 50, 100) LIMIT 5, 10
I suppose that works. It's a bit funny but this is a rare case anyway.
I'll have to have a look at the new tests. The thing I didn't like about the old ones was that they only tested some internal representation and not the output that would actually be used. Which is why code that didn't actually work was able to pass them. I make no claim of any expertise in TDD at all but I think there must be a lot of tests that need closer attention. I wonder what can be done to maintain a high standard of quality for tests? Anyway, that's another topic.
Well, I'm basically satisfied with your patch. Philosophically, I prefer my approach but it seems I'm in the minority.
Status | New | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2014-03-11 22:56:10 |
Labels |
There is already a PR for this issue: #2735. Please review, test and improve. Thanks.