By avoiding to annotate aggregations meant to be possibly pushed to an
outer query until their references are resolved it is possible to
aggregate over a query with the same alias.
Even if #34176 is a convoluted case to support, this refactor seems
worth it given the reduction in complexity it brings with regards to
annotation removal when performing a subquery pushdown.
Just like when using .annotate(), the .alias() method will generate the
necessary JOINs to resolve the alias even if not selected.
Since these JOINs could be multi-valued non-selected aggregates must be
considered to require subquery wrapping as a GROUP BY is required to
combine duplicated tuples from the base table.
Regression in 59bea9efd2.
This ensures implicit grouping from aggregate function annotations
groups by uncollapsed selected aliases if supported.
The feature is disabled on Oracle because it doesn't support it.
This required moving the combined queries slicing logic to the compiler
in order to allow Query.exists() to be called at expression resolving
time.
It allowed for Query.exists() to be called at Exists() initialization
time and thus ensured that get_group_by_cols() was operating on the
terminal representation of the query that only has a single column
selected.
A more in-depth solution is likely to make sure that we always GROUP BY
selected annotations or revisit how we use Query.exists() in the Exists
expression but that requires extra work that isn't suitable for a
backport.
Regression in e5a92d400a.
Thanks Fernando Flores Villaça for the report.
Thanks Splunk team: Preston Elder, Jacob Davis, Jacob Moore,
Matt Hanson, David Briggs, and a security researcher: Danylo Dmytriiev
(DDV_UA) for the report.
This includes refactoring of CombinedExpression._resolve_output_field()
so it no longer uses the behavior inherited from Expression of guessing
same output type if argument types match, and instead we explicitly
define the output type of all supported operations.
This also makes nonsensical operations involving dates
(e.g. date + date) raise a FieldError, and adds support for
automatically inferring output_field for cases such as:
* date - date
* date + duration
* date - duration
* time + duration
* time - time
As a QuerySet resolves to Query the outer column references grouping logic
should be defined on the latter and proxied from Subquery for the cases where
get_group_by_cols is called on unresolved expressions.
Thanks Antonio Terceiro for the report and initial patch.
The introduction of the Expression.empty_aggregate_value interface
allows the compilation stage to enable the EmptyResultSet optimization
if all the aggregates expressions implement it.
This also removes unnecessary RegrCount/Count.convert_value() methods.
Disabling the empty result set aggregation optimization when it wasn't
appropriate prevented None returned for a Count aggregation value.
Thanks Nick Pope for the review.
This also replaces assertQuerysetEqual() to
assertSequenceEqual()/assertCountEqual() where appropriate.
Co-authored-by: Peter Inglesby <peter.inglesby@gmail.com>
Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
This required implementing a limited form of dynamic dispatch to combine
expressions with numerical output. Refs #26355 should eventually provide
a better interface for that.
691def10a0 made all Subquery() instances
equal to each other which broke aggregation subquery pushdown which
relied on object equality to determine which alias it should select.
Subquery.__eq__() will be fixed in an another commit but
Query.rewrite_cols() should haved used object identity from the start.
Refs #30727, #30188.
Thanks Makina Corpus for the report.
Clearing the SELECT clause in Query.has_results was orphaning GROUP BY
references to it.
Thanks Thierry Bastian for the report and Baptiste Mispelon for the
bisect.
Regression in fb3f034f1c.