1
0
mirror of https://github.com/django/django.git synced 2025-10-24 06:06:09 +00:00
Files
django/docs/ref/contrib/postgres/aggregates.txt
Chris Muthig 4b977a5d72 Fixed #35444 -- Added generic support for Aggregate.order_by.
This moves the behaviors of `order_by` used in Postgres aggregates into
the `Aggregate` class. This allows for creating aggregate functions that
support this behavior across all database engines. This is shown by
moving the `StringAgg` class into the shared `aggregates` module and
adding support for all databases. The Postgres `StringAgg` class is now
a thin wrapper on the new shared `StringAgg` class.

Thank you Simon Charette for the review.
2025-03-03 11:37:00 +01:00

402 lines
12 KiB
Plaintext

=========================================
PostgreSQL specific aggregation functions
=========================================
.. module:: django.contrib.postgres.aggregates
:synopsis: PostgreSQL specific aggregation functions
These functions are available from the ``django.contrib.postgres.aggregates``
module. They are described in more detail in the `PostgreSQL docs
<https://www.postgresql.org/docs/current/functions-aggregate.html>`_.
.. note::
All functions come without default aliases, so you must explicitly provide
one. For example:
.. code-block:: pycon
>>> SomeModel.objects.aggregate(arr=ArrayAgg("somefield"))
{'arr': [0, 1, 2]}
.. admonition:: Common aggregate options
All aggregates have the :ref:`filter <aggregate-filter>` keyword argument
and most also have the :ref:`default <aggregate-default>` keyword argument.
General-purpose aggregation functions
=====================================
``ArrayAgg``
------------
.. class:: ArrayAgg(expression, distinct=False, filter=None, default=None, order_by=(), **extra)
Returns a list of values, including nulls, concatenated into an array, or
``default`` if there are no values.
.. attribute:: distinct
An optional boolean argument that determines if array values
will be distinct. Defaults to ``False``.
.. attribute:: order_by
.. versionadded:: 5.2
An optional string of a field name (with an optional ``"-"`` prefix
which indicates descending order) or an expression (or a tuple or list
of strings and/or expressions) that specifies the ordering of the
elements in the result list.
Examples::
from django.db.models import F
ArrayAgg("a_field", order_by="-some_field")
ArrayAgg("a_field", order_by=F("some_field").desc())
.. deprecated:: 5.2
The ``ordering`` keyword argument is deprecated. Use
:attr:`ArrayAgg.order_by` instead.
``BitAnd``
----------
.. class:: BitAnd(expression, filter=None, default=None, **extra)
Returns an ``int`` of the bitwise ``AND`` of all non-null input values, or
``default`` if all values are null.
``BitOr``
---------
.. class:: BitOr(expression, filter=None, default=None, **extra)
Returns an ``int`` of the bitwise ``OR`` of all non-null input values, or
``default`` if all values are null.
``BitXor``
----------
.. class:: BitXor(expression, filter=None, default=None, **extra)
Returns an ``int`` of the bitwise ``XOR`` of all non-null input values, or
``default`` if all values are null. It requires PostgreSQL 14+.
``BoolAnd``
-----------
.. class:: BoolAnd(expression, filter=None, default=None, **extra)
Returns ``True``, if all input values are true, ``default`` if all values
are null or if there are no values, otherwise ``False``.
Usage example::
class Comment(models.Model):
body = models.TextField()
published = models.BooleanField()
rank = models.IntegerField()
.. code-block:: pycon
>>> from django.db.models import Q
>>> from django.contrib.postgres.aggregates import BoolAnd
>>> Comment.objects.aggregate(booland=BoolAnd("published"))
{'booland': False}
>>> Comment.objects.aggregate(booland=BoolAnd(Q(rank__lt=100)))
{'booland': True}
``BoolOr``
----------
.. class:: BoolOr(expression, filter=None, default=None, **extra)
Returns ``True`` if at least one input value is true, ``default`` if all
values are null or if there are no values, otherwise ``False``.
Usage example::
class Comment(models.Model):
body = models.TextField()
published = models.BooleanField()
rank = models.IntegerField()
.. code-block:: pycon
>>> from django.db.models import Q
>>> from django.contrib.postgres.aggregates import BoolOr
>>> Comment.objects.aggregate(boolor=BoolOr("published"))
{'boolor': True}
>>> Comment.objects.aggregate(boolor=BoolOr(Q(rank__gt=2)))
{'boolor': False}
``JSONBAgg``
------------
.. class:: JSONBAgg(expressions, distinct=False, filter=None, default=None, order_by=(), **extra)
Returns the input values as a ``JSON`` array, or ``default`` if there are
no values. You can query the result using :lookup:`key and index lookups
<jsonfield.key>`.
.. attribute:: distinct
An optional boolean argument that determines if array values will be
distinct. Defaults to ``False``.
.. attribute:: order_by
.. versionadded:: 5.2
An optional string of a field name (with an optional ``"-"`` prefix
which indicates descending order) or an expression (or a tuple or list
of strings and/or expressions) that specifies the ordering of the
elements in the result list.
Examples are the same as for :attr:`ArrayAgg.order_by`.
Usage example::
class Room(models.Model):
number = models.IntegerField(unique=True)
class HotelReservation(models.Model):
room = models.ForeignKey("Room", on_delete=models.CASCADE)
start = models.DateTimeField()
end = models.DateTimeField()
requirements = models.JSONField(blank=True, null=True)
.. code-block:: pycon
>>> from django.contrib.postgres.aggregates import JSONBAgg
>>> Room.objects.annotate(
... requirements=JSONBAgg(
... "hotelreservation__requirements",
... order_by="-hotelreservation__start",
... )
... ).filter(requirements__0__sea_view=True).values("number", "requirements")
<QuerySet [{'number': 102, 'requirements': [
{'parking': False, 'sea_view': True, 'double_bed': False},
{'parking': True, 'double_bed': True}
]}]>
.. deprecated:: 5.2
The ``ordering`` keyword argument is deprecated. Use
:attr:`JSONBAgg.order_by` instead.
``StringAgg``
-------------
.. class:: StringAgg(expression, delimiter, distinct=False, filter=None, default=None, order_by=())
.. deprecated:: 6.0
Returns the input values concatenated into a string, separated by
the ``delimiter`` string, or ``default`` if there are no values.
.. attribute:: delimiter
Required argument. Needs to be a string.
.. attribute:: distinct
An optional boolean argument that determines if concatenated values
will be distinct. Defaults to ``False``.
.. attribute:: order_by
.. versionadded:: 5.2
An optional string of a field name (with an optional ``"-"`` prefix
which indicates descending order) or an expression (or a tuple or list
of strings and/or expressions) that specifies the ordering of the
elements in the result string.
Examples are the same as for :attr:`ArrayAgg.order_by`.
Usage example::
class Publication(models.Model):
title = models.CharField(max_length=30)
class Article(models.Model):
headline = models.CharField(max_length=100)
publications = models.ManyToManyField(Publication)
.. code-block:: pycon
>>> article = Article.objects.create(headline="NASA uses Python")
>>> article.publications.create(title="The Python Journal")
<Publication: Publication object (1)>
>>> article.publications.create(title="Science News")
<Publication: Publication object (2)>
>>> from django.contrib.postgres.aggregates import StringAgg
>>> Article.objects.annotate(
... publication_names=StringAgg(
... "publications__title",
... delimiter=", ",
... order_by="publications__title",
... )
... ).values("headline", "publication_names")
<QuerySet [{
'headline': 'NASA uses Python', 'publication_names': 'Science News, The Python Journal'
}]>
.. deprecated:: 5.2
The ``ordering`` keyword argument is deprecated. Use
:attr:`StringAgg.order_by` instead.
Aggregate functions for statistics
==================================
``y`` and ``x``
---------------
The arguments ``y`` and ``x`` for all these functions can be the name of a
field or an expression returning a numeric data. Both are required.
``Corr``
--------
.. class:: Corr(y, x, filter=None, default=None)
Returns the correlation coefficient as a ``float``, or ``default`` if there
aren't any matching rows.
``CovarPop``
------------
.. class:: CovarPop(y, x, sample=False, filter=None, default=None)
Returns the population covariance as a ``float``, or ``default`` if there
aren't any matching rows.
.. attribute:: sample
Optional. By default ``CovarPop`` returns the general population
covariance. However, if ``sample=True``, the return value will be the
sample population covariance.
``RegrAvgX``
------------
.. class:: RegrAvgX(y, x, filter=None, default=None)
Returns the average of the independent variable (``sum(x)/N``) as a
``float``, or ``default`` if there aren't any matching rows.
``RegrAvgY``
------------
.. class:: RegrAvgY(y, x, filter=None, default=None)
Returns the average of the dependent variable (``sum(y)/N``) as a
``float``, or ``default`` if there aren't any matching rows.
``RegrCount``
-------------
.. class:: RegrCount(y, x, filter=None)
Returns an ``int`` of the number of input rows in which both expressions
are not null.
.. note::
The ``default`` argument is not supported.
``RegrIntercept``
-----------------
.. class:: RegrIntercept(y, x, filter=None, default=None)
Returns the y-intercept of the least-squares-fit linear equation determined
by the ``(x, y)`` pairs as a ``float``, or ``default`` if there aren't any
matching rows.
``RegrR2``
----------
.. class:: RegrR2(y, x, filter=None, default=None)
Returns the square of the correlation coefficient as a ``float``, or
``default`` if there aren't any matching rows.
``RegrSlope``
-------------
.. class:: RegrSlope(y, x, filter=None, default=None)
Returns the slope of the least-squares-fit linear equation determined
by the ``(x, y)`` pairs as a ``float``, or ``default`` if there aren't any
matching rows.
``RegrSXX``
-----------
.. class:: RegrSXX(y, x, filter=None, default=None)
Returns ``sum(x^2) - sum(x)^2/N`` ("sum of squares" of the independent
variable) as a ``float``, or ``default`` if there aren't any matching rows.
``RegrSXY``
-----------
.. class:: RegrSXY(y, x, filter=None, default=None)
Returns ``sum(x*y) - sum(x) * sum(y)/N`` ("sum of products" of independent
times dependent variable) as a ``float``, or ``default`` if there aren't
any matching rows.
``RegrSYY``
-----------
.. class:: RegrSYY(y, x, filter=None, default=None)
Returns ``sum(y^2) - sum(y)^2/N`` ("sum of squares" of the dependent
variable) as a ``float``, or ``default`` if there aren't any matching rows.
Usage examples
==============
We will use this example table:
.. code-block:: text
| FIELD1 | FIELD2 | FIELD3 |
|--------|--------|--------|
| foo | 1 | 13 |
| bar | 2 | (null) |
| test | 3 | 13 |
Here's some examples of some of the general-purpose aggregation functions:
.. code-block:: pycon
>>> TestModel.objects.aggregate(result=StringAgg("field1", delimiter=";"))
{'result': 'foo;bar;test'}
>>> TestModel.objects.aggregate(result=ArrayAgg("field2"))
{'result': [1, 2, 3]}
>>> TestModel.objects.aggregate(result=ArrayAgg("field1"))
{'result': ['foo', 'bar', 'test']}
The next example shows the usage of statistical aggregate functions. The
underlying math will be not described (you can read about this, for example, at
`wikipedia <https://en.wikipedia.org/wiki/Regression_analysis>`_):
.. code-block:: pycon
>>> TestModel.objects.aggregate(count=RegrCount(y="field3", x="field2"))
{'count': 2}
>>> TestModel.objects.aggregate(
... avgx=RegrAvgX(y="field3", x="field2"), avgy=RegrAvgY(y="field3", x="field2")
... )
{'avgx': 2, 'avgy': 13}