mirror of
https://github.com/django/django.git
synced 2024-12-27 03:25:58 +00:00
a0cebe82b5
Backport of 88d798d71a
from master
576 lines
21 KiB
Plaintext
576 lines
21 KiB
Plaintext
=================
|
|
Query Expressions
|
|
=================
|
|
|
|
.. currentmodule:: django.db.models
|
|
|
|
Query expressions describe a value or a computation that can be used as part of
|
|
a filter, order by, annotation, or aggregate. There are a number of built-in
|
|
expressions (documented below) that can be used to help you write queries.
|
|
Expressions can be combined, or in some cases nested, to form more complex
|
|
computations.
|
|
|
|
Supported arithmetic
|
|
====================
|
|
|
|
Django supports addition, subtraction, multiplication, division, modulo
|
|
arithmetic, and the power operator on query expressions, using Python constants,
|
|
variables, and even other expressions.
|
|
|
|
.. versionadded:: 1.7
|
|
|
|
Support for the power operator ``**`` was added.
|
|
|
|
Some examples
|
|
=============
|
|
|
|
.. versionchanged:: 1.8
|
|
|
|
Some of the examples rely on functionality that is new in Django 1.8.
|
|
|
|
.. code-block:: python
|
|
|
|
# Find companies that have more employees than chairs.
|
|
Company.objects.filter(num_employees__gt=F('num_chairs'))
|
|
|
|
# Find companies that have at least twice as many employees
|
|
# as chairs. Both the querysets below are equivalent.
|
|
Company.objects.filter(num_employees__gt=F('num_chairs') * 2)
|
|
Company.objects.filter(
|
|
num_employees__gt=F('num_chairs') + F('num_chairs'))
|
|
|
|
# How many chairs are needed for each company to seat all employees?
|
|
>>> company = Company.objects.filter(
|
|
... num_employees__gt=F('num_chairs')).annotate(
|
|
... chairs_needed=F('num_employees') - F('num_chairs')).first()
|
|
>>> company.num_employees
|
|
120
|
|
>>> company.num_chairs
|
|
50
|
|
>>> company.chairs_needed
|
|
70
|
|
|
|
# Annotate models with an aggregated value. Both forms
|
|
# below are equivalent.
|
|
Company.objects.annotate(num_products=Count('products'))
|
|
Company.objects.annotate(num_products=Count(F('products')))
|
|
|
|
# Aggregates can contain complex computations also
|
|
Company.objects.annotate(num_offerings=Count(F('products') + F('services')))
|
|
|
|
# Expressions can also be used in order_by()
|
|
Company.objects.order_by(Length('name').asc())
|
|
Company.objects.order_by(Length('name').desc())
|
|
|
|
|
|
Built-in Expressions
|
|
====================
|
|
|
|
``F()`` expressions
|
|
-------------------
|
|
|
|
.. class:: F
|
|
|
|
An ``F()`` object represents the value of a model field or annotated column. It
|
|
makes it possible to refer to model field values and perform database
|
|
operations using them without actually having to pull them out of the database
|
|
into Python memory.
|
|
|
|
Instead, Django uses the ``F()`` object to generate a SQL expression that
|
|
describes the required operation at the database level.
|
|
|
|
This is easiest to understand through an example. Normally, one might do
|
|
something like this::
|
|
|
|
# Tintin filed a news story!
|
|
reporter = Reporters.objects.get(name='Tintin')
|
|
reporter.stories_filed += 1
|
|
reporter.save()
|
|
|
|
Here, we have pulled the value of ``reporter.stories_filed`` from the database
|
|
into memory and manipulated it using familiar Python operators, and then saved
|
|
the object back to the database. But instead we could also have done::
|
|
|
|
from django.db.models import F
|
|
reporter = Reporters.objects.get(name='Tintin')
|
|
reporter.stories_filed = F('stories_filed') + 1
|
|
reporter.save()
|
|
|
|
Although ``reporter.stories_filed = F('stories_filed') + 1`` looks like a
|
|
normal Python assignment of value to an instance attribute, in fact it's an SQL
|
|
construct describing an operation on the database.
|
|
|
|
When Django encounters an instance of ``F()``, it overrides the standard Python
|
|
operators to create an encapsulated SQL expression; in this case, one which
|
|
instructs the database to increment the database field represented by
|
|
``reporter.stories_filed``.
|
|
|
|
Whatever value is or was on ``reporter.stories_filed``, Python never gets to
|
|
know about it - it is dealt with entirely by the database. All Python does,
|
|
through Django's ``F()`` class, is create the SQL syntax to refer to the field
|
|
and describe the operation.
|
|
|
|
.. note::
|
|
|
|
In order to access the new value that has been saved in this way, the object
|
|
will need to be reloaded::
|
|
|
|
reporter = Reporters.objects.get(pk=reporter.pk)
|
|
|
|
As well as being used in operations on single instances as above, ``F()`` can
|
|
be used on ``QuerySets`` of object instances, with ``update()``. This reduces
|
|
the two queries we were using above - the ``get()`` and the
|
|
:meth:`~Model.save()` - to just one::
|
|
|
|
reporter = Reporters.objects.filter(name='Tintin')
|
|
reporter.update(stories_filed=F('stories_filed') + 1)
|
|
|
|
We can also use :meth:`~django.db.models.query.QuerySet.update()` to increment
|
|
the field value on multiple objects - which could be very much faster than
|
|
pulling them all into Python from the database, looping over them, incrementing
|
|
the field value of each one, and saving each one back to the database::
|
|
|
|
Reporter.objects.all().update(stories_filed=F('stories_filed) + 1)
|
|
|
|
``F()`` therefore can offer performance advantages by:
|
|
|
|
* getting the database, rather than Python, to do work
|
|
* reducing the number of queries some operations require
|
|
|
|
.. _avoiding-race-conditions-using-f:
|
|
|
|
Avoiding race conditions using ``F()``
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Another useful benefit of ``F()`` is that having the database - rather than
|
|
Python - update a field's value avoids a *race condition*.
|
|
|
|
If two Python threads execute the code in the first example above, one thread
|
|
could retrieve, increment, and save a field's value after the other has
|
|
retrieved it from the database. The value that the second thread saves will be
|
|
based on the original value; the work of the first thread will simply be lost.
|
|
|
|
If the database is responsible for updating the field, the process is more
|
|
robust: it will only ever update the field based on the value of the field in
|
|
the database when the :meth:`~Model.save()` or ``update()`` is executed, rather
|
|
than based on its value when the instance was retrieved.
|
|
|
|
Using ``F()`` in filters
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
``F()`` is also very useful in ``QuerySet`` filters, where they make it
|
|
possible to filter a set of objects against criteria based on their field
|
|
values, rather than on Python values.
|
|
|
|
This is documented in :ref:`using F() expressions in queries
|
|
<using-f-expressions-in-filters>`.
|
|
|
|
|
|
.. _func-expressions:
|
|
|
|
``Func()`` expressions
|
|
----------------------
|
|
|
|
.. versionadded:: 1.8
|
|
|
|
``Func()`` expressions are the base type of all expressions that involve
|
|
database functions like ``COALESCE`` and ``LOWER``, or aggregates like ``SUM``.
|
|
They can be used directly::
|
|
|
|
queryset.annotate(field_lower=Func(F('field'), function='LOWER'))
|
|
|
|
or they can be used to build a library of database functions::
|
|
|
|
class Lower(Func):
|
|
function = 'LOWER'
|
|
|
|
queryset.annotate(field_lower=Lower(F('field')))
|
|
|
|
But both cases will result in a queryset where each model is annotated with an
|
|
extra attribute ``field_lower`` produced, roughly, from the following SQL::
|
|
|
|
SELECT
|
|
...
|
|
LOWER("app_label"."field") as "field_lower"
|
|
|
|
See :doc:`database-functions` for a list of built-in database functions.
|
|
|
|
The ``Func`` API is as follows:
|
|
|
|
.. class:: Func(*expressions, **extra)
|
|
|
|
.. attribute:: function
|
|
|
|
A class attribute describing the function that will be generated.
|
|
Specifically, the ``function`` will be interpolated as the ``function``
|
|
placeholder within :attr:`template`. Defaults to ``None``.
|
|
|
|
.. attribute:: template
|
|
|
|
A class attribute, as a format string, that describes the SQL that is
|
|
generated for this function. Defaults to
|
|
``'%(function)s(%(expressions)s)'``.
|
|
|
|
.. attribute:: arg_joiner
|
|
|
|
A class attribute that denotes the character used to join the list of
|
|
``expressions`` together. Defaults to ``', '``.
|
|
|
|
The ``*expressions`` argument is a list of positional expressions that the
|
|
function will be applied to. The expressions will be converted to strings,
|
|
joined together with ``arg_joiner``, and then interpolated into the ``template``
|
|
as the ``expressions`` placeholder.
|
|
|
|
Positional arguments can be expressions or Python values. Strings are
|
|
assumed to be column references and will be wrapped in ``F()`` expressions
|
|
while other values will be wrapped in ``Value()`` expressions.
|
|
|
|
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
|
|
into the ``template`` attribute. Note that the keywords ``function`` and
|
|
``template`` can be used to replace the ``function`` and ``template``
|
|
attributes respectively, without having to define your own class.
|
|
``output_field`` can be used to define the expected return type.
|
|
|
|
``Aggregate()`` expressions
|
|
---------------------------
|
|
|
|
An aggregate expression is a special case of a :ref:`Func() expression
|
|
<func-expressions>` that informs the query that a ``GROUP BY`` clause
|
|
is required. All of the :ref:`aggregate functions <aggregation-functions>`,
|
|
like ``Sum()`` and ``Count()``, inherit from ``Aggregate()``.
|
|
|
|
Since ``Aggregate``\s are expressions and wrap expressions, you can represent
|
|
some complex computations::
|
|
|
|
Company.objects.annotate(
|
|
managers_required=(Count('num_employees') / 4) + Count('num_managers'))
|
|
|
|
The ``Aggregate`` API is as follows:
|
|
|
|
.. class:: Aggregate(expression, output_field=None, **extra)
|
|
|
|
.. attribute:: template
|
|
|
|
A class attribute, as a format string, that describes the SQL that is
|
|
generated for this aggregate. Defaults to
|
|
``'%(function)s( %(expressions)s )'``.
|
|
|
|
.. attribute:: function
|
|
|
|
A class attribute describing the aggregate function that will be
|
|
generated. Specifically, the ``function`` will be interpolated as the
|
|
``function`` placeholder within :attr:`template`. Defaults to ``None``.
|
|
|
|
The ``expression`` argument can be the name of a field on the model, or another
|
|
expression. It will be converted to a string and used as the ``expressions``
|
|
placeholder within the ``template``.
|
|
|
|
The ``output_field`` argument requires a model field instance, like
|
|
``IntegerField()`` or ``BooleanField()``, into which Django will load the value
|
|
after it's retrieved from the database. Usually no arguments are needed when
|
|
instantiating the model field as any arguments relating to data validation
|
|
(``max_length``, ``max_digits``, etc.) will not be enforced on the expression's
|
|
output value.
|
|
|
|
Note that ``output_field`` is only required when Django is unable to determine
|
|
what field type the result should be. Complex expressions that mix field types
|
|
should define the desired ``output_field``. For example, adding an
|
|
``IntegerField()`` and a ``FloatField()`` together should probably have
|
|
``output_field=FloatField()`` defined.
|
|
|
|
.. note::
|
|
|
|
When you need to define the ``output_field`` for ``F`` expression
|
|
arithmetic between different types, it's necessary to surround the
|
|
expression in another expression::
|
|
|
|
from django.db.models import DateTimeField, Expression, F
|
|
|
|
Race.objects.annotate(finish=Expression(
|
|
F('start') + F('duration'), output_field=DateTimeField()))
|
|
|
|
.. versionchanged:: 1.8
|
|
|
|
``output_field`` is a new parameter.
|
|
|
|
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
|
|
into the ``template`` attribute.
|
|
|
|
.. versionadded:: 1.8
|
|
|
|
Aggregate functions can now use arithmetic and reference multiple
|
|
model fields in a single function.
|
|
|
|
Creating your own Aggregate Functions
|
|
-------------------------------------
|
|
|
|
Creating your own aggregate is extremely easy. At a minimum, you need
|
|
to define ``function``, but you can also completely customize the
|
|
SQL that is generated. Here's a brief example::
|
|
|
|
class Count(Aggregate):
|
|
# supports COUNT(distinct field)
|
|
function = 'COUNT'
|
|
template = '%(function)s(%(distinct)s%(expressions)s)'
|
|
|
|
def __init__(self, expression, distinct=False, **extra):
|
|
super(Count, self).__init__(
|
|
expression,
|
|
distinct='DISTINCT ' if distinct else '',
|
|
output_field=IntegerField(),
|
|
**extra)
|
|
|
|
|
|
``Value()`` expressions
|
|
-----------------------
|
|
|
|
.. class:: Value(value, output_field=None)
|
|
|
|
|
|
A ``Value()`` object represents the smallest possible component of an
|
|
expression: a simple value. When you need to represent the value of an integer,
|
|
boolean, or string within an expression, you can wrap that value within a
|
|
``Value()``.
|
|
|
|
You will rarely need to use ``Value()`` directly. When you write the expression
|
|
``F('field') + 1``, Django implicitly wraps the ``1`` in a ``Value()``,
|
|
allowing simple values to be used in more complex expressions.
|
|
|
|
The ``value`` argument describes the value to be included in the expression,
|
|
such as ``1``, ``True``, or ``None``. Django knows how to convert these Python
|
|
values into their corresponding database type.
|
|
|
|
The ``output_field`` argument should be a model field instance, like
|
|
``IntegerField()`` or ``BooleanField()``, into which Django will load the value
|
|
after it's retrieved from the database. Usually no arguments are needed when
|
|
instantiating the model field as any arguments relating to data validation
|
|
(``max_length``, ``max_digits``, etc.) will not be enforced on the expression's
|
|
output value.
|
|
|
|
Conditional expressions
|
|
-----------------------
|
|
|
|
.. versionadded:: 1.8
|
|
|
|
Conditional expressions allow you to use :keyword:`if` ... :keyword:`elif` ...
|
|
:keyword:`else` logic in queries. Django natively supports SQL ``CASE``
|
|
expressions. For more details see :doc:`conditional-expressions`.
|
|
|
|
Technical Information
|
|
=====================
|
|
|
|
Below you'll find technical implementation details that may be useful to
|
|
library authors. The technical API and examples below will help with
|
|
creating generic query expressions that can extend the built-in functionality
|
|
that Django provides.
|
|
|
|
Expression API
|
|
--------------
|
|
|
|
Query expressions implement the :ref:`query expression API <query-expression>`,
|
|
but also expose a number of extra methods and attributes listed below. All
|
|
query expressions must inherit from ``Expression()`` or a relevant
|
|
subclass.
|
|
|
|
When a query expression wraps another expression, it is responsible for
|
|
calling the appropriate methods on the wrapped expression.
|
|
|
|
.. class:: Expression
|
|
|
|
.. attribute:: contains_aggregate
|
|
|
|
Tells Django that this expression contains an aggregate and that a
|
|
``GROUP BY`` clause needs to be added to the query.
|
|
|
|
.. method:: resolve_expression(query=None, allow_joins=True, reuse=None, summarize=False)
|
|
|
|
Provides the chance to do any pre-processing or validation of
|
|
the expression before it's added to the query. ``resolve_expression()``
|
|
must also be called on any nested expressions. A ``copy()`` of ``self``
|
|
should be returned with any necessary transformations.
|
|
|
|
``query`` is the backend query implementation.
|
|
|
|
``allow_joins`` is a boolean that allows or denies the use of
|
|
joins in the query.
|
|
|
|
``reuse`` is a set of reusable joins for multi-join scenarios.
|
|
|
|
``summarize`` is a boolean that, when ``True``, signals that the
|
|
query being computed is a terminal aggregate query.
|
|
|
|
.. method:: get_source_expressions()
|
|
|
|
Returns an ordered list of inner expressions. For example::
|
|
|
|
>>> Sum(F('foo')).get_source_expressions()
|
|
[F('foo')]
|
|
|
|
.. method:: set_source_expressions(expressions)
|
|
|
|
Takes a list of expressions and stores them such that
|
|
``get_source_expressions()`` can return them.
|
|
|
|
.. method:: relabeled_clone(change_map)
|
|
|
|
Returns a clone (copy) of ``self``, with any column aliases relabeled.
|
|
Column aliases are renamed when subqueries are created.
|
|
``relabeled_clone()`` should also be called on any nested expressions
|
|
and assigned to the clone.
|
|
|
|
``change_map`` is a dictionary mapping old aliases to new aliases.
|
|
|
|
Example::
|
|
|
|
def relabeled_clone(self, change_map):
|
|
clone = copy.copy(self)
|
|
clone.expression = self.expression.relabeled_clone(change_map)
|
|
return clone
|
|
|
|
.. method:: convert_value(self, value, expression, connection, context)
|
|
|
|
A hook allowing the expression to coerce ``value`` into a more
|
|
appropriate type.
|
|
|
|
.. method:: refs_aggregate(existing_aggregates)
|
|
|
|
Returns a tuple containing the ``(aggregate, lookup_path)`` of the
|
|
first aggregate that this expression (or any nested expression)
|
|
references, or ``(False, ())`` if no aggregate is referenced.
|
|
For example::
|
|
|
|
queryset.filter(num_chairs__gt=F('sum__employees'))
|
|
|
|
The ``F()`` expression here references a previous ``Sum()``
|
|
computation which means that this filter expression should be
|
|
added to the ``HAVING`` clause rather than the ``WHERE`` clause.
|
|
|
|
In the majority of cases, returning the result of ``refs_aggregate``
|
|
on any nested expression should be appropriate, as the necessary
|
|
built-in expressions will return the correct values.
|
|
|
|
.. method:: get_group_by_cols()
|
|
|
|
Responsible for returning the list of columns references by
|
|
this expression. ``get_group_by_cols()`` should be called on any
|
|
nested expressions. ``F()`` objects, in particular, hold a reference
|
|
to a column.
|
|
|
|
.. method:: asc()
|
|
|
|
Returns the expression ready to be sorted in ascending order.
|
|
|
|
.. method:: desc()
|
|
|
|
Returns the expression ready to be sorted in descending order.
|
|
|
|
.. method:: reverse_ordering()
|
|
|
|
Returns ``self`` with any modifications required to reverse the sort
|
|
order within an ``order_by`` call. As an example, an expression
|
|
implementing ``NULLS LAST`` would change its value to be
|
|
``NULLS FIRST``. Modifications are only required for expressions that
|
|
implement sort order like ``OrderBy``. This method is called when
|
|
:meth:`~django.db.models.query.QuerySet.reverse()` is called on a
|
|
queryset.
|
|
|
|
Writing your own Query Expressions
|
|
----------------------------------
|
|
|
|
You can write your own query expression classes that use, and can integrate
|
|
with, other query expressions. Let's step through an example by writing an
|
|
implementation of the ``COALESCE`` SQL function, without using the built-in
|
|
:ref:`Func() expressions <func-expressions>`.
|
|
|
|
The ``COALESCE`` SQL function is defined as taking a list of columns or
|
|
values. It will return the first column or value that isn't ``NULL``.
|
|
|
|
We'll start by defining the template to be used for SQL generation and
|
|
an ``__init__()`` method to set some attributes::
|
|
|
|
import copy
|
|
from django.db.models import Expression
|
|
|
|
class Coalesce(Expression):
|
|
template = 'COALESCE( %(expressions)s )'
|
|
|
|
def __init__(self, expressions, output_field, **extra):
|
|
super(Coalesce, self).__init__(output_field=output_field)
|
|
if len(expressions) < 2:
|
|
raise ValueError('expressions must have at least 2 elements')
|
|
for expression in expressions:
|
|
if not hasattr(expression, 'resolve_expression'):
|
|
raise TypeError('%r is not an Expression' % expression)
|
|
self.expressions = expressions
|
|
self.extra = extra
|
|
|
|
We do some basic validation on the parameters, including requiring at least
|
|
2 columns or values, and ensuring they are expressions. We are requiring
|
|
``output_field`` here so that Django knows what kind of model field to assign
|
|
the eventual result to.
|
|
|
|
Now we implement the pre-processing and validation. Since we do not have
|
|
any of our own validation at this point, we just delegate to the nested
|
|
expressions::
|
|
|
|
def resolve_expression(self, query=None, allow_joins=True, reuse=None, summarize=False):
|
|
c = self.copy()
|
|
c.is_summary = summarize
|
|
for pos, expression in enumerate(self.expressions):
|
|
c.expressions[pos] = expression.resolve_expression(query, allow_joins, reuse, summarize)
|
|
return c
|
|
|
|
Next, we write the method responsible for generating the SQL::
|
|
|
|
def as_sql(self, compiler, connection):
|
|
sql_expressions, sql_params = [], []
|
|
for expression in self.expressions:
|
|
sql, params = compiler.compile(expression)
|
|
sql_expressions.append(sql)
|
|
sql_params.extend(params)
|
|
self.extra['expressions'] = ','.join(sql_expressions)
|
|
return self.template % self.extra, sql_params
|
|
|
|
def as_oracle(self, compiler, connection):
|
|
"""
|
|
Example of vendor specific handling (Oracle in this case).
|
|
Let's make the function name lowercase.
|
|
"""
|
|
self.template = 'coalesce( %(expressions)s )'
|
|
return self.as_sql(compiler, connection)
|
|
|
|
We generate the SQL for each of the ``expressions`` by using the
|
|
``compiler.compile()`` method, and join the result together with commas.
|
|
Then the template is filled out with our data and the SQL and parameters
|
|
are returned.
|
|
|
|
We've also defined a custom implementation that is specific to the Oracle
|
|
backend. The ``as_oracle()`` function will be called instead of ``as_sql()``
|
|
if the Oracle backend is in use.
|
|
|
|
Finally, we implement the rest of the methods that allow our query expression
|
|
to play nice with other query expressions::
|
|
|
|
def get_source_expressions(self):
|
|
return self.expressions
|
|
|
|
def set_source_expressions(expressions):
|
|
self.expressions = expressions
|
|
|
|
Let's see how it works::
|
|
|
|
>>> qs = Company.objects.annotate(
|
|
... tagline=Coalesce([
|
|
... F('motto'),
|
|
... F('ticker_name'),
|
|
... F('description'),
|
|
... Value('No Tagline')
|
|
... ], output_field=CharField()))
|
|
>>> for c in qs:
|
|
... print("%s: %s" % (c.name, c.tagline))
|
|
...
|
|
Google: Do No Evil
|
|
Apple: AAPL
|
|
Yahoo: Internet Company
|
|
Django Software Foundation: No Tagline
|