2014-01-18 11:09:43 +02:00
|
|
|
==============
|
2014-06-11 17:09:10 +02:00
|
|
|
Custom Lookups
|
2014-01-18 11:09:43 +02:00
|
|
|
==============
|
|
|
|
|
|
|
|
.. versionadded:: 1.7
|
|
|
|
|
|
|
|
.. currentmodule:: django.db.models
|
|
|
|
|
2014-06-11 17:09:10 +02:00
|
|
|
Django offers a wide variety of :ref:`built-in lookups <field-lookups>` for
|
|
|
|
filtering (for example, ``exact`` and ``icontains``). This documentation
|
|
|
|
explains how to write custom lookups and how to alter the working of existing
|
|
|
|
lookups. For the API references of lookups, see the :doc:`/ref/models/lookups`.
|
2014-01-18 11:09:43 +02:00
|
|
|
|
2014-06-11 17:09:10 +02:00
|
|
|
A simple lookup example
|
2014-01-18 11:09:43 +02:00
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Let's start with a simple custom lookup. We will write a custom lookup ``ne``
|
|
|
|
which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
|
|
|
|
will translate to the SQL::
|
|
|
|
|
|
|
|
"author"."name" <> 'Jack'
|
|
|
|
|
|
|
|
This SQL is backend independent, so we don't need to worry about different
|
|
|
|
databases.
|
|
|
|
|
|
|
|
There are two steps to making this work. Firstly we need to implement the
|
|
|
|
lookup, then we need to tell Django about it. The implementation is quite
|
|
|
|
straightforward::
|
|
|
|
|
|
|
|
from django.db.models import Lookup
|
|
|
|
|
|
|
|
class NotEqual(Lookup):
|
|
|
|
lookup_name = 'ne'
|
|
|
|
|
|
|
|
def as_sql(self, qn, connection):
|
|
|
|
lhs, lhs_params = self.process_lhs(qn, connection)
|
|
|
|
rhs, rhs_params = self.process_rhs(qn, connection)
|
|
|
|
params = lhs_params + rhs_params
|
|
|
|
return '%s <> %s' % (lhs, rhs), params
|
|
|
|
|
|
|
|
To register the ``NotEqual`` lookup we will just need to call
|
|
|
|
``register_lookup`` on the field class we want the lookup to be available. In
|
|
|
|
this case, the lookup makes sense on all ``Field`` subclasses, so we register
|
|
|
|
it with ``Field`` directly::
|
|
|
|
|
|
|
|
from django.db.models.fields import Field
|
|
|
|
Field.register_lookup(NotEqual)
|
|
|
|
|
|
|
|
We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
|
|
|
|
this registration happens before you try to create any querysets using it. You
|
|
|
|
could place the implementation in a ``models.py`` file, or register the lookup
|
|
|
|
in the ``ready()`` method of an ``AppConfig``.
|
|
|
|
|
|
|
|
Taking a closer look at the implementation, the first required attribute is
|
|
|
|
``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
|
|
|
|
and use ``NotEqual`` to generate the SQL. By convention, these names are always
|
|
|
|
lowercase strings containing only letters, but the only hard requirement is
|
|
|
|
that it must not contain the string ``__``.
|
|
|
|
|
2014-03-01 21:21:57 +02:00
|
|
|
We then need to define the ``as_sql`` method. This takes a ``SQLCompiler``
|
|
|
|
object, called ``qn``, and the active database connection. ``SQLCompiler``
|
|
|
|
objects are not documented, but the only thing we need to know about them is
|
|
|
|
that they have a ``compile()`` method which returns a tuple containing a SQL
|
|
|
|
string, and the parameters to be interpolated into that string. In most cases,
|
|
|
|
you don't need to use it directly and can pass it on to ``process_lhs()`` and
|
|
|
|
``process_rhs()``.
|
|
|
|
|
2014-01-18 11:09:43 +02:00
|
|
|
A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
|
|
|
|
left-hand side and right-hand side. The left-hand side is usually a field
|
|
|
|
reference, but it can be anything implementing the :ref:`query expression API
|
|
|
|
<query-expression>`. The right-hand is the value given by the user. In the
|
|
|
|
example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
|
|
|
|
reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
|
|
|
|
right-hand side.
|
|
|
|
|
|
|
|
We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
|
2014-03-01 21:21:57 +02:00
|
|
|
need for SQL using the ``qn`` object described before. These methods return
|
|
|
|
tuples containing some SQL and the parameters to be interpolated into that SQL,
|
|
|
|
just as we need to return from our ``as_sql`` method. In the above example,
|
|
|
|
``process_lhs`` returns ``('"author"."name"', [])`` and ``process_rhs`` returns
|
|
|
|
``('"%s"', ['Jack'])``. In this example there were no parameters for the left
|
|
|
|
hand side, but this would depend on the object we have, so we still need to
|
|
|
|
include them in the parameters we return.
|
2014-01-18 11:09:43 +02:00
|
|
|
|
|
|
|
Finally we combine the parts into a SQL expression with ``<>``, and supply all
|
|
|
|
the parameters for the query. We then return a tuple containing the generated
|
|
|
|
SQL string and the parameters.
|
|
|
|
|
|
|
|
A simple transformer example
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
The custom lookup above is great, but in some cases you may want to be able to
|
|
|
|
chain lookups together. For example, let's suppose we are building an
|
|
|
|
application where we want to make use of the ``abs()`` operator.
|
2014-06-11 17:09:10 +02:00
|
|
|
We have an ``Experiment`` model which records a start value, end value, and the
|
2014-01-18 11:09:43 +02:00
|
|
|
change (start - end). We would like to find all experiments where the change
|
|
|
|
was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
|
2014-01-18 22:31:35 +02:00
|
|
|
or where it did not exceed a certain amount
|
2014-01-18 11:09:43 +02:00
|
|
|
(``Experiment.objects.filter(change__abs__lt=27)``).
|
|
|
|
|
|
|
|
.. note::
|
2014-06-11 17:09:10 +02:00
|
|
|
This example is somewhat contrived, but it nicely demonstrates the range of
|
2014-01-18 11:09:43 +02:00
|
|
|
functionality which is possible in a database backend independent manner,
|
|
|
|
and without duplicating functionality already in Django.
|
|
|
|
|
|
|
|
We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL
|
|
|
|
function ``ABS()`` to transform the value before comparison::
|
|
|
|
|
|
|
|
from django.db.models import Transform
|
|
|
|
|
|
|
|
class AbsoluteValue(Transform):
|
|
|
|
lookup_name = 'abs'
|
|
|
|
|
|
|
|
def as_sql(self, qn, connection):
|
|
|
|
lhs, params = qn.compile(self.lhs)
|
|
|
|
return "ABS(%s)" % lhs, params
|
|
|
|
|
|
|
|
Next, lets register it for ``IntegerField``::
|
|
|
|
|
|
|
|
from django.db.models import IntegerField
|
|
|
|
IntegerField.register_lookup(AbsoluteValue)
|
|
|
|
|
2014-01-18 22:31:35 +02:00
|
|
|
We can now run the queries we had before.
|
2014-01-18 11:09:43 +02:00
|
|
|
``Experiment.objects.filter(change__abs=27)`` will generate the following SQL::
|
|
|
|
|
|
|
|
SELECT ... WHERE ABS("experiments"."change") = 27
|
|
|
|
|
|
|
|
By using ``Transform`` instead of ``Lookup`` it means we are able to chain
|
|
|
|
further lookups afterwards. So
|
|
|
|
``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
|
|
|
|
SQL::
|
|
|
|
|
|
|
|
SELECT ... WHERE ABS("experiments"."change") < 27
|
|
|
|
|
|
|
|
Subclasses of ``Transform`` usually only operate on the left-hand side of the
|
|
|
|
expression. Further lookups will work on the transformed value. Note that in
|
|
|
|
this case where there is no other lookup specified, Django interprets
|
|
|
|
``change__abs=27`` as ``change__abs__exact=27``.
|
|
|
|
|
|
|
|
When looking for which lookups are allowable after the ``Transform`` has been
|
2014-06-17 11:57:16 -04:00
|
|
|
applied, Django uses the ``output_field`` attribute. We didn't need to specify
|
2014-01-18 11:09:43 +02:00
|
|
|
this here as it didn't change, but supposing we were applying ``AbsoluteValue``
|
|
|
|
to some field which represents a more complex type (for example a point
|
|
|
|
relative to an origin, or a complex number) then we may have wanted to specify
|
2014-06-17 11:57:16 -04:00
|
|
|
``output_field = FloatField``, which will ensure that further lookups like
|
2014-01-18 11:09:43 +02:00
|
|
|
``abs__lte`` behave as they would for a ``FloatField``.
|
|
|
|
|
|
|
|
Writing an efficient abs__lt lookup
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
When using the above written ``abs`` lookup, the SQL produced will not use
|
|
|
|
indexes efficiently in some cases. In particular, when we use
|
|
|
|
``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
|
|
|
|
``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
|
|
|
|
|
|
|
|
So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
|
|
|
|
the following SQL::
|
|
|
|
|
|
|
|
SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
|
|
|
|
|
|
|
|
The implementation is::
|
|
|
|
|
|
|
|
from django.db.models import Lookup
|
|
|
|
|
|
|
|
class AbsoluteValueLessThan(Lookup):
|
|
|
|
lookup_name = 'lt'
|
|
|
|
|
|
|
|
def as_sql(self, qn, connection):
|
|
|
|
lhs, lhs_params = qn.compile(self.lhs.lhs)
|
|
|
|
rhs, rhs_params = self.process_rhs(qn, connection)
|
|
|
|
params = lhs_params + rhs_params + lhs_params + rhs_params
|
2014-01-23 20:32:24 +00:00
|
|
|
return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params
|
2014-01-18 11:09:43 +02:00
|
|
|
|
|
|
|
AbsoluteValue.register_lookup(AbsoluteValueLessThan)
|
|
|
|
|
|
|
|
There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
|
|
|
|
isn't calling ``process_lhs()``. Instead it skips the transformation of the
|
|
|
|
``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
|
|
|
|
want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is
|
|
|
|
safe as ``AbsoluteValueLessThan`` can be accessed only from the
|
|
|
|
``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of
|
|
|
|
``AbsoluteValue``.
|
|
|
|
|
|
|
|
Notice also that as both sides are used multiple times in the query the params
|
|
|
|
need to contain ``lhs_params`` and ``rhs_params`` multiple times.
|
|
|
|
|
|
|
|
The final query does the inversion (``27`` to ``-27``) directly in the
|
|
|
|
database. The reason for doing this is that if the self.rhs is something else
|
|
|
|
than a plain integer value (for example an ``F()`` reference) we can't do the
|
|
|
|
transformations in Python.
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
In fact, most lookups with ``__abs`` could be implemented as range queries
|
2014-01-18 22:31:35 +02:00
|
|
|
like this, and on most database backends it is likely to be more sensible to
|
2014-01-18 11:09:43 +02:00
|
|
|
do so as you can make use of the indexes. However with PostgreSQL you may
|
|
|
|
want to add an index on ``abs(change)`` which would allow these queries to
|
|
|
|
be very efficient.
|
|
|
|
|
2014-01-18 22:31:35 +02:00
|
|
|
Writing alternative implementations for existing lookups
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
2014-01-18 11:09:43 +02:00
|
|
|
|
|
|
|
Sometimes different database vendors require different SQL for the same
|
|
|
|
operation. For this example we will rewrite a custom implementation for
|
|
|
|
MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
|
|
|
|
operator. (Note that in reality almost all databases support both, including
|
|
|
|
all the official databases supported by Django).
|
|
|
|
|
2014-02-28 11:44:03 -05:00
|
|
|
We can change the behavior on a specific backend by creating a subclass of
|
2014-01-18 11:09:43 +02:00
|
|
|
``NotEqual`` with a ``as_mysql`` method::
|
|
|
|
|
|
|
|
class MySQLNotEqual(NotEqual):
|
|
|
|
def as_mysql(self, qn, connection):
|
|
|
|
lhs, lhs_params = self.process_lhs(qn, connection)
|
|
|
|
rhs, rhs_params = self.process_rhs(qn, connection)
|
|
|
|
params = lhs_params + rhs_params
|
|
|
|
return '%s != %s' % (lhs, rhs), params
|
|
|
|
Field.register_lookup(MySQLNotExact)
|
|
|
|
|
|
|
|
We can then register it with ``Field``. It takes the place of the original
|
2014-01-18 22:31:35 +02:00
|
|
|
``NotEqual`` class as it has the same ``lookup_name``.
|
2014-01-18 11:09:43 +02:00
|
|
|
|
|
|
|
When compiling a query, Django first looks for ``as_%s % connection.vendor``
|
|
|
|
methods, and then falls back to ``as_sql``. The vendor names for the in-built
|
|
|
|
backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
|
|
|
|
|
2014-03-01 21:21:57 +02:00
|
|
|
How Django determines the lookups and transforms which are used
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
2014-09-07 11:29:20 +02:00
|
|
|
In some cases you may wish to dynamically change which ``Transform`` or
|
2014-03-01 21:21:57 +02:00
|
|
|
``Lookup`` is returned based on the name passed in, rather than fixing it. As
|
|
|
|
an example, you could have a field which stores coordinates or an arbitrary
|
|
|
|
dimension, and wish to allow a syntax like ``.filter(coords__x7=4)`` to return
|
|
|
|
the objects where the 7th coordinate has value 4. In order to do this, you
|
|
|
|
would override ``get_lookup`` with something like::
|
|
|
|
|
|
|
|
class CoordinatesField(Field):
|
|
|
|
def get_lookup(self, lookup_name):
|
|
|
|
if lookup_name.startswith('x'):
|
|
|
|
try:
|
|
|
|
dimension = int(lookup_name[1:])
|
|
|
|
except ValueError:
|
|
|
|
pass
|
|
|
|
finally:
|
|
|
|
return get_coordinate_lookup(dimension)
|
|
|
|
return super(CoordinatesField, self).get_lookup(lookup_name)
|
|
|
|
|
|
|
|
You would then define ``get_coordinate_lookup`` appropriately to return a
|
|
|
|
``Lookup`` subclass which handles the relevant value of ``dimension``.
|
|
|
|
|
|
|
|
There is a similarly named method called ``get_transform()``. ``get_lookup()``
|
|
|
|
should always return a ``Lookup`` subclass, and ``get_transform()`` a
|
|
|
|
``Transform`` subclass. It is important to remember that ``Transform``
|
|
|
|
objects can be further filtered on, and ``Lookup`` objects cannot.
|
|
|
|
|
|
|
|
When filtering, if there is only one lookup name remaining to be resolved, we
|
|
|
|
will look for a ``Lookup``. If there are multiple names, it will look for a
|
|
|
|
``Transform``. In the situation where there is only one name and a ``Lookup``
|
|
|
|
is not found, we look for a ``Transform`` and then the ``exact`` lookup on that
|
|
|
|
``Transform``. All call sequences always end with a ``Lookup``. To clarify:
|
|
|
|
|
|
|
|
- ``.filter(myfield__mylookup)`` will call ``myfield.get_lookup('mylookup')``.
|
|
|
|
- ``.filter(myfield__mytransform__mylookup)`` will call
|
|
|
|
``myfield.get_transform('mytransform')``, and then
|
|
|
|
``mytransform.get_lookup('mylookup')``.
|
|
|
|
- ``.filter(myfield__mytransform)`` will first call
|
|
|
|
``myfield.get_lookup('mytransform')``, which will fail, so it will fall back
|
|
|
|
to calling ``myfield.get_transform('mytransform')`` and then
|
|
|
|
``mytransform.get_lookup('exact')``.
|