mirror of
				https://github.com/django/django.git
				synced 2025-10-24 22:26:08 +00:00 
			
		
		
		
	Also fixed #22124 -- Expanded explanation of exactly what is going on in as_sql() methods.
		
			
				
	
	
		
			408 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			408 lines
		
	
	
		
			17 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ==============
 | |
| Custom lookups
 | |
| ==============
 | |
| 
 | |
| .. versionadded:: 1.7
 | |
| 
 | |
| .. module:: django.db.models.lookups
 | |
|    :synopsis: Custom lookups
 | |
| 
 | |
| .. currentmodule:: django.db.models
 | |
| 
 | |
| By default Django offers a wide variety of :ref:`built-in lookups
 | |
| <field-lookups>` for filtering (for example, ``exact`` and ``icontains``). This
 | |
| documentation explains how to write custom lookups and how to alter the working
 | |
| of existing lookups.
 | |
| 
 | |
| A simple Lookup example
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| Let's start with a simple custom lookup. We will write a custom lookup ``ne``
 | |
| which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
 | |
| will translate to the SQL::
 | |
| 
 | |
|   "author"."name" <> 'Jack'
 | |
| 
 | |
| This SQL is backend independent, so we don't need to worry about different
 | |
| databases.
 | |
| 
 | |
| There are two steps to making this work. Firstly we need to implement the
 | |
| lookup, then we need to tell Django about it. The implementation is quite
 | |
| straightforward::
 | |
| 
 | |
|   from django.db.models import Lookup
 | |
| 
 | |
|   class NotEqual(Lookup):
 | |
|       lookup_name = 'ne'
 | |
| 
 | |
|       def as_sql(self, qn, connection):
 | |
|           lhs, lhs_params = self.process_lhs(qn, connection)
 | |
|           rhs, rhs_params = self.process_rhs(qn, connection)
 | |
|           params = lhs_params + rhs_params
 | |
|           return '%s <> %s' % (lhs, rhs), params
 | |
| 
 | |
| To register the ``NotEqual`` lookup we will just need to call
 | |
| ``register_lookup`` on the field class we want the lookup to be available. In
 | |
| this case, the lookup makes sense on all ``Field`` subclasses, so we register
 | |
| it with ``Field`` directly::
 | |
| 
 | |
|   from django.db.models.fields import Field
 | |
|   Field.register_lookup(NotEqual)
 | |
| 
 | |
| We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
 | |
| this registration happens before you try to create any querysets using it. You
 | |
| could place the implementation in a ``models.py`` file, or register the lookup
 | |
| in the ``ready()`` method of an ``AppConfig``.
 | |
| 
 | |
| Taking a closer look at the implementation, the first required attribute is
 | |
| ``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
 | |
| and use ``NotEqual`` to generate the SQL. By convention, these names are always
 | |
| lowercase strings containing only letters, but the only hard requirement is
 | |
| that it must not contain the string ``__``.
 | |
| 
 | |
| We then need to define the ``as_sql`` method. This takes a ``SQLCompiler``
 | |
| object, called ``qn``,  and the active database connection. ``SQLCompiler``
 | |
| objects are not documented, but the only thing we need to know about them is
 | |
| that they have a ``compile()`` method which returns a tuple containing a SQL
 | |
| string, and the parameters to be interpolated into that string. In most cases,
 | |
| you don't need to use it directly and can pass it on to ``process_lhs()`` and
 | |
| ``process_rhs()``.
 | |
| 
 | |
| A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
 | |
| left-hand side and right-hand side. The left-hand side is usually a field
 | |
| reference, but it can be anything implementing the :ref:`query expression API
 | |
| <query-expression>`. The right-hand is the value given by the user. In the
 | |
| example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
 | |
| reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
 | |
| right-hand side.
 | |
| 
 | |
| We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
 | |
| need for SQL using the ``qn`` object described before. These methods return
 | |
| tuples containing some SQL and the parameters to be interpolated into that SQL,
 | |
| just as we need to return from our ``as_sql`` method. In the above example,
 | |
| ``process_lhs`` returns ``('"author"."name"', [])`` and ``process_rhs`` returns
 | |
| ``('"%s"', ['Jack'])``. In this example there were no parameters for the left
 | |
| hand side, but this would depend on the object we have, so we still need to
 | |
| include them in the parameters we return.
 | |
| 
 | |
| Finally we combine the parts into a SQL expression with ``<>``, and supply all
 | |
| the parameters for the query. We then return a tuple containing the generated
 | |
| SQL string and the parameters.
 | |
| 
 | |
| A simple transformer example
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| The custom lookup above is great, but in some cases you may want to be able to
 | |
| chain lookups together. For example, let's suppose we are building an
 | |
| application where we want to make use of the ``abs()`` operator.
 | |
| We have an ``Experiment`` model which records a start value, end value and the
 | |
| change (start - end). We would like to find all experiments where the change
 | |
| was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
 | |
| or where it did not exceed a certain amount
 | |
| (``Experiment.objects.filter(change__abs__lt=27)``).
 | |
| 
 | |
| .. note::
 | |
|     This example is somewhat contrived, but it demonstrates nicely the range of
 | |
|     functionality which is possible in a database backend independent manner,
 | |
|     and without duplicating functionality already in Django.
 | |
| 
 | |
| We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL
 | |
| function ``ABS()`` to transform the value before comparison::
 | |
| 
 | |
|   from django.db.models import Transform
 | |
| 
 | |
|   class AbsoluteValue(Transform):
 | |
|       lookup_name = 'abs'
 | |
| 
 | |
|       def as_sql(self, qn, connection):
 | |
|           lhs, params = qn.compile(self.lhs)
 | |
|           return "ABS(%s)" % lhs, params
 | |
| 
 | |
| Next, lets register it for ``IntegerField``::
 | |
| 
 | |
|   from django.db.models import IntegerField
 | |
|   IntegerField.register_lookup(AbsoluteValue)
 | |
| 
 | |
| We can now run the queries we had before.
 | |
| ``Experiment.objects.filter(change__abs=27)`` will generate the following SQL::
 | |
| 
 | |
|     SELECT ... WHERE ABS("experiments"."change") = 27
 | |
| 
 | |
| By using ``Transform`` instead of ``Lookup`` it means we are able to chain
 | |
| further lookups afterwards. So
 | |
| ``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
 | |
| SQL::
 | |
| 
 | |
|     SELECT ... WHERE ABS("experiments"."change") < 27
 | |
| 
 | |
| Subclasses of ``Transform`` usually only operate on the left-hand side of the
 | |
| expression. Further lookups will work on the transformed value. Note that in
 | |
| this case where there is no other lookup specified, Django interprets
 | |
| ``change__abs=27`` as ``change__abs__exact=27``.
 | |
| 
 | |
| When looking for which lookups are allowable after the ``Transform`` has been
 | |
| applied, Django uses the ``output_type`` attribute. We didn't need to specify
 | |
| this here as it didn't change, but supposing we were applying ``AbsoluteValue``
 | |
| to some field which represents a more complex type (for example a point
 | |
| relative to an origin, or a complex number) then we may have wanted to specify
 | |
| ``output_type = FloatField``, which will ensure that further lookups like
 | |
| ``abs__lte`` behave as they would for a ``FloatField``.
 | |
| 
 | |
| Writing an efficient abs__lt lookup
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| When using the above written ``abs`` lookup, the SQL produced will not use
 | |
| indexes efficiently in some cases. In particular, when we use
 | |
| ``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
 | |
| ``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
 | |
| 
 | |
| So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
 | |
| the following SQL::
 | |
| 
 | |
|     SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
 | |
| 
 | |
| The implementation is::
 | |
| 
 | |
|   from django.db.models import Lookup
 | |
| 
 | |
|   class AbsoluteValueLessThan(Lookup):
 | |
|       lookup_name = 'lt'
 | |
| 
 | |
|       def as_sql(self, qn, connection):
 | |
|           lhs, lhs_params = qn.compile(self.lhs.lhs)
 | |
|           rhs, rhs_params = self.process_rhs(qn, connection)
 | |
|           params = lhs_params + rhs_params + lhs_params + rhs_params
 | |
|           return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params
 | |
| 
 | |
|   AbsoluteValue.register_lookup(AbsoluteValueLessThan)
 | |
| 
 | |
| There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
 | |
| isn't calling ``process_lhs()``. Instead it skips the transformation of the
 | |
| ``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
 | |
| want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is
 | |
| safe as ``AbsoluteValueLessThan`` can be accessed only from the
 | |
| ``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of
 | |
| ``AbsoluteValue``.
 | |
| 
 | |
| Notice also that  as both sides are used multiple times in the query the params
 | |
| need to contain ``lhs_params`` and ``rhs_params`` multiple times.
 | |
| 
 | |
| The final query does the inversion (``27`` to ``-27``) directly in the
 | |
| database. The reason for doing this is that if the self.rhs is something else
 | |
| than a plain integer value (for example an ``F()`` reference) we can't do the
 | |
| transformations in Python.
 | |
| 
 | |
| .. note::
 | |
|     In fact, most lookups with ``__abs`` could be implemented as range queries
 | |
|     like this, and on most database backends it is likely to be more sensible to
 | |
|     do so as you can make use of the indexes. However with PostgreSQL you may
 | |
|     want to add an index on ``abs(change)`` which would allow these queries to
 | |
|     be very efficient.
 | |
| 
 | |
| Writing alternative implementations for existing lookups
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| Sometimes different database vendors require different SQL for the same
 | |
| operation. For this example we will rewrite a custom implementation for
 | |
| MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
 | |
| operator. (Note that in reality almost all databases support both, including
 | |
| all the official databases supported by Django).
 | |
| 
 | |
| We can change the behavior on a specific backend by creating a subclass of
 | |
| ``NotEqual`` with a ``as_mysql`` method::
 | |
| 
 | |
|   class MySQLNotEqual(NotEqual):
 | |
|       def as_mysql(self, qn, connection):
 | |
|           lhs, lhs_params = self.process_lhs(qn, connection)
 | |
|           rhs, rhs_params = self.process_rhs(qn, connection)
 | |
|           params = lhs_params + rhs_params
 | |
|           return '%s != %s' % (lhs, rhs), params
 | |
|   Field.register_lookup(MySQLNotExact)
 | |
| 
 | |
| We can then register it with ``Field``. It takes the place of the original
 | |
| ``NotEqual`` class as it has the same ``lookup_name``.
 | |
| 
 | |
| When compiling a query, Django first looks for ``as_%s % connection.vendor``
 | |
| methods, and then falls back to ``as_sql``. The vendor names for the in-built
 | |
| backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
 | |
| 
 | |
| How Django determines the lookups and transforms which are used
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| In some cases you may which to dynamically change which ``Transform`` or
 | |
| ``Lookup`` is returned based on the name passed in, rather than fixing it. As
 | |
| an example, you could have a field which stores coordinates or an arbitrary
 | |
| dimension, and wish to allow a syntax like ``.filter(coords__x7=4)`` to return
 | |
| the objects where the 7th coordinate has value 4. In order to do this, you
 | |
| would override ``get_lookup`` with something like::
 | |
| 
 | |
|     class CoordinatesField(Field):
 | |
|         def get_lookup(self, lookup_name):
 | |
|             if lookup_name.startswith('x'):
 | |
|                 try:
 | |
|                     dimension = int(lookup_name[1:])
 | |
|                 except ValueError:
 | |
|                     pass
 | |
|                 finally:
 | |
|                     return get_coordinate_lookup(dimension)
 | |
|             return super(CoordinatesField, self).get_lookup(lookup_name)
 | |
| 
 | |
| You would then define ``get_coordinate_lookup`` appropriately to return a
 | |
| ``Lookup`` subclass which handles the relevant value of ``dimension``.
 | |
| 
 | |
| There is a similarly named method called ``get_transform()``. ``get_lookup()``
 | |
| should always return a ``Lookup`` subclass, and ``get_transform()`` a
 | |
| ``Transform`` subclass. It is important to remember that ``Transform``
 | |
| objects can be further filtered on, and ``Lookup`` objects cannot.
 | |
| 
 | |
| When filtering, if there is only one lookup name remaining to be resolved, we
 | |
| will look for a ``Lookup``. If there are multiple names, it will look for a
 | |
| ``Transform``. In the situation where there is only one name and a ``Lookup``
 | |
| is not found, we look for a ``Transform`` and then the ``exact`` lookup on that
 | |
| ``Transform``. All call sequences always end with a ``Lookup``. To clarify:
 | |
| 
 | |
| - ``.filter(myfield__mylookup)`` will call ``myfield.get_lookup('mylookup')``.
 | |
| - ``.filter(myfield__mytransform__mylookup)`` will call
 | |
|   ``myfield.get_transform('mytransform')``, and then
 | |
|   ``mytransform.get_lookup('mylookup')``.
 | |
| - ``.filter(myfield__mytransform)`` will first call
 | |
|   ``myfield.get_lookup('mytransform')``, which will fail, so it will fall back
 | |
|   to calling ``myfield.get_transform('mytransform')`` and then
 | |
|   ``mytransform.get_lookup('exact')``.
 | |
| 
 | |
| Lookups and transforms are registered using the same API - ``register_lookup``.
 | |
| 
 | |
| .. _query-expression:
 | |
| 
 | |
| The Query Expression API
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| A lookup can assume that the lhs responds to the query expression API.
 | |
| Currently direct field references, aggregates and ``Transform`` instances respond
 | |
| to this API.
 | |
| 
 | |
| .. method:: as_sql(qn, connection)
 | |
| 
 | |
|     Responsible for producing the query string and parameters for the
 | |
|     expression. The ``qn`` is a ``SQLCompiler`` object, which has a
 | |
|     ``compile()`` method that can be used to compile other expressions. The
 | |
|     ``connection`` is the connection used to execute the query.
 | |
| 
 | |
|     Calling expression.as_sql() directly is usually incorrect - instead
 | |
|     ``qn.compile(expression)`` should be used. The ``qn.compile()`` method will
 | |
|     take care of calling vendor-specific methods of the expression.
 | |
| 
 | |
| .. method:: as_vendorname(qn, connection)
 | |
| 
 | |
|     Works like ``as_sql()`` method. When an expression is compiled by
 | |
|     ``qn.compile()``, Django will first try to call ``as_vendorname()``, where
 | |
|     vendorname is the vendor name of the backend used for executing the query.
 | |
|     The vendorname is one of ``postgresql``, ``oracle``, ``sqlite`` or
 | |
|     ``mysql`` for Django's built-in backends.
 | |
| 
 | |
| .. method:: get_lookup(lookup_name)
 | |
| 
 | |
|     The ``get_lookup()`` method is used to fetch lookups. By default the
 | |
|     lookup is fetched from the expression's output type in the same way
 | |
|     described in registering and fetching lookup documentation below.
 | |
|     It is possible to override this method to alter that behavior.
 | |
| 
 | |
| .. method:: get_transform(lookup_name)
 | |
| 
 | |
|     The ``get_transform()`` method is used when a transform is needed rather
 | |
|     than a lookup, or if a lookup is not found. This is a more complex
 | |
|     situation which is useful when there arbitrary possible lookups for a
 | |
|     field. Generally speaking, you will not need to override ``get_lookup()``
 | |
|     or ``get_transform()``, and can use ``register_lookup()`` instead.
 | |
| 
 | |
| .. attribute:: output_type
 | |
| 
 | |
|     The ``output_type`` attribute is used by the ``get_lookup()`` method to check for
 | |
|     lookups. The output_type should be a field.
 | |
| 
 | |
| Note that this documentation lists only the public methods of the API.
 | |
| 
 | |
| Lookup reference
 | |
| ~~~~~~~~~~~~~~~~
 | |
| 
 | |
| .. class:: Lookup
 | |
| 
 | |
|     In addition to the attributes and methods below, lookups also support
 | |
|     ``as_sql`` and ``as_vendorname`` from the query expression API.
 | |
| 
 | |
| .. attribute:: lhs
 | |
| 
 | |
|     The ``lhs`` (left-hand side) of a lookup tells us what we are comparing the
 | |
|     rhs to. It is an object which implements the query expression API. This is
 | |
|     likely to be a field, an aggregate or a subclass of ``Transform``.
 | |
| 
 | |
| .. attribute:: rhs
 | |
| 
 | |
|     The ``rhs`` (right-hand side) of a lookup is the value we are comparing the
 | |
|     left hand side to. It may be a plain value, or something which compiles
 | |
|     into SQL, for example an ``F()`` object or a ``Queryset``.
 | |
| 
 | |
| .. attribute:: lookup_name
 | |
| 
 | |
|     This class level attribute is used when registering lookups. It determines
 | |
|     the name used in queries to trigger this lookup. For example, ``contains``
 | |
|     or ``exact``. This should not contain the string ``__``.
 | |
| 
 | |
| .. method:: process_lhs(qn, connection)
 | |
| 
 | |
|     This returns a tuple of ``(lhs_string, lhs_params)``. In some cases you may
 | |
|     wish to compile ``lhs`` directly in your ``as_sql`` methods using
 | |
|     ``qn.compile(self.lhs)``.
 | |
| 
 | |
| .. method:: process_rhs(qn, connection)
 | |
| 
 | |
|     Behaves the same as ``process_lhs`` but acts on the right-hand side.
 | |
| 
 | |
| Transform reference
 | |
| ~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| .. class:: Transform
 | |
| 
 | |
|     In addition to implementing the query expression API Transforms have the
 | |
|     following methods and attributes.
 | |
| 
 | |
| .. attribute:: lhs
 | |
| 
 | |
|     The ``lhs`` (left-hand-side) of a transform contains the value to be
 | |
|     transformed. The ``lhs`` implements the query expression API.
 | |
| 
 | |
| .. attribute:: lookup_name
 | |
| 
 | |
|     This class level attribute is used when registering lookups. It determines
 | |
|     the name used in queries to trigger this lookup. For example, ``year``
 | |
|     or ``dayofweek``. This should not contain the string ``__``.
 | |
| 
 | |
| .. _lookup-registration-api:
 | |
| 
 | |
| Registering and fetching lookups
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| The lookup registration API is explained below.
 | |
| 
 | |
| .. classmethod:: register_lookup(lookup)
 | |
| 
 | |
|     Registers the Lookup or Transform for the class. For example
 | |
|     ``DateField.register_lookup(YearExact)`` will register ``YearExact`` for
 | |
|     all ``DateFields`` in the project, but also for fields that are instances
 | |
|     of a subclass of ``DateField`` (for example ``DateTimeField``). You can
 | |
|     register a Lookup or a Transform using the same class method.
 | |
| 
 | |
| .. method:: get_lookup(lookup_name)
 | |
| 
 | |
|     Django uses ``get_lookup(lookup_name)`` to fetch lookups. The
 | |
|     implementation of ``get_lookup()`` looks for a subclass which is registered
 | |
|     for the current class with the correct ``lookup_name``.
 | |
| 
 | |
| .. method:: get_transform(lookup_name)
 | |
| 
 | |
|     Django uses ``get_transform(lookup_name)`` to fetch lookups. The
 | |
|     implementation of ``get_transform()`` looks for a subclass which is registered
 | |
|     for the current class with the correct ``transform_name``.
 | |
| 
 | |
| The lookup registration API is available for ``Transform`` and ``Field`` classes.
 |