django/docs/ref/models/database-functions.txt

677 lines
24 KiB
Plaintext

==================
Database Functions
==================
.. module:: django.db.models.functions
:synopsis: Database Functions
The classes documented below provide a way for users to use functions provided
by the underlying database as annotations, aggregations, or filters in Django.
Functions are also :doc:`expressions <expressions>`, so they can be used and
combined with other expressions like :ref:`aggregate functions
<aggregation-functions>`.
We'll be using the following model in examples of each function::
class Author(models.Model):
name = models.CharField(max_length=50)
age = models.PositiveIntegerField(null=True, blank=True)
alias = models.CharField(max_length=50, null=True, blank=True)
goes_by = models.CharField(max_length=50, null=True, blank=True)
We don't usually recommend allowing ``null=True`` for ``CharField`` since this
allows the field to have two "empty values", but it's important for the
``Coalesce`` example below.
``Cast``
========
.. class:: Cast(expression, output_field)
.. versionadded:: 1.10
Forces the result type of ``expression`` to be the one from ``output_field``.
Usage example::
>>> from django.db.models import FloatField
>>> from django.db.models.functions import Cast
>>> Value.objects.create(integer=4)
>>> value = Value.objects.annotate(as_float=Cast('integer', FloatField)).get()
>>> print(value.as_float)
4.0
``Coalesce``
============
.. class:: Coalesce(*expressions, **extra)
Accepts a list of at least two field names or expressions and returns the
first non-null value (note that an empty string is not considered a null
value). Each argument must be of a similar type, so mixing text and numbers
will result in a database error.
Usage examples::
>>> # Get a screen name from least to most public
>>> from django.db.models import Sum, Value as V
>>> from django.db.models.functions import Coalesce
>>> Author.objects.create(name='Margaret Smith', goes_by='Maggie')
>>> author = Author.objects.annotate(
... screen_name=Coalesce('alias', 'goes_by', 'name')).get()
>>> print(author.screen_name)
Maggie
>>> # Prevent an aggregate Sum() from returning None
>>> aggregated = Author.objects.aggregate(
... combined_age=Coalesce(Sum('age'), V(0)),
... combined_age_default=Sum('age'))
>>> print(aggregated['combined_age'])
0
>>> print(aggregated['combined_age_default'])
None
.. warning::
A Python value passed to ``Coalesce`` on MySQL may be converted to an
incorrect type unless explicitly cast to the correct database type:
>>> from django.db.models.expressions import RawSQL
>>> from django.utils import timezone
>>> now = timezone.now()
>>> now_sql = RawSQL("cast(%s as datetime)", (now,))
>>> Coalesce('updated', now_sql)
``Concat``
==========
.. class:: Concat(*expressions, **extra)
Accepts a list of at least two text fields or expressions and returns the
concatenated text. Each argument must be of a text or char type. If you want
to concatenate a ``TextField()`` with a ``CharField()``, then be sure to tell
Django that the ``output_field`` should be a ``TextField()``. This is also
required when concatenating a ``Value`` as in the example below.
This function will never have a null result. On backends where a null argument
results in the entire expression being null, Django will ensure that each null
part is converted to an empty string first.
Usage example::
>>> # Get the display name as "name (goes_by)"
>>> from django.db.models import CharField, Value as V
>>> from django.db.models.functions import Concat
>>> Author.objects.create(name='Margaret Smith', goes_by='Maggie')
>>> author = Author.objects.annotate(
... screen_name=Concat('name', V(' ('), 'goes_by', V(')'),
... output_field=CharField())).get()
>>> print(author.screen_name)
Margaret Smith (Maggie)
``Greatest``
============
.. class:: Greatest(*expressions, **extra)
Accepts a list of at least two field names or expressions and returns the
greatest value. Each argument must be of a similar type, so mixing text and
numbers will result in a database error.
Usage example::
class Blog(models.Model):
body = models.TextField()
modified = models.DateTimeField(auto_now=True)
class Comment(models.Model):
body = models.TextField()
modified = models.DateTimeField(auto_now=True)
blog = models.ForeignKey(Blog, on_delete=models.CASCADE)
>>> from django.db.models.functions import Greatest
>>> blog = Blog.objects.create(body='Greatest is the best.')
>>> comment = Comment.objects.create(body='No, Least is better.', blog=blog)
>>> comments = Comment.objects.annotate(last_updated=Greatest('modified', 'blog__modified'))
>>> annotated_comment = comments.get()
``annotated_comment.last_updated`` will be the most recent of ``blog.modified``
and ``comment.modified``.
.. warning::
The behavior of ``Greatest`` when one or more expression may be ``null``
varies between databases:
- PostgreSQL: ``Greatest`` will return the largest non-null expression,
or ``null`` if all expressions are ``null``.
- SQLite, Oracle, and MySQL: If any expression is ``null``, ``Greatest``
will return ``null``.
The PostgreSQL behavior can be emulated using ``Coalesce`` if you know
a sensible minimum value to provide as a default.
``Least``
=========
.. class:: Least(*expressions, **extra)
Accepts a list of at least two field names or expressions and returns the
least value. Each argument must be of a similar type, so mixing text and numbers
will result in a database error.
.. warning::
The behavior of ``Least`` when one or more expression may be ``null``
varies between databases:
- PostgreSQL: ``Least`` will return the smallest non-null expression,
or ``null`` if all expressions are ``null``.
- SQLite, Oracle, and MySQL: If any expression is ``null``, ``Least``
will return ``null``.
The PostgreSQL behavior can be emulated using ``Coalesce`` if you know
a sensible maximum value to provide as a default.
``Length``
==========
.. class:: Length(expression, **extra)
Accepts a single text field or expression and returns the number of characters
the value has. If the expression is null, then the length will also be null.
Usage example::
>>> # Get the length of the name and goes_by fields
>>> from django.db.models.functions import Length
>>> Author.objects.create(name='Margaret Smith')
>>> author = Author.objects.annotate(
... name_length=Length('name'),
... goes_by_length=Length('goes_by')).get()
>>> print(author.name_length, author.goes_by_length)
(14, None)
It can also be registered as a transform. For example::
>>> from django.db.models import CharField
>>> from django.db.models.functions import Length
>>> CharField.register_lookup(Length, 'length')
>>> # Get authors whose name is longer than 7 characters
>>> authors = Author.objects.filter(name__length__gt=7)
``Lower``
=========
.. class:: Lower(expression, **extra)
Accepts a single text field or expression and returns the lowercase
representation.
It can also be registered as a transform as described in :class:`Length`.
Usage example::
>>> from django.db.models.functions import Lower
>>> Author.objects.create(name='Margaret Smith')
>>> author = Author.objects.annotate(name_lower=Lower('name')).get()
>>> print(author.name_lower)
margaret smith
``Now``
=======
.. class:: Now()
Returns the database server's current date and time when the query is executed,
typically using the SQL ``CURRENT_TIMESTAMP``.
Usage example::
>>> from django.db.models.functions import Now
>>> Article.objects.filter(published__lte=Now())
<QuerySet [<Article: How to Django>]>
.. admonition:: PostgreSQL considerations
On PostgreSQL, the SQL ``CURRENT_TIMESTAMP`` returns the time that the
current transaction started. Therefore for cross-database compatibility,
``Now()`` uses ``STATEMENT_TIMESTAMP`` instead. If you need the transaction
timestamp, use :class:`django.contrib.postgres.functions.TransactionNow`.
``Substr``
==========
.. class:: Substr(expression, pos, length=None, **extra)
Returns a substring of length ``length`` from the field or expression starting
at position ``pos``. The position is 1-indexed, so the position must be greater
than 0. If ``length`` is ``None``, then the rest of the string will be returned.
Usage example::
>>> # Set the alias to the first 5 characters of the name as lowercase
>>> from django.db.models.functions import Substr, Lower
>>> Author.objects.create(name='Margaret Smith')
>>> Author.objects.update(alias=Lower(Substr('name', 1, 5)))
1
>>> print(Author.objects.get(name='Margaret Smith').alias)
marga
``Upper``
=========
.. class:: Upper(expression, **extra)
Accepts a single text field or expression and returns the uppercase
representation.
It can also be registered as a transform as described in :class:`Length`.
Usage example::
>>> from django.db.models.functions import Upper
>>> Author.objects.create(name='Margaret Smith')
>>> author = Author.objects.annotate(name_upper=Upper('name')).get()
>>> print(author.name_upper)
MARGARET SMITH
Date Functions
==============
.. module:: django.db.models.functions.datetime
.. versionadded:: 1.10
We'll be using the following model in examples of each function::
class Experiment(models.Model):
start_time = models.DateTimeField()
start_date = models.DateField(null=True, blank=True)
end_time = models.DateTimeField(null=True, blank=True)
end_date = models.DateField(null=True, blank=True)
``Extract``
-----------
.. class:: Extract(expression, lookup_name=None, tzinfo=None, **extra)
Extracts a component of a date as a number.
Takes an ``expression`` representing a ``DateField`` or ``DateTimeField`` and a
``lookup_name``, and returns the part of the date referenced by ``lookup_name``
as an ``IntegerField``. Django usually uses the databases' extract function, so
you may use any ``lookup_name`` that your database supports. A ``tzinfo``
subclass, usually provided by ``pytz``, can be passed to extract a value in a
specific timezone.
Given the datetime ``2015-06-15 23:30:01.000321+00:00``, the built-in
``lookup_name``\s return:
* "year": 2015
* "month": 6
* "day": 15
* "week_day": 2
* "hour": 23
* "minute": 30
* "second": 1
If a different timezone like ``Australia/Melbourne`` is active in Django, then
the datetime is converted to the timezone before the value is extracted. The
timezone offset for Melbourne in the example date above is +10:00. The values
returned when this timezone is active will be the same as above except for:
* "day": 16
* "week_day": 3
* "hour": 9
.. admonition:: ``week_day`` values
The ``week_day`` ``lookup_type`` is calculated differently from most
databases and from Python's standard functions. This function will return
``1`` for Sunday, ``2`` for Monday, through ``7`` for Saturday.
The equivalent calculation in Python is::
>>> from datetime import datetime
>>> dt = datetime(2015, 6, 15)
>>> (dt.isoweekday() % 7) + 1
2
Each ``lookup_name`` above has a corresponding ``Extract`` subclass (listed
below) that should typically be used instead of the more verbose equivalent,
e.g. use ``ExtractYear(...)`` rather than ``Extract(..., lookup_name='year')``.
Usage example::
>>> from datetime import datetime
>>> from django.db.models.functions import Extract
>>> start = datetime(2015, 6, 15)
>>> end = datetime(2015, 7, 2)
>>> Experiment.objects.create(
... start_time=start, start_date=start.date(),
... end_time=end, end_date=end.date())
>>> # Add the experiment start year as a field in the QuerySet.
>>> experiment = Experiment.objects.annotate(
... start_year=Extract('start_time', 'year')).get()
>>> experiment.start_year
2015
>>> # How many experiments completed in the same year in which they started?
>>> Experiment.objects.filter(
... start_time__year=Extract('end_time', 'year')).count()
1
``DateField`` extracts
~~~~~~~~~~~~~~~~~~~~~~
.. class:: ExtractYear(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'year'
.. class:: ExtractMonth(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'month'
.. class:: ExtractDay(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'day'
.. class:: ExtractWeekDay(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'week_day'
These are logically equivalent to ``Extract('date_field', lookup_name)``. Each
class is also a ``Transform`` registered on ``DateField`` and ``DateTimeField``
as ``__(lookup_name)``, e.g. ``__year``.
Since ``DateField``\s don't have a time component, only ``Extract`` subclasses
that deal with date-parts can be used with ``DateField``::
>>> from datetime import datetime
>>> from django.utils import timezone
>>> from django.db.models.functions import (
... ExtractYear, ExtractMonth, ExtractDay, ExtractWeekDay
... )
>>> start_2015 = datetime(2015, 6, 15, 23, 30, 1, tzinfo=timezone.utc)
>>> end_2015 = datetime(2015, 6, 16, 13, 11, 27, tzinfo=timezone.utc)
>>> Experiment.objects.create(
... start_time=start_2015, start_date=start_2015.date(),
... end_time=end_2015, end_date=end_2015.date())
>>> Experiment.objects.annotate(
... year=ExtractYear('start_date'),
... month=ExtractMonth('start_date'),
... day=ExtractDay('start_date'),
... weekday=ExtractWeekDay('start_date'),
... ).values('year', 'month', 'day', 'weekday').get(
... end_date__year=ExtractYear('start_date'),
... )
{'year': 2015, 'month': 6, 'day': 15, 'weekday': 2}
``DateTimeField`` extracts
~~~~~~~~~~~~~~~~~~~~~~~~~~
In addition to the following, all extracts for ``DateField`` listed above may
also be used on ``DateTimeField``\s .
.. class:: ExtractHour(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'hour'
.. class:: ExtractMinute(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'minute'
.. class:: ExtractSecond(expression, tzinfo=None, **extra)
.. attribute:: lookup_name = 'second'
These are logically equivalent to ``Extract('datetime_field', lookup_name)``.
Each class is also a ``Transform`` registered on ``DateTimeField`` as
``__(lookup_name)``, e.g. ``__minute``.
``DateTimeField`` examples::
>>> from datetime import datetime
>>> from django.utils import timezone
>>> from django.db.models.functions import (
... ExtractYear, ExtractMonth, ExtractDay, ExtractWeekDay,
... ExtractHour, ExtractMinute, ExtractSecond,
... )
>>> start_2015 = datetime(2015, 6, 15, 23, 30, 1, tzinfo=timezone.utc)
>>> end_2015 = datetime(2015, 6, 16, 13, 11, 27, tzinfo=timezone.utc)
>>> Experiment.objects.create(
... start_time=start_2015, start_date=start_2015.date(),
... end_time=end_2015, end_date=end_2015.date())
>>> Experiment.objects.annotate(
... year=ExtractYear('start_time'),
... month=ExtractMonth('start_time'),
... day=ExtractDay('start_time'),
... weekday=ExtractWeekDay('start_time'),
... hour=ExtractHour('start_time'),
... minute=ExtractMinute('start_time'),
... second=ExtractSecond('start_time'),
... ).values(
... 'year', 'month', 'day', 'weekday', 'hour', 'minute', 'second',
... ).get(end_time__year=ExtractYear('start_time'))
{'year': 2015, 'month': 6, 'day': 15, 'weekday': 2, 'hour': 23, 'minute': 30, 'second': 1}
When :setting:`USE_TZ` is ``True`` then datetimes are stored in the database
in UTC. If a different timezone is active in Django, the datetime is converted
to that timezone before the value is extracted. The example below converts to
the Melbourne timezone (UTC +10:00), which changes the day, weekday, and hour
values that are returned::
>>> import pytz
>>> tzinfo = pytz.timezone('Australia/Melbourne') # UTC+10:00
>>> with timezone.override(tzinfo):
... Experiment.objects.annotate(
... day=ExtractDay('start_time'),
... weekday=ExtractWeekDay('start_time'),
... hour=ExtractHour('start_time'),
... ).values('day', 'weekday', 'hour').get(
... end_time__year=ExtractYear('start_time'),
... )
{'day': 16, 'weekday': 3, 'hour': 9}
Explicitly passing the timezone to the ``Extract`` function behaves in the same
way, and takes priority over an active timezone::
>>> import pytz
>>> tzinfo = pytz.timezone('Australia/Melbourne')
>>> Experiment.objects.annotate(
... day=ExtractDay('start_time', tzinfo=melb),
... weekday=ExtractWeekDay('start_time', tzinfo=melb),
... hour=ExtractHour('start_time', tzinfo=melb),
... ).values('day', 'weekday', 'hour').get(
... end_time__year=ExtractYear('start_time'),
... )
{'day': 16, 'weekday': 3, 'hour': 9}
``Trunc``
---------
.. class:: Trunc(expression, kind, output_field=None, tzinfo=None, **extra)
Truncates a date up to a significant component.
When you only care if something happened in a particular year, hour, or day,
but not the exact second, then ``Trunc`` (and its subclasses) can be useful to
filter or aggregate your data. For example, you can use ``Trunc`` to calculate
the number of sales per day.
``Trunc`` takes a single ``expression``, representing a ``DateField`` or
``DateTimeField``, a ``kind`` representing a date part, and an ``output_field``
that's either ``DateTimeField()`` or ``DateField()``. It returns a datetime or
date, depending on ``output_field``, with fields up to ``kind`` set to their
minimum value. If ``output_field`` is omitted, it will default to the
``output_field`` of ``expression``. A ``tzinfo`` subclass, usually provided by
``pytz``, can be passed to truncate a value in a specific timezone.
Given the datetime ``2015-06-15 14:30:50.000321+00:00``, the built-in ``kind``\s
return:
* "year": 2015-01-01 00:00:00+00:00
* "month": 2015-06-01 00:00:00+00:00
* "day": 2015-06-15 00:00:00+00:00
* "hour": 2015-06-15 14:00:00+00:00
* "minute": 2015-06-15 14:30:00+00:00
* "second": 2015-06-15 14:30:50+00:00
If a different timezone like ``Australia/Melbourne`` is active in Django, then
the datetime is converted to the new timezone before the value is truncated.
The timezone offset for Melbourne in the example date above is +10:00. The
values returned when this timezone is active will be:
* "year": 2015-01-01 00:00:00+11:00
* "month": 2015-06-01 00:00:00+10:00
* "day": 2015-06-16 00:00:00+10:00
* "hour": 2015-06-16 00:00:00+10:00
* "minute": 2015-06-16 00:30:00+10:00
* "second": 2015-06-16 00:30:50+10:00
The year has an offset of +11:00 because the result transitioned into daylight
saving time.
Each ``kind`` above has a corresponding ``Trunc`` subclass (listed below) that
should typically be used instead of the more verbose equivalent,
e.g. use ``TruncYear(...)`` rather than ``Trunc(..., kind='year')``.
The subclasses are all defined as transforms, but they aren't registered with
any fields, because the obvious lookup names are already reserved by the
``Extract`` subclasses.
Usage example::
>>> from datetime import datetime
>>> from django.db.models import Count, DateTimeField
>>> from django.db.models.functions import Trunc
>>> Experiment.objects.create(start_time=datetime(2015, 6, 15, 14, 30, 50, 321))
>>> Experiment.objects.create(start_time=datetime(2015, 6, 15, 14, 40, 2, 123))
>>> Experiment.objects.create(start_time=datetime(2015, 12, 25, 10, 5, 27, 999))
>>> experiments_per_day = Experiment.objects.annotate(
... start_day=Trunc('start_time', 'day', output_field=DateTimeField())
... ).values('start_day').annotate(experiments=Count('id'))
>>> for exp in experiments_per_day:
... print(exp['start_day'], exp['experiments'])
...
2015-06-15 00:00:00 2
2015-12-25 00:00:00 1
>>> experiments = Experiment.objects.annotate(
... start_day=Trunc('start_time', 'day', output_field=DateTimeField())
... ).filter(start_day=datetime(2015, 6, 15))
>>> for exp in experiments:
... print(exp.start_time)
...
2015-06-15 14:30:50.000321
2015-06-15 14:40:02.000123
``DateField`` truncation
~~~~~~~~~~~~~~~~~~~~~~~~
.. class:: TruncYear(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'year'
.. class:: TruncMonth(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'month'
These are logically equivalent to ``Trunc('date_field', kind)``. They truncate
all parts of the date up to ``kind`` which allows grouping or filtering dates
with less precision. ``expression`` can have an ``output_field`` of either
``DateField`` or ``DateTimeField``.
Since ``DateField``\s don't have a time component, only ``Trunc`` subclasses
that deal with date-parts can be used with ``DateField``::
>>> from datetime import datetime
>>> from django.db.models import Count
>>> from django.db.models.functions import TruncMonth, TruncYear
>>> from django.utils import timezone
>>> start1 = datetime(2014, 6, 15, 14, 30, 50, 321, tzinfo=timezone.utc)
>>> start2 = datetime(2015, 6, 15, 14, 40, 2, 123, tzinfo=timezone.utc)
>>> start3 = datetime(2015, 12, 31, 17, 5, 27, 999, tzinfo=timezone.utc)
>>> Experiment.objects.create(start_time=start1, start_date=start1.date())
>>> Experiment.objects.create(start_time=start2, start_date=start2.date())
>>> Experiment.objects.create(start_time=start3, start_date=start3.date())
>>> experiments_per_year = Experiment.objects.annotate(
... year=TruncYear('start_date')).values('year').annotate(
... experiments=Count('id'))
>>> for exp in experiments_per_year:
... print(exp['year'], exp['experiments'])
...
2014-01-01 1
2015-01-01 2
>>> import pytz
>>> melb = pytz.timezone('Australia/Melbourne')
>>> experiments_per_month = Experiment.objects.annotate(
... month=TruncMonth('start_time', tzinfo=melb)).values('month').annotate(
... experiments=Count('id'))
>>> for exp in experiments_per_month:
... print(exp['month'], exp['experiments'])
...
2015-06-01 00:00:00+10:00 1
2016-01-01 00:00:00+11:00 1
2014-06-01 00:00:00+10:00 1
``DateTimeField`` truncation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. class:: TruncDate(expression, **extra)
.. attribute:: lookup_name = 'date'
.. attribute:: output_field = DateField()
``TruncDate`` casts ``expression`` to a date rather than using the built-in SQL
truncate function. It's also registered as a transform on ``DateTimeField`` as
``__date``.
.. class:: TruncDay(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'day'
.. class:: TruncHour(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'hour'
.. class:: TruncMinute(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'minute'
.. class:: TruncSecond(expression, output_field=None, tzinfo=None, **extra)
.. attribute:: kind = 'second'
These are logically equivalent to ``Trunc('datetime_field', kind)``. They
truncate all parts of the date up to ``kind`` and allow grouping or filtering
datetimes with less precision. ``expression`` must have an ``output_field`` of
``DateTimeField``.
Usage example::
>>> from datetime import date, datetime
>>> from django.db.models import Count
>>> from django.db.models.functions import (
... TruncDate, TruncDay, TruncHour, TruncMinute, TruncSecond,
... )
>>> from django.utils import timezone
>>> import pytz
>>> start1 = datetime(2014, 6, 15, 14, 30, 50, 321, tzinfo=timezone.utc)
>>> Experiment.objects.create(start_time=start1, start_date=start1.date())
>>> melb = pytz.timezone('Australia/Melbourne')
>>> Experiment.objects.annotate(
... date=TruncDate('start_time'),
... day=TruncDay('start_time', tzinfo=melb),
... hour=TruncHour('start_time', tzinfo=melb),
... minute=TruncMinute('start_time'),
... second=TruncSecond('start_time'),
... ).values('date', 'day', 'hour', 'minute', 'second').get()
{'date': datetime.date(2014, 6, 15),
'day': datetime.datetime(2014, 6, 16, 0, 0, tzinfo=<DstTzInfo 'Australia/Melbourne' AEST+10:00:00 STD>),
'hour': datetime.datetime(2014, 6, 16, 0, 0, tzinfo=<DstTzInfo 'Australia/Melbourne' AEST+10:00:00 STD>),
'minute': 'minute': datetime.datetime(2014, 6, 15, 14, 30, tzinfo=<UTC>),
'second': datetime.datetime(2014, 6, 15, 14, 30, 50, tzinfo=<UTC>)
}