django/docs/db-api.txt

======================
Database API reference
======================

Once you've created your `data models`_, you'll need to retrieve data from the
database. This document explains the database abstraction API derived from the
models, and how to create, retrieve and update objects.

.. _`data models`: http://www.djangoproject.com/documentation/model_api/

Throughout this reference, we'll refer to the following Poll application::

    class Poll(meta.Model):
        slug = meta.SlugField(unique_for_month='pub_date')
        question = meta.CharField(maxlength=255)
        pub_date = meta.DateTimeField()
        expire_date = meta.DateTimeField()

        def __repr__(self):
            return self.question

    class Choice(meta.Model):
        poll = meta.ForeignKey(Poll, edit_inline=meta.TABULAR,
            num_in_admin=10, min_num_in_admin=5)
        choice = meta.CharField(maxlength=255, core=True)
        votes = meta.IntegerField(editable=False, default=0)

        def __repr__(self):
            return self.choice

Basic lookup functions
======================

Each model exposes these module-level functions for lookups:

get_object(\**kwargs)
---------------------

Returns the object matching the given lookup parameters, which should be in
the format described in "Field lookups" below. Raises a module-level
``*DoesNotExist`` exception if an object wasn't found for the given parameters.
Raises ``AssertionError`` if more than one object was found.

get_list(\**kwargs)
-------------------

Returns a list of objects matching the given lookup parameters, which should be
in the format described in "Field lookups" below. If no objects match the given
parameters, it returns an empty list. ``get_list()`` will always return a list.

get_iterator(\**kwargs)
-----------------------

Just like ``get_list()``, except it returns an iterator instead of a list. This
is more efficient for large result sets. This example shows the difference::

    # get_list() loads all objects into memory.
    for obj in foos.get_list():
        print repr(obj)

    # get_iterator() only loads a number of objects into memory at a time.
    for obj in foos.get_iterator():
        print repr(obj)

get_count(\**kwargs)
--------------------

Returns an integer representing the number of objects in the database matching
the given lookup parameters, which should be in the format described in
"Field lookups" below. ``get_count()`` never raises exceptions

Depending on which database you're using (e.g. PostgreSQL vs. MySQL), this may
return a long integer instead of a normal Python integer.

get_values(\**kwargs)
---------------------

Just like ``get_list()``, except it returns a list of dictionaries instead of
model-instance objects.

It accepts an optional parameter, ``fields``, which should be a list or tuple
of field names. If you don't specify ``fields``, each dictionary in the list
returned by ``get_values()`` will have a key and value for each field in the
database table. If you specify ``fields``, each dictionary will have only the
field keys/values for the fields you specify. Here's an example, using the
``Poll`` model defined above::

    >>> from datetime import datetime
    >>> p1 = polls.Poll(slug='whatsup', question="What's up?",
    ...     pub_date=datetime(2005, 2, 20), expire_date=datetime(2005, 3, 20))
    >>> p1.save()
    >>> p2 = polls.Poll(slug='name', question="What's your name?",
    ...     pub_date=datetime(2005, 3, 20), expire_date=datetime(2005, 4, 20))
    >>> p2.save()
    >>> polls.get_list()
    [What's up?, What's your name?]
    >>> polls.get_values()
    [{'id': 1, 'slug': 'whatsup', 'question': "What's up?", 'pub_date': datetime.datetime(2005, 2, 20), 'expire_date': datetime.datetime(2005, 3, 20)},
     {'id': 2, 'slug': 'name', 'question': "What's your name?", 'pub_date': datetime.datetime(2005, 3, 20), 'expire_date': datetime.datetime(2005, 4, 20)}]
    >>> polls.get_values(fields=['id', 'slug'])
    [{'id': 1, 'slug': 'whatsup'}, {'id': 2, 'slug': 'name'}]

Use ``get_values()`` when you know you're only going to need a couple of field
values and you won't need the functionality of a model instance object. It's
more efficient to select only the fields you need to use.

get_values_iterator(\**kwargs)
------------------------------

Just like ``get_values()``, except it returns an iterator instead of a list.
See the section on ``get_iterator()`` above.

get_in_bulk(id_list, \**kwargs)
-------------------------------

Takes a list of IDs and returns a dictionary mapping each ID to an instance of
the object with the given ID. Also takes optional keyword lookup arguments,
which should be in the format described in "Field lookups" below. Here's an
example, using the ``Poll`` model defined above::

    >>> from datetime import datetime
    >>> p1 = polls.Poll(slug='whatsup', question="What's up?",
    ...     pub_date=datetime(2005, 2, 20), expire_date=datetime(2005, 3, 20))
    >>> p1.save()
    >>> p2 = polls.Poll(slug='name', question="What's your name?",
    ...     pub_date=datetime(2005, 3, 20), expire_date=datetime(2005, 4, 20))
    >>> p2.save()
    >>> polls.get_list()
    [What's up?, What's your name?]
    >>> polls.get_in_bulk([1])
    {1: What's up?}
    >>> polls.get_in_bulk([1, 2])
    {1: What's up?, 2: What's your name?}

Field lookups
=============

Basic field lookups take the form ``field__lookuptype`` (that's a
double-underscore). For example::

    polls.get_list(pub_date__lte=datetime.datetime.now())

translates (roughly) into the following SQL::

    SELECT * FROM polls_polls WHERE pub_date < NOW();

.. admonition:: How this is possible

   Python has the ability to define functions that accept arbitrary name-value
   arguments whose names and values are evaluated at run time. For more
   information, see `Keyword Arguments`_ in the official Python tutorial.

The DB API supports the following lookup types:

    ===========  ==============================================================
    Type         Description
    ===========  ==============================================================
    exact        Exact match: ``polls.get_object(id__exact=14)``.
    iexact       Case-insensitive exact match:
                 ``polls.get_list(slug__iexact="foo")`` matches a slug of
                 ``foo``, ``FOO``, ``fOo``, etc.
    contains     Case-sensitive containment test:
                 ``polls.get_list(question__contains="spam")`` returns all polls
                 that contain "spam" in the question. (PostgreSQL only. MySQL
                 doesn't support case-sensitive LIKE statements; ``contains``
                 will act like ``icontains`` for MySQL.)
    icontains    Case-insensitive containment test.
    gt           Greater than: ``polls.get_list(id__gt=4)``.
    gte          Greater than or equal to.
    lt           Less than.
    lte          Less than or equal to.
    ne           Not equal to.
    in           In a given list: ``polls.get_list(id__in=[1, 3, 4])`` returns
                 a list of polls whose IDs are either 1, 3 or 4.
    startswith   Case-sensitive starts-with:
                 ``polls.get_list(question_startswith="Would")``. (PostgreSQL
                 only. MySQL doesn't support case-sensitive LIKE statements;
                 ``startswith`` will act like ``istartswith`` for MySQL.)
    endswith     Case-sensitive ends-with. (PostgreSQL only. MySQL doesn't
                 support case-sensitive LIKE statements; ``endswith`` will act
                 like ``iendswith`` for MySQL.)
    istartswith  Case-insensitive starts-with.
    iendswith    Case-insensitive ends-with.
    range        Range test:
                 ``polls.get_list(pub_date__range=(start_date, end_date))``
                 returns all polls with a pub_date between ``start_date``
                 and ``end_date`` (inclusive).
    year         For date/datetime fields, exact year match:
                 ``polls.get_count(pub_date__year=2005)``.
    month        For date/datetime fields, exact month match.
    day          For date/datetime fields, exact day match.
    isnull       True/False; does is IF NULL/IF NOT NULL lookup:
                 ``polls.get_list(expire_date__isnull=True)``.
    ===========  ==============================================================

Multiple lookups are allowed, of course, and are translated as "AND"s::

    polls.get_list(
        pub_date__year=2005,
        pub_date__month=1,
        question__startswith="Would",
    )

...retrieves all polls published in January 2005 that have a question starting with "Would."

For convenience, there's a ``pk`` lookup type, which translates into
``(primary_key)__exact``. In the polls example, these two statements are
equivalent::

    polls.get_object(id__exact=3)
    polls.get_object(pk=3)

``pk`` lookups also work across joins. In the polls example, these two
statements are equivalent::

    choices.get_list(poll__id__exact=3)
    choices.get_list(poll__pk=3)

If you pass an invalid keyword argument, the function will raise ``TypeError``.

.. _`Keyword Arguments`: http://docs.python.org/tut/node6.html#SECTION006720000000000000000

Ordering
========

The results are automatically ordered by the ordering tuple given by the
``ordering`` key in the model, but the ordering may be explicitly
provided by the ``order_by`` argument to a lookup::

    polls.get_list(
        pub_date__year=2005,
        pub_date__month=1,
        order_by=('-pub_date', 'question'),
    )

The result set above will be ordered by ``pub_date`` descending, then
by ``question`` ascending. The negative sign in front of "-pub_date" indicates
descending order. Ascending order is implied. To order randomly, use "?", like
so::

    polls.get_list(order_by=['?'])

Relationships (joins)
=====================

Joins may implicitly be performed by following relationships:
``choices.get_list(poll__slug__exact="eggs")`` fetches a list of ``Choice``
objects where the associated ``Poll`` has a slug of ``eggs``.  Multiple levels
of joins are allowed.

Given an instance of an object, related objects can be looked-up directly using
convenience functions. For example, if ``p`` is a ``Poll`` instance,
``p.get_choice_list()`` will return a list of all associated choices. Astute
readers will note that this is the same as
``choices.get_list(poll_id__exact=p.id)``, except clearer.

Each type of relationship creates a set of methods on each object in the
relationship. These methods are created in both directions, so objects that are
"related-to" need not explicitly define reverse relationships; that happens
automatically.

One-to-one relations
--------------------

Each object in a one-to-one relationship will have a ``get_relatedobjectname()``
method. For example::

    class Place(meta.Model):
        # ...

    class Restaurant(meta.Model):
        # ...
        the_place = meta.OneToOneField(places.Place)

In the above example, each ``Place`` will have a ``get_restaurant()`` method,
and each ``Restaurant`` will have a ``get_theplace()`` method.

Many-to-one relations
---------------------

In each many-to-one relationship, the related object will have a
``get_relatedobject()`` method, and the related-to object will have
``get_relatedobject()``, ``get_relatedobject_list()``, and
``get_relatedobject_count()`` methods (the same as the module-level
``get_object()``, ``get_list()``, and ``get_count()`` methods).

In the poll example above, here are the available choice methods on a ``Poll`` object ``p``::

    p.get_choice()
    p.get_choice_list()
    p.get_choice_count()

And a ``Choice`` object ``c`` has the following method::

    c.get_poll()

Many-to-many relations
----------------------

Many-to-many relations result in the same set of methods as `Many-to-one relations`_,
except that the ``get_relatedobject_list()`` function on the related object will
return a list of instances instead of a single instance.  So, if the relationship
between ``Poll`` and ``Choice`` was many-to-many, ``choice.get_poll_list()`` would
return a list.

Relationships across applications
---------------------------------

If a relation spans applications -- if ``Place`` was had a ManyToOne relation to
a ``geo.City`` object, for example -- the name of the other application will be
added to the method, i.e. ``place.get_geo_city()`` and
``city.get_places_place_list()``.

Selecting related objects
-------------------------

Relations are the bread and butter of databases, so there's an option to "follow"
all relationships and pre-fill them in a simple cache so that later calls to
objects with a one-to-many relationship don't have to hit the database. Do this by
passing ``select_related=True`` to a lookup. This results in (sometimes much) larger
queries, but it means that later use of relationships is much faster.

For example, using the Poll and Choice models from above, if you do the following::

    c = choices.get_object(id__exact=5, select_related=True)

Then subsequent calls to ``c.get_poll()`` won't hit the database.

Note that ``select_related`` follows foreign keys as far as possible. If you have the
following models::

    class Poll(meta.Model):
        # ...

    class Choice(meta.Model):
        # ...
        poll = meta.ForeignKey(Poll)

    class SingleVote(meta.Model):
        # ...
        choice = meta.ForeignKey(Choice)

then a call to ``singlevotes.get_object(id__exact=4, select_related=True)`` will
cache the related choice *and* the related poll::

    >>> sv = singlevotes.get_object(id__exact=4, select_related=True)
    >>> c = sv.get_choice()        # Doesn't hit the database.
    >>> p = c.get_poll()           # Doesn't hit the database.

    >>> sv = singlevotes.get_object(id__exact=4) # Note no "select_related".
    >>> c = sv.get_choice()        # Hits the database.
    >>> p = c.get_poll()           # Hits the database.

Limiting selected rows
======================

The ``limit``, ``offset``, and ``distinct`` keywords can be used to control
which rows are returned.  Both ``limit`` and ``offset`` should be integers which
will be directly passed to the SQL ``LIMIT``/``OFFSET`` commands.

If ``distinct`` is True, only distinct rows will be returned. This is equivalent
to a ``SELECT DISTINCT`` SQL clause.

Other lookup options
====================

There are a few other ways of more directly controlling the generated SQL
for the lookup.  Note that by definition these extra lookups may not be
portable to different database engines (because you're explicitly writing
SQL code) and should be avoided if possible.:

``params``
----------

All the extra-SQL params described below may use standard Python string
formatting codes to indicate parameters that the database engine will
automatically quote.  The ``params`` argument can contain any extra
parameters to be substituted.

``select``
----------

The ``select`` keyword allows you to select extra fields.  This should be a
dictionary mapping attribute names to a SQL clause to use to calculate that
attribute. For example::

    polls.get_list(
        select={
            'choice_count': 'SELECT COUNT(*) FROM choices WHERE poll_id = polls.id'
        }
    )

Each of the resulting ``Poll`` objects will have an extra attribute, ``choice_count``,
an integer count of associated ``Choice`` objects. Note that the parenthesis required by
most database engines around sub-selects are not required in Django's ``select``
clauses.

``where`` / ``tables``
----------------------

If you need to explicitly pass extra ``WHERE`` clauses -- perhaps to perform
non-explicit joins -- use the ``where`` keyword. If you need to
join other tables into your query, you can pass their names to ``tables``.

``where`` and ``tables`` both take a list of strings. All ``where`` parameters
are "AND"ed to any other search criteria.

For example::

    polls.get_list(question__startswith='Who', where=['id IN (3, 4, 5, 20)'])

...translates (roughly) into the following SQL:

    SELECT * FROM polls_polls WHERE question LIKE 'Who%' AND id IN (3, 4, 5, 20);

Changing objects
================

Once you've retrieved an object from the database using any of the above
options, changing it is extremely easy.  Make changes directly to the
objects fields, then call the object's ``save()`` method::

    >>> p = polls.get_object(id__exact=15)
    >>> p.slug = "new_slug"
    >>> p.pub_date = datetime.datetime.now()
    >>> p.save()

Creating new objects
====================

Creating new objects (i.e. ``INSERT``) is done by creating new instances
of objects then calling save() on them::

    >>> p = polls.Poll(slug="eggs",
    ...                question="How do you like your eggs?",
    ...                pub_date=datetime.datetime.now(),
    ...                expire_date=some_future_date)
    >>> p.save()

Calling ``save()`` on an object with a primary key whose value is ``None``
signifies to Django that the object is new and should be inserted.

Related objects (e.g. ``Choices``) are created using convenience functions::

    >>> p.add_choice(choice="Over easy", votes=0)
    >>> p.add_choice(choice="Scrambled", votes=0)
    >>> p.add_choice(choice="Fertilized", votes=0)
    >>> p.add_choice(choice="Poached", votes=0)
    >>> p.get_choice_count()
    4

Each of those ``add_choice`` methods is equivalent to (but much simpler than)::

    >>> c = polls.Choice(poll_id=p.id, choice="Over easy", votes=0)
    >>> c.save()

Note that when using the `add_foo()`` methods, you do not give any value
for the ``id`` field, nor do you give a value for the field that stores
the relation (``poll_id`` in this case).

The ``add_FOO()`` method always returns the newly created object.

Deleting objects
================

The delete method, conveniently, is named ``delete()``. This method immediately
deletes the object and has no return value. Example::

    >>> c.delete()

Extra instance methods
======================

In addition to ``save()``, ``delete()`` and all of the ``add_*`` and ``get_*``
related-object methods, a model object might get any or all of the following
methods:

get_FOO_display()
-----------------

For every field that has ``choices`` set, the object will have a
``get_FOO_display()`` method, where ``FOO`` is the name of the field. This
method returns the "human-readable" value of the field. For example, in the
following model::

    GENDER_CHOICES = (
        ('M', 'Male'),
        ('F', 'Female'),
    )
    class Person
        name = meta.CharField(maxlength=20)
        gender = meta.CharField(maxlength=1, choices=GENDER_CHOICES)

...each ``Person`` instance will have a ``get_gender_display()`` method. Example::

    >>> p = Person(name='John', gender='M')
    >>> p.save()
    >>> p.gender
    'M'
    >>> p.get_gender_display()
    'Male'

get_next_by_FOO(\**kwargs) and get_previous_by_FOO(\**kwargs)
-------------------------------------------------------------

For every ``DateField`` and ``DateTimeField`` that does not have ``null=True``,
the object will have ``get_next_by_FOO()`` and ``get_previous_by_FOO()``
methods, where ``FOO`` is the name of the field. This returns the next and
previous object with respect to the date field, raising the appropriate
``*DoesNotExist`` exception when appropriate.

Both methods accept optional keyword arguments, which should be in the format
described in "Field lookups" above.

get_FOO_filename()
------------------

For every ``FileField``, the object will have a ``get_FOO_filename()`` method,
where ``FOO`` is the name of the field. This returns the full filesystem path
to the file, according to your ``MEDIA_ROOT`` setting.

Note that ``ImageField`` is technically a subclass of ``FileField``, so every
model with an ``ImageField`` will also get this method.

get_FOO_url()
-------------

For every ``FileField``, the object will have a ``get_FOO_url()`` method,
where ``FOO`` is the name of the field. This returns the full URL to the file,
according to your ``MEDIA_URL`` setting. If the value is blank, this method
returns an empty string.

get_FOO_size()
--------------

For every ``FileField``, the object will have a ``get_FOO_filename()`` method,
where ``FOO`` is the name of the field. This returns the size of the file, in
bytes. (Behind the scenes, it uses ``os.path.getsize``.)

save_FOO_file(filename, raw_contents)
-------------------------------------

For every ``FileField``, the object will have a ``get_FOO_filename()`` method,
where ``FOO`` is the name of the field. This saves the given file to the
filesystem, using the given filename. If a file with the given filename already
exists, Django adds an underscore to the end of the filename (but before the
extension) until the filename is available.

get_FOO_height() and get_FOO_width()
------------------------------------

For every ``ImageField``, the object will have ``get_FOO_height()`` and
``get_FOO_width()`` methods, where ``FOO`` is the name of the field. This
returns the height (or width) of the image, as an integer, in pixels.

Extra module functions
======================

In addition to every function described in "Basic lookup functions" above, a
model module might get any or all of the following methods:

get_FOO_list(kind, \**kwargs)
-----------------------------

For every ``DateField`` and ``DateTimeField``, the model module will have a
``get_FOO_list()`` function, where ``FOO`` is the name of the field. This
returns a list of ``datetime.datetime`` objects representing all available
dates of the given scope, as defined by the ``kind`` argument. ``kind`` should
be either ``"year"``, ``"month"`` or ``"day"``. Each ``datetime.datetime``
object in the result list is "truncated" to the given ``type``.

    * ``"year"`` returns a list of all distinct year values for the field.
    * ``"month"`` returns a list of all distinct year/month values for the field.
    * ``"day"`` returns a list of all distinct year/month/day values for the field.

Additional, optional keyword arguments, in the format described in
"Field lookups" above, are also accepted.

Here's an example, using the ``Poll`` model defined above::

    >>> from datetime import datetime
    >>> p1 = polls.Poll(slug='whatsup', question="What's up?",
    ...     pub_date=datetime(2005, 2, 20), expire_date=datetime(2005, 3, 20))
    >>> p1.save()
    >>> p2 = polls.Poll(slug='name', question="What's your name?",
    ...     pub_date=datetime(2005, 3, 20), expire_date=datetime(2005, 4, 20))
    >>> p2.save()
    >>> polls.get_pub_date_list('year')
    [datetime.datetime(2005, 1, 1)]
    >>> polls.get_pub_date_list('month')
    [datetime.datetime(2005, 2, 1), datetime.datetime(2005, 3, 1)]
    >>> polls.get_pub_date_list('day')
    [datetime.datetime(2005, 2, 20), datetime.datetime(2005, 3, 20)]
    >>> polls.get_pub_date_list('day', question__contains='name')
    [datetime.datetime(2005, 3, 20)]

``get_FOO_list()`` also accepts an optional keyword argument ``order``, which
should be either ``"ASC"`` or ``"DESC"``. This specifies how to order the
results. Default is ``"ASC"``.