1
0
mirror of https://github.com/django/django.git synced 2025-07-05 18:29:11 +00:00

magic-removal: Beginning of substantial rewrite to docs/db-api.txt. Not yet finished.

git-svn-id: http://code.djangoproject.com/svn/django/branches/magic-removal@2748 bcc190cf-cafb-0310-a4f2-bffc1f526a37
This commit is contained in:
Adrian Holovaty 2006-04-26 15:38:30 +00:00
parent 2fb5a9b7fe
commit 1529aef3ff

View File

@ -2,137 +2,605 @@
Database API reference Database API reference
====================== ======================
Once you've created your `data models`_, you'll need to retrieve data from the Once you've created your `data models`_, Django automatically gives you a
database. This document explains the database abstraction API derived from the database-abstraction API that lets you create, retrieve, update and delete
models, and how to create, retrieve and update objects. objects. This document explains that API.
.. _`data models`: http://www.djangoproject.com/documentation/model_api/ .. _`data models`: http://www.djangoproject.com/documentation/model_api/
Throughout this reference, we'll refer to the following Poll application:: Throughout this reference, we'll refer to the following Blog application::
class Poll(models.Model): class Blog(models.Model):
slug = models.SlugField(unique_for_month='pub_date') name = models.CharField(maxlength=100)
question = models.CharField(maxlength=255) tagline = models.TextField()
def __repr__(self):
return self.name
class Author(models.Model):
name = models.CharField(maxlength=50)
email = models.URLField()
class __repr__(self):
return self.name
class Entry(models.Model):
blog = models.ForeignKey(Blog)
headline = models.CharField(maxlength=255)
body_text = models.TextField()
pub_date = models.DateTimeField() pub_date = models.DateTimeField()
expire_date = models.DateTimeField() authors = models.ManyToManyField(Author)
def __repr__(self): def __repr__(self):
return self.question return self.headline
class Meta: Creating objects
get_latest_by = 'pub_date'
class Choice(models.Model):
poll = models.ForeignKey(Poll, edit_inline=meta.TABULAR,
num_in_admin=10, min_num_in_admin=5)
choice = models.CharField(maxlength=255, core=True)
votes = models.IntegerField(editable=False, default=0)
def __repr__(self):
return self.choice
and the following Django sample session::
>>> from datetime import datetime
>>> p1 = Poll(slug='whatsup', question="What's up?",
... pub_date=datetime(2005, 2, 20), expire_date=datetime(2005, 4, 20))
>>> p1.save()
>>> p2 = Poll(slug='name', question="What's your name?",
... pub_date=datetime(2005, 3, 20), expire_date=datetime(2005, 3, 25))
>>> p2.save()
>>> Poll.objects.all()
[What's up?, What's your name?]
How queries work
================ ================
Querying in Django is based upon the construction and evaluation of Query To create an object, instantiate it using keyword arguments to the model class,
Sets. then call ``save()`` to save it to the database.
A Query Set is a database-independent representation of a group of objects Example::
that all meet a given set of criteria. However, the determination of which
objects are actually members of the Query Set is not made until you formally
evaluate the Query Set.
To construct a Query Set that meets your requirements, you start by obtaining b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.')
an initial Query Set that describes all objects of a given type. This initial b.save()
Query Set can then be refined using a range of operations. Once you have
refined your Query Set to the point where it describes the group of objects
you require, it can be evaluated (using iterators, slicing, or one of a range
of other techniques), yielding an object or list of objects that meet the
specifications of the Query Set.
Obtaining an initial QuerySet This performs an ``INSERT`` SQL statement behind the scenes. Django doesn't hit
============================= the database until you explicitly call ``save()``.
Every model has at least one Manager; by default, the Manager is called The ``save()`` method has no return value.
``objects``. One of the most important roles of the Manager is as a source
of initial Query Sets. The Manager acts as a Query Set that describes all
objects of the type being managed; ``Polls.objects`` is the initial Query Set
that contains all Polls in the database.
The initial Query Set on the Manager behaves in the same way as every other Auto-incrementing primary keys
Query Set in every respect except one - it cannot be evaluated. To overcome ------------------------------
this limitation, the Manager Query Set has an ``all()`` method. The ``all()``
method produces a copy of the initial Query Set - a copy that *can* be
evaluated::
all_polls = Poll.objects.all() If a model has an ``AutoField`` -- an auto-incrementing primary key -- then
that auto-incremented value will be calculated and saved as an attribute on
your object the first time you call ``save()``.
See the `Managers`_ section of the Model API for more details on the role Example::
and construction of Managers.
.. _Managers: http://www.djangoproject.com/documentation/model_api/#managers b2 = Blog(name='Cheddar Talk', tagline='Thoughts on cheese.')
b2.id # Returns None, because b doesn't have an ID yet.
b2.save()
b2.id # Returns the ID of your new object.
QuerySet refinement There's no way to tell what the value of an ID will be before you call
=================== ``save()``, because that value is calculated by your database, not by Django.
The initial Query Set provided by the Manager describes all objects of a (For convenience, each model has an ``AutoField`` named ``id`` by default
given type. However, you will usually need to describe a subset of the unless you explicitly specify ``primary_key=True`` on a field. See the
`AutoField documentation`_.)
.. _AutoField documentation: TODO: Link
Explicitly specifying auto-primary-key values
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If a model has an ``AutoField`` but you want to define a new object's ID
explicitly when saving, just define it explicitly before saving, rather than
relying on the auto-assignment of the ID.
Example::
b3 = Blog(id=3, name='Cheddar Talk', tagline='Thoughts on cheese.')
b3.id # Returns 3.
b3.save()
b3.id # Returns 3.
If you assign auto-primary-key values manually, make sure not to use an
already-existing primary-key value! If you create a new object with an explicit
primary-key value that already exists in the database, Django will assume
you're changing the existing record rather than creating a new one.
Given the above ``'Cheddar Talk'`` blog example, this example would override
the previous record in the database::
b4 = Blog(id=3, name='Not Cheddar', tagline='Anything but cheese.')
b4.save() # Overrides the previous blog with ID=3!
See also "How Django knows to UPDATE vs. INSERT", below.
Explicitly specifying auto-primary-key values is mostly useful for bulk-saving
objects, when you're confident you won't have primary-key collision.
Saving changes to objects
=========================
To save changes to an object that's already in the database, use ``save()``.
Given a ``Blog`` instance ``b5`` that has already been saved to the database,
this example changes its name and updates its record in the database::
b5.name = 'New name'
b5.save()
This performs an ``UPDATE`` SQL statement behind the scenes. Django doesn't hit
the database until you explicitly call ``save()``.
The ``save()`` method has no return value.
How Django knows to UPDATE vs. INSERT
-------------------------------------
You may have noticed Django database objects use the same ``save()`` method
for creating and changing objects. Django abstracts the need to use ``INSERT``
or ``UPDATE`` SQL statements. Specifically, when you call ``save()``, Django
follows this algorithm:
* If the object's primary key attribute is set, Django executes a
``SELECT`` query to determine whether a record with the given primary key
already exists.
* If the record with the given primary key does already exist, Django
executes an ``UPDATE`` query.
* If the object's primary key attribute is *not* set, or if it's set but a
record doesn't exist, Django executes an ``INSERT``.
The one gotcha here is that you should be careful not to specify a primary-key
value explicitly when saving new objects, if you cannot guarantee the
primary-key value is unused. For more on this nuance, see
"Explicitly specifying auto-primary-key values" above.
Retrieving objects
==================
To retrieve objects from your database, you construct a ``QuerySet`` via a
``Manager``.
A ``QuerySet`` represents a collection of objects from your database. It can
have zero, one or many *filters* -- criteria that narrow down the collection
based on given parameters.
In SQL terms, a ``QuerySet`` equates to a ``SELECT`` statement, and a filter is
a limiting clause such as ``WHERE`` or ``LIMIT``.
You get a ``QuerySet`` by using your model's ``Manager``. Each model has at
least one ``Manager``, and it's called ``objects`` by default. Access it
directly via the model class, like so::
Blog.objects # <django.db.models.manager.Manager object at ...>
b = Blog(name='Foo', tagline='Bar')
b.objects # AttributeError: "Manager isn't accessible via Blog instances."
(``Managers`` are accessible only via model classes, rather than from model
instances, to enforce a separation between "table-level" operations and
"record-level" operations.)
The ``Manager`` is the main source of ``QuerySets`` for a model. It acts as a
"root" ``QuerySet`` that describes all objects in the model's database table.
For example, ``Blog.objects`` is the initial ``QuerySet`` that contains all
``Blog`` objects in the database.
Retrieving all objects
----------------------
The simplest way to retrieve objects from a table is to get all of them.
To do this, use the ``all()`` method on a ``Manager``.
Example::
all_entries = Entry.objects.all()
The ``all()`` method returns a ``QuerySet`` of all the objects in the database.
(If ``Entry.objects`` is a ``QuerySet``, why can't we just do ``Entry.objects``?
That's because ``Entry.objects``, the root ``QuerySet``, is a special case
that cannot be evaluated. The ``all()`` method returns a ``QuerySet`` that
*can* be evaluated.)
Filtering objects
-----------------
The root ``QuerySet`` provided by the ``Manager`` describes all objects in the
database table. Usually, though, you'll need to select only a subset of the
complete set of objects. complete set of objects.
To create such a subset, you refine the initial Query Set, adding conditions To create such a subset, you refine the initial ``QuerySet``, adding filter
until you have described a set that meets your needs. The two most common conditions. The two most common ways to refine a ``QuerySet`` are:
mechanisms for refining a Query Set are:
``filter(**kwargs)`` ``filter(**kwargs)``
Returns a new Query Set containing objects that match the given lookup parameters. Returns a new ``QuerySet`` containing objects that match the given lookup
parameters.
``exclude(**kwargs)`` ``exclude(**kwargs)``
Return a new Query Set containing objects that do not match the given lookup parameters. Returns a new ``QuerySet`` containing objects that do *not* match the given
lookup parameters.
Lookup parameters should be in the format described in "Field lookups" below. The lookup parameters (``**kwargs`` in the above function definitions) should
be in the format described in "Field lookups" below.
The result of refining a Query Set is itself a Query Set; so it is possible to For example, to get a ``QuerySet`` of blog entries from the year 2006, use
chain refinements together. For example:: ``filter()`` like so::
Poll.objects.filter( Entry.objects.filter(pub_date__year=2006)
question__startswith="What").exclude(
(Note we don't have to add an ``all()`` -- ``Entry.objects.all().filter(...)``.
That would still work, but you only need ``all()`` when you want all objects
from the root ``QuerySet``.)
Chaining filters
~~~~~~~~~~~~~~~~
The result of refining a ``QuerySet`` is itself a ``Query Set``, so it's
possible to chain refinements together. For example::
Entry.objects.filter(
headline__startswith='What').exclude(
pub_date__gte=datetime.now()).filter( pub_date__gte=datetime.now()).filter(
pub_date__gte=datetime(2005,1,1)) pub_date__gte=datetime(2005, 1, 1))
...takes the initial Query Set, and adds a filter, then an exclusion, then ...takes the initial ``QuerySet`` of all entries in the database, adds a
another filter to remove elements present in the initial Query Set. The filter, then an exclusion, then another filter. The final result is a
final result is a Query Set containing all Polls with a question that ``QuerySet`` containing all entries with a headline that starts with "What",
starts with "What", that were published between 1 Jan 2005 and today. that were published between January 1, 2005, and the current day.
Each Query Set is a unique object. The process of refinement is not one Filtered QuerySets are unique
of adding a condition to the initial Query Set. Rather, each refinement -----------------------------
creates a separate and distinct Query Set that can be stored, used. and
reused. For example::
q1 = Poll.objects.filter(question__startswith="What") Each time you refine a ``QuerySet``, you get a brand-new ``QuerySet`` that is
in no way bound to the previous ``QuerySet``. Each refinement creates a
separate and distinct ``QuerySet`` that can be stored, used and reused.
Example::
q1 = Entry.objects.filter(headline__startswith="What")
q2 = q1.exclude(pub_date__gte=datetime.now()) q2 = q1.exclude(pub_date__gte=datetime.now())
q3 = q1.filter(pub_date__gte=datetime.now()) q3 = q1.filter(pub_date__gte=datetime.now())
will construct 3 Query Sets; a base query set containing all Polls with a These three ``QuerySets`` are separate. The first is a base ``QuerySet``
question that starts with "What", and two subsets of the base Query Set (one containing all entries that contain a headline starting with "What". The second
with an exlusion, one with a filter). The initial Query Set is unaffected by is a subset of the first, with an additional criteria that excludes records
the refinement process. whose ``pub_date`` is greater than now. The third is a subset of the first,
with an additional criteria that selects only the records whose ``pub_date`` is
greater than now. The initial ``QuerySet`` (``q1``) is unaffected by the
refinement process.
QuerySets are lazy
------------------
``QuerySets`` are lazy -- the act of creating a ``QuerySet`` doesn't involve
any database activity. You can stack filters together all day long, and Django
won't actually run the query until the ``QuerySet`` is *evaluated*.
When QuerySets are evaluated
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can evaluate a ``QuerySet`` in the following ways:
* **Iteration.** A ``QuerySet`` is iterable, and it executes its database
query the first time you iterate over it. For example, this will print
the headline of all entries in the database::
for e in Entry.objects.all():
print e.headline
* **Slicing.** A ``QuerySet`` can be sliced, using Python's array-slicing
syntax, and it executes its database query the first time you slice it.
Examples::
fifth_entry = Entry.objects.all()[4]
all_entries_but_the_first_two = Entry.objects.all()[2:]
every_second_entry = Entry.objects.all()[::2]
* **repr().** A ``QuerySet`` is evaluated when you call ``repr()`` on it.
This is for convenience in the Python interactive interpreter, so you can
immediately see your results.
* **len().** A ``QuerySet`` is evaluated when you call ``len()`` on it.
This, as you might expect, returns the length of the result list.
Note: *Don't* use ``len()`` on ``QuerySet``s if all you want to do is
determine the number of records in the set. It's much more efficient to
handle a count at the database level, using SQL's ``SELECT COUNT(*)``,
and Django provides a ``count()`` method for precisely this reason. See
``count()`` below.
* **list().** Force evaluation of a ``QuerySet`` by calling ``list()`` on
it. For example::
entry_list = list(Entry.objects.all())
Be warned, though, that this could have a large memory overhead, because
Django will load each element of the list into memory. In contrast,
iterating over a ``QuerySet`` will take advantage of your database to
load data and instantiate objects only as you need them.
Full list of QuerySet methods
-----------------------------
Django provides a range of ``QuerySet`` refinement methods that modify either
the types of results returned by the ``QuerySet`` or the way its SQL query is
executed.
filter(**kwargs)
~~~~~~~~~~~~~~~~
Returns a new ``QuerySet`` containing objects that match the given lookup
parameters.
The lookup parameters (``**kwargs``) should be in the format described in
"Field lookups" below. Multiple parameters are joined via ``AND`` in the
underlying SQL statement.
exclude(**kwargs)
~~~~~~~~~~~~~~~~~
Returns a new ``QuerySet`` containing objects that do *not* match the given
lookup parameters.
The lookup parameters (``**kwargs``) should be in the format described in
"Field lookups" below. Multiple parameters are joined via ``AND`` in the
underlying SQL statement, and the whole thing is enclosed in a ``NOT()``.
This example excludes all entries whose ``pub_date`` is the current date/time
AND whose ``headline`` is "Hello"::
Entry.objects.exclude(pub_date__gt=datetime.now(), headline='Hello')
This example excludes all entries whose ``pub_date`` is the current date/time
OR whose ``headline`` is "Hello"::
Entry.objects.exclude(pub_date__gt=datetime.now()).exclude(headline='Hello')
Note the second example is more restrictive.
order_by(*fields)
~~~~~~~~~~~~~~~~~
By default, results returned by a ``QuerySet`` are ordered by the ordering
tuple given by the ``ordering`` option in the model's ``Meta``. You can
override this on a per-``QuerySet`` basis by using the ``order_by`` method.
Example::
Entry.objects.filter(pub_date__year=2005).order_by('-pub_date', 'headline')
The result above will be ordered by ``pub_date`` descending, then by
``headline`` ascending. The negative sign in front of ``"-pub_date"`` indicates
*descending* order. Ascending order is implied. To order randomly, use ``"?"``,
like so::
Entry.objects.order_by('?')
To order by a field in a different table, add the other table's name and a dot,
like so::
Entry.objects.order_by('blogs_blog.name', 'headline')
There's no way to specify whether ordering should be case sensitive. With
respect to case-sensitivity, Django will order results however your database
backend normally orders them.
values(*fields)
---------------
Returns a ``ValuesQuerySet`` -- a ``QuerySet`` that evaluates to a list of
dictionaries instead of model-instance objects.
Each of those dictionaries represents an object, with the keys corresponding to
the attribute names of model objects.
This example compares the dictionaries of ``values()`` with the normal model
objects::
# This list contains a Blog object.
>>> Blog.objects.filter(name__startswith='Beatles')
[Beatles Blog]
# This list contains a dictionary.
>>> Blog.objects.filter(name__startswith='Beatles').values()
[{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}]
``values()`` takes optional positional arguments, ``*fields``, which specify
field names to which the ``SELECT`` should be limited. If you specify the
fields, each dictionary will contain only the field keys/values for the fields
you specify. If you don't specify the fields, each dictionary will contain a
key and value for every field in the database table.
Example::
>>> Blog.objects.values()
[{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}],
>>> Blog.objects.values('id', 'name')
[{'id': 1, 'name': 'Beatles Blog'}]
A ``ValuesQuerySet`` is useful when you know you're only going to need values
from a small number of the available fields and you won't need the
functionality of a model instance object. It's more efficient to select only
the fields you need to use.
Finally, note a ``ValuesQuerySet`` is a subclass of ``QuerySet``, so it has all
methods of ``QuerySet``. You can call ``filter()`` on it, or ``order_by()``, or
whatever. Yes, that means these two calls are identical::
Blog.objects.values().order_by('id')
Blog.objects.order_by('id').values()
The people who made Django prefer to put all the SQL-affecting methods first,
followed (optionally) by any output-affecting methods (such as ``values()``),
but it doesn't really matter. This is your chance to really flaunt your
individualism.
distinct()
~~~~~~~~~~
The ``distinct()`` method returns a new ``QuerySet`` that uses
``SELECT DISTINCT`` in its SQL query. This eliminates duplicate rows from the
query results.
By default, a ``QuerySet`` will not eliminate duplicate rows. In practice, this
is rarely a problem, because simple queries such as ``Blog.objects.all()``
don't introduce the possibility of duplicate result rows.
However, if your query spans multiple tables, or you're using a
``ValuesQuerySet`` with a ``fields`` clause, it's possible to get duplicate
results when a ``QuerySet`` is evaluated. That's when you'd use ``distinct()``.
TODO: Left off here
``dates(field, kind, order='ASC')``
-----------------------------------
Returns a Date Query Set - a Query Set that evaluates to a list of
``datetime.datetime`` objects representing all available dates of a
particular kind within the contents of the Query Set.
``field`` should be the name of a ``DateField`` or ``DateTimeField`` of your
model.
``kind`` should be either ``"year"``, ``"month"`` or ``"day"``. Each
``datetime.datetime`` object in the result list is "truncated" to the given
``type``.
* ``"year"`` returns a list of all distinct year values for the field.
* ``"month"`` returns a list of all distinct year/month values for the field.
* ``"day"`` returns a list of all distinct year/month/day values for the field.
``order``, which defaults to ``'ASC'``, should be either ``"ASC"`` or ``"DESC"``.
This specifies how to order the results.
For example::
>>> Poll.objects.dates('pub_date', 'year')
[datetime.datetime(2005, 1, 1)]
>>> Poll.objects.dates('pub_date', 'month')
[datetime.datetime(2005, 2, 1), datetime.datetime(2005, 3, 1)]
>>> Poll.objects.dates('pub_date', 'day')
[datetime.datetime(2005, 2, 20), datetime.datetime(2005, 3, 20)]
>>> Poll.objects.dates('pub_date', 'day', order='DESC')
[datetime.datetime(2005, 3, 20), datetime.datetime(2005, 2, 20)]
>>> Poll.objects.filter(question__contains='name').dates('pub_date', 'day')
[datetime.datetime(2005, 3, 20)]
``select_related()``
--------------------
Relations are the bread and butter of databases, so there's an option to "follow"
all relationships and pre-fill them in a simple cache so that later calls to
objects with a one-to-many relationship don't have to hit the database. Do this by
passing ``select_related=True`` to a lookup. This results in (sometimes much) larger
queries, but it means that later use of relationships is much faster.
For example, using the Poll and Choice models from above, if you do the following::
c = Choice.objects.select_related().get(id=5)
Then subsequent calls to ``c.poll`` won't hit the database.
Note that ``select_related`` follows foreign keys as far as possible. If you have the
following models::
class Poll(models.Model):
# ...
class Choice(models.Model):
# ...
poll = models.ForeignKey(Poll)
class SingleVote(meta.Model):
# ...
choice = models.ForeignKey(Choice)
then a call to ``SingleVotes.objects.select_related().get(id=4)`` will
cache the related choice *and* the related poll::
>>> sv = SingleVotes.objects.select_related().get(id=4)
>>> c = sv.choice # Doesn't hit the database.
>>> p = c.poll # Doesn't hit the database.
>>> sv = SingleVotes.objects.get(id=4)
>>> c = sv.choice # Hits the database.
>>> p = c.poll # Hits the database.
``extra(params, select, where, tables)``
----------------------------------------
Sometimes, the Django query syntax by itself isn't quite enough. To cater for these
edge cases, Django provides the ``extra()`` Query Set modifier - a mechanism
for injecting specific clauses into the SQL generated by a Query Set.
Note that by definition these extra lookups may not be portable to different
database engines (because you're explicitly writing SQL code) and should be
avoided if possible.:
``params``
All the extra-SQL params described below may use standard Python string
formatting codes to indicate parameters that the database engine will
automatically quote. The ``params`` argument can contain any extra
parameters to be substituted.
``select``
The ``select`` keyword allows you to select extra fields. This should be a
dictionary mapping attribute names to a SQL clause to use to calculate that
attribute. For example::
Poll.objects.extra(
select={
'choice_count': 'SELECT COUNT(*) FROM choices WHERE poll_id = polls.id'
}
)
Each of the resulting ``Poll`` objects will have an extra attribute, ``choice_count``,
an integer count of associated ``Choice`` objects. Note that the parenthesis required by
most database engines around sub-selects are not required in Django's ``select``
clauses.
``where`` / ``tables``
If you need to explicitly pass extra ``WHERE`` clauses -- perhaps to perform
non-explicit joins -- use the ``where`` keyword. If you need to
join other tables into your query, you can pass their names to ``tables``.
``where`` and ``tables`` both take a list of strings. All ``where`` parameters
are "AND"ed to any other search criteria.
For example::
Poll.objects.filter(
question__startswith='Who').extra(where=['id IN (3, 4, 5, 20)'])
...translates (roughly) into the following SQL::
SELECT * FROM polls_polls WHERE question LIKE 'Who%' AND id IN (3, 4, 5, 20);
Caching and QuerySets
---------------------
Each ``QuerySet`` contains a cache, to minimize database access.
In a newly created ``QuerySet``, this cache is empty. The first time a
``QuerySet`` is evaluated -- and, hence, a database query happens -- Django
saves the query results in the ``QuerySet``'s cache and returns the results
that have been explicitly requested (e.g., the next element, if the
``QuerySet`` is being iterated over). Subsequent evaluations of the
``QuerySet`` reuse the cached results.
Keep this caching behavior in mind, because it may bite you if you don't use
your ``QuerySet``s correctly. For example, the following will create two
``QuerySet``s, evaluate them, and throw them away::
print [e.headline for e in Entry.objects.all()]
print [e.pub_date for e in Entry.objects.all()]
That means the same database query will be executed twice, effectively doubling
your database load. Also, there's a possibility the two lists may not include
the same database records, because an ``Entry`` may have been added or deleted
in the split second between the two requests.
To avoid this problem, simply save the ``QuerySet`` and reuse it::
queryset = Poll.objects.all()
print [p.headline for p in queryset] # Evaluate the query set.
print [p.pub_date for p in queryset] # Re-use the cache from the evaluation.
Deleting objects
================
It should be noted that the construction of a Query Set does not involve any
activity on the database. The database is not consulted until a Query Set is
evaluated.
Field lookups Field lookups
============= =============
@ -303,67 +771,6 @@ See the `OR lookups examples page`_ for more examples.
.. _OR lookups examples page: http://www.djangoproject.com/documentation/models/or_lookups/ .. _OR lookups examples page: http://www.djangoproject.com/documentation/models/or_lookups/
QuerySet evaluation
===================
A Query Set must be evaluated to return the objects that are contained in the
set. This can be achieved by iteration, slicing, or by specialist function.
A Query Set is an iterable object. Therefore, it can be used in loop
constructs. For example::
for p in Poll.objects.all():
print p
will print all the Poll objects, using the ``__repr__()`` method of Poll.
A Query Set can also be sliced, using array notation::
fifth_poll = Poll.objects.all()[4]
all_polls_but_the_first_two = Poll.objects.all()[2:]
every_second_poll = Poll.objects.all()[::2]
Query Sets are lazy objects - that is, they are not *actually* sets (or
lists) that contain all the objects that they represent. Python protocol
magic is used to make the Query Set *look* like an iterable, sliceable
object, but behind the scenes, Django is using caching to only instantiate
objects as they are required.
If you really need to have a list, you can force the evaluation of the
lazy object::
querylist = list(Poll.objects.all())
However - be warned; this could have a large memory overhead, as Django will
create an in-memory representation of every element of the list.
Caching and QuerySets
=====================
Each Query Set contains a cache. In a newly created Query Set, this cache
is unpopulated. When a Query Set is evaluated for the first time, Django
makes a database query to populate the cache, and then returns the results
that have been explicitly requested (e.g., the next element if iteration
is in use). Subsequent evaluations of the Query Set reuse the cached results.
This caching behavior must be kept in mind when using Query Sets. For
example, the following will cause two temporary Query Sets to be created,
evaluated, and thrown away::
print [p for p in Poll.objects.all()] # Evaluate the Query Set
print [p for p in Poll.objects.all()] # Evaluate the Query Set again
On a small, low-traffic website, this may not pose a serious problem. However,
on a high traffic website, it effectively doubles your database load. In
addition, there is a possibility that the two lists may not be identical,
since a poll may be added or deleted by another user between making the two
requests.
To avoid this problem, simply save the Query Set and reuse it::
queryset = Poll.objects.all()
print [p for p in queryset] # Evaluate the query set
print [p for p in queryset] # Re-use the cache from the evaluation
Specialist QuerySet evaluation Specialist QuerySet evaluation
============================== ==============================
@ -563,217 +970,6 @@ example::
attribute that starts with "eggs". Django automatically composes the joins attribute that starts with "eggs". Django automatically composes the joins
and conditions required for the SQL query. and conditions required for the SQL query.
Specialist QuerySets refinement
===============================
In addition to ``filter`` and ``exclude()``, Django provides a range of
Query Set refinement methods that modify the types of results returned by
the Query Set, or modify the way the SQL query is executed on the database.
``order_by(*fields)``
----------------------
The results returned by a Query Set are automatically ordered by the ordering
tuple given by the ``ordering`` meta key in the model. However, ordering may be
explicitly provided by using the ``order_by`` method::
Poll.objects.filter(pub_date__year=2005,
pub_date__month=1).order_by('-pub_date', 'question')
The result set above will be ordered by ``pub_date`` descending, then
by ``question`` ascending. The negative sign in front of "-pub_date" indicates
descending order. Ascending order is implied. To order randomly, use "?", like
so::
Poll.objects.order_by=('?')
To order by a field in a different table, add the other table's name and a dot,
like so::
Choice.objects.order_by=('Poll.pub_date', 'choice')
There's no way to specify whether ordering should be case sensitive. With
respect to case-sensitivity, Django will order results however your database
backend normally orders them.
``distinct()``
--------------
By default, a Query Set will not eliminate duplicate rows. This will not
happen during simple queries; however, if your query spans relations,
or you are using a Values Query Set with a ``fields`` clause, it is possible
to get duplicated results when a Query Set is evaluated.
``distinct()`` returns a new Query Set that eliminates duplicate rows from the
results returned by the Query Set. This is equivalent to a ``SELECT DISTINCT``
SQL clause.
``values(*fields)``
--------------------
Returns a Values Query Set - a Query Set that evaluates to a list of
dictionaries instead of model-instance objects. Each dictionary in the
list will represent an object matching the query, with the keys matching
the attribute names of the object.
It accepts an optional parameter, ``fields``, which should be a list or tuple
of field names. If you don't specify ``fields``, each dictionary in the list
returned by ``get_values()`` will have a key and value for each field in the
database table. If you specify ``fields``, each dictionary will have only the
field keys/values for the fields you specify. For example::
>>> Poll.objects.values()
[{'id': 1, 'slug': 'whatsup', 'question': "What's up?",
'pub_date': datetime.datetime(2005, 2, 20),
'expire_date': datetime.datetime(2005, 3, 20)},
{'id': 2, 'slug': 'name', 'question': "What's your name?",
'pub_date': datetime.datetime(2005, 3, 20),
'expire_date': datetime.datetime(2005, 4, 20)}]
>>> Poll.objects.values('id', 'slug')
[{'id': 1, 'slug': 'whatsup'}, {'id': 2, 'slug': 'name'}]
A Values Query Set is useful when you know you're only going to need values
from a small number of the available fields and you won't need the
functionality of a model instance object. It's more efficient to select only
the fields you need to use.
``dates(field, kind, order='ASC')``
-----------------------------------
Returns a Date Query Set - a Query Set that evaluates to a list of
``datetime.datetime`` objects representing all available dates of a
particular kind within the contents of the Query Set.
``field`` should be the name of a ``DateField`` or ``DateTimeField`` of your
model.
``kind`` should be either ``"year"``, ``"month"`` or ``"day"``. Each
``datetime.datetime`` object in the result list is "truncated" to the given
``type``.
* ``"year"`` returns a list of all distinct year values for the field.
* ``"month"`` returns a list of all distinct year/month values for the field.
* ``"day"`` returns a list of all distinct year/month/day values for the field.
``order``, which defaults to ``'ASC'``, should be either ``"ASC"`` or ``"DESC"``.
This specifies how to order the results.
For example::
>>> Poll.objects.dates('pub_date', 'year')
[datetime.datetime(2005, 1, 1)]
>>> Poll.objects.dates('pub_date', 'month')
[datetime.datetime(2005, 2, 1), datetime.datetime(2005, 3, 1)]
>>> Poll.objects.dates('pub_date', 'day')
[datetime.datetime(2005, 2, 20), datetime.datetime(2005, 3, 20)]
>>> Poll.objects.dates('pub_date', 'day', order='DESC')
[datetime.datetime(2005, 3, 20), datetime.datetime(2005, 2, 20)]
>>> Poll.objects.filter(question__contains='name').dates('pub_date', 'day')
[datetime.datetime(2005, 3, 20)]
``select_related()``
--------------------
Relations are the bread and butter of databases, so there's an option to "follow"
all relationships and pre-fill them in a simple cache so that later calls to
objects with a one-to-many relationship don't have to hit the database. Do this by
passing ``select_related=True`` to a lookup. This results in (sometimes much) larger
queries, but it means that later use of relationships is much faster.
For example, using the Poll and Choice models from above, if you do the following::
c = Choice.objects.select_related().get(id=5)
Then subsequent calls to ``c.poll`` won't hit the database.
Note that ``select_related`` follows foreign keys as far as possible. If you have the
following models::
class Poll(models.Model):
# ...
class Choice(models.Model):
# ...
poll = models.ForeignKey(Poll)
class SingleVote(meta.Model):
# ...
choice = models.ForeignKey(Choice)
then a call to ``SingleVotes.objects.select_related().get(id=4)`` will
cache the related choice *and* the related poll::
>>> sv = SingleVotes.objects.select_related().get(id=4)
>>> c = sv.choice # Doesn't hit the database.
>>> p = c.poll # Doesn't hit the database.
>>> sv = SingleVotes.objects.get(id=4)
>>> c = sv.choice # Hits the database.
>>> p = c.poll # Hits the database.
``extra(params, select, where, tables)``
----------------------------------------
Sometimes, the Django query syntax by itself isn't quite enough. To cater for these
edge cases, Django provides the ``extra()`` Query Set modifier - a mechanism
for injecting specific clauses into the SQL generated by a Query Set.
Note that by definition these extra lookups may not be portable to different
database engines (because you're explicitly writing SQL code) and should be
avoided if possible.:
``params``
All the extra-SQL params described below may use standard Python string
formatting codes to indicate parameters that the database engine will
automatically quote. The ``params`` argument can contain any extra
parameters to be substituted.
``select``
The ``select`` keyword allows you to select extra fields. This should be a
dictionary mapping attribute names to a SQL clause to use to calculate that
attribute. For example::
Poll.objects.extra(
select={
'choice_count': 'SELECT COUNT(*) FROM choices WHERE poll_id = polls.id'
}
)
Each of the resulting ``Poll`` objects will have an extra attribute, ``choice_count``,
an integer count of associated ``Choice`` objects. Note that the parenthesis required by
most database engines around sub-selects are not required in Django's ``select``
clauses.
``where`` / ``tables``
If you need to explicitly pass extra ``WHERE`` clauses -- perhaps to perform
non-explicit joins -- use the ``where`` keyword. If you need to
join other tables into your query, you can pass their names to ``tables``.
``where`` and ``tables`` both take a list of strings. All ``where`` parameters
are "AND"ed to any other search criteria.
For example::
Poll.objects.filter(
question__startswith='Who').extra(where=['id IN (3, 4, 5, 20)'])
...translates (roughly) into the following SQL::
SELECT * FROM polls_polls WHERE question LIKE 'Who%' AND id IN (3, 4, 5, 20);
Changing objects
================
Once you've retrieved an object from the database using any of the above
options, changing it is extremely easy. Make changes directly to the
objects fields, then call the object's ``save()`` method::
>>> p = Polls.objects.get(id__exact=15)
>>> p.slug = "new_slug"
>>> p.pub_date = datetime.datetime.now()
>>> p.save()
Creating new objects Creating new objects
==================== ====================