From cf55b5bbaf3e99c48cea68bd4a896a691c647ddc Mon Sep 17 00:00:00 2001 From: Russell Keith-Magee Date: Mon, 17 Apr 2006 06:43:41 +0000 Subject: [PATCH] magic-removal: Completed review of db-api documentation. git-svn-id: http://code.djangoproject.com/svn/django/branches/magic-removal@2706 bcc190cf-cafb-0310-a4f2-bffc1f526a37 --- docs/db-api.txt | 520 +++++++++++++++++++++++++++++++----------------- 1 file changed, 335 insertions(+), 185 deletions(-) diff --git a/docs/db-api.txt b/docs/db-api.txt index db3fce9358..46ed5c01d0 100644 --- a/docs/db-api.txt +++ b/docs/db-api.txt @@ -46,69 +46,93 @@ and the following Django sample session:: How Queries Work ================ -Querying in Django is based upon the construction and evaluation of Query Sets. +Querying in Django is based upon the construction and evaluation of Query +Sets. -A Query Set is a database-independent representation of a query. It can be -thought of as a representation of a group of objects that meet a given set -of criteria. However, the members of the set are not determined until the -Query Set is formally evaluated. +A Query Set is a database-independent representation of a group of objects +that all meet a given set of criteria. However, the determination of which +objects are actually members of the Query Set is not made until you formally +evaluate the Query Set. -To compose a Query using Django, you obtain an initial a Query Set. This -Query Set can then be refined using a range of operations. When you have -a Query Set that meets your needs, it can be evaluated (using iterators, slicing, -or one of a range of other techniques), yielding an object or list of objects -that meet the specifications of the Query Set. +To construct a Query Set that meets your requirements, you start by obtaining +an initial Query Set that describes all objects of a given type. This initial +Query Set can then be refined using a range of operations. Once you have +refined your Query Set to the point where it describes the group of objects +you require, it can be evaluated (using iterators, slicing, or one of a range +of other techniques), yielding an object or list of objects that meet the +specifications of the Query Set. -Obtaining a Query Set -===================== +Obtaining an Initial Query Set +============================== -Query Sets are obtained using the Manager object on a model. Every model -has at least one Manager; by default, the Manager is called ``objects``. +Every model has at least one Manager; by default, the Manager is called +``objects``. One of the most important roles of the Manager is as a source +of initial Query Sets. The Manager acts as a Query Set that describes all +objects of the type being managed; ``Polls.objects`` is the initial Query Set +that contains all Polls in the database. + +The initial Query Set on the Manager behaves in the same way as every other +Query Set in every respect except one - it cannot be evaluated. To overcome +this limitation, the Manager Query Set has an ``all()`` method. The ``all()`` +method produces a copy of the initial Query Set - a copy that *can* be +evaluated:: + + all_polls = Poll.objects.all() See the `Managers`_ section of the Model API for more details on the role and construction of Managers. .. _Managers: http://www.djangoproject.com/documentation/model_api/#managers -The manager has a special factory method for creating Query Sets:: - - queryset = Poll.objects.all() - -This creates a new Query Set that matches all the objects of the given class. - -As a convenient shortcut, all of these Query Set construction methods -can be accessed from the Manager object itself. -The following two queries are identical:: - - Poll.objects.all().filter(question__startswith="What") - Poll.objects.filter(question__startswith="What") - - Query Set Refinement ==================== -The default Query Set returned by the Manager contains all objects of the -Model type. In order to be useful, +The initial Query Set provided by the Manager describes all objects of a +given type. However, you will usually need to describe a subset of the +complete set of objects. -Any Query Set can be refined by calling one of the following methods: +To create such a subset, you refine the initial Query Set, adding conditions +until you have described a set that meets your needs. The two most common +mechanisms for refining a Query Set are: -filter(\**kwargs) +``filter(**kwargs)`` Returns a new Query Set containing objects that match the given lookup parameters. -exclude(\**kwargs) +``exclude(**kwargs)`` Return a new Query Set containing objects that do not match the given lookup parameters. Lookup parameters should be in the format described in "Field lookups" below. -Query Set refinements can be chained together:: +The result of refining a Query Set is itself a Query Set; so it is possible to +chain refinements together. For example:: - Poll.objects.filter(question__startswith="What").exclude().filter(...) + Poll.objects.filter( + question__startswith="What").exclude( + pub_date__gte=datetime.now()).filter( + pub_date__gte=datetime(2005,1,1)) -Query Sets can also be stored and reused:: +...takes the initial Query Set, and adds a filter, then an exclusion, then +another filter to remove elements present in the initial Query Set. The +final result is a Query Set containing all Polls with a question that +starts with "What", that were published between 1 Jan 2005 and today. - q1 = Poll.objects.filter() - q2 = q1.exclude() - q3 = q1.filter() +Each Query Set is a unique object. The process of refinement is not one +of adding a condition to the initial Query Set. Rather, each refinement +creates a separate and distinct Query Set that can be stored, used. and +reused. For example:: + + q1 = Poll.objects.filter(question__startswith="What") + q2 = q1.exclude(pub_date__gte=datetime.now()) + q3 = q1.filter(pub_date__gte=datetime.now()) + +will construct 3 Query Sets; a base query set containing all Polls with a +question that starts with "What", and two subsets of the base Query Set (one +with an exlusion, one with a filter). The initial Query Set is unaffected by +the refinement process. + +It should be noted that the construction of a Query Set does not involve any +activity on the database. The database is not consulted until a Query Set is +evaluated. Field lookups ============= @@ -116,7 +140,7 @@ Field lookups Basic field lookups take the form ``field__lookuptype`` (that's a double-underscore). For example:: - Poll.objects.filter(pub_date__lte=datetime.datetime.now()) + Poll.objects.filter(pub_date__lte=datetime.now()) translates (roughly) into the following SQL:: @@ -176,8 +200,8 @@ two statements are equivalent:: Poll.objects.get(id=14) Poll.objects.get(id__exact=14) -Multiple lookups are also allowed. When separated by commans, the list of lookups will be -"AND"ed together:: +Multiple lookup parameters are allowed. When separated by commans, the list of +lookup parameters will be "AND"ed together:: Poll.objects.filter( pub_date__year=2005, @@ -205,82 +229,10 @@ If you pass an invalid keyword argument, the function will raise ``TypeError``. .. _`Keyword Arguments`: http://docs.python.org/tut/node6.html#SECTION006720000000000000000 -Query Set evaluation -==================== - -Once you have constructed a Query Set to meet your needs, it must be evaluated -to return the objects that are contained in the set. This can be achieved in - -A Query Set is an iterable object:: - - queryset = Poll.objects.all() - for p in queryset: - print p - -Query Sets can also be sliced:: - - fifth_poll = queryset[4] - all_polls_but_the_first_two = queryset[2:] - - -If you really need to have a . :: - querylist = list(Poll.objects.all()) - -However - be warned; if you use these approaches, - -Regardless of whether you iterate or slice the Query Set, - -upon first evaluation, the query will be executed on the database, and the results cached. -Subsequent evaluations of the Query Set reuse the cached results. - -As an alternative to iteration and slicing, you can use one of the -following functions. These functions do not populate or effect the cache: - -get(\**kwargs) --------------- - -Returns the object matching the given lookup parameters, which should be in -the format described in _`Field lookups`. Raises a module-level -``DoesNotExist`` exception if an object wasn't found for the given parameters. -Raises ``AssertionError`` if more than one object was found. - -count() -------- - -Returns an integer representing the number of objects in the database matching -the Query Set. ``count()`` never raises exceptions. - -Depending on which database you're using (e.g. PostgreSQL vs. MySQL), this may -return a long integer instead of a normal Python integer. - -in_bulk(id_list) ----------------- - -Takes a list of IDs and returns a dictionary mapping each ID to an instance of -the object with the given ID. For example:: - - >>> Poll.objects.in_bulk([1]) - {1: What's up?} - >>> Poll.objects.in_bulk([1, 2]) - {1: What's up?, 2: What's your name?} - >>> Poll.objects.in_bulk([]) - {} - -latest(field_name=None) ------------------------ - -Returns the latest object, according to the model's 'get_latest_by' -Meta option, or using the field_name provided. For example:: - - >>> Poll.objects.latest() - What's up? - >>> Poll.objects.latest('expire_date') - What's your name? - OR lookups ========== -By default, keyword argument queries are "AND"ed together. If you have more +Keyword argument queries are "AND"ed together. If you have more complex query requirements (for example, you need to include an ``OR`` statement in your query), you need to use ``Q`` objects. @@ -297,15 +249,17 @@ combined using the ``&`` and ``|`` operators. When an operator is used on two Q(question__startswith='Who') | Q(question__startswith='What') -... yields a single ``Q`` object that represents the "OR" of two "question__startswith" queries, equivalent to the SQL WHERE clause:: +... yields a single ``Q`` object that represents the "OR" of two +"question__startswith" queries, equivalent to the SQL WHERE clause:: ... WHERE question LIKE 'Who%' OR question LIKE 'What%' -You can compose statements of arbitrary complexity by combining ``Q`` objects with the ``&`` and ``|`` operators. Parenthetical grouping can also be used. +You can compose statements of arbitrary complexity by combining ``Q`` objects +with the ``&`` and ``|`` operators. Parenthetical grouping can also be used. -One or more ``Q`` objects can then provided as arguments to the lookup functions. If multiple -``Q`` object arguments are provided to a lookup function, they will be "AND"ed together. -For example:: +One or more ``Q`` objects can then provided as arguments to the lookup +functions. If multiple ``Q`` object arguments are provided to a lookup +function, they will be "AND"ed together. For example:: Poll.objects.get( Q(question__startswith='Who'), @@ -317,10 +271,11 @@ For example:: SELECT * from polls WHERE question LIKE 'Who%' AND (pub_date = '2005-05-02' OR pub_date = '2005-05-06') -If necessary, lookup functions can mix the use of ``Q`` objects and keyword arguments. All arguments -provided to a lookup function (be they keyword argument or ``Q`` object) are "AND"ed together. -However, if a ``Q`` object is provided, it must precede the definition of any keyword arguments. -For example:: +If necessary, lookup functions can mix the use of ``Q`` objects and keyword +arguments. All arguments provided to a lookup function (be they keyword +argument or ``Q`` object) are "AND"ed together. However, if a ``Q`` object is +provided, it must precede the definition of any keyword arguments. For +example:: Poll.objects.get( Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)), @@ -348,79 +303,275 @@ See the `OR lookups examples page`_ for more examples. .. _OR lookups examples page: http://www.djangoproject.com/documentation/models/or_lookups/ +Query Set evaluation +==================== + +A Query Set must be evaluated to return the objects that are contained in the +set. This can be achieved by iteration, slicing, or by specialist function. + +A Query Set is an iterable object. Therefore, it can be used in loop +constructs. For example:: + + for p in Poll.objects.all(): + print p + +will print all the Poll objects, using the ``__repr__()`` method of Poll. + +A Query Set can also be sliced, using array notation:: + + fifth_poll = Poll.objects.all()[4] + all_polls_but_the_first_two = Poll.objects.all()[2:] + every_second_poll = Poll.objects.all()[::2] + +Query Sets are lazy objects - that is, they are not *actually* sets (or +lists) that contain all the objects that they represent. Python protocol +magic is used to make the Query Set *look* like an iterable, sliceable +object, but behind the scenes, Django is using caching to only instantiate +objects as they are required. + +If you really need to have a list, you can force the evaluation of the +lazy object:: + + querylist = list(Poll.objects.all()) + +However - be warned; this could have a large memory overhead, as Django will +create an in-memory representation of every element of the list. + +Caching and Query Sets +====================== + +Each Query Set contains a cache. In a newly created Query Set, this cache +is unpopulated. When a Query Set is evaluated for the first time, Django +makes a database query to populate the cache, and then returns the results +that have been explicitly requested (e.g., the next element if iteration +is in use). Subsequent evaluations of the Query Set reuse the cached results. + +This caching behavior must be kept in mind when using Query Sets. For +example, the following will cause two temporary Query Sets to be created, +evaluated, and thrown away:: + + print [p for p in Poll.objects.all()] # Evaluate the Query Set + print [p for p in Poll.objects.all()] # Evaluate the Query Set again + +On a small, low-traffic website, this may not pose a serious problem. However, +on a high traffic website, it effectively doubles your database load. In +addition, there is a possibility that the two lists may not be identical, +since a poll may be added or deleted by another user between making the two +requests. + +To avoid this problem, simply save the Query Set and reuse it:: + + queryset = Poll.objects.all() + print [p for p in queryset] # Evaluate the query set + print [p for p in queryset] # Re-use the cache from the evaluation + +Specialist Query Set Evaluation +=============================== + +The following specialist functions can also be used to evaluate a Query Set. +Unlike iteration or slicing, these methods do not populate the cache; each +time one of these evaluation functions is used, the database will be queried. + +``get(**kwargs)`` +----------------- + +Returns the object matching the given lookup parameters, which should be in +the format described in _`Field lookups`. Raises a module-level +``DoesNotExist`` exception if an object wasn't found for the given parameters. +Raises ``AssertionError`` if more than one object was found. + +``count()`` +----------- + +Returns an integer representing the number of objects in the database matching +the Query Set. ``count()`` never raises exceptions. + +Depending on which database you're using (e.g. PostgreSQL vs. MySQL), this may +return a long integer instead of a normal Python integer. + +``in_bulk(id_list)`` +-------------------- + +Takes a list of IDs and returns a dictionary mapping each ID to an instance of +the object with the given ID. For example:: + + >>> Poll.objects.in_bulk([1]) + {1: What's up?} + >>> Poll.objects.in_bulk([1, 2]) + {1: What's up?, 2: What's your name?} + >>> Poll.objects.in_bulk([]) + {} + +``latest(field_name=None)`` +--------------------------- + +Returns the latest object, according to the model's 'get_latest_by' +Meta option, or using the field_name provided. For example:: + + >>> Poll.objects.latest() + What's up? + >>> Poll.objects.latest('expire_date') + What's your name? Relationships (joins) ===================== -Joins may implicitly be performed by following relationships: -``Choice.objects.filter(poll__slug="eggs")`` fetches a list of ``Choice`` -objects where the associated ``Poll`` has a slug of ``eggs``. Multiple levels -of joins are allowed. +When you define a relationship in a model (i.e., a ForeignKey, +OneToOneField, or ManyToManyField), Django uses the name of the +relationship to add a descriptor_ on every instance of the model. +This descriptor behaves just like a normal attribute, providing +access to the related object or objects. For example, +``mychoice.poll`` will return the poll object associated with a specific +instance of ``Choice``. -Given an instance of an object, related objects can be looked-up directly using -convenience functions. For example, if ``p`` is a ``Poll`` instance, -``p.choice_set.all()`` will return a list of all associated choices. Astute -readers will note that this is the same as -``Choice.objects.filter(poll__id=p.id)``, except clearer. +.. _descriptor: http://users.rcn.com/python/download/Descriptor.htm -Each type of relationship creates a set of methods on each object in the -relationship. These methods are created in both directions, so objects that are -"related-to" need not explicitly define reverse relationships; that happens -automatically. +Django also adds a descriptor for the 'other' side of the relationship - +the link from the related model to the model that defines the relationship. +Since the related model has no explicit reference to the source model, +Django will automatically derive a name for this descriptor. The name that +Django chooses depends on the type of relation that is represented. However, +if the definition of the relation has a `related_name` parameter, Django +will use this name in preference to deriving a name. -One-to-one relations --------------------- +There are two types of descriptor that can be employed: Single Object +Descriptors and Object Set Descriptors. The following table describes +when each descriptor type is employed. The local model is the model on +which the relation is defined; the related model is the model referred +to by the relation. -Each object in a one-to-one relationship will have a ``get_relatedobjectname()`` -method. For example:: + =============== ============= ============= + Relation Type Local Model Related Model + =============== ============= ============= + OneToOneField Single Object Single Object + + ForeignKey Single Object Object Set + + ManyToManyField Object Set Object Set + =============== ============= ============= - class Place(models.Model): - # ... +Single Object Descriptor +------------------------ - class Restaurant(models.Model): - # ... - the_place = models.OneToOneField(Place) +If the related object is a single object, the descriptor acts +just as if the related object were an attribute:: -In the above example, each ``Place`` will have a ``get_restaurant()`` method, -and each ``Restaurant`` will have a ``get_the_place()`` method. + # Obtain the existing poll + old_poll = mychoice.poll + # Set a new poll + mychoice.poll = new_poll + # Save the change + mychoice.save() -Many-to-one relations +Whenever a change is made to a Single Object Descriptor, save() +must be called to commit the change to the database. + +If no `related_name` parameter is defined, Django will use the +lower case version of the source model name as the name for the +related descriptor. For example, if the ``Choice`` model had +a field:: + + coordinator = models.OneToOneField(User) + +... instances of the model ``User`` would be able to call: + + old_choice = myuser.choice + myuser.choice = new_choice + +By default, relations do not allow values of None; if you attempt +to assign None to a Single Object Descriptor, an AttributeError +will be thrown. However, if the relation has 'null=True' set +(i.e., the database will allow NULLs for the relation), None can +be assigned and returned by the descriptor to represent empty +relations. + +Access to Single Object Descriptors is cached. The first time +a descriptor on an instance is accessed, the database will be +queried, and the result stored. Subsequent attempts to access +the descriptor on the same instance will use the cached value. + +Object Set Descriptor --------------------- -In each many-to-one relationship, the related object will have a -``get_relatedobject()`` method, and the related-to object will have -``get_relatedobject()``, ``get_relatedobject_list()``, and -``get_relatedobject_count()`` methods (the same as the module-level -``get_object()``, ``filter()``, and ``get_count()`` methods). +An Object Set Descriptor acts just like the Manager - as an initial Query +Set describing the set of objects related to an instance. As such, any +query refining technique (filter, exclude, etc) can be used on the Object +Set descriptor. This also means that Object Set Descriptor cannot be evaluated +directly - the ``all()`` method must be used to produce a Query Set that +can be evaluated. -In the poll example above, here are the available choice methods on a ``Poll`` object ``p``:: +If no ``related_name`` parameter is defined, Django will use the lower case +version of the source model name appended with `_set` as the name for the +related descriptor. For example, every ``Poll`` object has a ``choice_set`` +descriptor. - p.get_choice() - p.get_choice_list() - p.get_choice_count() +The Object Set Descriptor has utility methods to add objects to the +related object set: -And a ``Choice`` object ``c`` has the following method:: +``add(obj1, obj2, ...)`` + Add the specified objects to the related object set. + +``create(\**kwargs)`` + Create a new object, and put it in the related object set. See + _`Creating new objects` - c.get_poll() +The Object Set Descriptor may also have utility methods to remove objects +from the related object set: -Many-to-many relations ----------------------- +``remove(obj1, obj2, ...)`` + Remove the specified objects from the related object set. + +``clear()`` + Remove all objects from the related object set. + +These two removal methods will not exist on ForeignKeys where ``Null=False`` +(such as in the Poll example). This is to prevent database inconsistency - if +the related field cannot be set to None, then an object cannot be removed +from one relation without adding it to another. -Many-to-many relations result in the same set of methods as `Many-to-one relations`_, -except that the ``get_relatedobject_list()`` function on the related object will -return a list of instances instead of a single instance. So, if the relationship -between ``Poll`` and ``Choice`` was many-to-many, ``choice.get_poll_list()`` would -return a list. +The members of a related object set can be assigned from any iterable object. +For example:: -Specialist Query Sets -===================== + mypoll.choice_set = [choice1, choice2] + +If the ``clear()`` method is available, any pre-existing objects will be removed +from the Object Set before all objects in the iterable (in this case, a list) +are added to the choice set. If the ``clear()`` method is not available, all +objects in the iterable will be added without removing any existing elements. + +Each of these operations on the Object Set Descriptor has immediate effect +on the database - every add, create and remove is immediately and +automatically saved to the database. + +Relationships and Queries +========================= + +When composing a ``filter`` or ``exclude`` refinement, it may be necessary to +include conditions that span relationships. Relations can be followed as deep +as required - just add descriptor names, separated by double underscores, to +describe the full path to the query attribute. The query:: + + Foo.objects.filter(name1__name2__name3__attribute__lookup=value) + +... is interpreted as 'get every Foo that has a name1 that has a name2 that +has a name3 that has an attribute with lookup matching value'. In the Poll +example:: + + Choice.objects.filter(poll__slug__startswith="eggs") + +... describes the set of choices for which the related poll has a slug +attribute that starts with "eggs". Django automatically composes the joins +and conditions required for the SQL query. + +Specialist Query Sets Refinement +================================ In addition to ``filter`` and ``exclude()``, Django provides a range of Query Set refinement methods that modify the types of results returned by the Query Set, or modify the way the SQL query is executed on the database. -order_by(\*fields) ------------------- +``order_by(*fields)`` +---------------------- The results returned by a Query Set are automatically ordered by the ordering tuple given by the ``ordering`` meta key in the model. However, ordering may be @@ -445,8 +596,8 @@ There's no way to specify whether ordering should be case sensitive. With respect to case-sensitivity, Django will order results however your database backend normally orders them. -distinct() ----------- +``distinct()`` +-------------- By default, a Query Set will not eliminate duplicate rows. This will not happen during simple queries; however, if your query spans relations, @@ -457,8 +608,8 @@ to get duplicated results when a Query Set is evaluated. results returned by the Query Set. This is equivalent to a ``SELECT DISTINCT`` SQL clause. -values(\*fields) ----------------- +``values(*fields)`` +-------------------- Returns a Values Query Set - a Query Set that evaluates to a list of dictionaries instead of model-instance objects. Each dictionary in the @@ -486,8 +637,8 @@ from a small number of the available fields and you won't need the functionality of a model instance object. It's more efficient to select only the fields you need to use. -dates(field, kind, order='ASC') -------------------------------- +``dates(field, kind, order='ASC')`` +----------------------------------- Returns a Date Query Set - a Query Set that evaluates to a list of ``datetime.datetime`` objects representing all available dates of a @@ -520,8 +671,8 @@ For example:: >>> Poll.objects.filter(question__contains='name').dates('pub_date', 'day') [datetime.datetime(2005, 3, 20)] -select_related() ----------------- +``select_related()`` +-------------------- Relations are the bread and butter of databases, so there's an option to "follow" all relationships and pre-fill them in a simple cache so that later calls to @@ -561,8 +712,8 @@ cache the related choice *and* the related poll:: >>> p = c.poll # Hits the database. -extra(params, select, where, tables) ------------------------------------- +``extra(params, select, where, tables)`` +---------------------------------------- Sometimes, the Django query syntax by itself isn't quite enough. To cater for these edge cases, Django provides the ``extra()`` Query Set modifier - a mechanism @@ -705,9 +856,8 @@ key field is called ``name``, these two statements are equivalent:: Extra instance methods ====================== -In addition to ``save()``, ``delete()`` and all of the ``add_*`` and ``get_*`` -related-object methods, a model object might get any or all of the following -methods: +In addition to ``save()``, ``delete()``, a model object might get any or all +of the following methods: get_FOO_display() ----------------- @@ -741,7 +891,7 @@ For every ``DateField`` and ``DateTimeField`` that does not have ``null=True``, the object will have ``get_next_by_FOO()`` and ``get_previous_by_FOO()`` methods, where ``FOO`` is the name of the field. This returns the next and previous object with respect to the date field, raising the appropriate -``*DoesNotExist`` exception when appropriate. +``DoesNotExist`` exception when appropriate. Both methods accept optional keyword arguments, which should be in the format described in "Field lookups" above.