mirror of
				https://github.com/django/django.git
				synced 2025-10-25 14:46:09 +00:00 
			
		
		
		
	git-svn-id: http://code.djangoproject.com/svn/django/trunk@17872 bcc190cf-cafb-0310-a4f2-bffc1f526a37
		
			
				
	
	
		
			295 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			295 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ==========================
 | |
| Performing raw SQL queries
 | |
| ==========================
 | |
| 
 | |
| .. currentmodule:: django.db.models
 | |
| 
 | |
| When the :doc:`model query APIs </topics/db/queries>` don't go far enough, you
 | |
| can fall back to writing raw SQL. Django gives you two ways of performing raw
 | |
| SQL queries: you can use :meth:`Manager.raw()` to `perform raw queries and
 | |
| return model instances`__, or you can avoid the model layer entirely and
 | |
| `execute custom SQL directly`__.
 | |
| 
 | |
| __ `performing raw queries`_
 | |
| __ `executing custom SQL directly`_
 | |
| 
 | |
| .. _executing-raw-queries:
 | |
| 
 | |
| Performing raw queries
 | |
| ======================
 | |
| 
 | |
| .. versionadded:: 1.2
 | |
| 
 | |
| The ``raw()`` manager method can be used to perform raw SQL queries that
 | |
| return model instances:
 | |
| 
 | |
| .. method:: Manager.raw(raw_query, params=None, translations=None)
 | |
| 
 | |
| This method method takes a raw SQL query, executes it, and returns a
 | |
| :class:`~django.db.models.query.RawQuerySet` instance. This
 | |
| :class:`~django.db.models.query.RawQuerySet` instance can be iterated
 | |
| over just like an normal QuerySet to provide object instances.
 | |
| 
 | |
| This is best illustrated with an example. Suppose you've got the following model::
 | |
| 
 | |
|     class Person(models.Model):
 | |
|         first_name = models.CharField(...)
 | |
|         last_name = models.CharField(...)
 | |
|         birth_date = models.DateField(...)
 | |
| 
 | |
| You could then execute custom SQL like so::
 | |
| 
 | |
|     >>> for p in Person.objects.raw('SELECT * FROM myapp_person'):
 | |
|     ...     print p
 | |
|     John Smith
 | |
|     Jane Jones
 | |
| 
 | |
| Of course, this example isn't very exciting -- it's exactly the same as
 | |
| running ``Person.objects.all()``. However, ``raw()`` has a bunch of other
 | |
| options that make it very powerful.
 | |
| 
 | |
| .. admonition:: Model table names
 | |
| 
 | |
|     Where'd the name of the ``Person`` table come from in that example?
 | |
| 
 | |
|     By default, Django figures out a database table name by joining the
 | |
|     model's "app label" -- the name you used in ``manage.py startapp`` -- to
 | |
|     the model's class name, with an underscore between them. In the example
 | |
|     we've assumed that the ``Person`` model lives in an app named ``myapp``,
 | |
|     so its table would be ``myapp_person``.
 | |
| 
 | |
|     For more details check out the documentation for the
 | |
|     :attr:`~Options.db_table` option, which also lets you manually set the
 | |
|     database table name.
 | |
| 
 | |
| .. warning::
 | |
| 
 | |
|     No checking is done on the SQL statement that is passed in to ``.raw()``.
 | |
|     Django expects that the statement will return a set of rows from the
 | |
|     database, but does nothing to enforce that. If the query does not
 | |
|     return rows, a (possibly cryptic) error will result.
 | |
| 
 | |
| Mapping query fields to model fields
 | |
| ------------------------------------
 | |
| 
 | |
| ``raw()`` automatically maps fields in the query to fields on the model.
 | |
| 
 | |
| The order of fields in your query doesn't matter. In other words, both
 | |
| of the following queries work identically::
 | |
| 
 | |
|     >>> Person.objects.raw('SELECT id, first_name, last_name, birth_date FROM myapp_person')
 | |
|     ...
 | |
|     >>> Person.objects.raw('SELECT last_name, birth_date, first_name, id FROM myapp_person')
 | |
|     ...
 | |
| 
 | |
| Matching is done by name. This means that you can use SQL's ``AS`` clauses to
 | |
| map fields in the query to model fields. So if you had some other table that
 | |
| had ``Person`` data in it, you could easily map it into ``Person`` instances::
 | |
| 
 | |
|     >>> Person.objects.raw('''SELECT first AS first_name,
 | |
|     ...                              last AS last_name,
 | |
|     ...                              bd AS birth_date,
 | |
|     ...                              pk as id,
 | |
|     ...                       FROM some_other_table''')
 | |
| 
 | |
| As long as the names match, the model instances will be created correctly.
 | |
| 
 | |
| Alternatively, you can map fields in the query to model fields using the
 | |
| ``translations`` argument to ``raw()``. This is a dictionary mapping names of
 | |
| fields in the query to names of fields on the model. For example, the above
 | |
| query could also be written::
 | |
| 
 | |
|     >>> name_map = {'first': 'first_name', 'last': 'last_name', 'bd': 'birth_date', 'pk': 'id'}
 | |
|     >>> Person.objects.raw('SELECT * FROM some_other_table', translations=name_map)
 | |
| 
 | |
| Index lookups
 | |
| -------------
 | |
| 
 | |
| ``raw()`` supports indexing, so if you need only the first result you can
 | |
| write::
 | |
| 
 | |
|     >>> first_person = Person.objects.raw('SELECT * from myapp_person')[0]
 | |
| 
 | |
| However, the indexing and slicing are not performed at the database level. If
 | |
| you have a big amount of ``Person`` objects in your database, it is more
 | |
| efficient to limit the query at the SQL level::
 | |
| 
 | |
|     >>> first_person = Person.objects.raw('SELECT * from myapp_person LIMIT 1')[0]
 | |
| 
 | |
| Deferring model fields
 | |
| ----------------------
 | |
| 
 | |
| Fields may also be left out::
 | |
| 
 | |
|     >>> people = Person.objects.raw('SELECT id, first_name FROM myapp_person')
 | |
| 
 | |
| The ``Person`` objects returned by this query will be deferred model instances
 | |
| (see :meth:`~django.db.models.query.QuerySet.defer()`). This means that the
 | |
| fields that are omitted from the query will be loaded on demand. For example::
 | |
| 
 | |
|     >>> for p in Person.objects.raw('SELECT id, first_name FROM myapp_person'):
 | |
|     ...     print p.first_name, # This will be retrieved by the original query
 | |
|     ...     print p.last_name # This will be retrieved on demand
 | |
|     ...
 | |
|     John Smith
 | |
|     Jane Jones
 | |
| 
 | |
| From outward appearances, this looks like the query has retrieved both
 | |
| the first name and last name. However, this example actually issued 3
 | |
| queries. Only the first names were retrieved by the raw() query -- the
 | |
| last names were both retrieved on demand when they were printed.
 | |
| 
 | |
| There is only one field that you can't leave out - the primary key
 | |
| field. Django uses the primary key to identify model instances, so it
 | |
| must always be included in a raw query. An ``InvalidQuery`` exception
 | |
| will be raised if you forget to include the primary key.
 | |
| 
 | |
| Adding annotations
 | |
| ------------------
 | |
| 
 | |
| You can also execute queries containing fields that aren't defined on the
 | |
| model. For example, we could use `PostgreSQL's age() function`__ to get a list
 | |
| of people with their ages calculated by the database::
 | |
| 
 | |
|     >>> people = Person.objects.raw('SELECT *, age(birth_date) AS age FROM myapp_person')
 | |
|     >>> for p in people:
 | |
|     ...     print "%s is %s." % (p.first_name, p.age)
 | |
|     John is 37.
 | |
|     Jane is 42.
 | |
|     ...
 | |
| 
 | |
| __ http://www.postgresql.org/docs/8.4/static/functions-datetime.html
 | |
| 
 | |
| Passing parameters into ``raw()``
 | |
| ---------------------------------
 | |
| 
 | |
| If you need to perform parameterized queries, you can use the ``params``
 | |
| argument to ``raw()``::
 | |
| 
 | |
|     >>> lname = 'Doe'
 | |
|     >>> Person.objects.raw('SELECT * FROM myapp_person WHERE last_name = %s', [lname])
 | |
| 
 | |
| ``params`` is a list of parameters. You'll use ``%s`` placeholders in the
 | |
| query string (regardless of your database engine); they'll be replaced with
 | |
| parameters from the ``params`` list.
 | |
| 
 | |
| .. warning::
 | |
| 
 | |
|     **Do not use string formatting on raw queries!**
 | |
| 
 | |
|     It's tempting to write the above query as::
 | |
| 
 | |
|         >>> query = 'SELECT * FROM myapp_person WHERE last_name = %s' % lname
 | |
|         >>> Person.objects.raw(query)
 | |
| 
 | |
|     **Don't.**
 | |
| 
 | |
|     Using the ``params`` list completely protects you from `SQL injection
 | |
|     attacks`__, a common exploit where attackers inject arbitrary SQL into
 | |
|     your database. If you use string interpolation, sooner or later you'll
 | |
|     fall victim to SQL injection. As long as you remember to always use the
 | |
|     ``params`` list you'll be protected.
 | |
| 
 | |
| __ http://en.wikipedia.org/wiki/SQL_injection
 | |
| 
 | |
| .. _executing-custom-sql:
 | |
| 
 | |
| Executing custom SQL directly
 | |
| =============================
 | |
| 
 | |
| Sometimes even :meth:`Manager.raw` isn't quite enough: you might need to
 | |
| perform queries that don't map cleanly to models, or directly execute
 | |
| ``UPDATE``, ``INSERT``, or ``DELETE`` queries.
 | |
| 
 | |
| In these cases, you can always access the database directly, routing around
 | |
| the model layer entirely.
 | |
| 
 | |
| The object ``django.db.connection`` represents the
 | |
| default database connection, and ``django.db.transaction`` represents the
 | |
| default database transaction. To use the database connection, call
 | |
| ``connection.cursor()`` to get a cursor object. Then, call
 | |
| ``cursor.execute(sql, [params])`` to execute the SQL and ``cursor.fetchone()``
 | |
| or ``cursor.fetchall()`` to return the resulting rows. After performing a data
 | |
| changing operation, you should then call
 | |
| ``transaction.commit_unless_managed()`` to ensure your changes are committed
 | |
| to the database. If your query is purely a data retrieval operation, no commit
 | |
| is required. For example::
 | |
| 
 | |
|     def my_custom_sql():
 | |
|         from django.db import connection, transaction
 | |
|         cursor = connection.cursor()
 | |
| 
 | |
|         # Data modifying operation - commit required
 | |
|         cursor.execute("UPDATE bar SET foo = 1 WHERE baz = %s", [self.baz])
 | |
|         transaction.commit_unless_managed()
 | |
| 
 | |
|         # Data retrieval operation - no commit required
 | |
|         cursor.execute("SELECT foo FROM bar WHERE baz = %s", [self.baz])
 | |
|         row = cursor.fetchone()
 | |
| 
 | |
|         return row
 | |
| 
 | |
| If you are using :doc:`more than one database </topics/db/multi-db>`, you can
 | |
| use ``django.db.connections`` to obtain the connection (and cursor) for a
 | |
| specific database. ``django.db.connections`` is a dictionary-like
 | |
| object that allows you to retrieve a specific connection using its
 | |
| alias::
 | |
| 
 | |
|     from django.db import connections
 | |
|     cursor = connections['my_db_alias'].cursor()
 | |
|     # Your code here...
 | |
|     transaction.commit_unless_managed(using='my_db_alias')
 | |
| 
 | |
| By default, the Python DB API will return results without their field
 | |
| names, which means you end up with a ``list`` of values, rather than a
 | |
| ``dict``. At a small performance cost, you can return results as a
 | |
| ``dict`` by using something like this::
 | |
|     
 | |
|     def dictfetchall(cursor):
 | |
|         "Returns all rows from a cursor as a dict"
 | |
|         desc = cursor.description
 | |
|         return [
 | |
|             dict(zip([col[0] for col in desc], row))
 | |
|             for row in cursor.fetchall()
 | |
|         ]
 | |
| 
 | |
| Here is an example of the difference between the two::
 | |
| 
 | |
|     >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
 | |
|     >>> cursor.fetchall()
 | |
|     ((54360982L, None), (54360880L, None))
 | |
|     
 | |
|     >>> cursor.execute("SELECT id, parent_id from test LIMIT 2");
 | |
|     >>> dictfetchall(cursor)
 | |
|     [{'parent_id': None, 'id': 54360982L}, {'parent_id': None, 'id': 54360880L}]
 | |
| 
 | |
| 
 | |
| .. _transactions-and-raw-sql:
 | |
| 
 | |
| Transactions and raw SQL
 | |
| ------------------------
 | |
| 
 | |
| When you make a raw SQL call, Django will automatically mark the
 | |
| current transaction as dirty. You must then ensure that the
 | |
| transaction containing those calls is closed correctly. See :ref:`the
 | |
| notes on the requirements of Django's transaction handling
 | |
| <topics-db-transactions-requirements>` for more details.
 | |
| 
 | |
| .. versionchanged:: 1.3
 | |
| 
 | |
| Prior to Django 1.3, it was necessary to manually mark a transaction
 | |
| as dirty using ``transaction.set_dirty()`` when using raw SQL calls.
 | |
| 
 | |
| Connections and cursors
 | |
| -----------------------
 | |
| 
 | |
| ``connection`` and ``cursor`` mostly implement the standard Python DB-API
 | |
| described in :pep:`249` (except when it comes to :doc:`transaction handling
 | |
| </topics/db/transactions>`). If you're not familiar with the Python DB-API, note
 | |
| that the SQL statement in ``cursor.execute()`` uses placeholders, ``"%s"``,
 | |
| rather than adding parameters directly within the SQL. If you use this
 | |
| technique, the underlying database library will automatically add quotes and
 | |
| escaping to your parameter(s) as necessary. (Also note that Django expects the
 | |
| ``"%s"`` placeholder, *not* the ``"?"`` placeholder, which is used by the SQLite
 | |
| Python bindings. This is for the sake of consistency and sanity.)
 |