mirror of
https://github.com/django/django.git
synced 2025-07-04 17:59:13 +00:00
magic-removal: Greatly enhanced docs/cache.txt -- now using caching segment from Django book
git-svn-id: http://code.djangoproject.com/svn/django/branches/magic-removal@2758 bcc190cf-cafb-0310-a4f2-bffc1f526a37
This commit is contained in:
parent
97ef69178f
commit
c32bd868c6
354
docs/cache.txt
354
docs/cache.txt
@ -2,63 +2,180 @@
|
||||
Django's cache framework
|
||||
========================
|
||||
|
||||
So, you got slashdotted_. Now what?
|
||||
A fundamental tradeoff in dynamic Web sites is, well, they're dynamic. Each
|
||||
time a user requests a page, the Web server makes all sorts of calculations --
|
||||
from database queries to template rendering to business logic -- to create the
|
||||
page that your site's visitor sees. This is a lot more expensive, from a
|
||||
processing-overhead perspective, than your standard read-a-file-off-the-filesystem
|
||||
server arrangement.
|
||||
|
||||
Django's cache framework gives you three methods of caching dynamic pages in
|
||||
memory or in a database. You can cache the output of specific views, you can
|
||||
cache only the pieces that are difficult to produce, or you can cache your
|
||||
entire site.
|
||||
For most Web applications, this overhead isn't a big deal. Most Web
|
||||
applications aren't washingtonpost.com or slashdot.org; they're simply small-
|
||||
to medium-sized sites with so-so traffic. But for medium- to high-traffic
|
||||
sites, it's essential to cut as much overhead as possible.
|
||||
|
||||
.. _slashdotted: http://en.wikipedia.org/wiki/Slashdot_effect
|
||||
That's where caching comes in.
|
||||
|
||||
To cache something is to save the result of an expensive calculation so that
|
||||
you don't have to perform the calculation next time. Here's some pseudocode
|
||||
explaining how this would work for a dynamically generated Web page:
|
||||
|
||||
given a URL, try finding that page in the cache
|
||||
if the page is in the cache:
|
||||
return the cached page
|
||||
else:
|
||||
generate the page
|
||||
save the generated page in the cache (for next time)
|
||||
return the generated page
|
||||
|
||||
Django comes with a robust cache system that lets you save dynamic pages so
|
||||
they don't have to be calculated for each request. For convenience, Django
|
||||
offers different levels of cache granularity: You can cache the output of
|
||||
specific views, you can cache only the pieces that are difficult to produce, or
|
||||
you can cache your entire site.
|
||||
|
||||
Django also works well with "upstream" caches, such as Squid
|
||||
(http://www.squid-cache.org/) and browser-based caches. These are the types of
|
||||
caches that you don't directly control but to which you can provide hints (via
|
||||
HTTP headers) about which parts of your site should be cached, and how.
|
||||
|
||||
Setting up the cache
|
||||
====================
|
||||
|
||||
The cache framework allows for different "backends" -- different methods of
|
||||
caching data. There's a simple single-process memory cache (mostly useful as a
|
||||
fallback) and a memcached_ backend (the fastest option, by far, if you've got
|
||||
the RAM).
|
||||
The cache system requires a small amount of setup. Namely, you have to tell it
|
||||
where your cached data should live -- whether in a database, on the filesystem
|
||||
or directly in memory. This is an important decision that affects your cache's
|
||||
performance; yes, some cache types are faster than others.
|
||||
|
||||
Before using the cache, you'll need to tell Django which cache backend you'd
|
||||
like to use. Do this by setting the ``CACHE_BACKEND`` in your settings file.
|
||||
Your cache preference goes in the ``CACHE_BACKEND`` setting in your settings
|
||||
file. Here's an explanation of all available values for CACHE_BACKEND.
|
||||
|
||||
The ``CACHE_BACKEND`` setting is a "fake" URI (really an unregistered scheme).
|
||||
Examples:
|
||||
Memcached
|
||||
---------
|
||||
|
||||
============================== ===========================================
|
||||
CACHE_BACKEND Explanation
|
||||
============================== ===========================================
|
||||
memcached://127.0.0.1:11211/ A memcached backend; the server is running
|
||||
on localhost port 11211. You can use
|
||||
multiple memcached servers by separating
|
||||
them with semicolons.
|
||||
By far the fastest, most efficient type of cache available to Django, Memcached
|
||||
is an entirely memory-based cache framework originally developed to handle high
|
||||
loads at LiveJournal.com and subsequently open-sourced by Danga Interactive.
|
||||
It's used by sites such as Slashdot and Wikipedia to reduce database access and
|
||||
dramatically increase site performance.
|
||||
|
||||
This backend requires the
|
||||
`Python memcached bindings`_.
|
||||
Memcached is available for free at http://danga.com/memcached/ . It runs as a
|
||||
daemon and is allotted a specified amount of RAM. All it does is provide an
|
||||
interface -- a *super-lightning-fast* interface -- for adding, retrieving and
|
||||
deleting arbitrary data in the cache. All data is stored directly in memory,
|
||||
so there's no overhead of database or filesystem usage.
|
||||
|
||||
db://tablename/ A database backend in a table named
|
||||
"tablename". This table should be created
|
||||
with "django-admin createcachetable".
|
||||
After installing Memcached itself, you'll need to install the Memcached Python
|
||||
bindings. They're in a single Python module, memcache.py, available at
|
||||
ftp://ftp.tummy.com/pub/python-memcached/ . If that URL is no longer valid,
|
||||
just go to the Memcached Web site (http://www.danga.com/memcached/) and get the
|
||||
Python bindings from the "Client APIs" section.
|
||||
|
||||
file:///var/tmp/django_cache/ A file-based cache stored in the directory
|
||||
/var/tmp/django_cache/.
|
||||
To use Memcached with Django, set ``CACHE_BACKEND`` to
|
||||
``memcached://ip:port/``, where ``ip`` is the IP address of the Memcached
|
||||
daemon and ``port`` is the port on which Memcached is running.
|
||||
|
||||
simple:/// A simple single-process memory cache; you
|
||||
probably don't want to use this except for
|
||||
testing. Note that this cache backend is
|
||||
NOT thread-safe!
|
||||
In this example, Memcached is running on localhost (127.0.0.1) port 11211::
|
||||
|
||||
locmem:/// A more sophisticated local memory cache;
|
||||
this is multi-process- and thread-safe.
|
||||
CACHE_BACKEND = 'memcached://127.0.0.1:11211/'
|
||||
|
||||
dummy:/// Doesn't actually cache; just implements the
|
||||
cache backend interface and doesn't do
|
||||
anything. This is an easy way to turn off
|
||||
caching for a test environment.
|
||||
============================== ===========================================
|
||||
One excellent feature of Memcached is its ability to share cache over multiple
|
||||
servers. To take advantage of this feature, include all server addresses in
|
||||
``CACHE_BACKEND``, separated by semicolons. In this example, the cache is
|
||||
shared over Memcached instances running on IP address 172.19.26.240 and
|
||||
172.19.26.242, both on port 11211::
|
||||
|
||||
All caches may take arguments -- they're given in query-string style. Valid
|
||||
arguments are:
|
||||
CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11211/'
|
||||
|
||||
Memory-based caching has one disadvantage: Because the cached data is stored in
|
||||
memory, the data will be lost if your server crashes. Clearly, memory isn't
|
||||
intended for permanent data storage, so don't rely on memory-based caching as
|
||||
your only data storage. Actually, none of the Django caching backends should be
|
||||
used for permanent storage -- they're all intended to be solutions for caching,
|
||||
not storage -- but we point this out here because memory-based caching is
|
||||
particularly temporary.
|
||||
|
||||
Database caching
|
||||
----------------
|
||||
|
||||
To use a database table as your cache backend, first create a cache table in
|
||||
your database by running this command::
|
||||
|
||||
python manage.py createcachetable [cache_table_name]
|
||||
|
||||
...where ``[cache_table_name]`` is the name of the database table to create.
|
||||
(This name can be whatever you want, as long as it's a valid table name that's
|
||||
not already being used in your database.) This command creates a single table
|
||||
in your database that is in the proper format that Django's database-cache
|
||||
system expects.
|
||||
|
||||
Once you've created that database table, set your ``CACHE_BACKEND`` setting to
|
||||
``"db://tablename/"``, where ``tablename`` is the name of the database table.
|
||||
In this example, the cache table's name is ``my_cache_table``:
|
||||
|
||||
CACHE_BACKEND = 'db://my_cache_table'
|
||||
|
||||
Database caching works best if you've got a fast, well-indexed database server.
|
||||
|
||||
Filesystem caching
|
||||
------------------
|
||||
|
||||
To store cached items on a filesystem, use the ``"file://"`` cache type for
|
||||
``CACHE_BACKEND``. For example, to store cached data in ``/var/tmp/django_cache``,
|
||||
use this setting::
|
||||
|
||||
CACHE_BACKEND = 'file:///var/tmp/django_cache'
|
||||
|
||||
Note that there are three forward slashes toward the beginning of that example.
|
||||
The first two are for ``file://``, and the third is the first character of the
|
||||
directory path, ``/var/tmp/django_cache``.
|
||||
|
||||
The directory path should be absolute -- that is, it should start at the root
|
||||
of your filesystem. It doesn't matter whether you put a slash at the end of the
|
||||
setting.
|
||||
|
||||
Make sure the directory pointed-to by this setting exists and is readable and
|
||||
writable by the system user under which your Web server runs. Continuing the
|
||||
above example, if your server runs as the user ``apache``, make sure the
|
||||
directory ``/var/tmp/django_cache`` exists and is readable and writable by the
|
||||
user ``apache``.
|
||||
|
||||
Local-memory caching
|
||||
--------------------
|
||||
|
||||
If you want the speed advantages of in-memory caching but don't have the
|
||||
capability of running Memcached, consider the local-memory cache backend. This
|
||||
cache is multi-process and thread-safe. To use it, set ``CACHE_BACKEND`` to
|
||||
``"locmem:///"``. For example::
|
||||
|
||||
CACHE_BACKEND = 'locmem:///'
|
||||
|
||||
Simple caching (for development)
|
||||
--------------------------------
|
||||
|
||||
A simple, single-process memory cache is available as ``"simple:///"``. This
|
||||
merely saves cached data in-process, which means it should only be used in
|
||||
development or testing environments. For example::
|
||||
|
||||
CACHE_BACKEND = 'simple:///'
|
||||
|
||||
Dummy caching (for development)
|
||||
-------------------------------
|
||||
|
||||
Finally, Django comes with a "dummy" cache that doesn't actually cache -- it
|
||||
just implements the cache interface without doing anything.
|
||||
|
||||
This is useful if you have a production site that uses heavy-duty caching in
|
||||
various places but a development/test environment on which you don't want to
|
||||
cache. In that case, set ``CACHE_BACKEND`` to ``"dummy:///"`` in the settings
|
||||
file for your development environment. As a result, your development
|
||||
environment won't use caching and your production environment still will.
|
||||
|
||||
CACHE_BACKEND arguments
|
||||
-----------------------
|
||||
|
||||
All caches may take arguments. They're given in query-string style on the
|
||||
``CACHE_BACKEND`` setting. Valid arguments are:
|
||||
|
||||
timeout
|
||||
Default timeout, in seconds, to use for the cache. Defaults to 5
|
||||
@ -66,7 +183,7 @@ arguments are:
|
||||
|
||||
max_entries
|
||||
For the simple and database backends, the maximum number of entries
|
||||
allowed in the cache before it is cleaned. Defaults to 300.
|
||||
allowed in the cache before it is cleaned. Defaults to 300.
|
||||
|
||||
cull_percentage
|
||||
The percentage of entries that are culled when max_entries is reached.
|
||||
@ -77,20 +194,21 @@ arguments are:
|
||||
dumped when max_entries is reached. This makes culling *much* faster
|
||||
at the expense of more cache misses.
|
||||
|
||||
For example::
|
||||
In this example, ``timeout`` is set to ``60``::
|
||||
|
||||
CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=60"
|
||||
|
||||
In this example, ``timeout`` is ``30`` and ``max_entries`` is ``400``::
|
||||
|
||||
CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=30&max_entries=400"
|
||||
|
||||
Invalid arguments are silently ignored, as are invalid values of known
|
||||
arguments.
|
||||
|
||||
.. _memcached: http://www.danga.com/memcached/
|
||||
.. _Python memcached bindings: ftp://ftp.tummy.com/pub/python-memcached/
|
||||
|
||||
The per-site cache
|
||||
==================
|
||||
|
||||
Once the cache is set up, the simplest way to use the cache is to cache your
|
||||
Once the cache is set up, the simplest way to use caching is to cache your
|
||||
entire site. Just add ``django.middleware.cache.CacheMiddleware`` to your
|
||||
``MIDDLEWARE_CLASSES`` setting, as in this example::
|
||||
|
||||
@ -159,52 +277,100 @@ For example, you may find it's only necessary to cache the result of an
|
||||
intensive database query. In cases like this, you can use the low-level cache
|
||||
API to store objects in the cache with any level of granularity you like.
|
||||
|
||||
The cache API is simple::
|
||||
The cache API is simple. The cache module, ``django.core.cache``, exports a
|
||||
``cache`` object that's automatically created from the ``CACHE_BACKEND``
|
||||
setting::
|
||||
|
||||
# The cache module exports a cache object that's automatically
|
||||
# created from the CACHE_BACKEND setting.
|
||||
>>> from django.core.cache import cache
|
||||
|
||||
# The basic interface is set(key, value, timeout_seconds) and get(key).
|
||||
The basic interface is ``set(key, value, timeout_seconds)`` and ``get(key)``::
|
||||
|
||||
>>> cache.set('my_key', 'hello, world!', 30)
|
||||
>>> cache.get('my_key')
|
||||
'hello, world!'
|
||||
|
||||
# (Wait 30 seconds...)
|
||||
The ``timeout_seconds`` argument is optional and defaults to the ``timeout``
|
||||
argument in the ``CACHE_BACKEND`` setting (explained above).
|
||||
|
||||
If the object doesn't exist in the cache, ``cache.get()`` returns ``None``::
|
||||
|
||||
>>> cache.get('some_other_key')
|
||||
None
|
||||
|
||||
# Wait 30 seconds for 'my_key' to expire...
|
||||
|
||||
>>> cache.get('my_key')
|
||||
None
|
||||
|
||||
# get() can take a default argument.
|
||||
>>> cache.get('my_key', 'has_expired')
|
||||
'has_expired'
|
||||
get() can take a ``default`` argument::
|
||||
|
||||
>>> cache.get('my_key', 'has expired')
|
||||
'has expired'
|
||||
|
||||
There's also a get_many() interface that only hits the cache once. get_many()
|
||||
returns a dictionary with all the keys you asked for that actually exist in the
|
||||
cache (and haven't expired)::
|
||||
|
||||
# There's also a get_many() interface that only hits the cache once.
|
||||
# Also, note that the timeout argument is optional and defaults to what
|
||||
# you've given in the settings file.
|
||||
>>> cache.set('a', 1)
|
||||
>>> cache.set('b', 2)
|
||||
>>> cache.set('c', 3)
|
||||
|
||||
# get_many() returns a dictionary with all the keys you asked for that
|
||||
# actually exist in the cache (and haven't expired).
|
||||
>>> cache.get_many(['a', 'b', 'c'])
|
||||
{'a': 1, 'b': 2, 'c': 3}
|
||||
|
||||
# There's also a way to delete keys explicitly.
|
||||
Finally, you can delete keys explicitly with ``delete()``. This is an easy way
|
||||
of clearing the cache for a particular object::
|
||||
|
||||
>>> cache.delete('a')
|
||||
|
||||
That's it. The cache has very few restrictions: You can cache any object that
|
||||
can be pickled safely, although keys must be strings.
|
||||
|
||||
Controlling cache: Using Vary headers
|
||||
=====================================
|
||||
Upstream caches
|
||||
===============
|
||||
|
||||
The Django cache framework works with `HTTP Vary headers`_ to allow developers
|
||||
to instruct caching mechanisms to differ their cache contents depending on
|
||||
request HTTP headers.
|
||||
So far, this document has focused on caching your *own* data. But another type
|
||||
of caching is relevant to Web development, too: caching performed by "upstream"
|
||||
caches. These are systems that cache pages for users even before the request
|
||||
reaches your Web site.
|
||||
|
||||
Essentially, the ``Vary`` response HTTP header defines which request headers a
|
||||
cache mechanism should take into account when building its cache key.
|
||||
Here are a few examples of upstream caches:
|
||||
|
||||
* Your ISP may cache certain pages, so if you requested a page from
|
||||
somedomain.com, your ISP would send you the page without having to access
|
||||
somedomain.com directly.
|
||||
|
||||
* Your Django Web site may site behind a Squid Web proxy
|
||||
(http://www.squid-cache.org/) that caches pages for performance. In this
|
||||
case, each request first would be handled by Squid, and it'd only be
|
||||
passed to your application if needed.
|
||||
|
||||
* Your Web browser caches pages, too. If a Web page sends out the right
|
||||
headers, your browser will use the local (cached) copy for subsequent
|
||||
requests to that page.
|
||||
|
||||
Upstream caching is a nice efficiency boost, but there's a danger to it:
|
||||
Many Web pages' contents differ based on authentication and a host of other
|
||||
variables, and cache systems that blindly save pages based purely on URLs could
|
||||
expose incorrect or sensitive data to subsequent visitors to those pages.
|
||||
|
||||
For example, say you operate a Web e-mail system, and the contents of the
|
||||
"inbox" page obviously depend on which user is logged in. If an ISP blindly
|
||||
cached your site, then the first user who logged in through that ISP would have
|
||||
his user-specific inbox page cached for subsequent visitors to the site. That's
|
||||
not cool.
|
||||
|
||||
Fortunately, HTTP provides a solution to this problem: A set of HTTP headers
|
||||
exist to instruct caching mechanisms to differ their cache contents depending
|
||||
on designated variables, and to tell caching mechanisms not to cache particular
|
||||
pages.
|
||||
|
||||
Using Vary headers
|
||||
==================
|
||||
|
||||
One of these headers is ``Vary``. It defines which request headers a cache
|
||||
mechanism should take into account when building its cache key. For example, if
|
||||
the contents of a Web page depend on a user's language preference, the page is
|
||||
said to "vary on language."
|
||||
|
||||
By default, Django's cache system creates its cache keys using the requested
|
||||
path -- e.g., ``"/stories/2005/jun/23/bank_robbed/"``. This means every request
|
||||
@ -241,7 +407,7 @@ setting the ``Vary`` header (using something like
|
||||
``response['Vary'] = 'user-agent'``) is that the decorator adds to the ``Vary``
|
||||
header (which may already exist) rather than setting it from scratch.
|
||||
|
||||
Note that you can pass multiple headers to ``vary_on_headers()``::
|
||||
You can pass multiple headers to ``vary_on_headers()``::
|
||||
|
||||
@vary_on_headers('User-Agent', 'Cookie')
|
||||
def my_view(request):
|
||||
@ -261,7 +427,8 @@ decorator. These two views are equivalent::
|
||||
Also note that the headers you pass to ``vary_on_headers`` are not case
|
||||
sensitive. ``"User-Agent"`` is the same thing as ``"user-agent"``.
|
||||
|
||||
You can also use a helper function, ``patch_vary_headers()``, directly::
|
||||
You can also use a helper function, ``django.utils.cache.patch_vary_headers``,
|
||||
directly::
|
||||
|
||||
from django.utils.cache import patch_vary_headers
|
||||
def my_view(request):
|
||||
@ -273,7 +440,9 @@ You can also use a helper function, ``patch_vary_headers()``, directly::
|
||||
``patch_vary_headers`` takes an ``HttpResponse`` instance as its first argument
|
||||
and a list/tuple of header names as its second argument.
|
||||
|
||||
.. _`HTTP Vary headers`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
|
||||
For more on Vary headers, see the `official Vary spec`_.
|
||||
|
||||
.. _`official Vary spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
|
||||
|
||||
Controlling cache: Using other headers
|
||||
======================================
|
||||
@ -317,44 +486,25 @@ cache on every access and to store cached versions for, at most, 3600 seconds::
|
||||
def my_view(request):
|
||||
...
|
||||
|
||||
Any valid ``Cache-Control`` directive is valid in ``cache_control()``. For a
|
||||
full list, see the `Cache-Control spec`_. Just pass the directives as keyword
|
||||
arguments to ``cache_control()``, substituting underscores for hyphens. For
|
||||
directives that don't take an argument, set the argument to ``True``.
|
||||
Any valid ``Cache-Control`` HTTP directive is valid in ``cache_control()``.
|
||||
Here's a full list:
|
||||
|
||||
Examples:
|
||||
* ``public=True``
|
||||
* ``private=True``
|
||||
* ``no_cache=True``
|
||||
* ``no_transform=True``
|
||||
* ``must_revalidate=True``
|
||||
* ``proxy_revalidate=True``
|
||||
* ``max_age=num_seconds``
|
||||
* ``s_maxage=num_seconds``
|
||||
|
||||
* ``@cache_control(max_age=3600)`` turns into ``max-age=3600``.
|
||||
* ``@cache_control(public=True)`` turns into ``public``.
|
||||
For explanation of Cache-Control HTTP directives, see the `Cache-Control spec`_.
|
||||
|
||||
(Note that the caching middleware already sets the cache header's max-age with
|
||||
the value of the ``CACHE_MIDDLEWARE_SETTINGS`` setting. If you use a custom
|
||||
``max_age`` in a ``cache_control`` decorator, the decorator will take
|
||||
precedence, and the header values will be merged correctly.)
|
||||
|
||||
Disabling HTTP caching for a particular view
|
||||
============================================
|
||||
|
||||
If you want to use headers to disable HTTP caching altogether for a particular
|
||||
view, use one of the two utility functions the come with Django:
|
||||
|
||||
* ``django.utils.cache.add_never_cache_headers`` takes a single
|
||||
``HttpResponse`` object as its argument and alters the response to adds
|
||||
headers that ensure the response won't be cached by browsers or other
|
||||
caches.
|
||||
* ``django.views.decorators.never_cache`` is a view decorator that does the
|
||||
same thing but can be applied to a view function for convenience.
|
||||
Example::
|
||||
|
||||
from django.views.decorators.cache import never_cache
|
||||
@never_cache
|
||||
def myview(request):
|
||||
# ...
|
||||
|
||||
Note that these functions disable HTTP caching (as described in the 'Controlling
|
||||
Cache' sections of this document) -- they do *not* disable performance caching
|
||||
(as described in the first few sections of this document).
|
||||
|
||||
.. _`Cache-Control spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
|
||||
|
||||
Other optimizations
|
||||
|
Loading…
x
Reference in New Issue
Block a user