mirror of
				https://github.com/django/django.git
				synced 2025-10-24 22:26:08 +00:00 
			
		
		
		
	git-svn-id: http://code.djangoproject.com/svn/django/trunk@15055 bcc190cf-cafb-0310-a4f2-bffc1f526a37
		
			
				
	
	
		
			393 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			393 lines
		
	
	
		
			16 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| ============
 | |
| File Uploads
 | |
| ============
 | |
| 
 | |
| .. currentmodule:: django.core.files
 | |
| 
 | |
| When Django handles a file upload, the file data ends up placed in
 | |
| :attr:`request.FILES <django.http.HttpRequest.FILES>` (for more on the
 | |
| ``request`` object see the documentation for :doc:`request and response objects
 | |
| </ref/request-response>`). This document explains how files are stored on disk
 | |
| and in memory, and how to customize the default behavior.
 | |
| 
 | |
| Basic file uploads
 | |
| ==================
 | |
| 
 | |
| Consider a simple form containing a :class:`~django.forms.FileField`::
 | |
| 
 | |
|     from django import forms
 | |
| 
 | |
|     class UploadFileForm(forms.Form):
 | |
|         title = forms.CharField(max_length=50)
 | |
|         file  = forms.FileField()
 | |
| 
 | |
| A view handling this form will receive the file data in
 | |
| :attr:`request.FILES <django.http.HttpRequest.FILES>`, which is a dictionary
 | |
| containing a key for each :class:`~django.forms.FileField` (or
 | |
| :class:`~django.forms.ImageField`, or other :class:`~django.forms.FileField`
 | |
| subclass) in the form. So the data from the above form would
 | |
| be accessible as ``request.FILES['file']``.
 | |
| 
 | |
| Note that :attr:`request.FILES <django.http.HttpRequest.FILES>` will only
 | |
| contain data if the request method was ``POST`` and the ``<form>`` that posted
 | |
| the request has the attribute ``enctype="multipart/form-data"``. Otherwise,
 | |
| ``request.FILES`` will be empty.
 | |
| 
 | |
| Most of the time, you'll simply pass the file data from ``request`` into the
 | |
| form as described in :ref:`binding-uploaded-files`. This would look
 | |
| something like::
 | |
| 
 | |
|     from django.http import HttpResponseRedirect
 | |
|     from django.shortcuts import render_to_response
 | |
| 
 | |
|     # Imaginary function to handle an uploaded file.
 | |
|     from somewhere import handle_uploaded_file
 | |
| 
 | |
|     def upload_file(request):
 | |
|         if request.method == 'POST':
 | |
|             form = UploadFileForm(request.POST, request.FILES)
 | |
|             if form.is_valid():
 | |
|                 handle_uploaded_file(request.FILES['file'])
 | |
|                 return HttpResponseRedirect('/success/url/')
 | |
|         else:
 | |
|             form = UploadFileForm()
 | |
|         return render_to_response('upload.html', {'form': form})
 | |
| 
 | |
| Notice that we have to pass :attr:`request.FILES <django.http.HttpRequest.FILES>`
 | |
| into the form's constructor; this is how file data gets bound into a form.
 | |
| 
 | |
| Handling uploaded files
 | |
| -----------------------
 | |
| 
 | |
| The final piece of the puzzle is handling the actual file data from
 | |
| :attr:`request.FILES <django.http.HttpRequest.FILES>`. Each entry in this
 | |
| dictionary is an ``UploadedFile`` object -- a simple wrapper around an uploaded
 | |
| file. You'll usually use one of these methods to access the uploaded content:
 | |
| 
 | |
|     ``UploadedFile.read()``
 | |
|         Read the entire uploaded data from the file. Be careful with this
 | |
|         method: if the uploaded file is huge it can overwhelm your system if you
 | |
|         try to read it into memory. You'll probably want to use ``chunks()``
 | |
|         instead; see below.
 | |
| 
 | |
|     ``UploadedFile.multiple_chunks()``
 | |
|         Returns ``True`` if the uploaded file is big enough to require
 | |
|         reading in multiple chunks. By default this will be any file
 | |
|         larger than 2.5 megabytes, but that's configurable; see below.
 | |
| 
 | |
|     ``UploadedFile.chunks()``
 | |
|         A generator returning chunks of the file. If ``multiple_chunks()`` is
 | |
|         ``True``, you should use this method in a loop instead of ``read()``.
 | |
| 
 | |
|         In practice, it's often easiest simply to use ``chunks()`` all the time;
 | |
|         see the example below.
 | |
| 
 | |
|     ``UploadedFile.name``
 | |
|         The name of the uploaded file (e.g. ``my_file.txt``).
 | |
| 
 | |
|     ``UploadedFile.size``
 | |
|         The size, in bytes, of the uploaded file.
 | |
| 
 | |
| There are a few other methods and attributes available on ``UploadedFile``
 | |
| objects; see `UploadedFile objects`_ for a complete reference.
 | |
| 
 | |
| Putting it all together, here's a common way you might handle an uploaded file::
 | |
| 
 | |
|     def handle_uploaded_file(f):
 | |
|         destination = open('some/file/name.txt', 'wb+')
 | |
|         for chunk in f.chunks():
 | |
|             destination.write(chunk)
 | |
|         destination.close()
 | |
| 
 | |
| Looping over ``UploadedFile.chunks()`` instead of using ``read()`` ensures that
 | |
| large files don't overwhelm your system's memory.
 | |
| 
 | |
| Where uploaded data is stored
 | |
| -----------------------------
 | |
| 
 | |
| Before you save uploaded files, the data needs to be stored somewhere.
 | |
| 
 | |
| By default, if an uploaded file is smaller than 2.5 megabytes, Django will hold
 | |
| the entire contents of the upload in memory. This means that saving the file
 | |
| involves only a read from memory and a write to disk and thus is very fast.
 | |
| 
 | |
| However, if an uploaded file is too large, Django will write the uploaded file
 | |
| to a temporary file stored in your system's temporary directory. On a Unix-like
 | |
| platform this means you can expect Django to generate a file called something
 | |
| like ``/tmp/tmpzfp6I6.upload``. If an upload is large enough, you can watch this
 | |
| file grow in size as Django streams the data onto disk.
 | |
| 
 | |
| These specifics -- 2.5 megabytes; ``/tmp``; etc. -- are simply "reasonable
 | |
| defaults". Read on for details on how you can customize or completely replace
 | |
| upload behavior.
 | |
| 
 | |
| Changing upload handler behavior
 | |
| --------------------------------
 | |
| 
 | |
| Three settings control Django's file upload behavior:
 | |
| 
 | |
|     :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE`
 | |
|         The maximum size, in bytes, for files that will be uploaded into memory.
 | |
|         Files larger than :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be
 | |
|         streamed to disk.
 | |
| 
 | |
|         Defaults to 2.5 megabytes.
 | |
| 
 | |
|     :setting:`FILE_UPLOAD_TEMP_DIR`
 | |
|         The directory where uploaded files larger than
 | |
|         :setting:`FILE_UPLOAD_MAX_MEMORY_SIZE` will be stored.
 | |
| 
 | |
|         Defaults to your system's standard temporary directory (i.e. ``/tmp`` on
 | |
|         most Unix-like systems).
 | |
| 
 | |
|     :setting:`FILE_UPLOAD_PERMISSIONS`
 | |
|         The numeric mode (i.e. ``0644``) to set newly uploaded files to. For
 | |
|         more information about what these modes mean, see the `documentation for
 | |
|         os.chmod`_
 | |
| 
 | |
|         If this isn't given or is ``None``, you'll get operating-system
 | |
|         dependent behavior. On most platforms, temporary files will have a mode
 | |
|         of ``0600``, and files saved from memory will be saved using the
 | |
|         system's standard umask.
 | |
| 
 | |
|         .. warning::
 | |
| 
 | |
|             If you're not familiar with file modes, please note that the leading
 | |
|             ``0`` is very important: it indicates an octal number, which is the
 | |
|             way that modes must be specified. If you try to use ``644``, you'll
 | |
|             get totally incorrect behavior.
 | |
| 
 | |
|             **Always prefix the mode with a 0.**
 | |
| 
 | |
|     :setting:`FILE_UPLOAD_HANDLERS`
 | |
|         The actual handlers for uploaded files. Changing this setting allows
 | |
|         complete customization -- even replacement -- of Django's upload
 | |
|         process. See `upload handlers`_, below, for details.
 | |
| 
 | |
|         Defaults to::
 | |
| 
 | |
|             ("django.core.files.uploadhandler.MemoryFileUploadHandler",
 | |
|              "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
 | |
| 
 | |
|         Which means "try to upload to memory first, then fall back to temporary
 | |
|         files."
 | |
| 
 | |
| .. _documentation for os.chmod: http://docs.python.org/library/os.html#os.chmod
 | |
| 
 | |
| ``UploadedFile`` objects
 | |
| ========================
 | |
| 
 | |
| .. class:: UploadedFile
 | |
| 
 | |
| In addition to those inherited from :class:`File`, all ``UploadedFile`` objects
 | |
| define the following methods/attributes:
 | |
| 
 | |
|     ``UploadedFile.content_type``
 | |
|         The content-type header uploaded with the file (e.g. ``text/plain`` or
 | |
|         ``application/pdf``). Like any data supplied by the user, you shouldn't
 | |
|         trust that the uploaded file is actually this type. You'll still need to
 | |
|         validate that the file contains the content that the content-type header
 | |
|         claims -- "trust but verify."
 | |
| 
 | |
|     ``UploadedFile.charset``
 | |
|         For ``text/*`` content-types, the character set (i.e. ``utf8``) supplied
 | |
|         by the browser. Again, "trust but verify" is the best policy here.
 | |
| 
 | |
|     ``UploadedFile.temporary_file_path()``
 | |
|         Only files uploaded onto disk will have this method; it returns the full
 | |
|         path to the temporary uploaded file.
 | |
| 
 | |
| .. note::
 | |
| 
 | |
|     Like regular Python files, you can read the file line-by-line simply by
 | |
|     iterating over the uploaded file:
 | |
| 
 | |
|     .. code-block:: python
 | |
| 
 | |
|         for line in uploadedfile:
 | |
|             do_something_with(line)
 | |
| 
 | |
|     However, *unlike* standard Python files, :class:`UploadedFile` only
 | |
|     understands ``\n`` (also known as "Unix-style") line endings. If you know
 | |
|     that you need to handle uploaded files with different line endings, you'll
 | |
|     need to do so in your view.
 | |
| 
 | |
| Upload Handlers
 | |
| ===============
 | |
| 
 | |
| When a user uploads a file, Django passes off the file data to an *upload
 | |
| handler* -- a small class that handles file data as it gets uploaded. Upload
 | |
| handlers are initially defined in the ``FILE_UPLOAD_HANDLERS`` setting, which
 | |
| defaults to::
 | |
| 
 | |
|     ("django.core.files.uploadhandler.MemoryFileUploadHandler",
 | |
|      "django.core.files.uploadhandler.TemporaryFileUploadHandler",)
 | |
| 
 | |
| Together the ``MemoryFileUploadHandler`` and ``TemporaryFileUploadHandler``
 | |
| provide Django's default file upload behavior of reading small files into memory
 | |
| and large ones onto disk.
 | |
| 
 | |
| You can write custom handlers that customize how Django handles files. You
 | |
| could, for example, use custom handlers to enforce user-level quotas, compress
 | |
| data on the fly, render progress bars, and even send data to another storage
 | |
| location directly without storing it locally.
 | |
| 
 | |
| Modifying upload handlers on the fly
 | |
| ------------------------------------
 | |
| 
 | |
| Sometimes particular views require different upload behavior. In these cases,
 | |
| you can override upload handlers on a per-request basis by modifying
 | |
| ``request.upload_handlers``. By default, this list will contain the upload
 | |
| handlers given by ``FILE_UPLOAD_HANDLERS``, but you can modify the list as you
 | |
| would any other list.
 | |
| 
 | |
| For instance, suppose you've written a ``ProgressBarUploadHandler`` that
 | |
| provides feedback on upload progress to some sort of AJAX widget. You'd add this
 | |
| handler to your upload handlers like this::
 | |
| 
 | |
|     request.upload_handlers.insert(0, ProgressBarUploadHandler())
 | |
| 
 | |
| You'd probably want to use ``list.insert()`` in this case (instead of
 | |
| ``append()``) because a progress bar handler would need to run *before* any
 | |
| other handlers. Remember, the upload handlers are processed in order.
 | |
| 
 | |
| If you want to replace the upload handlers completely, you can just assign a new
 | |
| list::
 | |
| 
 | |
|    request.upload_handlers = [ProgressBarUploadHandler()]
 | |
| 
 | |
| .. note::
 | |
| 
 | |
|     You can only modify upload handlers *before* accessing
 | |
|     ``request.POST`` or ``request.FILES`` -- it doesn't make sense to
 | |
|     change upload handlers after upload handling has already
 | |
|     started. If you try to modify ``request.upload_handlers`` after
 | |
|     reading from ``request.POST`` or ``request.FILES`` Django will
 | |
|     throw an error.
 | |
| 
 | |
|     Thus, you should always modify uploading handlers as early in your view as
 | |
|     possible.
 | |
| 
 | |
|     Also, ``request.POST`` is accessed by
 | |
|     :class:`~django.middleware.csrf.CsrfViewMiddleware` which is enabled by
 | |
|     default. This means you will probably need to use
 | |
|     :func:`~django.views.decorators.csrf.csrf_exempt` on your view to allow you
 | |
|     to change the upload handlers. Assuming you do need CSRF protection, you
 | |
|     will then need to use :func:`~django.views.decorators.csrf.csrf_protect` on
 | |
|     the function that actually processes the request.  Note that this means that
 | |
|     the handlers may start receiving the file upload before the CSRF checks have
 | |
|     been done. Example code:
 | |
| 
 | |
|     .. code-block:: python
 | |
| 
 | |
|         from django.views.decorators.csrf import csrf_exempt, csrf_protect
 | |
| 
 | |
|         @csrf_exempt
 | |
|         def upload_file_view(request):
 | |
|             request.upload_handlers.insert(0, ProgressBarUploadHandler())
 | |
|             return _upload_file_view(request)
 | |
| 
 | |
|         @csrf_protect
 | |
|         def _upload_file_view(request):
 | |
|             ... # Process request
 | |
| 
 | |
| 
 | |
| Writing custom upload handlers
 | |
| ------------------------------
 | |
| 
 | |
| All file upload handlers should be subclasses of
 | |
| ``django.core.files.uploadhandler.FileUploadHandler``. You can define upload
 | |
| handlers wherever you wish.
 | |
| 
 | |
| Required methods
 | |
| ~~~~~~~~~~~~~~~~
 | |
| 
 | |
| Custom file upload handlers **must** define the following methods:
 | |
| 
 | |
|     ``FileUploadHandler.receive_data_chunk(self, raw_data, start)``
 | |
|         Receives a "chunk" of data from the file upload.
 | |
| 
 | |
|         ``raw_data`` is a byte string containing the uploaded data.
 | |
| 
 | |
|         ``start`` is the position in the file where this ``raw_data`` chunk
 | |
|         begins.
 | |
| 
 | |
|         The data you return will get fed into the subsequent upload handlers'
 | |
|         ``receive_data_chunk`` methods. In this way, one handler can be a
 | |
|         "filter" for other handlers.
 | |
| 
 | |
|         Return ``None`` from ``receive_data_chunk`` to sort-circuit remaining
 | |
|         upload handlers from getting this chunk.. This is useful if you're
 | |
|         storing the uploaded data yourself and don't want future handlers to
 | |
|         store a copy of the data.
 | |
| 
 | |
|         If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload
 | |
|         will abort or the file will be completely skipped.
 | |
| 
 | |
|     ``FileUploadHandler.file_complete(self, file_size)``
 | |
|         Called when a file has finished uploading.
 | |
| 
 | |
|         The handler should return an ``UploadedFile`` object that will be stored
 | |
|         in ``request.FILES``. Handlers may also return ``None`` to indicate that
 | |
|         the ``UploadedFile`` object should come from subsequent upload handlers.
 | |
| 
 | |
| Optional methods
 | |
| ~~~~~~~~~~~~~~~~
 | |
| 
 | |
| Custom upload handlers may also define any of the following optional methods or
 | |
| attributes:
 | |
| 
 | |
|     ``FileUploadHandler.chunk_size``
 | |
|         Size, in bytes, of the "chunks" Django should store into memory and feed
 | |
|         into the handler. That is, this attribute controls the size of chunks
 | |
|         fed into ``FileUploadHandler.receive_data_chunk``.
 | |
| 
 | |
|         For maximum performance the chunk sizes should be divisible by ``4`` and
 | |
|         should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are
 | |
|         multiple chunk sizes provided by multiple handlers, Django will use the
 | |
|         smallest chunk size defined by any handler.
 | |
| 
 | |
|         The default is 64*2\ :sup:`10` bytes, or 64 KB.
 | |
| 
 | |
|     ``FileUploadHandler.new_file(self, field_name, file_name, content_type, content_length, charset)``
 | |
|         Callback signaling that a new file upload is starting. This is called
 | |
|         before any data has been fed to any upload handlers.
 | |
| 
 | |
|         ``field_name`` is a string name of the file ``<input>`` field.
 | |
| 
 | |
|         ``file_name`` is the unicode filename that was provided by the browser.
 | |
| 
 | |
|         ``content_type`` is the MIME type provided by the browser -- E.g.
 | |
|         ``'image/jpeg'``.
 | |
| 
 | |
|         ``content_length`` is the length of the image given by the browser.
 | |
|         Sometimes this won't be provided and will be ``None``.
 | |
| 
 | |
|         ``charset`` is the character set (i.e. ``utf8``) given by the browser.
 | |
|         Like ``content_length``, this sometimes won't be provided.
 | |
| 
 | |
|         This method may raise a ``StopFutureHandlers`` exception to prevent
 | |
|         future handlers from handling this file.
 | |
| 
 | |
|     ``FileUploadHandler.upload_complete(self)``
 | |
|         Callback signaling that the entire upload (all files) has completed.
 | |
| 
 | |
|     ``FileUploadHandler.handle_raw_input(self, input_data, META, content_length, boundary, encoding)``
 | |
|         Allows the handler to completely override the parsing of the raw
 | |
|         HTTP input.
 | |
| 
 | |
|         ``input_data`` is a file-like object that supports ``read()``-ing.
 | |
| 
 | |
|         ``META`` is the same object as ``request.META``.
 | |
| 
 | |
|         ``content_length`` is the length of the data in ``input_data``. Don't
 | |
|         read more than ``content_length`` bytes from ``input_data``.
 | |
| 
 | |
|         ``boundary`` is the MIME boundary for this request.
 | |
| 
 | |
|         ``encoding`` is the encoding of the request.
 | |
| 
 | |
|         Return ``None`` if you want upload handling to continue, or a tuple of
 | |
|         ``(POST, FILES)`` if you want to return the new data structures suitable
 | |
|         for the request directly.
 |