How Django processes a request
In a comment he left yesterday, Jonathan Snook posed an excellent challenge: document the chain of how Django processes a request, from start to finish, with plenty of detail on the various things being called internally and links to the appropriate documentation.
Simon Willison once wrote such a document, but it was a fairly high-level view and a fair number of things have changed since then, so I’m going to take a stab at it myself, and hopefully the result will be comprehensible.
Note: this is a first draft. Not a finished product, not a complete listing. Expect it to change frequently as I work on it. Ideally I’ll be able to get some help generating a visualization at some point, but for now I’ll stick to plain text.
Where official documentation for an item is available I’ll link to it. For items which don’t yet have documentation I’ll provide links to the current locations of the relevant code in the Django repository — these locations may change over time, particularly because I’m often going to be linking to line numbers in individual files, but I’ll do my best to keep them up to date. If you see something here that’s incorrect, or spot something I’ve left out or could have explained better, please leave a comment to let me know.
Updated October 2, 2006: I’ve just gone through and done quite a bit of updating of this article; several references to places in the code needed to be updated, and I rewrote a few things to reflect some shuffling around of internal bits that’s taken place recently.
Updated November 28, 2006: Added a note about when the database connection is closed.
Updated December 20, 2006: linked to notes on the Django wiki about the dispatcher.
Let’s get started.
Incoming!
The very first thing that happens is that something else invokes Django. That happens in one of two ways:
- Apache/mod_python is the server setup, in which case the request is handed to Django by mod_python creating an instance of django.core.handlers.modpython.ModPythonHandler.
- Something else is the server, in which case it needs to be WSGI-compliant. In this situation, an instance of django.core.handlers.wsgi.WsgiHandler is created by the server.
Both of those classes inherit from django.core.handlers.base.BaseHandler, which contains common code needed for any type of request.
I’ve got a handle on it
When one of the above handlers is instantiated, a couple things happen immediately:
- The handler imports your Django settings file.
- The handler imports Django’s custom exception classes.
-
The handler calls its own
load_middleware
method, which loads all the middleware classes it finds listed in theMIDDLEWARE_CLASSES
setting and introspects them.
That last bit is somewhat complicated, so let’s look at it in detail.
A middleware class can hook into any of four phases of processing: request, view, response and exception. It does so by defining methods named, appropriately enough, process_request
, process_view
, process_response
and process_exception
. The middleware can define any or all of these depending on what functionality it wants to provide.
When the handler introspects the middleware, it looks for methods with those names, and builds up four lists which are stored as instance variables of the handler:
-
_request_middleware
is a list of theprocess_request
methods (in each case these will be the actual methods, so they’re directly callable) from any middleware classes which defined them. -
_view_middleware
is a list of theprocess_view
methods from any middleware classes which defined them. -
_response_middleware
is a list of theprocess_response
methods from any middleware classes which defined them. -
_exception_middleware
is a list of theprocess_exception
methods from any middleware classes which defined them.
Green light: now begin
Now the handler is ready to really begin processing, so it fires the dispatcher signal request_started
(the internal dispatcher in Django allows various components to advertise what they’re doing, and allows other bits of code to listen for certain events; it’s currently not documented officially, but there are some notes on the wiki). Then it instantiates a subclass of django.http.HttpRequest. Depending on the handler, this may be an instance of django.core.handlers.modpython.ModPythonRequest or it may be an instance of django.core.handlers.wsgi.WSGIRequest. The two different classes are needed because the mod_python and WSGI APIs pass in the request information in different formats, and that information needs to be parsed into a single standard format Django can work with.
Once an HttpRequest
of some sort exists, the handler calls its own get_response method, passing theHttpRequest
as the only argument. This is where nearly all of the actual activity happens.
Middleware, round 1
The first thing get_response
does is loop through the handler’s _request_middleware
instance variable and call each method in that list, passing in the HttpRequest
instance as an argument. These methods have the option of short-circuiting the rest of the process and immediately causing get_response
to return, by returning a value themselves (if they do so, the return value needs to be an instance of django.http.HttpResponse, which we’ll talk about in a bit). If one of them does so, that’s it and we’re back into the main handler code; get_response
won’t even wait to see what the other middleware classes wanted to do, it will just return and the handler will go into its response phase.
More commonly, though, the middleware methods applied here simply do some processing and decide whether to add, remove or supplement attributes of the request.
Resolution time
Assuming that none of the middlewares which acted on the request short-circuited straight to a response, the handler next tries to resolve the requested URL. It looks in the settings file for a setting called ROOT_URLCONF
, and hands that, along with a base URL of /
, as arguments to create an instance of django.core.urlresolvers.RegexURLResolver, then calls the RegexURLResolver
‘s resolve
method with the requested URL path.
The URL resolver follows a fairly simple pattern. For each item in the urlpatterns
list generated by the URL configuration file specified by the ROOT_URLCONF
setting, it checks whether the requested URL path matches that item’s regular expression; if so , there are two options:
-
If the item has a call to include, the resolver chops off the bit of the URL that matched, moves to the URL configuration file specified by the include and begins iterating over the items in its
urlpatterns
list. Depending on the depth and modularity of your URL hierarchy, this may be repeated several times. - Otherwise, the resolver returns three items: the view function specified by the matched item, a list of non-named matched groups from the URL (to be used as positional arguments for the view) and a dictionary of keyword arguments, built from a combination of any named matched groups in the URL and any extra keyword arguments specified in that line in the URLConf.
Note that this stops at the first match which specifies a view, so it’s best to have your URL configuration proceed from more specific regexes to less specific ones, in order to ensure that the resolver doesn’t match one of the less specific ones first and end up returning the wrong view function.
If no matches are found, the resolver raises the exception django.core.urlresolvers.Resolver404, a subclass of the exception django.http.Http404. We’ll get to how that’s handled a little later on.
Middleware, round 2
Once it knows the view function it will be using and what arguments to pass to it, the handler looks at its _view_middleware
list, and calls each method in that list, passing the HttpRequest
, the view function, the list of positional arguments for the view and the dictionary of keyword arguments for the view.
Again, it’s possible for middleware to intervene at this stage and force the handler to return immediately.
Into the view
If processing is still going at this point, the handler calls the view function. Views in Django are somewhat nebulous because only a few requirements are placed on them:
- They must be callable.
-
They must accept as their first positional argument an instance of
django.http.HttpRequest
. -
They must either raise an exception or return an instance of
django.http.HttpResponse
.
Beyond that, the sky’s the limit. Most commonly, though, views will use Django’s database API to create, retrieve, update or delete something in the database, and they’ll load and render a template to display something to the end user.
Templates
Django’s template system is two-faceted: there’s one part that’s HTML with a few extra things mixed in and is mostly used by designers, and one part that’s pure Python and is used by programmers.
From an HTML author’s point of view, Django’s template system is pretty simple. There are three constructs you need to know about:
-
Variable references. In a template they look like this:
{{ foo }}
. -
Template filters, which act on the above. Using the filter
bar
on the example above would look like this:{{ foo|bar }}
. Generally these are used for output formatting (e.g., running Textile over something, formatting a date, etc.) -
Template tags. They look like this:
{% baz %}
. This is where the “logic” of templates is implemented, you can do things like{% if foo %}
,{% for bar in foo %}
, and so on, andif
andfor
are template tags.
Variable references work in a fairly simple way; if you’re just printing a variable directly, as in {{ foo }}
, the template system just outputs it. The only complexity here is when you do something like {{ foo.bar }}
; in that case, the template system tries a few things in order:
-
First it tries a dictionary-style lookup, to see if
foo[‘bar’]
exists. If it does, then that value is output and that’s the end of the process. -
If dictionary lookup fails, next the template system tries an attribute lookup, to see if
foo.bar
exists. It also checks whether the attribute is callable, and tries calling it in that case. - If attribute lookup fails, the template system tries looking it up as a list index.
If all of these fail, the template system outputs the value of the setting TEMPLATE_STRING_IF_INVALID
, which defaults to an empty string.
Template filters are simply Python functions which accept a value and an argument, and return a new value. For example, the date
filter takes a Python datetime
object as its value and a standard strftime
formatting string as its argument, and returns the result of applying that formatting string to that datetime
object.
Template tags are where things are a bit complicated, and where you get close to how the Django template system really works.
The structure of a Django template
Internally, a Django template is represented as a collection of ‘nodes’; these are Python classes which all inherit from the base node class django.template.Node. Node
s can do various sorts of processing, but they have one thing in common: every Node
must have a method called render
which accepts as its second argument (the first argument, of course, will be the Node
instance) an instance of django.template.Context, which is a dictionary-like object containing all the variables which are accessible to the template. The render
method of a Node
must return a string, but if the Node
is meant to carry out some task other than output (for example, if it’s meant to modify the template context by adding, removing or modifying variables in the Context
instance passed to it) it can return an empty string.
Django includes a number of subclasses of Node
which provide useful functions; each of the built-in template tags, for example, is handled by a subclass of Node
(e.g., there’s an IfNode
which implements the if
tag, a ForNode
which implements the for
tag, etc.). All of the built-in tags are found in django.template.defaulttags. In reality, all of the template constructs described above are Node
s of some sort, and so is plain text; a variable lookup is handled by a VariableNode
, filters, by their nature, get applied in a VariableNode
, tags are Node
s of various types and plain text is a TextNode
.
Generally, a view renders a template by carrying out the following steps, in order:
-
Loading the template to be rendered; this is handled by the function django.template.loader.get_template, which can use any of several methods to locate the desired template file. The
get_template
function returns an instance of django.template.Template, which is an object containing the parsed template and methods for using it. -
Instantiating a
Context
to be used in rendering the template; if theContext
subclass django.template.RequestContext is used, additional context-processing functions will be applied which can automatically add variables which were not defined in the view. The constructor method forContext
takes a dictionary of key/value pairs (which will become the variable names/values for the template) as its only argument;RequestContext
takes an instance ofHttpRequest
and a dictionary. -
Calling the
Template
instance’srender
method, with theContext
object as the first positional argument.
The return value of the Template
‘s render
method is a string, which is the concatenation of the return values of the render
methods of all the Template
‘s constituent Node
s, called in the order in which they occur in the Template
.
Response time, sort of
Once a template has been rendered, or some other sort of suitable output has been generated, the view is responsible for creating an instance of django.http.HttpResponse
. The constructor for this class takes two optional arguments:
-
A string (which should be the first positional argument, or the keyword argument
content
) which will be the body of the response. Most of the time, this will be the output of rendering a template, but it doesn’t have to be; you can pass any valid Python string in here. -
A value for the
Content-Type
header of the response (which should be the second positional argument, or the keyword argumentmime_type
). If you don’t provide this argument, Django will fill in the value of the settingDEFAULT_MIME_TYPE
and the value of the settingDEFAULT_CHARSET
which, if you haven’t changed them from the defaults in Django’s global settings file, will be “text/html” and “utf-8”, respectively.
Middleware, round 3: exception edition
If the view function, or something which happens inside it, raises an exception, then get_response
(I know we spent a while digging into views and templates, but once the view returns or raises we pick up again in the middle of the handler’s get_response
method) will loop through its _exception_middleware
instance variable and call each method there, passing the HttpRequest
and the exception as arguments. Hopefully one of those methods will instantiate and return an HttpResponse
.
Still not responding?
At this point it’s possible that there still isn’t an HttpResponse
; this could be due to a number of factors:
- The view might not have returned a value.
- The view might have raised an exception that none of the middleware was able to deal with.
- A middleware method that was trying to deal with an exception might have raised a new exception itself.
When this happens, get_response
falls back to its own exception handling mechanisms; these come in several layers:
-
If the execption was
Http404
and theDEBUG
setting isTrue
,get_response
will execute the view django.views.debug.technical_404_response, passing theHttpRequest
and the exception as arguments. This view displays information about the patterns the URL resolver tried to match against. -
If
DEBUG
isFalse
, and the exception wasHttp404
,get_response
calls the URL resolver’s resolve_404 method; this method looks at the URL configuration to determine which view has been specified for handling 404 errors. This defaults to django.views.defaults.page_not_found, but can be overridden in the URL configuration by assigning a value to the variablehandler404
. -
For any other type of exception, If the
DEBUG
setting isTrue
,get_response
will execute the view django.views.debug.technical_500_response, passing theHttpRequest
and exception information as arguments. This view provides detailed information about the exception, including the traceback, local variables at each level of the stack, a detailed representation of theHttpRequest
object and a listing of all non-sensitive settings. -
If
DEBUG
isFalse
,get_response
calls the URL resolver’s resolve_500 method, which works in mostly the same way asresolve_404
; the default view in this case is django.views.defaults.server_error, and can be overridden in the URL configuration by assigning a value to the variablehandler500
.
Additionally, for any exception other than django.http.Http404
or Python’s built-in SystemExit
, the handler will fire the dispatcher signal got_request_exception
, and construct a description of the exception which is mailed to each person listed in the Django settings file’s ADMINS
setting before returning.
Middleware, final round
At this point, regardless of anything which went wrong at any level in get_response
, it should have returned an HttpResponse
instance, so we’re back up into the main part of the handler. The first thing it does once it gets that HttpResponse
is loop through its _response_middleware
instance variable and apply the methods it finds there, passing the HttpRequest
and the HttpResponse
as arguments.
Note that this is the last chance any middleware has to modify things.
The check is in the mail
And now it’s time to wrap up. Once the final round of middleware has been applied, the handler fires the dispatcher signal request_finished
, which is the absolute last call for anything that wanted to execute during the current request. Handlers which listen for this signal should clean up and free any resources which were being used; for example, Django attaches a listener to request_finished
which will close any open database connections.
After that happens, the handler builds up an appropriate return value to send back to whatever instantiated it (currently, either a mod_python-appropriate response or a WSGI-compliant response depending on the handler), and returns it.
Whew
And we’re done. From beginning to end, that’s how Django processes a request.