My Django Project Conventions, pt. I

These few posts (there are more to come) are about some of my favourite Django project conventions. Python has this style of “there’s only one (obvious) way to do it”, and whilst this is nice, it is at best a bit idealistic and at worst a bit constraining. Of course, there will sometimes be ‘best practices’. That’s not the point of this blog post (although it will probably be the focus of the comments).

I have my own style which I like to adopt when working on Django projects. You might not want to use them, but maybe there will be some nuggets of gold in here for you to take.

The Layout

This is what the layout of a project of mine might look like. I’ll go into more depth later. I’ve laid out the files and folders in what, I hope, is an intuitive manner.

DIRECTORY   myproject/
FILE            __init__.py
FILE            manage.py
FILE            urls.py
DIRECTORY       myproject/settings/
FILE                __init__.py
FILE                common.py
FILE                debug.py
FILE                staging.py
FILE                production.py
FILE                modes.txt
DIRECTORY       myproject/apps/
FILE                __init__.py
FILE                external.py
DIRECTORY           myproject/apps/local/
FILE                    __init__.py
FILES                   ...
DIRECTORY       myproject/templates/
FILES               ...
DIRECTORY       myproject/media/
DIRECTORY       myproject/media/css/
DIRECTORY       myproject/media/js/
DIRECTORY       myproject/media/img/

You might have a few questions about this already. I’m going to try and answer a few here:

  1. Why no uploads directory? I like to use the /var/ directory for user-generated content. I believe that allowing users to upload their own data into the Python source tree is a little bit dangerous.
  2. Why has settings.py been replaced by a directory? There are a couple of reasons for this, which I shall explain shortly. Hang on tight!
  3. Aren’t templates supposed to be at the app level? Well, I like to stick some of the templates at the project level. This is especially important if you’re going to be making re-usable apps. A project is specific to a site, an app may be used on multiple sites. If you want to keep your apps re-usable, you probably need project-level templates.
  4. You keep the media stuff here? Yeah. I like to put the media in my project so that, when using a (D)VCS, the media gets included in the version control history. Images, JavaScript and CSS play an important part in the development process, too.

Now, on to the first major part of the project layout: the settings.

The Settings

As you can probably see, I have a settings directory instead of a simple Python module. When Django imports your settings, it does something like this:

from myproject import settings

Normally, the settings object which is created by this statement contains everything defined in the myproject/settings.py file. In the case of a directory, it takes all the values defined in myproject/settings/__init__.py. This file, therefore, is where the magic must happen. Look at the following piece of code:

UPDATE: Fixed some of this code so that it actually works.

import os

from myproject.settings.common import *

SETTING_MODES = []
SETTING_MODE = None

modes_fp = open(os.path.join(os.path.dirname(__file__), 'modes.txt'))
try:
    for line in map(str.strip, modes_fp):
        if line.endswith('*'):
            SETTING_MODE = line.strip('*').strip()
        SETTING_MODES.append(line)
finally:
    modes_fp.close()

SETTING_MODULE = __import__(__name__ + '.' + SETTING_MODE.lower(),
    fromlist=__name__.split('.'))
for attr in dir(SETTING_MODULE):
    if not (attr.startswith('__') and attr.endswith('__')):
        globals()[attr] = getattr(SETTING_MODULE, attr)

I don’t know about you, but It makes me want to hurl. Thing is, you see, it’s actually quite a useful piece of sludge. In fact, it’s possibly the most useful piece of sludge you’ll ever come across in your life, unless you are a member of a local SUG (Sludge User Group). What this does is the following: let’s say you have a project with some specified common settings, such as timezone, default language, installed apps, middleware and other nick nacks. However, you’re working on a super-critical application which involves debugging, staging and production servers, so you can develop your code, test it out and then release it to the public. If you’re running off a central (D)VCS branch, you would normally (with a simple settings.py file) have to change certain settings for each server, like database connections and cacheing. Every time you made a change and wanted to update the checked-out code on each server, you’d have to patch these settings in.

Well, no more!

Now, you get to have your common settings in one file, the settings for the debug server in another, the settings for the staging server in yet another, and the production settings in another file. This is all managed by the modes.txt file you see in the settings directory. This file looks something like this:

DEBUG *
STAGING
PRODUCTION

This is a list of all the modes you’re currently running. The one with the asterisk character (*) at the end of it is the currently selected mode. The code I showed you before does all the heavy lifting for you; change the line with the asterisk and it will switch from debugging mode to staging mode. In practice, what I normally do is have a .gitignore file in this directory with the single line ‘modes.txt’; this means each server’s settings mode persists through multiple commits and checkouts from and on different machines.

Call the Super NOW! This apartment is flooding quickly!

The title of this post is far too dramatic. There is no flooding situation right now. I’m so sorry to all of you who clicked on this hoping to read some intense story about one woman’s (or man’s or centaur’s) struggle with too much water.

In true Zackeriffic style, I’m writing this for a couple of reasons. Firstly, I’m trying out Snipt’s new embedding feature (that’s the one true snipt, not the other snipts).

I’ve written a decorator which does a lil’ bit of awesome. Wrap a method with it. Then, from that method, raise the ‘CallSuper’ exception, optionally with some args. The supermethod (if defined) will be called with those args (if provided), and the result of calling the supermethod will then be returned as the result of the submethod you’ve defined. You can see an example in the code snippet below.

I believe it looks rather nice, actually. Well done Snipt.

A Programmer's Apology

Today has been quite a pivotal day for me. I had an informal meeting with Matt Henderson, who funnily enough I met on twitter (he’s the first person I actually know on twitter), and he very kindly introduced me to his colleagues (the Makalu Media gang) and lent me a stack of books (which I’m desperately looking forward to reading).

I definitely took a lot away from the experience. I had a very long conversation with Arto Bendiken about all sorts of subjects, running the gamut from consciousness to web frameworks, and it was amazing to be in a development environment again—it’s something I haven’t seen since I did some summer work for Ravenpack a couple of years ago.

Makalu Media currently use Ruby on Rails and Drupal for most of their web projects, and in my discussions with them I did, of course, question this, thanks in no small to my Djangonautical persuasion, and probably having way too many preconceived notions about RoR. I had an epiphany of sorts when we were discussing Lisp and Python (my two favourite languages). I’ve realised something that’s affected me, and a lot of people I know—I am far too dogmatic when it comes to programming.

I’ve realised that I’m doing everything in the wrong order. The way things should work is this:

  1. I come across a problem. I try and understand the problem, which may involve some deep grokking of a subject I know little about. But I will get there in the end.
  2. With a fresh understanding of the problem, I set about trying to solve it conceptually, and in doing so plan out a system.
  3. With my system plan, I carefully select the tools I think would best suit the task.
  4. I then set about implementing the plan, using those tools.

Here’s the way I’ve been doing things lately (the bit I’d like to change):

  1. I decide to use certain tools (most often Python + Django) before I even come across a problem.
  2. I encounter a problem, again with as much grokking as necessary.
  3. I try and fiddle with the problem until it becomes something that I can solve with those tools I chose in part 1. This usually involves writing a middle-layer between the nature of the problem and the selected tools.
  4. I then design a system that almost completely ignores the actual problem, and just concentrates on how I will use those tools to solve the perceived problem.
  5. I then implement the system using those tools, and it feels awkward (because 99 times out of 100 those tools aren’t the best for the job), but at least I’m comfortable with it.

OK, the second one sounds really pessimistic. But when I say ‘dogmatic’ about software, that’s what I mean. For far too long have I made pre-choices about what I’m going to use to solve a problem before I’ve even encountered the problem itself. Too many times have I vociferously defended my own choice of programming language or web framework without trying (or even considering) the alternatives. This sort of closed-mindedness (if that’s an actual word) is the kind of stuff you see when people engage in religious debates. Unfortunately, being human, I’m very susceptible to these kinds of emotional responses—I believe they’re known to psychologists as “defense mechanisms”. You see, the use of a particular toolset becomes a sort of paradigm to me, so my harsh attacks on other approaches are really just me defending my paradigm. It’s more of an inner insecurity than anything else.

In the future, I promise myself that I will choose the best tool for the problem after actually figuring out what the problem is. And if the best tool is Homespring [PDF] (which, incidentally, looks incredibly awesome), then so be it. I’ll grok Homespring and create a system with it. At the end of the day, us programmers have the job of coming up with solutions.

I’ve also listened way too much to what other people have said. Just because some ‘guru’ writes that Ruby on Rails is the devil’s web framework, doesn’t mean it is. I’ve been really quite mean about the project in the past, and I’ve now realized that, never having written a Rails web application, I don’t have the right to do that. Way too many people have poured hour upon hour and dollar upon dollar into Rails for me to decry and dismiss it without ever having used it before. I suppose what I’m also saying is that I’m going to stop listening to other people’s opinions on these things. When the local ‘guru’ claims that Rails is evil, I’m going to remember that everything is subjective. For example, I greatly enjoy programming in Common Lisp. But I acknowledge that it’s not for everyone; it just happens to suit my way of thinking quite well. I’d encourage people to try it, but if they don’t like it I’m not going to force it down their throats.

I think that the people who do write inflammatory blog posts and articles on subjective opinions need to be more responsible and think about what they’re saying. A spirit of adventure needs to be adopted if we ever want to keep innovating. And I’m not saying I’ve been innocent in the past either, but I’m making a change.

I’m going to begin by learning Ruby, and later Rails (although I might wait until Rails 3, I don’t know how different from this version it’s going to be). Why? Because I’ve been very harsh on the language in the past, and it’s sort of a token gesture. In the true spirit of irony, I’ll probably end up loving it. I mean, how can you resist “why’s (poignant) guide to ruby”? Have you actually read it? It’s crazy. In a good way.

I’m going to end up branching into some languages that I never would have thought of before. This includes Erlang, Haskell, Squeak, Factor, Clojure, and probably a few others. When I talk about ‘choosing tools’, I don’t mean just languages, but they’re a good starting point. And you know what? I might just come full circle, decide that these languages are just too far beyond the manic fringe for my liking, and end up sticking with the traditional (albeit brilliant) favourites Python and Django. But at least I can say I tried.

Oh, and I’d probably like to learn C and/or C++ too, mainly because they tend to be the sort of ‘lowest common denominator’ languages, and a lot of jobs ask for C coding skills. I’m not much of a commercial programmer, but income is a nice thing to have.

Thanks for reading. If you’ve made it this far, you’ve done well. Now, it’s 5:00 AM, so I think I’m going to sleep now.

It’s funny ‘cos it’s true!

"Don't Ask, Don't Tell" - Time for Change

Note: Usually, this blog serves as a place for my more work-oriented stuff, but I thought it important to address an issue which means a lot to me.

US Code TITLE 10, Subtitle A, PART II, CHAPTER 37, § 654. Most people know it by its short name: “Don’t Ask, Don’t Tell.”

This is a policy of the United States military stating, amongst other things, that:

The presence in the armed forces of persons who demonstrate a propensity or intent to engage in homosexual acts would create an unacceptable risk to the high standards of morale, good order and discipline, and unit cohesion that are the essence of military capability.

To give the short but not-so-sweet version, members of the US military who are part of the LGBTQ community must hide their sexual identity from other soldiers. Should it be discovered that they are not heterosexual, they will be ‘separated’ from the armed forces.

The history of the bill is that, essentially, it was a compromise between those who wanted an outright ban on gays in the military, and those who wanted gays to be able to serve openly. That was in 1993. This is 2009. If you ask me, that’s one heck of a long compromise - if only an Israel-Palestine compromise could last that long!

I’m not going to go into too much depth about the policy itself, because the wikipedia article has a veritable treasure trove of information, but here’s an interesting perspective:

The Axis of Evil” (and beyond) is a group of states that the US government believes are plotting to destroy the western world, by producing Weapons of Mass Destruction (WMDs) and supplying active terrorist organizations. This group is made up of the following countries:

  • Iraq
  • Iran
  • North Korea
  • Cuba
  • Libya
  • Syria.

You’ve probably heard of this axis before; it’s a phrase which has been used enough in the news for the past 7 years.

How does this relate to the point of this article? Well, here’s just a few of the countries around the world where it’s either illegal to be homosexual, or illegal for homosexuals to serve openly in the military:

  • Iran (homosexuality illegal)
  • Libya (homosexuality punishable by death)
  • North Korea
  • Syria (homosexuality illegal)

Note how this lines up quite closely with the “Axis of Evil”, with the notable omission of Cuba (for which I could find no information) and Iraq (where, although homosexuality is legal, homosexuals are routinely killed by other citizens in hate crimes).

Oh, and don’t forget the United States of America, where it’s illegal for a LGBT person to serve openly in the military (so I guess it gets on my list).

I’m not saying that all countries with poor LGBT rights belong on the Axis of Evil, but still, it’s a common thread that all evil countries have poor LGBT rights. Just think about it. Maybe I’ve convinced you enough to vote on this poll (I think it’s going to change soon, so this link may go stale). Or something a bit more…influential (like writing to your representative, or something).

The OAuth/Twitter kerfuffle.

As a lot of people now know, Twitter’s been hit by the result of a few dodgy design decisions from the past (we’ve all made them at some point) and this has prompted a large call from web developers for Twitter to support OAuth, an authentication protocol that supposedly provides a bit more security than people going around and giving out their username and password to any old service.

I’m going to summarize the pros and cons of using OAuth with Twitter, and then I’ll follow up with a couple of suggestions I have for fixing the issues.

So, what’s this all about? Well, basically there is a service called Twply which asks for your Twitter username and password, and e-mails you every time you get a reply from someone. On signup, it gives the option to ‘support’ Twply, basically saying something along the lines of: “can we post a tweet from your account that advertises our service?” Needless to say, many people respond with “No.” Of course, Twply seems not to give two shits about what you want it to do, so it goes ahead and tweets the advertisement anyway. Lovely. So now, a bunch of Twitter users want to stop Twply from accessing their account; trouble is, they can’t without changing their password. Ug.

Obviously, users of Twitter are enraged at the service for a breach of trust. But it turns out that this was just an incident waiting to happen. You see, the Twitter API is structured so that any application that needs access to it, has to use your username and password. Which means you are giving your username and password to other people, and blindly trusting them not to do something stupid or evil. Clearly someone, at some point, was going to abuse this trust. Twply has done so.

Then, all eyes turned to OAuth. OAuth is a protocol, designed by a few smart people, which allows you to give a third-party app (a ‘consumer’) permission to access a service (a ‘service provider’, i.e. Twitter) on your behalf. Now, basically, you are giving access to the consumer, which isn’t that different from giving them your username and password; where OAuth differs is that you can control access, only allowing the service to do certain things, and you can also revoke access without having to change your password. So you can see that, if OAuth were implemented on Twitter, people wouldn’t have to change their password, they could just go into some management console and revoke access. In this respect, OAuth is good. Also, people could have given Twply read-only access, meaning it could access their replies (in order to e-mail them) but not be able to send tweets from their account.

OK, so OAuth seems to be pretty cool - though we can’t forget the cost of adoption, and the fact that it will have to be phased in over a relatively large time scale. But some people have expressed concerns over a vulnerability in the protocol. OAuth requires that the user log in with the service provider before it will grant access to an application. It is the responsibility of the consumer to redirect the user to Twitter to log in; this means that a fake consumer could potentially (maliciously) redirect a user to some phishing site and steal their password. So we need some kind of identity verification to make sure that you are, indeed, entering your details into Twitter, rather than a phishing website. This can be done in a couple of ways; firstly, Twitter should by default use HTTPS, and thus Twitter could be verified as the service provider. Secondly, users would have to be educated to check the authenticity of websites before entering sensitive data into them. Firefox (and maybe some other browsers - I don’t know) already checks against a blacklist (maintained by Google) and warns users when they are entering a potential phishing site. The phishing problem is one which will bug Twitter whether they’re using OAuth or not - it’s relatively easy to set up a clone of Twitter’s homepage and start spoofing users into handing over their credentials.

So, to put it bluntly, OAuth isn’t the silver bullet that will kill phishing once and for all, but it’s certainly a step in the right direction. You shouldn’t make your users think it’s OK to just go around giving their usernames and passwords out to everyone who asks, but likewise, it’s stupid of users to do that as well. I’ll be the first one to admit that I’ve been stupid with my password; about a month ago, I changed my password to a very secure one, and removed myself from as many third-party services as possible. I won’t be signing up for any new services until OAuth is implemented and also until the first hole is found (because there probably will be quite a few holes to begin with).

Twactor, and some Python descriptor amazingness.

For the past few days, I’ve been working on a fledgling twitter client library for Python. The libraries which currently exist are, not to put too fine a point on it, cop-outs. One overrides __getattribute__ so that you are literally just making API calls directly from Python, and then it deserializes the JSON data it gets back and gives it straight to you. This might work for some, although I prefer a more refined approach.

Anyways, I’m writing this library to take advantage of Python’s descriptors - you should be able to access data via attributes, not just dictionary calls, and it should be presented to you in a nice format, with names converted to more Pythonic equivalents and packaged into neat little namespaces so that it makes sense. I’ve also created a basic framework for incremental cacheing of the data — the library only ever contacts twitter when it needs to. An example of where this is important is the following: when you download info on one tweet, you get a small amount of info on the user who posted it. This info contains stuff like username, full name, profile image URL and the like - basic stuff. Now, a more naïve library might, if you wanted to fetch that user info, download all the info for that user. This library, however, creates a new instance of a ‘User’ class (compare to Django’s models), which contains a cached version of all the info from the tweet. So, if you want to access this User’s username, you can do status.user.username, and you’ll get the username without any additional requests being made. If, however, you want the user’s profile’s background color, more info needs to be fetched, and so status.user.profile.bg_color will go and fetch more info from the server, storing it in the cache. Then, any other information it fetched from the server will be accessible via that cache.

All in all, I think this library will be pretty good when finished. Eventually, I’m hoping to make it a little more than just a twitter client, but more on that later…

So, to the second purpose of this post, I’m going to explain the ‘Python descriptor amazingness…’ to you. Basically, when defining Python descriptors on classes (via property), you often have to write three functions — one for getting the value, one for setting it, and one for deleting it. An example might look something like this:

class SomeClass(object):

    def __init__(self, somevalue=None):
        self._value = some_value

    def _get_value(self):
        return self._value

    def _set_value(self, value):
        self._value = value

    def _del_value(self):
        self._value = None

    value = property(
        fget=_get_value,
        fset=_set_value,
        fdel=_del_value)

On instances of this class, you can access the value attribute to get the value. But the thing is, this quickly clutters up your class’s namespace with _get_*, _set_* and _del_* methods. Well, I’ve figured out a better way of doing this. All we need is a little helper lambda at the top called property_shortcut, which will handle the calling of property:

property_shortcut = lambda x: property(**x())

class SomeClass(object):

    def __init__(self, somevalue=None):
        self._value = some_value

    @property_shortcut
    def value():
        def fget(self):
            return self._value
        def fset(self, value):
            self._value = value
        def fdel(self):
            self._value = None
        return locals()

And this will work exactly the same as the one before, without cluttering your class’s namespace with junk. There are several things to note here:

  1. The value method takes no explicit self argument. This allows the method to be called unbound (i.e. without being on an instance). If it did accept self, then this wouldn’t work.
  2. The names of the functions fget, fset and fdel are the keyword arguments that the property function accepts. This means that calling locals() will return a dictionary with these three names, and that is why property_shortcut works, as it uses Python’s double asterisk method for unpacking a dictionary into keyword arguments.

So that’s how we roll. Hope you like it ;-)

Managers vs. Model Classmethods in Django

Everyone who’s read the Django documentation on models and managers will know the following; methods on your models specify row-level functionality, whereas operations on the whole data set are better encapsulated in managers. I know this all too well; I used to be absolutely obsessed with writing managers - each model would have about five for all different possible uses.

However.

Recently, I’ve completely stopped writing my own managers, for a very simple reason. I’ll explain why after a quick briffet (a made-up word) of code:

from django.db import models

class MyModel(models.Model):
    field1 = models.CharField(max_length=100)
    ...
    fieldn = models.IntegerField()
    fieldn_1_manager = MyManager()

class MyManager(models.Manager):

    def get_query_set(self):
        return super(MyManager, self).get_query_set().filter(fieldn=1)

This is a simple model definition in Django, which uses a manager to filter for a particular set of rows. MyModel.objects represents the primary manager, which will contain all objects in the database, and MyModel.fieldn_1_manager will contain all the rows where the value of fieldn is 1. This is, and has been, and may well be for a while, the conventional way to do things. I don’t like it though, so I’ve come up with my own way of doing things.

Most people don’t really use the classmethod and staticmethod decorators which are in the top-level built-in namespace; however, they are very useful as the first, for example, allows you to define a method which, instead of receiving the instance as the first argument, gets the class instead. This allows you to define class-level operations using the typical method syntax, like so:

class MyModel(models.Model):

    field1 = models.CharField(...)
    field2 = models.CharField(...)
    ...
    fieldn = models.IntegerField()

    @classmethod
    def fieldn_1(cls):
        return cls.objects.filter(fieldn=1)

You can probably see here why this advantage is a lot prettier. No subclassing, models and operations are tied together to prevent confusion, and good prevails throughout the land.

I’d love to know people’s reasons for choosing to use managers over classmethods. Please twitter me @zvoase with ideas and explanations.