Eventual Consistency

Better Python APIs

Python comes with a large number of built-in functions, operators and keywords. They make working with data structures and built-in types very easy, but usually when we define our own data types (classes) we also tend to come up with our own ways to manipulate and consume our data.

One of the nice things in python, is that we don't have to. We can use "underscore methods" to make our classes compatible with the built-in functions and operators.

This makes our code easier to use and does a better job of hiding our nasty, complex implementation from the user. More importantly, it makes our code more intuitive. This means that in many cases our API will just do what the user expected it to do. I call this Code UX, and we should be working on constantly improving it.

Here are a few practical examples for doing that:

Make your code REPL friendly

Python has an awesome REPL shell, which I use all the time to introspect code and to play around with implementations before finally adding them to my code base. I use BPython or IPython instead of the regular Python shell since they offer many niceties such as auto-completion, syntax highlighting, history, and in-line documentation. Many of the Python developers I know use these tools daily.

So how can we make our code easier to use in such an environment?

Take this class for example:

class Container(object):

    def __init__(self, *args):
        self.objects = args

If we create an object of Container in our REPL environment, we would see:

>>> from myprogram import Container
>>> Container(1, 2, 'Hello', False)
<myprogram.Container object at 0x107f74f90>
>>>

Not very readable, and definitely hard to mess around with. Let's add a __repr__ method:

class Container(object):

    def __init__(self, *args):
        self.objects = args

    def __repr__(self):
        return 'Container(%s)' % (', '.join(map(repr, self.objects)))

and now do the same thing from the shell:

>>> from myprogram import Container
>>> Container(1, 2, 'Hello', False)
Container(1, 2, 'Hello', False)
>>>

Much better. We now get a representation of the objects we can actually use. We don't have to keep track in our head of what we created and we don't have to guess what our objects contain.

This is also very useful for logging objects in a reproducible way.

Create iterators over complex lists of data

We can use the __iter__ method to turn our object into an iterator. If we are working with lists or creating lists from some external resource, we can expose an iterator to users to hide away our underlying complexity, and let them simply use a for loop.

Here is an example: A YouTube search API wrapper, exposing the results using an iterator (this is a stripped down, yet functional example. I've removed things like error handling, caching or validation to make it easier to follow):

import requests

class YoutubeSearch(object):

    def __init__(self, term):
        self.query_url = 'https://gdata.youtube.com/feeds/api/videos'
        self.query_params = {
            'q': term,
            'alt': 'json',
            'orderby': 'relevance',
            'v': '2'
        }

    def _do_request(self):
        return requests.get(self.query_url, params=self.query_params).json

    def __iter__(self):
        for video in self._do_request().get('feed').get('entry'):
            result = {}
            result['title'] = video.get('title').get('$t')
            result['url'] = video.get('link')[0].get('href')
            yield result

To run this example you need to install Requests.

Now, we can search YouTube simply by instantiating an object and iterating over it:

>>> from myprogram import YoutubeSearch
>>> yt_search = YoutubeSearch('DjangoCon 2012')
>>> for video in yt_search:
...     print(video)
...
{'url': u'https://www.youtube.com/watch?v=0FD510Oz2e4&feature=youtube_gdata', 'title': u'DjangoCon 2012 - Alex Gaynor "Take Two: If I got to do it all over again"'}
{'url': u'https://www.youtube.com/watch?v=IKQzXu43hzY&feature=youtube_gdata', 'title': u'DjangoCon 2012 - Daniel Lindsley "API Design Tips"'}
...

This makes for a very simple and readable API. We can also use this technique to lazily load resources and yield them when needed instead of fetching everything upfront and returning a big list. So we can also use it to improve efficiency and performance.

Overloading arithmetic operators

If you are dealing with your own special data types, you can override the way arithmetic operators are handled between instances.

For example, Python's datetime module lets you subtract date objects from one another, resulting in a timedelta object, simply by using the - (minus) sign, as you would for numbers.

To do this, your class needs to define a __sub__ method:

class ExpressiveList(list):

    def __sub__(self, other):
        new_list = ExpressiveList(self)
        if isinstance(other, list):
            for item in other:
                new_list.remove(item)
        else:
            new_list.remove(other)
        return new_list

And now we can remove items from the list by using the - operator:

>>> l = ExpressiveList([1,2,3,4,5])
>>> l
[1, 2, 3, 4, 5]
>>> l - 2
[1, 3, 4, 5]
>>> l - [2,4,1]
[3, 5]
>>>

We can do the same with __add__, __mul__, __div__ and basically all other arithmetic operations.

Give your objects boolean meaning

Sometimes we want to give our object a boolean meaning. For example, a python list would return false if empty, and true if its length is > 0. We define such a logic for our own classes with __nonzero__ (or __bool__ in Python 3):

class Container(object):

    def __init__(self, *args):
        self.objects = args

    def __nonzero__(self):
        return bool(self.objects)

    def __bool__(self):
        # Python 3 compatibility
        return self.__nonzero__()

Now we can test for boolean value simply by doing:

>>> if Container(1, 2, 'Hello', True):
...     print('true!')
... else:
...     print('false')
...
true!
>>> # would print "false" for an empty Container()

But that's not all

There are quite a few other methods we could use to make our code easier to use. Here are a few notable others:

  • __len__ - Determines what len(my_object) returns.
  • __contains__ - Determines whether x in my_object is true or false.
  • __getattr__ - Create dynamic attributes or return some special value for attributes not found on the object. Not very REPL friendly (since we can't auto-complete attributes not set on the object).
  • __cmp__ - used to compare between objects of this class. Once defined we can use the built-in sorted() function to sort lists containing objects of our class.

And there are many more useful methods. Creating expressive and intuitive APIs in Python is not a lot of work, and your users (including future-you) will thank you for it.


I'd be happy to hear your tips for improving code UX in the comments below.

You can also follow me on twitter where I talk a lot about python and other geeky stuff.

comments powered by Disqus

Hi, I'm Oz Katz

I am a co-founder and CTO over at Swayy.

I usually write about software development using Python, JavaScript and other awesome, open source tools.

Feel free to reach out on Twitter, or contact me using the links at the bottom of the page.