Mocking properties in Python

It doesn’t happen all that often, but sometimes when writing unit tests you want to mock a property and specify a return value. The mock library provides a PropertyMock for that, but using it probably doesn’t work the way you would initially think it would.

I had this exact problem earlier this week and had to explain it to two different people in the two days after that. So why not write it down in a blog post?

Suppose you have this silly implementation of a class that takes a file path and reads the file’s content into a variable that can be accessed through a property:

class SillyFileReader(object):
    def __init__(self, file_path):
        self._file_path = file_path
        self._content = self._read_file()
        
    def _read_file(self):
        with open(self._file_path, 'r') as f:
            return f.read()
        
    @property
    def content(self):
        return self._content

You also have this equally silly function that takes an instance of the silly file reader and returns the reversed content:

def silly_reverse_content(silly_file_reader):
    return silly_file_reader.content[::-1]

Of course you want to test this silly function. Maybe you come up with something like this:

import mock
import unittest


class SillyTest(unittest.TestCase):
    def test_silly_reversing(self):
        mock_reader = mock.MagicMock()
        mock_reader.content = mock.PropertyMock(return_value='silly')

        assert silly_reverse_content(mock_reader) == 'yllis'

Unfortunately, that won’t work:

TypeError: 'PropertyMock' object has no attribute '__getitem__'

The thing with a PropertyMock is that you need to set it for the type of an object, not for the object itself. To get things working, you need to do it like this:

class SillyTest(unittest.TestCase):
    def test_silly_reversing(self):
        mock_reader = mock.MagicMock()
        type(mock_reader).content = mock.PropertyMock(return_value='silly')

        assert silly_reverse_content(mock_reader) == 'yllis'

That’s it!

Running Jekyll locally with Docker

This website is built with static site generator Jekyll and hosted on GitHub pages. There are a number of reasons why this combination is awesome:

  • I can write everything in Markdown
  • The workflow has source control baked in
  • The website is automatically updated by pushing to master

If you want to give Jekyll a go and you’re running Linux or OS X, you have to make sure you have the correct version of Ruby and some gems installed. Running it on Windows isn’t officially supported, but it can be done using Chocolatey and it also requires Ruby and some gems. Regardless of your operating system, there is an easier way to get Jekyll running on your machine. This is where Docker comes into play.

When you run Jekyll from inside a Docker container, you won’t have to bother with installing a specific version of Ruby or with different versions of gems you already had installed potentially clashing. You could of course use the Ruby Version Manager, but working with Docker really is a breeze.

The rest of this post assumes you have Docker and Docker Compose installed and have some basic knowledge of working with it. Docker has some great getting started guides for Linux, Mac and Windows.

The easiest way to get started is probably to download a theme for Jekyll. This website uses a slightly modified version of the Hyde theme. It’s hosted on GitHub, so you can download or clone it from there.

The friendly people of Jekyll have already made a Docker image available on Docker Hub. Let’s take advantage of that! Suppose you have downloaded or cloned the Hyde repository into ~/jekyll-site/. To use the Jekyll image without having to pass in the required options every time to create a container, we can make use of Docker Compose. Create a file called docker-compose.yml in ~/jekyll-site with these contents:

jekyll:
    image: jekyll/jekyll:pages
    command: jekyll serve --watch --incremental
    ports:
        - 4000:4000
    volumes:
        - .:/srv/jekyll
  • With image: jekyll/jekyll:pages you indicate you want to use Jekyll’s Jekyll image tagged with “pages”. This is a specific image suited for GitHub pages.
  • The part after command: is the command to execute in the container. The jekyll serve command starts Jekyll’s builtin development web server. The options --watch and --incremental instruct Jekyll to automatically regenerate the HTML when a file is changed.
  • 4000:4000 forwards port 4000 of the container to your local port 4000.
  • Finally, on the last line you map the current directory (~/jekyll-site/) to /srv/jekyll/. That’s where the images is configured to go looking for a Jekyll site.

That’s all! You can start your Jekyll site by browsing to your site’s directory using a terminal and doing docker-compose up. You will see something like this:

jekyll_1  | Configuration file: /srv/jekyll/_config.yml
jekyll_1  |             Source: /srv/jekyll
jekyll_1  |        Destination: /srv/jekyll/_site
jekyll_1  |  Incremental build: enabled
jekyll_1  |       Generating...
jekyll_1  |                     done in 0.343 seconds.
jekyll_1  |  Auto-regeneration: enabled for '/srv/jekyll'
jekyll_1  | Configuration file: /srv/jekyll/_config.yml
jekyll_1  |     Server address: http://0.0.0.0:4000//
jekyll_1  |   Server running... press ctrl-c to stop.

If you go to http://0.0.0.0:4000 (or the IP of your VM if you’re running Docker on OS X or Windows) in a browser you’ll see your Jekyll site. Easy, right?

Don't dehydrate your code

DRY is one of the most overused principles in software development and the world would be a better place if a lot of code was a bit more MOIST. (I’m still looking for a relevant catch phrase that has MOIST as an acronym.) There, I’ve said it. Before you close your browser, allow me to explain myself.

A lack of pragmatism

Where everyone agrees that principles should be applied pragmatically, it appears that DRY is exempt from any form of pragmatism. When I read blog posts, questions and answers on StackOverflow, or talk to people, I often have the impression that all code should be as DRY as possible at all times. I have often seen this lead to unnecessary (and overly complex) abstractions that only exist for the sake of DRYing up code. As we all know, naming things is jokingly said to be one of the hardest things in software development. Unnecessary and complex abstractions combined with bad naming gives you code that is hard to read and hard to maintain. DRY should lead to code that is easy to understand and maintain, but a lot of times the exact opposite is achieved.

It’s all about context

I’m all for avoiding repetition, but not all code that looks similar serves the same purpose. You should keep the context of the code in mind when trying to determine if two instances of seemingly similar code are actual repetitions or not.

Let’s look at these two classes:

class Book(object):
    def __init__(self):
        self.pages = []

    def add_page(self, page):
        if len(self.pages) > 1000:
            raise Exception('Max 1000 allowed.')

        self.pages.append(page)


class Box(object):
    def __init__(self):
        self.items = []

    def add_item(self, item):
        if len(self.items) > 1000:
            raise Exception('Max 1000 allowed')

        self.items.append(items)

The add_page and add_item sure look similar. A lot of people might argue some refactoring is needed to DRY things up. They might end up with something like this:

class ContainerBase(object):
    @staticmethod
    def add_to_list(item, item_list):
        if len(item_list) > 1000:
            raise Exception('Max 1000 allowed.')

        item_list.append(item)


class Book(ContainerBase):
    def __init__(self):
        self.pages = []

    def add_page(self, page):
        self.add_to_list(page, self.pages)


class Box(ContainerBase):
    def __init__(self):
        self.items = []

    def add_item(self, item):
        self.add_to_list(item, items)

Look at this lovely DRY code! But then the requirements change. You need different exception messages for books and boxes. Because of thinner paper, you can now have books with 1500 pages. Specific types of items can only appear twice in a box.

Before you know it the add_to_list methods takes 6 parameters, including three booleans to toggle behavior. But hey, the code is DRY.

Books and boxes have nothing in common. They don’t share any business rule. The initial rules “Books can’t contain more than 1000 pages” and “Boxes can’t contain more than 1000 items” are exactly that: two separate rules. They may look similar, but they are two different rules nevertheless and shouldn’t be represented as one in the code.

My advice

Don’t prematurely DRY up your code. We’ve all learned we shouldn’t prematurely optimize our code. It’s time we treat DRY the same. It’s OK to have similar or identical code in two or more different places. There’s absolutely nothing wrong with that.

It can become a problem when you have to make changes to such pieces of code, but as long as you don’t, there’s no problem at all.

Let’s go back to our previous example of books and boxes. Suppose a feature request comes in asking to allow books to have up to 1500 pages. With the ContainerBase you have two options:

  1. You forget (or don’t know) the ContainerBase is used by Box as well. You now have introduced boxes that can hold up to 1500 items. Congratulations on your new bug!
  2. You remember the ContainerBase is used by Box as well and take it into account. Everything is OK, but you had to do some unnecessary work. Combine that with the amount of time you spent prematurely DRYing up the code. Now look at your velocity and cry as you realize how much business value you could have added instead.

Without the ContainerBase class, you would only need to change one simple thing in the Book class. But let’s say that after the release of the feature request you receive an email saying that they actually also wanted boxes to be able to hold 1500 items. No problem, just another simple change to the Box class. But at this point, you should make a note that Book and Box were changed at the same time in a similar way. If this becomes a trend for these two classes, it’s time to refactor them and extract shared logic.

I really don’t see the advantage of prematurely DRYing up your code. There’s always the chance you were right and made the right decision of course, but a lot of times you just don’t know how things will have to behave or change in the future. Especially if you’re taking an agile approach and build and release features iteratively, you just can’t be sure. It’s more time consuming to fix incorrect DRYness than it is to retroactively make a correct decision and make things more DRY. That’s time that could be spent delivering business value or writing unit tests instead.

In the end, it all comes down to this: YAGNI > DRY.