How To Write Testable Code in Python

So, a few weeks back, I stumbled upon this awesome talk by Brandon Rhodes. I was so hooked that I couldn't resist the temptation to dive right into the example he shared and jot down my learnings.

One key takeaway I've learned is the importance of separating I/O operations (i.e. network requests, database calls, etc.) from the core logic of our code. By doing so, we can make our code more modular and testable.

I won't be delving into the intricacies of Clean Architecture or Clean Code in this blog post. While those concepts hold their own value, I want to focus on something that you can put into use immediately.

Let's dive right into an example that will serve as our starting point. This example is taken directly from Brandon Rhodes' slides, which you can find here.

Exploring the find_definition Function

Imagine we have a Python function called find_definition that performs data processing and involves making HTTP requests to an external API.

import requests                      # Listing 1
from urllib.parse import urlencode

def find_definition(word):
    q = 'define ' + word
    url = 'http://api.duckduckgo.com/?'
    url += urlencode({'q': q, 'format': 'json'})
    response = requests.get(url)     # I/O
    data = response.json()           # I/O
    definition = data[u'Definition']
    if definition == u'':
        raise ValueError('that is not a word')
    return definition

Writing our first test

To write a unit test for the find_definition function, we can utilize Python's built-in unittest module. Here's an example of how we can approach it:

import unittest
from unittest.mock import patch

class TestFindDefinition(unittest.TestCase):
    @patch('requests.get')
    def test_find_definition(self, mock_get):
        mock_response = {u'Definition': 'Visit tournacat.com'}
        mock_get.return_value.json.return_value = mock_response
        
        expected_definition = 'Visit tournacat.com'
        definition = find_definition('tournacat')
        
        self.assertEqual(definition, expected_definition)
        mock_get.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')

To isolate the I/O operations, we use the patch decorator from the unittest.mock module. It allows us to mock the behavior of the requests.get function.

By doing so, we can control the response that our function receives during testing. This way, we can test the find_definition function in isolation without actually making real HTTP requests.

Testing difficulties and tight coupling

By using the patch decorator to mock the behavior of the requests.get function, we tightly couple the tests to the internal workings of the function. This makes the tests more susceptible to breaking if there are changes in the implementation or dependencies.

If the implementation of find_definition changes, such as:

  1. Using a different HTTP library
  2. Modifying the structure of the API response
  3. Changes in the API endpoint

The tests may need to be updated accordingly. In the case of find_definition, writing and maintaining unit tests becomes a cumbersome task.

Hiding I/O: A Common Mistake

Typically, when working with functions like find_definition that involve I/O operations, I’d often refactor the code to extract the I/O operations into a separate function, such as call_json_api, as shown in the updated code below (again, borrowed from Brandon’s slides):

def find_definition(word):           # Listing 2
    q = 'define ' + word
    url = 'http://api.duckduckgo.com/?'
    url += urlencode({'q': q, 'format': 'json'})
    data = call_json_api(url)
    definition = data[u'Definition']
    if definition == u'':
        raise ValueError('that is not a word')
    return definition

def call_json_api(url):
    response = requests.get(url)     # I/O
    data = response.json()           # I/O
    return data

By extracting the I/O operations into a separate function, we achieve a level of abstraction and encapsulation.

The find_definition function now delegates the responsibility of making the HTTP request and parsing the JSON response to the call_json_api function.

Updating the test

Again, we utilize the patch decorator from the unittest.mock module to mock the behavior of the call_json_api function (instead of requests.get). By doing so, we can control the response that find_definition receives during testing.

import unittest
from unittest.mock import patch

class TestFindDefinition(unittest.TestCase):
    @patch('call_json_api')
    def test_find_definition(self, mock_call_json_api):
        mock_response = {u'Definition': 'Visit tournacat.com'}
        mock_call_json_api.return_value = mock_response
        
        expected_definition = 'Visit tournacat.com'
        definition = find_definition('tournacat')
        
        self.assertEqual(definition, expected_definition)
        mock_call_json_api.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')

”We have hidden I/O, but have we really decoupled it?”

However, it's important to note that although we have hidden the I/O operations behind the call_json_api function, we haven't completely decoupled them.

The find_definition function still depends on the implementation details  call_json_api and assumes it will handle the I/O operations correctly.

Dependency Injection: Decoupling

To achieve a more decoupled design, we could further separate the I/O concerns by using dependency injection.

Here's an updated version of the find_definition:

import requests

def find_definition(word, api_client=requests):  # Dependency injection
    q = 'define ' + word
    url = 'http://api.duckduckgo.com/?'
    url += urlencode({'q': q, 'format': 'json'})
    response = api_client.get(url)               # I/O 
    data = response.json()                       # I/O
    definition = data[u'Definition']
    if definition == u'':
        raise ValueError('that is not a word')
    return definition

The api_client parameter is introduced, which represents the dependency responsible for making the API calls. By default, it is set to requests, allowing us to use the requests library for the I/O operations.

Unit testing with dependency injection

Using dependency injection allows for better control and predictability in unit testing. Here's an example of how we can write unit tests for the find_definition function with dependency injection:

import unittest
from unittest.mock import MagicMock

class TestFindDefinition(unittest.TestCase):
    def test_find_definition(self):
        mock_response = {u'Definition': u'How to add Esports schedules to Google Calendar?'}
        mock_api_client = MagicMock()
        mock_api_client.get.return_value.json.return_value = mock_response

        word = 'example'
        expected_definition = 'How to add Esports schedules to Google Calendar?'

        definition = find_definition(word, api_client=mock_api_client)

        self.assertEqual(definition, expected_definition)
        mock_api_client.get.assert_called_once_with('http://api.duckduckgo.com/?q=define+example&format=json')

In the updated unit test example, we create a mock API client using the MagicMock class from the unittest.mock module. The mock API client is configured to return a predefined response i.e. mock_response when its get method is called.

Yay! In the case where we want to use a different HTTP library, we're now in a much better spot.

Problems with dependency injection

While dependency injection offers several benefits, it can also introduce some challenges. As highlighted by Brandon, there are a few potential problems to be aware of:

  1. Mock vs. Real Library: The mock objects used for testing might not fully replicate the behavior of the real dependencies. This could lead to discrepancies between test results and actual runtime behavior.
  2. Complex Dependencies: Functions or components with multiple dependencies, such as a combination of database, filesystem, and external services, can require significant injection setup and management, making the codebase more complex.

This brings us to the next point.

Separating I/O Operations from Core Logic

In the pursuit of a flexible and testable code, we can adopt a different approach without relying on explicit dependency injection

We can achieve a clear separation of concerns by placing the I/O operations at the outermost layer of our code. Here's an example that demonstrates this concept:

def find_definition(word):           # Listing 3
    url = build_url(word)
    data = requests.get(url).json()  # I/O
    return pluck_definition(data)

Here, the find_definition function focuses solely on the core logic of extracting the definition from the received data. The I/O operations, such as making the HTTP request and retrieving the JSON response, are performed at the outer layer.

In addition, the find_definition function also relies on two separate functions:

  1. build_url function constructs the URL for the API request
  2. pluck_definition function extracts the definition from the API response.

Here are the corresponding code snippets:

def build_url(word):
    q = 'define ' + word
    url = 'http://api.duckduckgo.com/?'
    url += urlencode({'q': q, 'format': 'json'})
    return url

def pluck_definition(data):
    definition = data[u'Definition']
    if definition == u'':
        raise ValueError('that is not a word')
    return definition

By putting I/O at the outermost layer, code becomes more flexible. We successfully created functions that can be individually tested and replaced as needed.

For example, you can easily switch to a different API endpoint by modifying the build_url function, or handle alternative error scenarios in the pluck_definition function.

This separation of concerns enables modifications to the I/O layer without impacting the core functionality of find_definition, enhancing the overall maintainability and adaptability of the codebase.

Updating unit tests (again)

To demonstrate the improved flexibility and control offered by the modular design, let's update our unit tests for the find_definition function.

Here's the updated code snippet:

import unittest
from unittest.mock import patch

class TestFindDefinition(unittest.TestCase):
    @patch('requests.get')
    def test_find_definition(self, mock_get):
        mock_response = {'Definition': 'Visit tournacat.com'}
        mock_get.return_value.json.return_value = mock_response
        word = 'example'
        expected_definition = 'Visit tournacat.com'
        
        definition = find_definition(word)
        
        self.assertEqual(definition, expected_definition)
        mock_get.assert_called_once_with(build_url(word))
    
    def test_build_url(self):
        word = 'example'
        expected_url = 'http://api.duckduckgo.com/?q=define+example&format=json'
        
        url = build_url(word)
        self.assertEqual(url, expected_url)
    
    def test_pluck_definition(self):
        mock_response = {'Definition': 'What does tournacat.com do?'}
        expected_definition = 'What does tournacat.com do?'
        
        definition = pluck_definition(mock_response)
        self.assertEqual(definition, expected_definition)

if __name__ == '__main__':
    unittest.main()

In the updated unit tests, we now have separate test methods for each of the modular components:

  1. test_find_definition remains largely unchanged from the previous example before dependency injection was introduced, verifying the correct behavior of the find_definition function. However, it now asserts that the requests.get function is called with the URL generated by the build_url function, demonstrating the updated interaction between the modular components.
  2. test_build_url verifies that the build_url function correctly constructs the URL based on the given word.
  3. test_pluck_definition ensures that the pluck_definition function correctly extracts the definition from the provided data.

By updating our unit tests, we can now test each component independently, ensuring that they function correctly in isolation.

Summary

In short, we’ve explored different approaches to refactoring to address tight coupling and achieve loose coupling between components. On top of that, we witnessed how unit testing can be enhanced by mocking I/O operations and controlling the behavior of external dependencies.

By placing I/O operations at the outermost layer of our code, we achieve a clear separation of concerns, enhancing the modularity and maintainability of our codebase.

Finally, if you're interested in reading more from me, check out my other articles about Python.

Hosted on Digital Ocean.