How To Write Testable Code in Python
So, a few weeks back, I stumbled upon this awesome talk by Brandon Rhodes. I was so hooked that I couldn't resist the temptation to dive right into the example he shared and jot down my learnings.
One key takeaway I've learned is the importance of separating I/O operations (i.e. network requests, database calls, etc.) from the core logic of our code. By doing so, we can make our code more modular and testable.
I won't be delving into the intricacies of Clean Architecture or Clean Code in this blog post. While those concepts hold their own value, I want to focus on something that you can put into use immediately.
Let's dive right into an example that will serve as our starting point. This example is taken directly from Brandon Rhodes' slides, which you can find here.
Exploring the find_definition
Function
Imagine we have a Python function called find_definition
that performs data processing and involves making HTTP requests to an external API.
import requests # Listing 1
from urllib.parse import urlencode
def find_definition(word):
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
response = requests.get(url) # I/O
data = response.json() # I/O
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')
return definition
Writing our first test
To write a unit test for the find_definition
function, we can utilize Python's built-in unittest
module. Here's an example of how we can approach it:
import unittest
from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('requests.get')
def test_find_definition(self, mock_get):
mock_response = {u'Definition': 'Visit tournacat.com'}
mock_get.return_value.json.return_value = mock_response
expected_definition = 'Visit tournacat.com'
definition = find_definition('tournacat')
self.assertEqual(definition, expected_definition)
mock_get.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')
To isolate the I/O operations, we use the patch
decorator from the unittest.mock
module. It allows us to mock the behavior of the requests.get
function.
By doing so, we can control the response that our function receives during testing. This way, we can test the find_definition
function in isolation without actually making real HTTP requests.
Testing difficulties and tight coupling
By using the patch
decorator to mock the behavior of the requests.get
function, we tightly couple the tests to the internal workings of the function. This makes the tests more susceptible to breaking if there are changes in the implementation or dependencies.
If the implementation of find_definition
changes, such as:
- Using a different HTTP library
- Modifying the structure of the API response
- Changes in the API endpoint
The tests may need to be updated accordingly. In the case of find_definition
, writing and maintaining unit tests becomes a cumbersome task.
Hiding I/O: A Common Mistake
Typically, when working with functions like find_definition
that involve I/O operations, I’d often refactor the code to extract the I/O operations into a separate function, such as call_json_api
, as shown in the updated code below (again, borrowed from Brandon’s slides):
def find_definition(word): # Listing 2
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
data = call_json_api(url)
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')
return definition
def call_json_api(url):
response = requests.get(url) # I/O
data = response.json() # I/O
return data
By extracting the I/O operations into a separate function, we achieve a level of abstraction and encapsulation.
The find_definition
function now delegates the responsibility of making the HTTP request and parsing the JSON response to the call_json_api
function.
Updating the test
Again, we utilize the patch
decorator from the unittest.mock
module to mock the behavior of the call_json_api
function (instead of requests.get
). By doing so, we can control the response that find_definition
receives during testing.
import unittest
from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('call_json_api')
def test_find_definition(self, mock_call_json_api):
mock_response = {u'Definition': 'Visit tournacat.com'}
mock_call_json_api.return_value = mock_response
expected_definition = 'Visit tournacat.com'
definition = find_definition('tournacat')
self.assertEqual(definition, expected_definition)
mock_call_json_api.assert_called_with('http://api.duckduckgo.com/?q=define+tournacat&format=json')
”We have hidden I/O, but have we really decoupled it?”
However, it's important to note that although we have hidden the I/O operations behind the call_json_api
function, we haven't completely decoupled them.
The find_definition
function still depends on the implementation details  call_json_api
and assumes it will handle the I/O operations correctly.
Dependency Injection: Decoupling
To achieve a more decoupled design, we could further separate the I/O concerns by using dependency injection.
Here's an updated version of the find_definition
:
import requests
def find_definition(word, api_client=requests): # Dependency injection
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
response = api_client.get(url) # I/O
data = response.json() # I/O
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')
return definition
The api_client
parameter is introduced, which represents the dependency responsible for making the API calls. By default, it is set to requests
, allowing us to use the requests
library for the I/O operations.
Unit testing with dependency injection
Using dependency injection allows for better control and predictability in unit testing. Here's an example of how we can write unit tests for the find_definition
function with dependency injection:
import unittest
from unittest.mock import MagicMock
class TestFindDefinition(unittest.TestCase):
def test_find_definition(self):
mock_response = {u'Definition': u'How to add Esports schedules to Google Calendar?'}
mock_api_client = MagicMock()
mock_api_client.get.return_value.json.return_value = mock_response
word = 'example'
expected_definition = 'How to add Esports schedules to Google Calendar?'
definition = find_definition(word, api_client=mock_api_client)
self.assertEqual(definition, expected_definition)
mock_api_client.get.assert_called_once_with('http://api.duckduckgo.com/?q=define+example&format=json')
In the updated unit test example, we create a mock API client using the MagicMock
class from the unittest.mock
module. The mock API client is configured to return a predefined response i.e. mock_response
when its get
method is called.
Yay! In the case where we want to use a different HTTP library, we're now in a much better spot.
Problems with dependency injection
While dependency injection offers several benefits, it can also introduce some challenges. As highlighted by Brandon, there are a few potential problems to be aware of:
- Mock vs. Real Library: The mock objects used for testing might not fully replicate the behavior of the real dependencies. This could lead to discrepancies between test results and actual runtime behavior.
- Complex Dependencies: Functions or components with multiple dependencies, such as a combination of database, filesystem, and external services, can require significant injection setup and management, making the codebase more complex.
This brings us to the next point.
Separating I/O Operations from Core Logic
In the pursuit of a flexible and testable code, we can adopt a different approach without relying on explicit dependency injection
We can achieve a clear separation of concerns by placing the I/O operations at the outermost layer of our code. Here's an example that demonstrates this concept:
def find_definition(word): # Listing 3
url = build_url(word)
data = requests.get(url).json() # I/O
return pluck_definition(data)
Here, the find_definition
function focuses solely on the core logic of extracting the definition from the received data. The I/O operations, such as making the HTTP request and retrieving the JSON response, are performed at the outer layer.
In addition, the find_definition
function also relies on two separate functions:
build_url
function constructs the URL for the API requestpluck_definition
function extracts the definition from the API response.
Here are the corresponding code snippets:
def build_url(word):
q = 'define ' + word
url = 'http://api.duckduckgo.com/?'
url += urlencode({'q': q, 'format': 'json'})
return url
def pluck_definition(data):
definition = data[u'Definition']
if definition == u'':
raise ValueError('that is not a word')
return definition
By putting I/O at the outermost layer, code becomes more flexible. We successfully created functions that can be individually tested and replaced as needed.
For example, you can easily switch to a different API endpoint by modifying the build_url
function, or handle alternative error scenarios in the pluck_definition
function.
This separation of concerns enables modifications to the I/O layer without impacting the core functionality of find_definition
, enhancing the overall maintainability and adaptability of the codebase.
Updating unit tests (again)
To demonstrate the improved flexibility and control offered by the modular design, let's update our unit tests for the find_definition
function.
Here's the updated code snippet:
import unittest
from unittest.mock import patch
class TestFindDefinition(unittest.TestCase):
@patch('requests.get')
def test_find_definition(self, mock_get):
mock_response = {'Definition': 'Visit tournacat.com'}
mock_get.return_value.json.return_value = mock_response
word = 'example'
expected_definition = 'Visit tournacat.com'
definition = find_definition(word)
self.assertEqual(definition, expected_definition)
mock_get.assert_called_once_with(build_url(word))
def test_build_url(self):
word = 'example'
expected_url = 'http://api.duckduckgo.com/?q=define+example&format=json'
url = build_url(word)
self.assertEqual(url, expected_url)
def test_pluck_definition(self):
mock_response = {'Definition': 'What does tournacat.com do?'}
expected_definition = 'What does tournacat.com do?'
definition = pluck_definition(mock_response)
self.assertEqual(definition, expected_definition)
if __name__ == '__main__':
unittest.main()
In the updated unit tests, we now have separate test methods for each of the modular components:
test_find_definition
remains largely unchanged from the previous example before dependency injection was introduced, verifying the correct behavior of thefind_definition
function. However, it now asserts that therequests.get
function is called with the URL generated by thebuild_url
function, demonstrating the updated interaction between the modular components.test_build_url
verifies that thebuild_url
function correctly constructs the URL based on the given word.test_pluck_definition
ensures that thepluck_definition
function correctly extracts the definition from the provided data.
By updating our unit tests, we can now test each component independently, ensuring that they function correctly in isolation.
Summary
In short, we’ve explored different approaches to refactoring to address tight coupling and achieve loose coupling between components. On top of that, we witnessed how unit testing can be enhanced by mocking I/O operations and controlling the behavior of external dependencies.
By placing I/O operations at the outermost layer of our code, we achieve a clear separation of concerns, enhancing the modularity and maintainability of our codebase.
Finally, if you're interested in reading more from me, check out my other articles about Python.