|
From Practices of the Python Pro by Dane Hillard This article covers • Recognizing the signs of tightly coupled code • Strategies for reducing coupling
|
Take 37% off Practices of the Python Pro. Just enter fcchillard into the discount code box at checkout at manning.com.
Loose coupling is what allows you to affect change in different areas of your code without worry you’ll break something elsewhere. It’s also what allows you to work on one feature as your coworker tackles another. It’s the foundation for other desirable characteristics like extensibility. Without loose coupling, maintaining your code can quickly grow out of hand. In this article you’ll see some of the pains of tight coupling and learn how to address them.
Defining coupling
Because this idea of coupling is such a big part of effective software development, it’s important to get a solid handle on what it means. What is coupling exactly? It’s useful to think of it as the connective tissue between the different areas of your code.
The connective tissue
Coupling can be a tricky concept at first because it’s not necessarily tangible. It’s a kind of mesh that runs throughout your code (figure 1). Where two pieces of code have high interdependency, that mesh is tightly woven and taut. Moving either piece of code around causes the other to move around too. The mesh between areas with little or no interdependence is flexible; maybe it’s made of rubber bands. You’d have to change the code in this looser part of the mesh much more drastically for it to impact the code around it.
Figure 1. Coupling is a measure of the interconnectedness of distinct pieces of software
I like this analogy because it doesn’t prescribe that tight coupling is inherently bad in all cases; rather, it focuses on the qualities where tight and loose coupling differ and helps you get a sense of the resulting outcomes for your code — tight coupling usually means more work shuffling things around. It also implies that coupling is a continuum rather than a binary, all-or-nothing thing.
Although coupling is measured along a continuum, there are common ways it manifests. You can learn to recognize these and reduce coupling in your software as you see fit. First, I want to give you a more fine-grained definition of tight and loose coupling.
Tight coupling
Coupling is considered tight between two pieces of code—these could be modules or classes—when those pieces of code are interconnected. What does interconnectedness look like? In your code, several things create interconnection:
- A class that stores another object as an attribute
- A class whose methods call functions from another module
- A function or method that does a lot of procedural work using methods from another object
Any time a class, method, or function needs to carry a lot of knowledge about another module or class, this is tight coupling. Consider the code in listing 1. The display_book_info
function carries knowledge of all the different pieces of information that a Book
instance contains.
Listing 1. A function tightly coupled to an object
class Book: def __init__(self, title, subtitle, author): ❶ self.title = title self.subtitle = subtitle self.author = author def display_book_info(book): print(f'{book.title}: {book.subtitle} by {book.author}') ❷
❶A book stores several pieces of info as attributes
❷This function has knowledge of all the book’s attributes
If the Book
class and the display_book_info
function live in the same module, this code might be tolerable. It operates on related information, and it’s together in one place, but as your code base grows you may find functions like in one module operating on classes from other modules.
Tight coupling isn’t inherently bad. Occasionally, it’s just trying to tell you something. Because display_book_info
operates only on info from Book
and does something book-related, the function and the class have high cohesion. It’s so tightly coupled to Book
that it makes sense for you to move it inside the Book
class as a method, as shown in the following listing.
Listing 2. Reducing coupling by increasing cohesion
class Book: def __init__(self, title, subtitle, author): self.title = title self.subtitle = subtitle self.author = author def display_info(self): ❶ print(f'{self.title}: {self.subtitle} by {self.author}') ❷
❶ Moved to a method whose only necessary argument is self
(still a Book
)
❷ All references to book
change to self
In general, tight coupling is problematic when it exists between two separate concerns. Some tight coupling is a sign of high cohesion which isn’t structured quite right.
You may have seen or written code similar to listing 3. Imagine you’ve got a search index to which your users can submit queries. The search
module provides functionality for cleaning up search queries to make sure they produce more consistent results from the index. You write a main procedure to get a query from the user, clean it up, and print the cleaned-up version.
Listing 3. A procedure tightly coupled to the details of a class
import re def remove_spaces(query): ❶ query = query.strip() query = re.sub(r'\s+', ' ', query) return query def normalize(query): ❷ query = query.casefold() return query if __name__ == '__main__': search_query = input('Enter your search query: ') ❸ search_query = remove_spaces(search_query) ❹ search_query = normalize(search_query) print(f'Running a search for "{search_query}"') ❺
❶ Turns ‘ George Washington ‘ into ‘George Washington’
❷ Turns ‘Universitätsstraße’ (“University Street”) into ‘universitätsstrasse’
❸ Get a query from the user
❹ Remove spaces and normalize casing
❺ Print the cleaned query
Is the main procedure tightly coupled to the search
module?
- No, because it could easily do that work itself
- Yes, because it calls some of the functions inside the
search
module - Yes, because it’s likely to change if you change the way cleaning queries work
You can effectively identify coupling by assessing the likelihood that any given change to a module requires a change to the code that uses it (C). Although the main procedure could do the work the cleaning functions do, it’s important to discuss coupling concretely as it currently exists in your code. A is hypothetical and doesn’t help you achieve this. Calling a few functions from a module (B) is sometimes a sign of coupling, but the likelihood for the need to change in response to changes in that module is the important metric.
Suppose your users report that they’re still getting inconsistent results from minor changes to their queries. You do some investigation and realize it’s because some users like to put quotes around their queries, thinking it makes them more specific, but your search index treats quotes literally, matching only records that contain the quotes as written. You decide to discard the quotes before running the query.
The way things are written, this involves adding a new function to search
and updating all the places you clean queries to ensure they call the new function, as shown in listing 4. Those points in the code are all tightly coupled to search
.
Listing 4. Tight coupling causes changes in one place to ripple outward
def remove_quotes(query): ❶ query = re.sub(r'"', '', query) return query if __name__ == '__main__': ... search_query = remove_quotes(search_query) ❷ ...
❶A new function for removing quotes
❷Call the new function anywhere you normalize queries
Read on to understand what loose coupling is and how it can help you in situations like this.
Loose coupling
Loose coupling is the ability for two pieces of code to interact to accomplish a task, without either relying heavily on the details of the other. This is often achieved through the use of shared abstractions.
Loosely coupled code implements and uses interfaces; at the extreme end, it uses only interfaces for intercommunication. Python’s dynamic typing allows us to relax this a bit, but there’s a philosophy here I’d like to emphasize to you.
IMPORTANT If you begin to think about the intercommunication between pieces of your code in terms of the messages objects send to each other rather than the objects themselves (figure 2) , you’ll begin to identify cleaner abstractions and stronger cohesion.
Figure 2. Imagining interconnections between classes as the message they send and receive
What are messages? Messages are the questions you ask of an object or the things you tell it to do.
Take another look at the main procedure from your query cleaner (listing 5). You achieve each transform on the query by calling a function to get a new query. Each of these is a message you’re sending.
Listing 5. Calling functions from a module
if __name__ == '__main__': search_query = input('Enter your search query: ') search_query = remove_spaces(search_query) ❶ search_query = remove_quotes(search_query) ❷ search_query = normalize(search_query) ❸ print(f'Running a search for "{search_query}"') ❹
❶ Tell the search module to remove spaces
❷ Tell the search module to remove quotes
❸ Tell the search module to normalize the casing
What you’ve written achieves the task—cleaning the query—but how do the messages feel to you? Does it feel like a lot? If I saw this code, I might say to myself, I want the cleaned-up query, I don’t care how! Going through the paces of calling each function is tedious, like if you’re cleaning queries throughout your code.
Think about this in terms of the message or messages you’d like to send. A cleaner approach might be to send a single message: “Here is my query, clean it please!” What approach might you take to achieve this?
- Rewrite it as a single function to remove spaces, quotes, and normalize casing
- Wrap the existing function calls in another function you can call anywhere
- Use a class to encapsulate the query cleaning logic
Any of these could work. Because separation of concerns is generally a good idea, A might not be the best choice because it combines several concerns into a single function. Wrapping the existing functions into another (B) keeps the concerns separate and provides a single point of entry for the cleaning behavior, which is good. Encapsulating that logic further into a class (C) could make sense later on if you need the cleaning logic to maintain information between steps.
Try refactoring the search
module to make each transform function private, providing a clean_query(query)
function that performs all the cleaning and returns the cleaned query. Come back here and check your work against listing 6.
Listing 6. Reduce coupling by minimizing the details shared between two areas of code
import re def _remove_spaces(query): ❶ query = query.strip() query = re.sub(r'\s+', ' ', query) return query def _normalize(query): query = query.casefold() return query def _remove_quotes(query): query = re.sub(r'"', '', query) return query def clean_query(query): ❷ query = _remove_spaces(query) query = _remove_quotes(query) query = _normalize(query) return query if __name__ == '__main__': search_query = input('Enter your search query: ') search_query = clean_query(search_query) ❸ print(f'Running a search for "{search_query}"')
❶ Transforms made private because they’re underlying details of cleaning
❷ A single point of entry receives the original query, cleans it, and returns it
❸ The consuming code needs to call only a single function now, reducing coupling
Now, when you think of the next way you need to clean your queries, you’ll be able to do the following (shown in figure 3):
- Create a function to perform the new transform on a query
- Call the new function inside
clean_query
- Call it a day, confident that consumers are all cleaning queries properly
Figure 3. Using encapsulation and separation of concerns to maintain loose coupling
You should see that loose coupling, separation of concerns, and encapsulation all work together. The separation and encapsulation of behavior with a carefully thought out interface to the outside world helps achieve the loose coupling you desire.
Recognizing coupling
You’ve seen an example of tight and loose coupling now, but coupling can take on a few specific forms in practice. Giving a name to these forms and understanding the signs helps you mitigate tight coupling early on, keeping you more productive in the long term.
Feature envy
In the early version of your query-cleaning code, the consumer needed to call several functions from the search
module. When code performs several tasks using mainly features from another area, it is said to have feature envy. Your main procedure feels like it wants to be the search
module, because it uses all of its features explicitly. This is also common in classes, as shown in figure 4.
Figure 4. Feature envy from one class to another
Feature envy can be solved the same way you fixed your query-cleaning logic: roll it up into a single entry point back at the source. In your example, you created a clean_query
function in the search
module. The search
module is where query cleaning logic goes, and a clean_query
function is perfectly at home there. Other code can continue on using clean_query
, blissfully unaware of what happens underneath and trusting that it will receive a properly-cleaned query in return. That code no longer has feature envy; it’s happy letting the search
module be in charge of search-related things.
As you refactor to remove feature envy, it feels like you’re giving up a certain amount of control. Before refactoring you see exactly how the information flows through the code, but afterward that flow is often hidden under a layer of abstraction. This requires putting a certain amount of trust in the code you interact with to do what it says. It feels uncomfortable occasionally, but a thorough test suite can help you remain confident in the functionality.
Shotgun surgery
Shotgun surgery is often what happens as a result of tight coupling. You make one change to a class or module, and those changes have to ripple far and wide for other code to keep working. Peppering changes throughout your code each time you need to update behavior is tiresome!
By addressing feature envy, separating concerns, and practicing good encapsulation and abstraction, you’ll minimize the amount of shotgun surgery you need to do. Anytime you find yourself jumping around to different functions, methods, or modules to realize the change you’re trying to make, ask yourself if you’re experiencing tight coupling between those areas of code. Then see what opportunities there are to move a method to a better suited class, a function to a better-suited module, and so on. A place for everything, and everything in its place.
Leaky abstractions
The goal of abstraction, as you’ve learned, is to hide the details of a particular task from the consumer. The consumer triggers the behavior and receives the result, but doesn’t care about what happens under the hood. If you start to notice feature envy, it might be because of leaky abstraction.
A leaky abstraction is one that doesn’t sufficiently hide its details. The abstraction claims to provide a simple way to get something done, but ultimately requires you to have some knowledge about what lies beneath when using it. This sometimes manifests as feature envy but can also be subtle, as you’ll see in a moment.
Picture a Python package for making HTTP requests (requests
, maybe). If your goal is purely to make a GET
request to some URL and get the response back, you’d be best served by an abstraction on the GET
behavior. requests.get('https://www.google.com')
is one example.
This abstraction works well most of the time, but what happens when you lose your internet connection, when Google is unavailable? When things are “weird” for a moment and your GET
request doesn’t make it anywhere? In these cases, requests
generally raise an exception indicating the problem (figure 5). This is useful for error handling, but requires you to know a bit about the possible errors to know which are likely to occur and how to handle them. Once you start handling errors from requests
in many places, you’re coupled to it because your code expects a certain set of possible outcomes, which are specific to the requests
package.
Figure 5. Abstractions occasionally leak the details they’re trying to hide
Leaks happen because there’s a trade-off to consider with abstraction—generally speaking, the further you abstract a concept in code, the less customization you can provide. This is because abstraction is inherently meant to remove access to detail; the fewer details you can access, the fewer ways you have to change the details. As developers we often want to tweak things to better suit our needs, though, and we provide lower-level access to the details we tried to hide.
When you find yourself providing access to a low-level detail from a high-level layer of abstraction, you’re likely introducing coupling. Remember that loose coupling relies on interfaces — shared abstractions — rather than specific low-level details.
That’s all for this article.
If you want to learn more about the book, you can check it out on our browser-based liveBook reader here.