|
From Seriously Good Software by Marco Faella This article delves into how thinking about what you want your code to do before you write it can lead to more reliable code. |
Take 37% off Seriously Good Software by entering fccfaella into the discount code box at checkout at manning.com.
Software reliability refers to the extent to which the system performs as expected, in a variety of operating conditions. In this article, we’re going to explore the main coding techniques you can use to prevent or expose unexpected program behaviors. But first, let’s discuss how we can define the expected behavior of a piece of software, aka its specification. I’m going to focus on the behavior of a single class. A popular way to organize specifications of OO programs and classes therein is through the design-by-contract methodology.
Design by contract
In ordinary language, a contract is an agreement where each party accepts obligations in exchange for some benefits. In fact, an obligation for one party is the benefit of another. For example, a phone plan is a contract between a carrier and the phone owner. The carrier is obliged to render the phone service and the owner is obliged to pay for it, such that each party benefits from the other party’s obligations.
The design-by-contract methodology suggests attaching contracts to software artifacts, particularly individual methods. A method contract comprises a pre-condition, a post-condition, and possibly a penalty.
Pre- and post-conditions
The pre-condition states the requirements for that method to correctly function. It talks about the legal values for the parameters and about the current state of this object (for an instance method). For example, the pre-condition of a square root method might state that its argument should be non-negative.
It’s the caller’s responsibility to respect the pre-condition of the method being called. In the analogy with an ordinary contract, the pre-condition is an obligation for the caller and a benefit for the callee. The method itself can either passively assume that the pre-condition holds, or actively check whether it holds, and react accordingly.
The pre-condition should include only properties which are under the full control of the caller. For example, a method that takes a file name as an argument and opens that file can’t list among its pre-conditions that the file exists, because the caller can’t be one hundred percent sure that it does (another process can erase that file at any time). The method can still throw an exception in that case, but that exceptions of the checked variety, forcing the caller to handle it.
Conversely, the post-condition states the effect of the method, and describes its return value and all the changes performed on the state of any object. In most well-designed classes, changes should be limited to the current object, but this isn’t always the case. For example, the connectTo
method in our running example must modify multiple containers to achieve its intended effect.
The post-condition should also specify what happens when a pre-condition is violated by the caller: this is referred to as a penalty. In Java, the typical penalty consists in throwing an unchecked exception. Figure 1 is a graphical depiction of the contract for an instance method.
Figure 1. High-level structure of a contract for an instance method. All consequences of a method besides its return value are called side effects.
For example, here’s the contract for the next
method of the java.util.Iterator
interface:
Pre-condition: This iterator hasn’t reached the end. Equivalently, a call to hasNext
returns true
.
Post-condition: It returns the next item in the iteration and advances the iterator by one position.
Penalty: If the pre-condition is violated, the method throws NoSuchElementException
(an unchecked exception).
Calling next
when the iterator reaches the end and violates the pre-condition and it’s an error on the side of the client (the error is outside the next
method). Conversely, an implementation of next
which doesn’t advance the iterator to the next item is violating the post-condition. In this case, the error lies inside the method itself.
Figure 2 depicts the detailed data dependencies involving various parts of the contract. The pre-condition dictates the legal values for the arguments and for the state of this object before the call. This is why there’s two arrows coming into the “pre-condition” box. For example, the pre-condition of Iterator::next
mentions only the state of the iterator, because that method takes no arguments.
Figure 2. Detailed structure of a contract for an instance method. Solid arrows represent data dependencies pertaining to the contract (what the contract talks about). Dashed arrows are standard runtime interactions that occur regardless of the contract (what happens at runtime).
Because the post-condition describes all the changes brought about by the method, it may refer to the following data:
- The return value (as the main effect of the method)
- Both the old and the new state of this object; the old state as an input that can influence the behavior of the method, and the new state as another effect of the method
- The value of the arguments, as inputs
- Other side effects produced using globally available objects or static methods, such as a call to
System.out.println
Figure 2 omits the last case, and depicts the others as incoming arrows into the “postcondition” box. For example, the post-condition of Iterator::next
refers explicitly to the return value and implicitly to both the old and the new state of the iterator, when it says that “it returns the next item and advances the iterator by one position”.
Invariants
Besides method contracts, classes can have associated invariants. An invariant is a condition, expressed on the class fields, which is always true, except when an object is undergoing change due to a method of the class.
Invariants are static consistency rules: they refer to the state of the objects in a single instant in time. On the contrary, post-conditions are dynamic consistency rules, as they compare the state of the object(s) before and after a method call.
As the name implies, invariants must hold both before and after a method is called. Accordingly, in figure 2 invariants have incoming arrows from the old and the new state of the object.
The initial state of each object, as established by a constructor, must satisfy the invariants, and all public methods are responsible for preserving their validity. Private methods don’t have this obligation because their role consists in supporting public methods. When a private instance method is run, it’s typically within the context of some ongoing public instance method which invoked it, either directly or indirectly. Because the current object may be undergoing change, the private method may find it in an intermediate state which violates an invariant, and may also leave it in an inconsistent state. It’s only at the end of the public method that the state of the object must be consistent again, and invariants must be restored.
Correctness and robustness
Software reliability can be refined into two separate qualities: correctness and robustness. The difference between them lies in the type of environment you assume for your system. When evaluating correctness, you imagine your system into a nominal environment, which is an environment satisfying the system expectations. In such a friendly context, method pre-conditions are honored, external inputs arrive in a timely manner and in the right format, and all resources needed by the system are available. If the system is correct, it behaves according to plans in all friendly environments.
In principle, correctness is a boolean property: either it holds or it doesn’t. Partial correctness doesn’t make a lot of sense. It’s usually impractical to devise perfectly formal and complete specifications, and as soon as specifications become blurry, the same is true for correctness. In the little controlled world of the container example, we’re going to put forward clear specifications and make sure that the class is correct with respect to them. Then, we’re going to explore techniques that maximize our confidence in its correctness. Those techniques are useful in those real-world scenarios when you don’t have months to spend over a single class.
On the other hand, robustness refers to the system behavior under exceptional or unanticipated environments. Typical cases include the host machine running out of memory or disk space, external inputs being in the wrong format or outside the legal range, methods being called in breach of their pre-conditions, and so on. A robust system is expected to react gracefully in such situations, where the appropriate definition of “grace” is highly dependent on the context.
For example, if a crucial resource is unavailable, the program might try to wait a little and request it again, for a couple of times, before giving up and terminating. If the problem persists, and in general every time termination is the only way out, the program should clearly inform the user about the nature of the problem, and strive to minimize data loss, and that the user can later resume their task as smoothly as possible.
Figure 3 summarizes the relationships between the two software qualities comprising reliability, the various types of specifications discussed earlier, and the three coding techniques you’re going to use in this article.
Figure 3. Relationship between reliability attributes, contract-based specifications, and coding techniques.
Correctness is defined with respect to a contract, comprising pre- and post-conditions, and optionally to a set of class invariants. The penalty isn’t directly related to correctness, because it only springs into action when the caller violates a pre-condition. As such, it’s a robustness issue.
Three coding techniques help implement and enforce contracts:
- Plain
if
-based checks make sure that the caller is invoking a method in the proper way, obeying its pre-conditions, and issue the corresponding penalty otherwise. - Java
assert
statements are useful to keep post-conditions and invariants in check, particularly in safety-critical software. - Finally, tests increase your confidence in the reliability of your software, mostly by checking post-conditions and triggering penalties.
- Let’s delve into some of the best practices regarding each of these techniques. For the moment, notice that the first two are monitoring techniques, active during regular operation of your software. Testing is performed before operation and separately from it.
Checking contracts
Many programming errors have to do with violating method pre-conditions. To expose these problems as soon as possible, methods should check their pre-conditions at runtime and throw a suitable exception if they aren’t met. This is sometimes called defensive programming. Two standard exception classes are commonly used for this purpose:
IllegalArgumentException
: the value of an argument violates the pre-condition;IllegalStateException
: the current state of this object is incompatible with the instance method being called or with the value of the arguments. For example, attempting to read from a file that has already been closed might throw this exception.- A related but more specific checking mechanism is represented by assertions. An assertion is a statement of the following form:
assert condition : ''Error message!'';
When executed, the line evaluates the boolean condition and throws an AssertionError
if the condition’s false. The error message string’s passed into the exception being thrown and prints out if the exception isn’t caught. The assertion’s quite similar to the following statement:
if (!condition) { throw new AssertionError(''Error message!''); }
At this point, an assertion looks like a shorter version (aka syntactic sugar) of a regular if
-based check. One crucial feature distinguishes the two: by default, assertions aren’t executed by the JVM. They have to be explicitly activated with the “-ea
” command-line option, or via the corresponding IDE setting. When assertions are turned off, the program doesn’t incur the performance overhead due to evaluating the corresponding boolean conditions.
A standard if
-based check is always executed; if that check is instead performed by an assertion, you can turn it on or off at each execution. The usual practice is to turn on assertions during development and then revert to the default “off” state for production. It seems that assertions win all the way: they’re more succinct and you have more control over them. Shall you use them for all runtime error checking? It turns out there are cases when the flexibility that comes with assertions becomes a liability. In those cases, you want some checks to stay in place at all times, even during production.
Design by contract provides simple guidelines for identifying which checks should always be on:
- Pre-condition checks on public methods should always be on. Use regular
if
-based checks for them. - All other checks should be on only during development. These include postcondition and invariant checks, and pre-condition checks on non-public methods. Use assertions there.
The rationale is the following. Pre-condition violations are due to the caller not respecting the method contract. On the contrary, a post-condition or invariant violation is due to an issue within the class itself. Now, consider the following key assumption:
Development and testing ensure that each single class is free from internal issues.
By an “internal issue,” I mean a bug that manifests itself even if the class clients respect all the rules put forward by the contracts. For the moment, take this assumption at face value, we’ll discuss its plausibility in a second. If the previous assumption holds, the only way the program can misbehave is by a class misusing another class. In a properly encapsulated system, this can happen only via public methods. Hence, to expose these bugs, pre-condition checks on public methods are sufficient and should be left on. Notice that checking pre-conditions at runtime doesn’t fix the problem, it merely exposes it as soon as possible during the execution, allowing the root cause to be more accurately characterized.
How reasonable is the no-internal-issue assumption? That ultimately depends on the quality and intensity of the development process. The higher the quality and intensity, the more likely the assumption is to hold. By quality of the development process, I mean whether the industry’s best practices are followed. Intensity (or effort) refers to the amount of people and time spent developing, and testing, each class. For example, only small classes can be expected to be entirely free from internal issues.
Writing small classes is one of the sacred best practices in OOP for good reason.
The broader picture
Figure 4 puts the techniques presented in this article into a wider perspective. It focuses on programming styles and techniques that even a single programmer can employ in their daily activities. Beyond that, at least two more types of intervention can contribute to software quality in general, and to reliability in particular.
First, there’s human oversight: having a fellow developer look at your code and evaluate it according to company standards. This can be arranged in periodic reviews or sometimes as a continuous interaction between two colleagues, a practice known as pair programming.
Then, there are software tools that can automatically check a variety of code properties, enriching the scrutiny already performed by the compiler. Such tools can be roughly divided into three categories, from the most basic to the most advanced:
Style checkers: These tools only perform relatively superficial checks targeting readability and uniformity. In turn, those qualities indirectly benefit reliability and maintainability too.
Figure 4. A broad view on quality-enhancing techniques.
Example feature: check that the indentation is correct and uniform (same number of extra spaces for each nesting level). Example tool: CheckStyle.[1]
Static analyzers: These tools are capable of performing a semantic analysis similar to the compiler’s type-checking phase. Style checkers and static analyzers are also known as linters.
Example feature: checks whether an anonymous class contains an uncallable method (a method that doesn’t override another method and isn’t used by other methods of that class).
Formal verifiers: These tools, mostly born out of academic research, understand a program at a deeper level than the typical compiler. They can simulate the execution of the program on entire sets of values, a process known as symbolic execution.
Example feature: check whether an integer variable can ever become negative.[4]
Example tool: KeY.[5]
It’s usually up to your organization to choose the set of quality practices and tools deemed appropriate to the task at hand. What’s suitable for developing a videogame is vastly different from what a military or a healthcare client demands.
That’s all for now.
If you want to learn more about the book, check it out on our browser-based liveBook reader here and see this slide deck.
[1] As of this writing, available at https://checkstyle.sourceforge.io.
[2] As of this writing, available at https://spotbugs.github.io.
[3] As of this writing, available at https://www.sonarqube.org.
[4] Technically, this property is undecidable. A formal verifier will attempt to prove or disprove it, but it’s not guaranteed to succeed.
[5] As of this writing, available at https://www.key-project.org.