Hybrid Verification: Mixing Formal Methods and Testing

The ability to verify contracts either statically or dynamically, coupled with recent advances in proof technology, has opened up a new and promising approach to verification. Critical code can be proved with formal methods, and less critical code can be verified using traditional testing, with a clear separation at the interfaces between the two.

Testing has long been the standard method for verifying that software meets its requirements, but it suffers from the well-known weakness that it can never be complete, at least for any realistic program. As famously stated by the late computer scientist Edsger Dijkstra, “Program testing can be used to show the presence of bugs but never to show their absence.”

Demonstrating the absence of bugs requires formal methods: a mathematics-based analysis that can prove specific program properties statically, for all possible inputs. Unfortunately, issues such as scalability and overly restricted language subsets have historically rendered formal methods impractical for real-world projects. But thanks to progress in language design and advances in proof technology and computer hardware, formal methods have moved out of the research labs.

One of the fruits of these efforts is a novel and practical approach known as hybrid verification, which combines testing and formal methods to take advantage of the strengths of each. Key to this approach is the use of “contracts,” or pre- and postconditions, to express a code module’s assumptions and obligations, and the ability to verify the contracts either statically (using formal methods) or dynamically (with testing).

The SPARK language, a formally analyzable subset of Ada 2012, illustrates how hybrid verification can be used to help bring increased assurance to critical software.

Contract-Based Programming

In business, a contract is an agreement between two parties that specifies what each one expects from, and agrees to provide to, the other. Computer programs also have contracts; in this context, the two parties are a subprogram and its invokers. (The terminology depends on the programming language; a subprogram is also known as a procedure, function, method, or subroutine.) Contracts take the form of subprogram preconditions and postconditions.

A precondition is a property (a Boolean expression) that the subprogram requires at the point of call, such as constraints on its parameters or global variables. A postcondition is a property that the subprogram will guarantee when it returns—again, a Boolean expression, but one that can reference values of parameters and globals, both on entry to and return from an invocation. In the other direction, at the point where the subprogram is invoked, it is the responsibility of the invoking context to guarantee that the precondition is met, and after the return, the subprogram’s postcondition can be assumed.

Figure 1 shows a skeletal example, which is valid in Ada and in SPARK (assuming the implementation conforms with the SPARK subset):

Contracts in Ada and SPARK

Append’s precondition is an assertion that the Table parameter is not full; in order for Append to work, this condition must be met. The logic at the point of invocation—statement (1)—should thus ensure that the precondition is met before the procedure is called.

Append’s postcondition is an assertion that the length of the Table on return is one more than the length at the point of call. The logic of the implementation of Append should ensure that this postcondition is met, and following the return, the postcondition can be assumed by the caller.

The pre- and postconditions for Truncate are analogous.

Pre- and postcondition contracts are part of a subprogram's specification, or its interface to its callers. Other contracts take the form of assertions that are part of a subprogram's implementation, such as the state of the program's variables at a particular point of execution.

Contract-based programming raises two critical questions:

  • When are the contracts checked (i.e., dynamically versus statically)?
  • What happens if a contract fails at runtime or cannot be verified statically?

Under Ada semantics, each contract is checked dynamically, and if it fails, an exception is raised. This way, traditional testing methods can be used for verification.

Under SPARK semantics, each contract is checked statically. This means that at each point of call, the proof tools will attempt to show that the called subprogram’s precondition is met. The proof tools will also analyze the subprogram code and attempt to prove that, if the precondition is met and the subprogram terminates, then the postcondition is met at each point of return.

If the analysis is unable to prove a contract, this could be due to several causes:

  • The code or the contract is incorrect
  • The code and the contract are correct, but additional provable contracts need to be added to assist the proof tools
  • The code and the contract are correct, but proving the contract is beyond the capabilities of the proof tools

Review and inspection or testing should be able to detect errors in the code or the contracts. The other situations may require additional contracts to be inserted (for example, loop invariants) or interactions with the proof tools to pinpoint why the contract cannot be proved. In extreme cases where the developer knows that a contract is satisfied but it is not provable, two approaches are available:

  • An assertion (pragma Assume) can be supplied that is strong enough to allow the contract to be proved. Such assertions need to be verified through other means, such as testing
  • The context of the contract can be marked as excluded from proof analysis. In this case, verification would again need to be performed through testing

Combining Testing and Formal Methods

The ability to mix and match SPARK code with full Ada and with code from other languages, such as C, allows the developer to use formal methods for the most critical sections and traditional testing for modules that are either at a lower level of criticality or are not amenable to formal analysis.

This can work in both directions: Formally verified code can call a subprogram that is verified through testing, and tested code can call subprograms that have been formally verified with proof of properties such as data flow security, absence of runtime errors, key integrity constraints, and full functional correctness. The example in figure 1 illustrates both possibilities.

Calling tested units from formally verified code

Suppose that the contracts for the subprograms in Table_Pkg are verified by testing rather than by proof. They can still be used in the formal verification of code that invokes the subprograms. Formal analysis of Table_Client will attempt to show that the precondition for Append is met at (1) and will assume that the postcondition for Append is satisfied on return. Analogously, it will attempt to show that the precondition for Truncate is met at (3) and will assume that the postcondition for Truncate is satisfied on return.

The assumptions about the postconditions can be exploited in the proof. For example, if the code in (2) does not reduce the length of the table, then the precondition for Truncate at (3) is automatically satisfied.

Verifying the Table_Pkg subprograms by testing means enabling runtime checks of the contracts. These checks can either be retained in the final executable (the precondition checks will be redundant if the calls are from formally verified code) or suppressed if necessary for performance once there is sufficient confidence from the breadth of the testing that the checks will not fail.

Calling formally verified units from tested code

In the other direction, suppose that verification is achieved through formal proofs for the Table subprograms but through testing for procedure Table_Client. The Table_Pkg package can be compiled with runtime checks enabled for the preconditions, but not for the postconditions, because these would be redundant with the formal analysis. After the testing of Table_Client provides sufficient confidence that the precondition checks are not needed, they can be suppressed if necessary for improved performance.

A side effect of the formal verification of the postconditions is the potential reduction in the testing effort for Table_Client. If code inspection or other analysis shows that that My_Table could not have its length reduced by any code in (2), then the precondition for Truncate at (3) is known to be satisfied. There is no need to try to formulate robustness tests that violate the precondition.

Postcondition Issues

Although it is usually straightforward to specify a subprogram’s precondition, the choice of a postcondition is not so simple. In the Append example, the postcondition reflects the requirement that after an element is added to the table, the table’s length has increased by 1. But both weaker and stronger postconditions can be supplied. One possible weaker postcondition is that the final value of the table is not empty. One possible stronger postcondition is the complete functional requirement that the procedure is supposed to meet: The final value of the table consists of all the elements of the original table, in the same order, plus exactly one new element (the Item parameter) at the end.

The choice of weaker versus stronger postconditions depends on several factors. If verification is to be done dynamically, then performance needs to be taken into account; a strong and complex postcondition (especially one involving quantification) may introduce too much overhead. In such cases, a stronger postcondition might be used during testing and then replaced by a weaker postcondition (or kept in the source code as documentation but compiled with assertion checks disabled) for production. If verification is to be done statically, then the assurance objectives for the code (which properties need to be demonstrated) and the capabilities of the proof technology need to be taken into account.

A strong and complex postcondition might not be necessary if the intent is to demonstrate absence of runtime errors rather than full functional correctness. Furthermore, proving such a postcondition might be beyond the reach of the proof tools. In such cases, a weaker postcondition should be used.

A Modern Approach to Secure Software

Although contract-based programming is not a new concept—it has been supported for decades in languages such as Eiffel—the ability to verify contracts either statically or dynamically, coupled with recent advances in proof technology, has opened up a new and promising approach to verification. Critical code can be proved with formal methods, and less critical code can be verified using traditional testing, with a clear separation at the interfaces between the two.

And as illustrated by SPARK and Ada, the same code can be subject to both approaches, offering a practical migration path from testing to proof for components that grow in criticality over their lifecycle. Such hybrid verification marks a major advance in the industry’s ongoing effort to produce reliable, safe, and secure software. 

User Comments

1 comment
Jon Hagar's picture

A nice article, but really beyond many software projects and test people.  For me the question comes under what coding situations should formal verification be "expected".  What levels of risk and integrity (see for example IEEE 1012 V&V) imply formal methods of some kind.  The industry really does not have standards on such things nor in many cases the skilled staff to implement. 

December 7, 2017 - 11:10am

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.