Computing Information Flow Using Symbolic Model-Checking

Several measures have been proposed in literature for quantifying the information leaked by the public outputs of a program with secret inputs. We consider the problem of computing information leaked by a deterministic or probabilistic program when the measure of information is based on (a) min-entropy and (b) Shannon entropy. The key challenge in computing these measures is that we need the total number of possible outputs and, for each possible output, the number of inputs that lead to it. A direct computation of these quantities is infeasible because of the state-explosion problem. We therefore propose symbolic algorithms based on binary decision diagrams (BDDs). The advantage of our approach is that these symbolic algorithms can be easily implemented in any BDD-based model-checking tool that checks for reachability in deterministic non-recursive programs by computing program summaries. We demonstrate the validity of our approach by implementing these algorithms in a tool Moped-QLeak , which is built upon Moped , a model checker for Boolean programs. Finally, we show how this symbolic approach extends to probabilistic programs.


Introduction
It is desirable for a program to never leak any information about its confidential inputs.For example, when an adversary can make low-security observations of an execution, these should be independent of the confidential inputs.This property is often too strong in practice, mostly because it clashes with desired functionality.Therefore, many authors (cf.[22,25]) have proposed to evaluate security by the amount of leaked confidential information.This raises foundational questions of (a) how to measure that amount and (b) how to compute it.These challenges have received much attention recently.
A usual approach is to employ information-theoretic tools.In this approach, a program is modeled as an information channel that transforms a random variable taking values from the set of confidential inputs into a random variable taking values from the set of public outputs (i.e., the adversary's observations).Based on this, one quantifies the adversary's uncertainty about the confidential inputs.The amount of information leaked by the program is then modeled as the difference between the initial uncertainty and the uncertainty remaining in the secret inputs after the adversary observes the execution.Commonly used measures of uncertainty are Shannon entropy [22] and min-entropy [25].Intuitively, leakage based on min-entropy measures vulnerability of the secret inputs to a single guess of the adversary who observes the program execution, while, leakage based on Shannon entropy measures expected number of guesses required for the adversary to guess the secret input having observed the program execution.We refer to [25] for a detailed comparison between these two measures.
Though appealing from a conceptional viewpoint, these measures do not readily lend themselves to feasible computation.For example, it has been shown in [27] that when using Shannon's entropy for measuring uncertainty, the problem of deciding whether the information leaked by a loop-free deterministic Boolean program is less than a rational number is harder than counting the number of satisfying assignments of a Boolean formula in Conjunctive Normal Form.The hardness of the problem comes from the fact that one has to compute (a) how many outputs are observable to the adversary and (b) for each possible output, how many inputs lead to that particular output.

Contributions.
We first consider the problem of evaluating the amount of information leaked by the public outputs of Boolean deterministic programs with uniformly distributed secret inputs.We exploit symbolic model-checking techniques to achieve our goals.More precisely, we demonstrate how model checkers based on Binary Decision Diagrams (BDDs) can very easily be enhanced to compute information leakage.As we shall see shortly, our approach is informed by the model-checking algorithms used by these tools.
BDDs [19,1,5] are data structures used to store Boolean functions.Their efficiency has led to many applications in program verification.Broadly, in this approach, the program is viewed as a transition system in which a configuration contains the current line number and the values of the variables.Transitions are encoded as BDDs, and reachability is encoded as the least fixed-point solution to a set of Boolean equations.This solution is the result of a fixed-point iteration with efficient BDD operations (Please see [5] for a discussion of complexity of BDD operations).For certain BDD-based tools (cf.[13]), this fixed-point computation yields the relation between the values of global variables at the start of the program and the values of the global variables when the queried location is reached.By querying the exit point, we can thus compute the relation between the inputs and the outputs of the program, henceforth referred to as the summary of the program.
Our key observation is that this summary (which is given as a BDD) is indeed all the information we need to quantify information leakage.We give symbolic algorithms that extract information leakage from the summary according to either Shannon entropy or min-entropy.This approach is appealing because these algorithms can be easily plugged into existing BDD-based model-checking tools.
We validate this approach by implementing our algorithms in Moped [13], a BDD-based symbolic model checker that checks for assertion errors in programs modeled as reachability problems.Apart from providing support for Boolean data, Moped also supports integers of variable length, arrays, and C-like structures.Our experience with these implementations are promising, as the computation of information leakage (for both min-entropy and Shannonentropy) comes with little overhead over the reachability computation.
We then turn our attention to probabilistic non-recursive programs.For such programs, we need to compute, for each possible input-output pair (i, o), the conditional probability that the program outputs o when the input is i.Usually, these quantities are stored as a matrix, also called the channel matrix.We compute the channel matrix as the least fixed-point solution to a system of linear equations [24], which can be done using Algebraic Decision Diagrams (ADDs) [12], a generalization of BDDs.The summary for probabilistic programs now encodes (symbolically) the channel matrix, and we construct symbolic algorithms to extract the leaked information from the computed summary.We validate this approach by extending the ability of Moped to compute the summary for probabilistic non-recursive programs and implementing the symbolic algorithms for computing the information leakage.
The tool implementing the algorithms for entropy calculations is available for download at http://people.cs.missouri.edu/~chadhar/mql/.For space reasons, we have omitted proofs which can be found in the longer version of the paper available at the same site.

Related work.
The problem of automatically computing information flow was first tackled in [3].This approach iteratively constructs equivalence classes on inputs: two inputs are said to be equivalent if they lead to the same output.One starts with a single equivalence class and progressively refines when these inputs lead to different outputs.At each step, the equivalence relation is characterized using logical formulas and refined using experimental runs of the program.Once a fixed point is reached, the sizes of the resulting equivalence classes can be used to compute information leakage.This technique is optimized in [18], where statistical techniques are used to estimate the equivalence classes.The effectiveness of the approach is demonstrated through examples, and the authors suggest that an automated tool based on these techniques can be built.The computation of the size of the equivalence classes is further optimized in [15].
SMT solvers are used in [21,20,16,23] to estimate min-entropy leakage in Boolean straight-line programs.In this approach, the program summary is encoded as a SMT formula and various model-counting techniques are used to obtain the information leaked.In [21,20], an upper bound on min-entropy leakage is computed by estimating an upper bound on the number of feasible outputs.It is easy to construct examples where the computed bound is far from the correct value.Our techniques in contrast yield exact values.[16] provides a toolchain which first computes the program summary as a SAT formula that is then fed to a custom-made #SAT solver to calculate the information leaked.[23] combines model-counting techniques of #SMT solvers with the technique of symbolic executions.This tool can handle real C and Java programs.
For probabilistic programs, the use of model-checking to compute information leakage has been explored in [8,2,10,4].Please note that the models considered in these papers are more general as they also allow for other observations than just the outputs at the end of the program execution.[8] uses [14] to get the channel matrix and then computes the information leakage by hand, [10] implements an explicit state model-checking algorithm, and [4] computes the information leakage using (forward) symbolic executions.[2] also proposes to compute the channel matrix using fix-point iterations.Once the channel matrix is computed explicitly then information leakage can be computed.Our approach is different in that we solve the fix-point iterations symbolically and use the symbolic representation of the computed matrix directly in the computation of information leakage.
Information leakage in programs.Several measures of information leakage have been considered in literature.Of these, we consider Shannon entropy and min-entropy.We assume that the reader is familiar with information theory and introduce some abbreviations and results that we shall need.For this section, we fix a Boolean program P. As discussed above, the semantics of P is a function P : S → O.If S is sampled from a distribution µ, then µ gives rise to a joint probability distribution on S and O.
Leakage based on Shannon entropy: In Shannon entropy, the information leaked by the program P is defined as SE µ (P ) := I µ (S; O), where S and O are random variables taking values in S and O respectively according to the joint distribution µ, and I µ (S; O) is the mutual information of random variables S and O.When P is deterministic and µ is U, the uniform distribution on inputs, we have [3,17] Leakage based on Min entropy: In min-entropy [25], the information leaked by the program P on uniformly distributed inputs is defined as ME U (P ) := log o∈O max

Algebraic Decision Diagrams.
We assume that the reader is familiar with Binary Decision Diagrams (BDDs) and merely recall some facts necessary for our presentation.Our presentation follows closely the presentation in [24].When speaking of BDDs, we always mean their reduced ordered form [5]. BDDs are data structures for storing elements of 2 V → {0, 1}, where V = {x 1 , . . ., x n } is a finite set of Boolean variables.They take the form of a rooted, directed acyclic labeled graph.Non-terminal nodes are labeled by an element of V, and terminals are either 0 or 1.There are two edges out of a non-terminal node, one labeled then and the other labeled else.Assuming a fixed strict order < on V, an edge from a non-terminal labeled x to a non-terminal labeled y satisfies x < y .From now on we will often confuse a function 2 V → {0, 1} with its BDD representation.
Example 1. Figure 1 shows how a BDD over the set V = {x, y, z} with the order x < y < z would store the Boolean assignments satisfying x → (y ↔ z).The figure on the left shows a (non-reduced) diagram exhaustively listing all assignments, and the right-hand side shows the resulting BDD, where for simplicity the terminal 0 and edges leading to it have been omitted.The solid arrows are then branches and the dashed arrows are else branches.
ADDs generalize BDDs and store elements of the set 2 V → M , where V = {x 1 , . . ., x n } and M is an arbitrary set.The main difference between BDDs and ADDs is that the terminal nodes contain elements of M and not just elements of {0, 1}.For our purposes, M will be either R or R + .Analogous to BDDs, the value of a function f represented by an ADD T at (z 1 , . . ., z n ) ∈ 2 V is given by the label of the terminal node along the unique path from the root to a terminal node such that if a non-terminal node is labeled x i along the path then the outgoing edge from x i must be labeled then if and only if z i is true.
Note that an BDD is an ADD where all the terminals are either 0 or 1. Henceforth, we will refer to BDDs as 0/1-ADDs.Many efficient operations can be performed on ADDs.We list the most relevant ones for our paper.

The function isConst(T ) checks if T is a constant function. val(T ) returns the value of T
if isConst(T ) is true.

2.
If op is a commutative and associative binary operator on R and V 1 a subset of variables of V then abstract(op, V 1 , T ) returns the result of abstracting all the variables in V 1 by applying the operator op over all possible values taken by variables in V 1 .abstract(op, V 1 , T ), thus obtained, is a function with domain as the set V \ V 1 and range as R.
For example, if T represents the function f , then abstract(+, {x 1 , x 2 }, T ) returns the ADD which represents the function f (true, true, x 3 , . . ., abstracting all the variables in V 1 by applying disjunction over all possible values taken by variables in V 1 .

Leakage in non-probabilistic programs
In this section, we shall describe our ADD-based algorithms for computing the information leaked by deterministic programs when the leakage is measured using (a) min-entropy and (b) Shannon entropy.We fix some notation.Consider a set of variables G = {x 1 , . . ., x n }.
Let G = {x 1 , . . ., x n } be a set of distinct variables disjoint from G. Note that there is a one-to-one correspondence between elements of 2 G and 2 G and every element (z 1 , . . ., z n ) of 2 G can be identified with a unique element (z 1 , . . ., z n ) of 2 G and vice versa.G shall represent the initial values of the variables of a program and G shall represent their final values.In this section, we will assume that all possible valuations of G are valid inputs to the program (and hence our input domain shall always be a power of 2).We discuss how to restrict the domain in the longer version of the paper.
Observe that thanks to the correspondence between OBDDs and Boolean functions, T P can be considered as an OBDD on the set of variables G ∪G .Now, T P can be seen as the least fixed point of a system of Boolean equations, which can efficiently be constructed by iterative methods.For our purposes, it suffices to say that BDD-based model-checkers essentially construct this relation for us (and, if not, can be modified to carry out this construction).We assume for our paper that T P is constructed by a BDD-based model-checker.It remains to show how to exploit T P to compute the information leaked by P .Here, variables s1 and s2 are high-security input-only variables and o1, o2 low-security input variables.This is why we initialized o 1 and o 2 to be false and set s 1 and s 2 false before the end of the program.Observe also that the program does not terminate when s1 is true at the beginning of the program.Assuming the order s the transition relation of P is shown as a 0/1-ADD in Figure 2 (a).
For the rest of the section, unless otherwise stated, we will fix the Boolean program P. We assume that G = {x 1 , . . ., x n } is the set of global variables of P and that G = {x 1 , . . ., x n } is a set of distinct variables disjoint from G. The summary of P will be referred to as T P .
Leakage measured using min-entropy.The amount of information leaked by the program P when using min-entropy as measure of information is as follows.Let post(2 Thus, to compute the min-entropy leakage, we need to compute |post(2 G )| and check if there is an input on which the program P never terminates.The following lemma shows how these two tasks can be achieved using ADDs.Lemma 4. Let T out,P = orAbstract(G, T P ) and T term,P = orAbstract(G , T P ).

P terminates on every input iff isConst(T term,P
) and val(T term,P ) = 1.

Example 5. Consider the program
Observe that the program terminates only when s 1 is false, in which case the final value of s 1 is also false.The initial values of o 1 and o 2 do not effect the output.The final values of s 2 and o 1 are always false.The value of o 2 is exactly the value of s 2 .Thus, there are two possible outputs (false, false, false, true) and (false, false, false, false), both of which happen for exactly 4 inputs.The ADD representing T out,P , the set of all possible outputs of P is given in Figure 2 (b).Note that o 2 does not appear in the picture because the then and else branches of o 2 lead to isomorphic subtrees.Observe that abstract(+, G , T out,P ) is the constant ADD 2. The ADD T term,P representing all possible inputs on which P terminates is given in Figure 2 (c).Theorem 6.For a program P with global variables G = {x 1 , . . ., x n }, let G = {x 1 , . . ., x n } be a set of distinct variables disjoint from G. Let T P be the summary of P represented as a 0/1-ADD on G ∪ G .The Algorithm 1 computes ME U (P ).Leakage measured using Shannon entropy.We now consider information leaked by P when measured using Shannon entropy.We need to compute In order to compute this sum, we need a new auxiliary definition: Definition 7. Let : R + × R + → R be the binary operator defined as r 1 r 2 = r 1 log r 1 + r 2 log r 2 .best variable ordering and gives the user the flexibility to choose the ordering.Hence, the examples for which the default ordering of variables (which entails the order of declaration of the variables in the source file) was the overhead, have been re-written with supposedly efficient variable orderings.The principal obstacle here is the computation of summary.The computation of leakage itself adds little overhead.We illustrate our orderings using the variables O (for public outputs) and S (for private inputs).Let O N (O 1 ) be the most (least) significant bit of O and likewise S N (S 1 ) the most (least) significant bit of S. We primarily used two kinds of orderings:

F S T T C S 2 0 1 4
Contiguous ordering: This is the default ordering of the tool, where we set Interleaved ordering: In this ordering, we set The choice of an ordering depends largely on the structure and semantics of the program.The ADDs produced are generally smaller if a variable v 1 is closer to a variable v 2 such that the value of v 1 depends on the value of v 2 .Essentially, as long as variables are compared and assigned to constants in the program, the default ordering works very well and in that case we do not even attempt the interleaved ordering.For other examples, typically, we switch to interleaved ordering as contiguous ordering becomes inefficient very fast with the number of bits as the ADDs become very large.Going by this, we have also reordered variable declarations in an example (see Mix and Duplicate below) so that variables with a constant difference in the indices are closer.
Table 1 presents some selected benchmark programs that we used to test Moped-QLeak.The examples have been derived from [21].The experiments were conducted on a 64-bit Xeon-X5650 2.67GHz Linux machine.Unless otherwise stated, S and O are 32-bit unsigned integers in all the programs.For each example, we give the name, the ordering, the Shannon entropy (SE) and min-entropy (ME) leakage values, the execution time of the tool in seconds, and the data types that occur in the example, which are either all Boolean or integers with a specified number of bits.If the example uses restricted domains then we mention it in the data types.The order is either the contiguous default order (D), the interleaved order (I), or another example-specific order (S).There is one example from [21], Population Count, for which the computation of summary never succeeds as there is no good variable ordering for that example.Note that we run the tool to compute the two leakage values separately and report the worse case.The time difference between the computation of the two values is almost always within a 3-4 microseconds.

F S T T C S 2 0 1 4
programs).Moped does not support probabilistic model-checking, so we also implemented the symbolic fixed-point algorithms for computing the summary also in Moped-QLeak.We used Moped-QLeak to compute information leakage in the dining cryptographer's problem.The symbolic algorithms and the results are discussed in detail in the longer version of the paper available at http://people.cs.missouri.edu/~chadhar/mql/.

Conclusions and future work
We gave symbolic algorithms for computing the information leaked by Boolean programs when information leakage is measured using min-entropy and Shannon entropy.The advantage of our approach is that these algorithms can be integrated with any BDD-based model checking tool that computes reachability in Boolean programs by computing program summaries.We made such an integration with Moped, with promising experimental results.The leakage calculations themselves add little overhead.The main limiting factor in these calculations seems to be the size of the OBDDs constructed in the computation.As is standard with symbolic approaches, the size of BDDs is sensitive to the variable ordering.Since Moped by itself does not compute the most efficient ordering (and puts the onus on the user), we sometimes had to rewrite our examples to achieve good performance.We also generalized our symbolic algorithms for computing information leakage in probabilistic programs.These algorithms have also been integrated in Moped.
In order to make symbolic model-checking more amenable to automation, many automated abstraction refinement techniques have been proposed in literature.We plan to investigate these techniques for our symbolic algorithms.In particular, we plan to integrate the counterexample guided abstraction-refinement framework in our symbolic algorithms.Currently, our implementation only supports non-recursive programs.However, the algorithms we presented for computing information leakage assume only that program summaries be computed.Thus, in principle, we can support programs that have both recursion and probabilistic choices, and we plan to extend support to such programs in future.

1 Figure 1
Figure 1 An unreduced decision diagram (left) and a corresponding BDD (right).

Figure 2
Figure 2 (a) Transition relation of the program Pex.The ordering assumed is s1 < s 1 < s2 < s 2 < o1 < o 1 < o2 < o 2 .(b) All the possible outputs of the program Pex as an ADD.(c) All possible inputs on which Pex terminates represented as an ADD.(d) Teq-size,P for Pex as an ADD.(e) Tnon-term,P for the program Pex as an ADD.

Table 1
Examples used for evaluation.