Combinatorial redundancy detection

The problem of detecting and removing redundant constraints is fundamental in optimization. We focus on the case of linear programs (LPs) in dictionary form, given by n equality constraints in n+d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n+d$$\end{document} variables, where the variables are constrained to be nonnegative. A variable xr\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_r$$\end{document} is called redundant, if after removing xr≥0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_r \ge 0$$\end{document} the LP still has the same feasible region. The time needed to solve such an LP is denoted by LP(n,d)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textit{LP}(n,d)$$\end{document}. It is easy to see that solving n+d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n+d$$\end{document} LPs of the above size is sufficient to detect all redundancies. The currently fastest practical method is the one by Clarkson: it solves n+d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n+d$$\end{document} linear programs, but each of them has at most s variables, where s is the number of nonredundant constraints. In the first part we show that knowing all of the finitely many dictionaries of the LP is sufficient for the purpose of redundancy detection. A dictionary is a matrix that can be thought of as an enriched encoding of a vertex in the LP. Moreover—and this is the combinatorial aspect—it is enough to know only the signs of the entries, the actual values do not matter. Concretely we show that for any variable xr\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_r$$\end{document} one can find a dictionary, such that its sign pattern is either a redundancy or nonredundancy certificate for xr\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_r$$\end{document}. In the second part we show that considering only the sign patterns of the dictionary, there is an output sensitive algorithm of running time O(d·(n+d)·sd-1·LP(s,d)+d·sd·LP(n,d))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(d \cdot (n+d) \cdot s^{d-1} \cdot \textit{LP}(s,d) + d \cdot s^{d} \cdot \textit{LP}(n,d))$$\end{document} to detect all redundancies. In the case where all constraints are in general position, the running time is O(s·LP(n,d)+(n+d)·LP(s,d))\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {O}(s \cdot \textit{LP}(n,d) + (n+d) \cdot \textit{LP}(s,d))$$\end{document}, which is essentially the running time of the Clarkson method. Our algorithm extends naturally to a more general setting of arrangements of oriented topological hyperplane arrangements.


Introduction
The problem of detecting and removing redundant constraints is fundamental in optimization. Being able to understand redundancies in a model is an important step towards improvements of the model and faster solutions.
In this paper, we focus on redundancies in systems of linear inequalities. We consider systems of the form where B and N are disjoint finite sets of variable indices with |B| = n, |N | = d, b ∈ R B and A ∈ R B×N are given input vector and matrix. We assume that the system (1) has a feasible solution. Any consistent system of linear equalities and inequalities can be reduced to this form.
A variable x r is called redundant in (1) if x B = b − Ax N and x i ≥ 0 for i ∈ B ∪ N \ {r } implies x r ≥ 0, i.e., if after removing constraint x r ≥ 0 from (1) the resulting system still has the same feasible region. Testing redundancy of x r can be done by solving the linear program (LP) minimize x r subject to x B = b − Ax N x i ≥ 0, ∀i ∈ B ∪ N \ {r }.
(2) Namely, a variable x r is redundant if and only if the LP has an optimal solution and the optimal value is nonnegative. Let LP(n, d) denote the time needed to solve an LP of form (2). Throughout the paper, we are working in the real RAM model of computation, where practical algorithms, but no polynomial bounds on LP(n, d) are known. However, our results translate to the standard Turing machine model, where they would involve bounds of the form LP (n, d, ), with being the bit size of the input. In this case, LP (n, d, ) can be polynomially bounded. The notation LP(n, d) abstracts from the concrete representation of the LP, and also from the algorithm being used; as a consequence, we can also apply it in the context of LPs given by the signs of their dictionaries.
By solving n + d linear programs, O((n + d) · LP(n, d)) time is enough to detect all redundant variables in the real RAM model, but it is natural to ask whether there is a faster method. The currently fastest practical method is the one by Clarkson with running time O((n + d) · LP(s, d) + s · n · d) (Clarkson 1994). This method also solves n + d linear programs, but each of them has at most s variables, where s is the number of nonredundant variables. Hence, if s n (the relevant case for redundancy removal), this output-sensitive algorithm is a major improvement. This case arises quite naturally. For example, when one needs to compute a projection of a polyhedron, one natural method is the Fourier-Motzkin elimination. The method is known to generate a large number of redundant constraints in each step (a quadratic increase in the worst case) and it is essential to remove redundant constraints frequently for any practical implementation. [The reader is referred to Schrijver (1986, pp. 155-156) for more details.] A related (dual) problem is the one of finding the extreme points among a set P of n points in R d . A point p ∈ P is extreme in P, if p is not contained in the convex hull of P \ {p}. It is not hard to see that this problem is a special case of redundancy detection in linear systems.
Specialized (and output-sensitive) algorithms for the extreme points problem exist (Ottmann et al. 1995;Dulá et al. 1998), but they are essentially following the ideas of Clarkson's algorithm (Clarkson 1994). For fixed d, Chan (1996) uses elaborate data structures from computational geometry to obtain a slight improvement over Clarkson's method. In this paper, we study the combinatorial aspects of redundancy detection in linear systems. The basic questions are: What kind of information about the linear system do we need in order to detect all redundant variables? With this restricted set of information, how fast can we detect all of them? Our motivation is to explore and understand the boundary between geometry and combinatorics with respect to redundancy. For example, Clarkson's method (1994) uses ray shooting, an intrinsically geometric procedure; similarly, the dual extreme points algorithms (Ottmann et al. 1995;Dulá et al. 1998) use scalar products. In a purely combinatorial setting, neither ray shooting nor scalar products are well-defined notions, so it is natural to ask whether we can do without them.
We will show that our results solely depend on the finite combinatorial information given by the signed dictionaries, i.e., the size is bounded by a function of d and n only. A dictionary can be thought of as an encoding of the associated arrangements of hyperplanes, the corresponding signed dictionary only contains the signs of the encoding (see Sect. 2). On the other hand Clarkson's algorithm depends on the input data A and b.
Our approach is very similar to the combinatorial viewpoint of linear programming pioneered by Matoušek et al. (1996) in form of the concept of LP-type problems. The question they ask is: how quickly can we optimize, given only combinatorial information? As we consider redundancy detection and removal as important towards efficient optimization, it is very natural to extend the combinatorial viewpoint to also include the question of redundancy. The results that we obtain are first steps and leave ample space for improvement. An immediate theoretical benefit is that we can handle redundancy detection in structures that are more general than systems of linear inequalities; most notably, our results naturally extend to the realm of oriented matroids (Björner et al. 1993, Chapter 10). An oriented matroid can be thought of as an arrangement of oriented topological halfspaces, (also called pseudo halfspaces). It can be encoded in the same manner with their associated dictionaries and the following methods extend naturally.

Statement of results
First of all, we note that for the purpose of redundancy testing, it is sufficient to know all the finitely many dictionaries associated with the system of inequalities (1). Moreover, we show that it is sufficient to know only the signed dictionaries, i.e., the signs of the dictionary entries. Their actual numerical values do not matter.
In Theorem 1, we give a characterization of such a redundancy certificate. More precisely, we show that, for every redundant variable x r there exists at least one signed dictionary such that its sign pattern is a redundancy certificate of x r . Similarly, as shown in Theorem 2, for every nonredundant variable there exists a nonredundancy certificate. An alternative nonredundancy certificate is given in Theorem 5. Such a single certificate can be detected in time LP(n, d) (see Sect. 4.3). The number of dictionaries needed to detect all redundancies depends on the LP and can vary between constant and linear in n + d (see Sect. 6).
In a second part, we present a Clarkson-type, output-sensitive algorithm that detects all redundancies in running time O(d · (n + d) · s d−1 LP(s, d) + d · s d · LP(n, d)) (Theorem 3). Under some general position assumptions the running time can be improved to O((n + d) · LP(s, d) + s · LP(n, d)), which is basically the running time of Clarkson's algorithm. In these bounds, LP(n, d) denotes the time to solve an LP to which we have access only through signed dictionaries. As in the real RAM model, no polynomial bounds are known, but algorithms that are fast in practice exist.
In general our algorithm's running time is worse than Clarkson's, but it only requires the combinatorial information of the system and not its actual numerical values. If the feasible region is not full dimensional (i.e. not of dimension d), then a redundant constraint may become nonredundant after the removal of some other redundant constraints. To avoid these dependencies of the redundant constraints we assume full dimensionality of the feasible region. Because of our purely combinatorial characterizations of redundancy and nonredundancy, our algorithm works in the combinatorial setting of oriented matroids (Björner et al. 1993), and can be applied to remove redundancies from oriented topological hyperplane arrangements.
A preliminary version of these results has appeared in the 31st International Symposium on Computational Geometry (Fukuda et al. 2015).

Basics
Before discussing redundancy removal and combinatorial aspects in linear programs, we fix the basic notation on linear programming-such as dictionaries and pivot operation-and review finite pivot algorithms. [For further details and proofs see e.g. Chvatal (1980, Part 1), Fukuda (2011a, Chapter 4).]

LP in dictionary form
Throughout, if not stated otherwise, we always consider linear programs (LPs) of the form where E := B ∪ N and as introduced in (1), B and N are disjoint finite sets of variable indices with |B| = n, |N | = d, b ∈ R B and A ∈ R B×N are given input vector and matrix. An LP of this form is called LP in dictionary form and its size is n × d. The set B is called a (initial) basis, N a (initial) nonbasis and c T x N the objective function.
The feasible region of the LP is defined as the set of x ∈ R E that satisfy all constraints, i.e., the set The LP is called unbounded if for every k ∈ R, there exists a feasible solution x, such that c T x ≤ k. If there exists no feasible solution, the LP is called infeasible.
The dictionary D(B) ∈ R B∪{ f }×N ∪{g} of an LP (3) w.r.t. a basis B is defined as Hence by setting x f := c T x N , we can rewrite (3) as (4) Whenever we do not care about the objective function, we may set c = 0, and with abuse of The basic solution w.r.t. B is the unique solution x to x B∪{ f } = Dx N ∪{g} such that x g = 1, x N = 0 and hence x B∪{ f } = D •g .
The dual LP of LP (4) is defined as minimize y g subject to y N ∪{g} = −D T y N ∪{ f } y E ≥ 0, y f = 1. (5) It is useful to define the following four different types of dictionaries (and bases) as shown in Fig. 1 below, where "+" denotes positivity, "⊕" nonnegativity and similarly "−" negativity and " " nonpositivity.
A dictionary D (or the associated basis The following proposition follows from standard calculations.
Proposition 1 For any LP in dictionary form the following statements hold.
1. If the dictionary is feasible then the associated basic solution is feasible.

If the dictionary is optimal, then the associated basic solution is optimal.
3. If the dictionary is inconsistent, then the LP is infeasible.

Pivot operations
We now show how to transform the dictionary of an LP into a modified dictionary using elementary matrix operation, preserving the equivalence of the associated linear system. This operation is called a pivot operation.
Let p ∈ B, q ∈ N and d pq = 0. Then it is easy to see that one can transform x B∪{ f } = Dx N ∪{g} to an equivalent system (i.e., with the same solution set): We call a dictionary terminal if it is optimal, inconsistent or dual inconsistent. There are several finite pivot algorithms such as the simplex and the criss-cross method that transform any dictionary into one of the terminal dictionaries (Dantzig 1963, Section 4;Terlaky 1987;Wang 1987;Fukuda and Terlaky 1997). This will be discussed further in Sect. 4.3.

Combinatorial redundancy
Consider an LP in dictionary form as given in (3). Then x r ≥ 0 is redundant, if the removal of the constraint does not change the feasible solution set, i.e., if has the same feasible solution set as (3). Then the variable x r and the index r are called redundant.
If the constraint x r ≥ 0 is not redundant it is called nonredundant, in that case the variable x r and the index r are called nonredundant.
It is not hard to see that solving n + d LPs of the same size as (7) suffices to find all redundancies. Hence running time O((n + d) · LP(n, d)) suffices to find all redundancies, where LP(n, d) is the time needed to solve an LP of size n × d. Clarkson showed that it is possible to find all redundancies in time O((n + d) · LP(s, d) + s · n · d), where s is the number of nonredundant variables (Clarkson 1994). In case where s n, this is a major improvement. To be able to execute Clarkson's algorithm, one needs to assume full dimensionality and an interior point of the feasible solution set. In the LP setting this can be done by some preprocessing, including solving a few (O(d)) LPs (Fukuda 2016, Section 8).
In the following we focus on the combinatorial aspect of redundancy removal. We give a combinatorial way, the dictionary oracle, to encode LPs in dictionary form, where we are basically only given the signs of the entries of the dictionaries. In Sect. 4 we will show how the signs suffice to find all redundant and nonredundant constraints of an LP in dictionary form.
Consider an LP of form (3). For any given basis B, the dictionary oracle returns a matrix Namely, for basis B, the oracle simply returns the matrix containing the signs of D(B), without the entries of the objective row f .

Certificates
We show that the dictionary oracle is enough to detect all redundancies and nonredundancies of the variables in E. More precisely for every r ∈ E, there exists a basis B such that D σ (B) is either a redundancy or nonredundancy certificate for x r . We give a full characterization of the certificates in Theorems 1 and 2. An alternative characterization of the nonredundancy certificate is given in Theorem 5. The number of dictionaries needed to have all certificates depend on the LP. See Sect. 6 for examples where constantly many suffice and where linearly many are needed. For convenience throughout we make the following assumptions, which can be satisfied with simple preprocessing.

Assumption 1
The feasible region of (3) is full dimensional (and hence non-empty).
Assumption 2 There is no j ∈ N such that d i j = 0 for all i ∈ B.
In Sect. 4.3 we will see that both the criss-cross and the simplex method can be used on the dictionary oracle for certain objective functions. Testing whether the feasible solution set is empty can hence be done by solving one linear program in the oracle setting. As mentioned in the introduction the full-dimensionality assumption is made to avoid dependencies between the redundant constraints. This can be achieved by some preprocessing on the LP, including solving a few (O(d)) LPs (Fukuda 2016).
It is easy to see that if there exists a column j such that d i j = 0 for all i ∈ B, then x j is nonredundant. As it can take any value independent of the others, in particular, there are solutions with x j < 0, which implies that x j is nonredundant. The redundancies and nonredundancies of all other variables are independent of x j , hence we can mark x j as nonredundant and simply remove the column.

A certificate for redundancy in the dictionary oracle
We say a basis B is r -redundant if r ∈ B and D σ r is of the form of Fig. 2. Note that an r -redundant basis is not necessarily feasible. Since the r th row of the dictionary represents x r = d rg + j∈N d r j x j , x r ≥ 0 is satisfied as long as x j ≥ 0 for all j ∈ N . Hence x r ≥ 0 is redundant for (3).
Theorem 1 (Redundancy Certificate) An inequality x r ≥ 0 is redundant for the system (3) if and only if there exists an r -redundant basis.
Proof We only have to show the "only if" part.
Suppose x r ≥ 0 is redundant for the system (3). We will show that there exists an r -redundant basis.
Consider the LP minimizing the variable x r subject to the system (3) without the constraint x r ≥ 0. Since x r ≥ 0 is redundant for the system (3), the LP is bounded. By Assumption 1 and the fact that every finite pivot algorithm terminates in a terminal dictionary the LP has an optimal dictionary. If the initial basis contains r , then we can consider the row associated with r as the objective row. Apply any finite pivot algorithm to the LP. Otherwise, r is nonbasic. By Assumption 2, one can pivot on the r -th column to make r a basic index. This reduces the case to the first case.
Let's consider an optimal basis and optimal dictionary for the LP where x r is the objective function. Since it is optimal, all entries d r j for j ∈ N are nonnegative. Furthermore, d rg is nonnegative as otherwise we would have found a solution that satisfies all constraints except x r ≥ 0, implying nonredundancy of x r .
From the proof of Theorem 1 the following strengthening of Theorem 1 is immediate. Fig. 3).

A certificate for nonredundancy in the dictionary oracle
Similarly as in the redundancy case, we introduce a certificate for nonredundancy using the dictionary oracle. Before proving the theorem, we observe the following. 1. Unlike in the redundancy certificate an r -nonredundant basis needs to be feasible. To verify the correctness of a nonredundancy certificate we need to check between n and 2n entries, which is typically much larger than the d + 1 entries we need for the redundant case. 2. If the g-column of a feasible basis does not contain any zeros, then all nonbasic variables are nonredundant. In general when x r ≥ 0 is nonredundant, not necessarily every feasible basis B with r ∈ N is r -nonredundant. Consider the system: Then the basis {3} is not a certificate of nonredundancy of x 1 , as d σ 31 = + in the associated dictionary. On the other hand, the basis {2} is 1-nonredundant: g 1 2 3 0 + + g 1 3 2 0 − + Proof (of Theorem 2) Let (LP) be an LP of form (3) and suppose that x r ≥ 0 is nonredundant. Then it follows that for small enough x r ≥ − is nonredundant in Note that this LP can easily be transformed to an LP of form (3) by the straightforward variable substitution x r = x r + . We denote this LP obtained after perturbation and substitution by (LP ).
(LP ) attains its minimum at − and hence there exists an optimal dictionary where r is nonbasic. Let B be such a feasible optimal basis of (LP ) with r ∈ N . We show that if we choose small enough, B is r -nonredundant in (LP).
Let B 1 , B 2 , . . . , B m be the set of all bases (feasible and infeasible) of (LP), that have r as a nonbasic variable. Choose > 0 such that If the right hand side (RHS) is undefined, we choose any < ∞. Geometrically this means that if for t ∈ B i x t ≥ 0 is violated in the basic solution w.r.t.
We need to show that B is r -nonredundant in (LP). To show that B is a feasible basis we need that d tg ≥ 0 for all t ∈ B. If d tr ≥ 0, then this is clear. For the case where d tr < 0 assume that d tg < 0. Then by choice of , substituting into (9) gives d tg < 0, which is a contradiction to the nonnegativity of d tg . If d tg = 0, by (9) it follows that d tr ≤ 0. Therefore B is r -nonredundant.
For the other direction let B be r -nonredundant and D and D the corresponding dictionaries in (LP) and (LP ), respectively. Choose > 0 such that If the RHS is undefined, we choose any < ∞. We claim that for such an , B is still feasible for (LP ) and hence x r ≥ 0 is nonredundant. Again the two dictionaries only differ in column g, where In the case where d tg = 0, it follows that d tg ≥ 0 by r -nonredundancy. If d tg > 0, then

Finite pivot algorithms for certificates
In this section we discuss how to design finite pivot algorithms for the dictionary oracle model. Both the criss-cross method and the simplex method can be used for the dictionary oracle to find redundancy and nonredundancy certificates. A finite pivot algorithm chooses in every step a pivot according to some given rule and terminates in an optimal, inconsistent or dual inconsistent basis in a finite number of steps. Note that both the criss-cross method and the simplex method may not be polynomial in the worst case, but are known to be fast in practice (Klee and Minty 1972;Roos 1990). Furthermore there exists no known polynomial algorithm to solve an LP given by the dictionary oracle. Fukuda conjectured that the randomized crisscross method is an expected polynomial time algorithm (Fukuda 2011b). By the proof of Theorem 1, in order to find a redundancy certificate in (3) it is enough to solve (3) with objective function x r . Similarly by the proof of Theorem 2, for a nonredundancy certificate it is enough to solve the -perturbed version (8).
For the criss-cross method, the pivot rule is solely dependent on the signs of the dictionary entries and not its actual values (Fukuda 2011a, Chapter 4;Fukuda and Terlaky 1997). Standard calculations show that the signs in the -perturbed dictionary (for > 0 small enough) are completely determined by the signs of the original dictionary. We recall that the dictionary oracle does not output the objective row, but since we minimize in direction of x r the signs of the objective row are completely determined. (If r is basic then the objective row has the same entries as the r -th row and if r is nonbasic then d f r = + and all other entries of the objective row are zero.) Therefore the dictionary oracle is enough to decide on the pivot steps of the criss-cross method.
For the simplex method with the smallest index rule, we are given a feasible basis and the nonbasic variable of the pivot element is chosen by its sign only (Chvatal 1980, Part 1 Sect. 3). The basic variable of the pivot is chosen as the smallest index such that feasibility is preserved after a pivot step. Using the dictionary oracle one can test the at most n possibilities and choose the appropriate pivot.

An output sensitive redundancy detection algorithm
Throughout this section, we denote by S the set of nonredundant indices and by R the set of redundant indices. Denote by LP(n, d) the time needed to solve an LP. By the discussion in Sect. 4.3, for any x r , r ∈ E, we can find a certificate in time LP(n, d). Theorem 3 presents a Clarkson type, output sensitive algorithm with running time O(d · (n + d) · s d−1 · LP(s, d) + d · s d · LP(n, d)), that for a given LP outputs the set S , where s = |S |. Typically s and d are much smaller than n.

General redundancy detection
Since in every round at least one variable is added to S or R, the algorithm terminates. The correctness of the output can easily be verified: If in the outer loop r is added to R, r is redundant w.r.t. S and hence redundant w.r.t. S * ⊇ S. If in the inner loop r is added to S, r is nonredundant w.r.t. E \ R and hence nonredundant w.r.t. S * ⊆ E \ R. It follows that S * = S .
The main issue is how to find the sets S F and R F efficiently in the last step. This will be discussed in (the proof of) Lemma 1.
Remark 1 A technical problem is that we cannot test for redundancy in the dictionary oracle when S does not contain a nonbasis. Therefore as long as this is the case, we fix an arbitrary nonbasis N and execute the redundancy detection algorithm on S ∪ N instead of S. Since this stronger checking of redundancy does not change correctness or the order of the running time, we will omit this detail in the further discussion.
Theorem 3 The redundancy detection algorithm outputs S , the set of nonredundant constraints in time and consequently in time The following Lemma implies Theorem 3.

Lemma 1 Let R(n, d, s) be the running time of the redundancy detection algorithm in n basic variables, d nonbasic variables and s the number of nonredundant variables. Then in the last step of the inner loop some sets S F ⊆ S and R F ⊆ R , with S F S, can be found in time O(R(n, d − 1, s) + LP(n, d)).
Proof (of Theorem 3) Termination and correctness of the algorithm are discussed above. The iteration of the outer loop of the algorithm takes time O (LP(s, d)) and is executed at most n + d times. By Lemma 1, the running time of the inner loop is O(R(n, d − 1, s) + LP(n, d)) and since in each round at least one variable is added to S, it is executed at most s times. Therefore the total running time is given recursively by The claim follows by solving the recursion and noting that R(n, 0, s) can be set to O(n).
It remains to prove Lemma 1, for which we first prove some basic results below, using the dictionary oracle setting.

that only contains the rows of D indexed by F. Then r ∈ F ∪ N is nonredundant in (LP) if and only if it is nonredundant in (LP F ).
Proof We only need to show the "if" part. Let r ∈ F ∪ N be nonredundant in (LP F ) with certificate D F . Then there exists a sequence of pivot steps from D F to D F . Using the same ones on D and obtaining dictionary D, this is a nonredundancy certificate for r , since d ig = d ig > 0 for all i ∈ B \ F by the definition of F.

Lemma 3 Let D = [b, −A] be the dictionary of an LP of form (3). Then a variable r ∈ E is nonredundant in the LP given by D if and only if it is nonredundant in the LP with dictionary
Proof If D(B) is a redundancy certificate for r for some basis B, then D 0 (B) is a redundancy certificate for r as well.
For the converse, let D = D(B) be a nonredundancy certificate for r for some basis B. For simplicity assume that B = {1, 2, . . . , n}. For now assume that b i > 0 for all i ∈ B and let D i the dictionary obtained from D 0 by pivoting on b i , i = 1, 2, . . . , n. We will show that at least one of the D i , i ∈ {0, 1, . . . , n} is a nonredundancy certificate for r . Since after any pivot the first column of D i stays zero, D i is a nonredundancy certificate if and only if D i •r ≤ 0, i.e., the r -th column of D i is nonpositive. Let R i = (r i 1 , r i 2 , . . . r i n ) T := D i •r for i ≥ 1 and R 0 = (r 1 , r 2 , . . . , r n ) T := D 0 •r . Claim Assume that r i i < 0 for any fixed i and there are at least i − 1 additional nonpositive entries (w.l.o.g. we assume them to be r i 1 , r i 2 , . . . , r i i−1 ). If R i has a positive entry (which w.l.o.g. we assume to be r i i+1 ), then r i+1 i+1 < 0 and r i+1 1 , r i+1 2 , . . . , r i+1 i are nonpositive. If D 0 is not a certificate for r , then w.l.o.g. r 1 > 0 and hence r 1 1 = − r 1 b 1 < 0. Therefore by induction the lemma follows from the claim.
It remains to prove the claim. Assume that r i 1 , r i 2 , . . . , r i i−1 ≤ 0, r i i < 0 and r i i+1 > 0. Then we have r i > 0 and The following calculations show the claim.
By (10) and (11), Now suppose that b i = 0 for some i. Then by the nonredundancy certificate r i ≤ 0, and it is easy to see that r j i = r i ≤ 0 for all admissible pivots on b j . Hence we can use the above construction on the nonzero entries of b, the rows corresponding to the zero entries satisfy the nonredundancy certificate conditions trivially. Now if there exists i ∈ B such that b i = 0, define F = {i ∈ B | b i = 0}, LP F and D F as in Lemma 2. We now recursively find all redundant and nonredundant constraints in the LP F using Lemma 3 as follows. From LP F we construct another LP, denoted LP − with one less nonbasic variable, by deleting D F •g (the column of all zeros), choosing any element t ∈ N and setting t = g, i.e., setting x t = 1 in the corresponding LP. Finding all redundancies and nonredundancies in LP − takes time R(|F|, d − 1, s). By Lemma 3 redundancies and nonredundancies are preserved for LP F . Therefore finding them in LP F takes time R(|F|, d − 1, s) + LP(n, d) ≤ R(n, d − 1, s) + LP (n, d), where the LP(n, d) term is needed to check separately whether t is redundant. Choose S F as the set of nonredundant indices of LP F and R F as the set of redundant ones. By Lemma 2 S F ⊆ S and R F ⊆ R . By the same Lemma r is redundant in LP F , therefore it follows that S F S, as otherwise r would be redundant w.r.t. S.

Strong redundancy detection
In this section we show how under certain assumptions the running time of the redundancy algorithm can be improved. If we allow the output to also contain some weakly redundant constraints (see definition below), it is basically the same as the running time of Clarkson's method.
A redundant variable r is called strongly redundant if for any basic feasible solution x, x r > 0. In particular for any basic feasible solution, r ∈ B. If r is redundant but not strongly redundant r is called weakly redundant.
As before let s be the number of nonredundant constraints and let R v , (with |R v | = r v ,) and R w , (with |R w | = r w ,) be the set of strongly and weakly redundant constraints respectively.

Theorem 4 Let S be the set of nonredundant constraints. It is possible to find a set S
The following corollary follows immediately.

Corollary 2
If there are no weakly redundant constraints, the set S of nonredundant constraints can be found in time O((n + d) · LP(s, d) + s · LP(n, d)).
The theorem is proven using the following two lemmas, which can be verified with straight forward variable substitutions. Note that the perturbation method mentioned above can be implemented symbolically without an explicit evaluation of . This combinatorial technique is known as the lexicographic method (Chvatal 1980, Part 1 Sect. 3 Page 34).
Lemma 5 (Chvatal 1980 Proof (of Theorem 4) Replace the given LP by its -perturbed version as in Lemma 4 and run the redundancy removal algorithm, which is possible by the same lemma. By Lemma 5, S * ⊇ S and S * ∩ R v = ∅. Since by Lemma 4, the entries of the g-column of any dictionary D σ, are strictly positive the algorithm never runs the recursive step and the running time follows.
Remark The -perturbation makes every feasible LP full dimensional, therefore the full dimensionality assumption can be dropped for Theorem 4.

Number of dictionaries for all certificates
In Sect. 4 we showed the existence of certificates in the dictionary oracle for both redundant and nonredundant variables. The main question discussed in this section is how many dictionaries are needed to detect all redundancies. As we will show below, this number depends on the given set of linear inequalities and lies between 1 and n + d − s, i.e., the number of redundant constraints. The number of dictionaries needed to detect all nonredundancies is not very interesting as usually s n. Moreover a single dictionary is a certificate for at most d nonredundancies (the nonbasic variables), which implies that we always need between s d and s dictionaries in order to obtain all certificates.
It is not hard to find an example where a single basis B is r -redundant for all redundant constraints r . For instance the following basis is r -redundant for all r ∈ B and r -nonredundant for all r ∈ N .
For the maximum number of dictionaries needed to detect all redundancies, we give an example of a system of linear equalities where every redundant constraint r has a unique rredundant basis and all those bases are pairwise distinct. Therefore n +d −s bases are needed to detect all redundancies. We consider the d-dimensional hypercube and its dual d-cross polytope, such that each vertex of the hypercube lies in the barycenter of its corresponding dual face of the d-cross polytope. We show that any constraint of the d-cross polytope is redundant for the cube and has a unique certificate corresponding to its dual vertex. (For an example in general position, one can move the constraints of the d-cross polytope away from the hypercube by some > 0 small enough, and still obtain the same results.) Formally the d-dimensional hypercube is given by The d-cross polytope has 2 d constraints given as follows: For x = (x 1 , x 2 , . . . , x d ) T and each p ∈ {−1, +1} d we have the constraint where k p denotes the number of +1's in p.
The system of linear inequalities in dictionary form is hence given by It is easy to see that x 1 , . . . , x 2d are nonredundant.
and hence B p is p-redundant. We show uniqueness for p − = (−1, . . . , −1) T , the rest follows by symmetry. We prove that x p − is nonredundant in the system induced by E \ {i}, for all i ∈ {1, . . . , d}, which implies that the unique p − -redundant basis is given by B p − = {1, . . . , d}. Again by symmetry it is enough to prove this for i = 1. Let D be the dictionary corresponding to the linear system (13). Let D be the dictionary obtained by a pivot step on ( p − , 1), and B its basis. Then for all i ∈ B \ {1} To obtain an example with maximum number of unique pairwise distinct redundancy certificates we use McMullen's Upper Bound Theorem (1970). It states that a polytope given by s constraints, can have at most Θ(s d 2 ) vertices and the bound is tight for the dual cyclic polytope. Similarly as in the hypercube case, we can define a redundant constraint through each of its vertices and get an example with n + d − s ∈ Θ(s d 2 ) redundant constraints that have unique, pairwise distinct redundancy certificates.

Alternative nonredundancy certificate
In this section we introduce two families of dictionaries that are an alternative nonredundancy certificate to the one described in Sect. 4.2. We show that from the latter r -nonredundancy certificate one can always obtain a certificate of the new forms (Sect. 7.1) in a single pivot step. The opposite does not hold in general.

Another certificate
In this section we introduce an alternative certificate for nonredundancy, which is given by two different kinds of dictionaries. A basis B is called r -nonredundant type I if r ∈ B, d rg < 0 and d ig ≥ 0 for all i ∈ B \ {r } i.e. D σ (B) is of the form of Fig. 5.
A basis B is called r -nonredundant type II if r ∈ B, and there exists t ∈ N such that d rt < 0 and d it ≥ 0 for all i ∈ B \ {r } i.e. D σ (B) is of the form of Fig. 6.
Note that here in both types of certificates r is a basic variable, whereas in the certificate of Theorem 2 r is nonbasic. Proof Suppose x r is nonredundant for the system (3). W.l.o.g. assume that r ∈ B (by Assumption 2 in Sect. 4). Consider the linear program (LP ) obtained by considering x r = b r − A r • x as the objective row instead of a constraint. As mentioned in Sect. 2.2 (LP ) can be transformed into one of the terminal dictionaries, i.e., into an optimal, inconsistent or dual inconsistent dictionary. Since (LP ) is feasible, the inconsistent case is void. Suppose (LP ) can be transformed into an optimal dictionary D. If d rg ≥ 0, then this implies redundancy of r , which is a contradiction. Hence d rg < 0 and by considering x r = b r − A r • x again as a constraint instead of the objective row, this is a r -nonredundant type I dictionary.
If (LP ) can be transformed into an unbounded dictionary, this immediately gives us an r -nonredundant type II basis.
For the other direction suppose D is r -nonredundant type I . Then the corresponding basic solution, where x g = 1, x N = 0 and x B = D •g , satisfies all constraints but x r ≥ 0. By definition r is nonredundant. If D is r -nonredundant type II, then by the nonemptyness assumption there exists a solution y = (y B , y N ) to the linear system. By setting y c B = b − Ay N + c · D •t , for c large enough, again all constraints but x r ≥ 0 are satisfied.
We observe that the above certificates can be found with the algorithms described in Sect. 4.3. However the redundancy detection algorithm can not be applied in the given form to them. This can be seen already in general case, as we use that any r -redundant basis is t-nonredundant for t ∈ N .

Comparison of certificates
The natural question that arises, is how the above certificates relate to the r -nonredundancy certificate of Sect. 4.2. This relation is given in the following theorem.

Theorem 6
We can obtain a r -nonredundant type I or type II dictionary in one pivot step from any r -nonredundant dictionary. The opposite direction does not hold in general.