Abstract 1 Introduction 2 Preliminaries 3 Private Perceptron Algorithm for Positive Margin LPs 4 When the LP is Not Fully Dimensional References Appendix A Missing Proofs from Section 3 Appendix B Missing Proofs from Section 4

Solving Linear Programs with Differential Privacy

Alina Ene ORCID Department of Computer Science, Boston University, MA, USA Huy Le Nguyen ORCID Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA Ta Duy Nguyen Department of Computer Science, Boston University, MA, USA Adrian Vladu ORCID CNRS & IRIF, Université Paris Cité, France
Abstract

We study the problem of solving linear programs of the form Axb, x0 with differential privacy. For homogeneous LPs Ax0, we give an efficient (ϵ,δ)-differentially private algorithm which with probability at least 1β finds in polynomial time a solution that satisfies all but O(d2ϵlog2dδβlog1ρ0) constraints, for problems with margin ρ0>0. This improves the bound of O(d5ϵlog1.51ρ0polylog(d,1δ,1β)) by [Kaplan-Mansour-Moran-Stemmer-Tur, STOC ’25]. For general LPs Axb, x0 with potentially zero margin, we give an efficient (ϵ,δ)-differentially private algorithm that w.h.p drops O(d4ϵlog2.5dδlogdU) constraints, where U is an upper bound for the entries of A and b in absolute value. This improves the result by Kaplan et al. by at least a factor of d5. Our techniques build upon privatizing a rescaling perceptron algorithm by [Hoberg-Rothvoss, IPCO ’17] and a more refined iterative procedure for identifying equality constraints by Kaplan et al.

Keywords and phrases:
Differential Privacy, Linear Programming
Category:
RANDOM
Funding:
Alina Ene: AE was supported in part by an Alfred P. Sloan Research Fellowship.
Huy Le Nguyen: HN was supported in part by NSF CCF 2311649 and a gift from Apple Inc.
Adrian Vladu: AV was supported by the French Agence Nationale de la Recherche (ANR) under grant ANR-21-CE48-0016 (project COMCOPT).
Copyright and License:
[Uncaptioned image] © Alina Ene, Huy Le Nguyen, Ta Duy Nguyen, and Adrian Vladu; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
; Security and privacy
Editors:
Alina Ene and Eshan Chattopadhyay

1 Introduction

Linear programming is a fundamental tool for modeling and solving problems in computer science. Consider the standard feasibility problem of finding xd subject to Axb,x0, where the constraints Axb are input from users. The input can contain sensitive information such as the users’ private health data or private transactions, which the users may wish to be protected. For the algorithm designer, this is where differential privacy proves its usefulness. Differential privacy protects sensitive information by requiring that the algorithm must have approximately the same output upon receiving similar input.

The study of solving linear programs with differential privacy was initiated by [9] and has subsequently been studied under different contexts along with other related problems. Notably, privately solving linear programs is closely related to private learning of subspaces and halfspaces, which are fundamental problems in learning theory. In particular, a line of work by [3, 1, 11, 7, 2] showed reductions from learning halfspaces to the problem of finding a point in a convex hull, which in turn can be solved via linear programing. This means that the existence of an efficient private solver for linear programs implies an upper bound for the sample complexity of efficient private learners for halfspaces, and any improvement for the former problem implies an improvement for the latter.

Imposing differential privacy when solving a linear program comes with the impossibility to ensure all constraints are satisfied. Indeed, in the extreme case, the addition of one more constraint can change the problem from feasible to infeasible. Therefore, to guarantee differential privacy, it is required that a number of constraints must be dropped. It was known (by folklore) that for a linear program whose feasible region is of positive volume, privatizing the algorithm by [4] results in a solution that violates poly(d) constraints, where d is the dimension of the problem. The recent work by [10] formalized this claim and generalized it to all linear programs. Their algorithm, however, suffered from a high degree polynomial dependence on d (at least d9 dependence), whereas the lower bound (from learning theory) is linear in d. Closing this gap remains a challenging open question.

In our work, we make progress in this direction, and propose new algorithms for solving linear programs with differential privacy. Our algorithms are efficient and they achieve significantly improved guarantees on the number of violated constraints.

1.1 Our contribution

Our first contribution is a new algorithm for privately solving linear programs with positive margin with guarantee stated in Theorem 1. Here the margin of the problem is the radius of the largest ball that fits in the intersection of the feasible region and the unit ball, which we define in Section 2.

Theorem 1.

Let ϵ,δ,β0 be given an input. There exists an efficient (ϵ,δ)-differentially private algorithm for finding a feasible solution to the linear program Ax0, x0 such that whenever the margin of the system is at least ρ0, with probability at least 1β, the algorithm outputs a solution x that satisfies all but O(d2ϵlog1ρ0log2dβδ) constraints.

In terms of the dependence on the dimension d, our algorithm significantly improves over the prior work by [10], which drops O(d5log1.51ρ0polylog(d,1δ,1β)) constraints.

For general linear programs with potentially zero margin, we give an iterative private algorithm with the following guarantee.

Theorem 2.

Let ϵ,δ,β0 be given an input. There exists an efficient (ϵ,δ)-differentially private algorithm for finding a feasible solution to the linear program Axb; x0 with integer entries such that with probability at least 1(β+δ), the algorithm outputs a solution x that satisfies all but O(d4ϵlogdUlog2.5dβδ) constraints, where U is an upper bound on the absolute values of the entries in A and b.

The algorithm by [10] requires to drop O~(d9) constraints to guarantee privacy. To improve this, one can use the algorithm from Theorem 1 as a subroutine in the algorithm by [10] to reduce to the number of dropped constraints to O~(d5). Our algorithm with a more refined analysis goes one step further to remove another factor d and achieves O~(d4) dependence. Our bound is a significant improvement towards the lower bound Ω(d).

1.2 Our techniques

Our technique for showing Theorem 1 is based on privatizing a rescaling perceptron algorithm for solving linear programs of the form Ax0,x0 with positive margin. Instead of using the algorithm by [4] as in [10], we develop an algorithm based on the work of [8].

A rescaling perceptron algorithm consists of two phases: the Perceptron Phase to find a new solution and the Rescaling Phase to find a direction to rescale the input. To privatize the Perceptron Phase, the algorithm of [10] uses the 𝖭𝗈𝗂𝗌𝗒𝖠𝗏𝗀 mechanism by [13] to construct a noisy average of the constraints that are violated by the current solution, and then updates the solution in the same direction. In the Rescaling Phase, the non-private algorithm by [4] and the privatized version by [10] use another perceptron-style procedure to find a rescaling direction. This approach requires O~(d2) updates and each call to the Rescaling Phase requires to drop O~(d2.5) constraints.

The main difference between our work and [10] lies in the rescaling phase. We use the following technique by [14] and [8]. During the perceptron phase, the algorithm maintains a weight vector λ for tracking which constraints are violated by the solution in each iteration. Using a random Gaussian vector, the algorithm produces a rescaling direction by a convex combination weighted by λ of the rows satisfied by the random vector. [14] and [8] show that with a constant probability, rescaling the input matrix along that direction increases the volume of the feasible region inside the unit ball by a constant factor.

The benefit of using the weight vector to rescale is that, since we only need O(log1β) steps to boost the success probability, the amount of noise needed to make this phase private is much smaller. Further, we can keep all constraints around until we find a good solution and only discard constraints once at the end of the algorithm. We discard only O(d2) constraints in total.

For general linear programs with potentially zero margin, we use the standard technique of adding a small perturbation to the constraints that does not change the feasibility of the problem. This perturbation increases the problem margin and allows the application of the private perceptron algorithm. Similar to [10], our algorithm iteratively identifies tight (equality) constraints in the LP, privatizes these equality constraints and uses them to eliminate variables. The approach by [10] needs to account for the blow-up of the input entries after performing the variable elimination steps, which can quickly diminish the margin. The cost of this shows up as an extra factor d in the number of dropped constraints. By contrast, our algorithm always returns to the initial LP after identifying tight constraints. We show that in this way the margin reduces at a slower rate, saving the factor d.

1.3 Related work

Rescaling Perceptron algorithms.

To find a solution x0 that satisfies Ax0, one can use the classic perceptron algorithm which convergences after at most 1/ρ2 iterations on problems with positive margin ρ, where the margin is the radius of the largest ball that fits in the intersection of the feasible region and the unit ball. [4] show a modification of the classic algorithm with an additional rescaling procedure that runs in time O~(nd4log1ρ), for problems with ρ>0. The rescaling procedure by [4] is another variant of the perceptron algorithm which finds a rescaling direction by moving a random unit vector along the direction of a violated constraint. Subsequent works by [14, 8] explore different rescaling operations. In particular, our work relies on the technique by [8] in which the rescaling direction is found by a convex combination of rows whose corresponding constraints are satisfied by a random Gaussian vector.

Solving linear programs with privacy.

Solving linear programs with differential privacy has been the focus of several prior works. Under various notions of neighboring inputs (such as input differing by a row or a column), [9] give algorithms that approximately satisfy most constraints. [12] show an algorithm that satisfy most constraints exactly, but only considering neighboring inputs to be differing on the right hand side scalars. [11] provide an algorithm to solve the general feasibility problem Axb, but requires a running time that is exponential in the dimension. Most relevant for our work is the work of [10], which studies the same setting. We provide a detailed comparison of the results and techniques in sections 1.1 and 1.2.

Beyond solving linear programs.

Solving linear programs is closely related to private learning subspaces and halfspaces [3]. [1] show a reduction from learning halfspaces to the problem of finding a point in the convex hull, where an efficient algorithm for the latter problem implies an upper bound for the sample complexity of efficient algorithms for the former. Subsequent works by [11, 7, 2] have targeted this question. [10] build techniques for solving this problem via privately solving LPs and achieve the first algorithm that has polynomial dependence on the dimension and polylogarithmic dependence on the domain size. Our algorithm can be used as a subroutine to improve the runtime of the algorithm by [10].

2 Preliminaries

Notation.

We let 1 be the 1-norm and 2 be the 2-norm. When it is clear from context, also denotes the 2norm. For a matrix A, we denote by ai the i-th row of A. For a vector a, we let a¯ be the normalized vector a¯=aa2. We also use A¯ to denote the matrix A with normalized rows. We denote by Lap(b) the Laplace distribution with density f(x)=12bexp(|x|b); N(0,σ2) the Gaussian distribution with mean zero and variance σ2 and N(0,σ2I) the multivariate Gaussian distribution with mean zero and covariance σ2I. The dimension will be clear from context.

Linear programs.

We consider the problem of finding a feasible solution to a linear program in general standard form Axb,x0, where A has dimension n×d, b is a vector of dimension n and the entries of A and b are integers. Following [4], we refer to the problem of finding a solution satisfying Ax0, x0 as a homogeneous LP. A homogeneous LP Ax0,x0 is characterized by a quantity ρ(A), namely, the margin (or roundness) parameter, given as

ρ(A) =maxx21minia¯i,x.

Geometrically, ρ(A) is the radius of a ball that fits into the intersection between the feasible region and the unit ball. The classic perceptron algorithm for LPs with ρ(A)>0 converges with 1/ρ(A)2 iterations. Rescaling algorithms such as [4, 14, 8] have total runtime poly(n,d)log1/ρ(A).

Homogenization.

[4] give a simple reduction (called homogenization) from a general LP Axb,x0 to a homogeneous LP Ax0, x0 by setting A=[AbI] and x=(xx0). We refer to the homogeneous LP constructed via this reduction as the homogenized LP.

Differential privacy.

We use the notation (A,b) as shorthand for the LP Axb,x0. We say that two LPs (A,b) and (A,b) are neighbors is they differ by only one constraint (one LP has an extra constraint). A randomized algorithm 𝒜 is said to be (ϵ,δ)-differentially private (DP) if for all neighboring LPs (A,b) and (A,b) and every subset of possible outcomes 𝒪,

Pr[𝒜(A,b)𝒪] eϵPr[𝒜(A,b)𝒪]+δ.

In the case δ=0, we say the algorithm is ϵ-DP. Two commonly used mechanisms for achieving differential privacy are the Laplace mechanism and the Gaussian mechanism. Let f be a function whose output has dimension d. We say f has 1 sensitivity k if on any two neighboring inputs x and x, f(x)f(x)1k, and f has 2 sensitivity k if on any two neighboring inputs x and x, f(x)f(x)2k.

Theorem 3 (Laplace mechanism [6]).

Let f be a function of 1 sensitivity k. The mechanism 𝒜 that on an input x adds independently generated noise with the Laplace distribution Lap(kϵ) to each of the d coordinates of f(x) is ϵ-DP.

Theorem 4 (Gaussian mechanism [5]).

Let f be a function of 2 sensitivity k. The mechanism 𝒜 that on an input x adds noise generated with the Gaussian distribution N(0,σ2) where σkϵ2ln2δ to each of the d coordinates of f(x) is (ϵ,δ)-DP.

3 Private Perceptron Algorithm for Positive Margin LPs

3.1 Algorithm

Algorithm 1 Private Perceptron.
Algorithm 2 𝖯𝗋𝗂𝗏𝖺𝗍𝖾𝖯𝖾𝗋𝖼𝖾𝗉𝗍𝗋𝗈𝗇𝖤𝗉𝗈𝖼𝗁(A,ϵ,δ,β).

We describe our Private Perceptron algorithm in Algorithm 1. Given a matrix An×d, margin parameter ρ0, privacy parameters ϵ,δ, and failure probability β, the algorithm runs at most τ=O(dlog1ρ0) and makes calls to 𝖯𝗋𝗂𝗏𝖺𝗍𝖾𝖯𝖾𝗋𝖼𝖾𝗉𝗍𝗋𝗈𝗇𝖤𝗉𝗈𝖼𝗁 procedure given in Algorithm 2. Algorithm 2 has three possible outcomes. If it outputs a solution, Algorithm 1 terminates and returns this solution with a suitable rescaling. If it outputs a rescaling direction, Algorithm 1 rescales the input matrix A and repeats. Otherwise, Algorithm 1 terminates and returns .

Our main novel contribution lies in Algorithm 2. This algorithm consists of two phases: the Perceptron Phase in which the algorithm attempts to find a solution to the LP and the Rescaling Phase in which the algorithm finds a good direction to rescale the input if the solution from the Perceptron Phase is not satisfactory. In each phase, we add maintain the privacy by adding appropriate noise. During the Perceptron Phase, the algorithm maintains a solution and updates it along the direction of a noisy average of the violated constraints. The algorithm also maintains a weight vector λ(t) which picks up the constraints violated by the current solution. We keep λ(t) private, and use the average value λ¯ to determine the rescaling direction. To determine the rescaling direction, during the Rescaling Phase, the algorithm takes a random gaussian vector and computes a noisy sum of all rows satisfied by this vector weighted by λ¯. We will show that with a constant probability, this noisy sum provides a good rescaling direction, and the algorithm repeats the process a number of iterations to boost the success probability.

In the next subsections, we show the privacy and utility guarantees of our algorithm.

3.2 Privacy analysis

Throughout, we let (A,b) and (A,b) be two neighboring LPs. The corresponding computed terms in Algorithm 2 for (A,b) are denoted with the extra prime symbol (for example, S(t) and S(t)).

Proposition 5.

Algorithm 1 is (ϵ,δ)-DP for ϵ=2d3log1ρ0ϵ2+2d3ϵ2log1ρ0log1δ and δ=(d+1)δ.

To show this Proposition, we show the following lemmas.

Lemma 6.

Each iteration of the Perceptron Phase is (ϵ,δ2T)-DP.

The following claim follows similarly to the proof of the 𝖭𝗈𝗂𝗌𝗒𝖠𝗏𝗀 algorithm by [13]. We include the proof in the appendix.

Claim 7.

For all t[T], m(t) has 1 sensitivity 1, and u(t) has 2 sensitivity 2m(t).

Proof of Lemma 6.

For the purpose of analysis, in each iteration of the Perceptron Phase, we define the output of the algorithm to be either the solution x(t) or the new update vector u^(t). Let O,O be the outputs of the iteration on (A,b) and (A,b) respectively, and let F be an arbitrary subset of d. First, due the the privacy of the Laplace mechanism,

Pr[O=x(t)] =Pr[m^(t)ν]eϵPr[m^(t)ν]=eϵPr[O=x(t)]

Next, consider the case where the output is an update vector u^(t). If m(t)<ν:

Pr[O=u^(t)] Pr[m^(t)>m(t)]Pr[Lap(1ϵ)>1ϵlog2Tδ]δ2T.

If m(t)ν, then σ28log4Tδ(m(t))2ϵ2, and due to the privacy of the Gaussian mechanism,

Pr[O=u^(t)u^(t)F] δ2T+eϵPr[O=u^(t)u^(t)F].

Lemma 8.

Each iteration of the Rescaling Phase is (dϵ,δ2000logτβ)-DP.

To show this lemma, first, we show the sensitivity of c(s) in the following claim, whose proof is deferred to the appendix.

Claim 9.

Let m=mint[T]m(t). The 2 sensitivity of c(s) is 2m.

Proof of Lemma 8.

If m<ν, the Rescaling Phase only happens when for all t[T], m^(t)>ν. Hence, for any set of outcomes Fd, by union bound

Pr[Rescaling happens c^F]Pr[m^(t)>ν,tT]Pr[m^(t)>m,tT]
t[T]Pr[m^(t)>m(t)]t[T]Pr[Lap(1ϵ)>1ϵlog2000Tlogτβδ]δ2000logτβ.

Next, consider the case mν. Since θ28d2ϵ2ν2log4000logτβδ by the privacy guarantee of the Gaussian mechanism

Pr[c^F] δ2000logτβ+edϵPr[c^F].

Proof of Proposition 5.

The algorithm is (ϵ,δ)-DP, following directly from advanced composition.

3.3 Utility analysis

To analyze the runtime and utility of Algorithm 1 we let be the unit ball and 𝒫 be the feasible region defined by Ax0. We will show the following proposition about the guarantee on the output of Algorithm 1.

Proposition 10.

With probability at least 15β, Algorithm 1 outputs a solution x that satisfies all but O(dϵlog1.5dβδ) constraints.

We outline the proof of Proposition 10. First, if in an iteration, Algorithm 1 finds a solution outputted by Algorithm 2, we show that this solution must be correct with probability 1β.

Lemma 11.

If Algorithm 2 terminates in the Perceptron Phase and outputs a solution x, then with probability at least 1β, x satisfies all but O(dϵlog1.5dβδ) constraints.

If Algorithm 1 finds a rescaling vector c by the output of Algorithm 2, we show that the volume of the feasible region inside the unit ball is increased by a constant factor with high probability. To start with, assuming the initial margin parameter is at least ρ0, [14] give a lower bound on the volume of the initial feasible region inside the unit ball:

Lemma 12 (Lemma 3 [14]).

Suppose maxx𝒫miniai¯,xρ0 then vol(𝒫)=Ω(ρ0d)vol().

Note that the rescaling operation AA(I12c¯c¯) is equivalent to a linear map F:dd such that, Fc(c)=2c and Fc(x)=x for all xc. [8] show the following lemma.

Lemma 13 (Lemma 4 [14]).

Suppose that c satisfies 1cmaxx𝒫c,x23d, then vol(F(𝒫))1.02vol(𝒫).

Next, we show that the rescaling vector c outputted by Algorithm 2 satisfies the condition of Lemma 13 with high probability.

Lemma 14.

If Algorithm 2 does not return a solution, then with probability at least 14βτ, it outputs a rescaling vector c that satisfies 1cmaxx𝒫c,x23d.

Equipped with Lemma 11 - 14, we are now ready prove Proposition 10.

Proof of Proposition 10.

The algorithm fails if either it outputs a solution that does not satisfy more than Ω(dϵlog1.5dβδ) constraints, or it fails to output a rescaling vector that satisfies the condition of Lemma 13. This happens with probability at most β+τ4βτ=5β.

Otherwise, in each iteration, either the algorithm outputs a satisfactory solution, or the volume of the feasible region inside the unit ball is increased by a factor at least 1.02, by Lemma 13. Therefore, by Lemma 12, the algorithm stops after O(log1ρ0d)=O(dlog1ρ0). The remaining work is to prove Lemmas 11 and 14.

Proof of Lemma 11.

If Algorithm 2 terminates in iteration t of the Perceptron Phase, we have m^(t)ν where ν=O(dϵlog1.5dβδ). That is

m(t) νLap(1ϵ)+1ϵlog2000Tlog1βδ

Here, m(t) is the number of constraints not satisfied by the solution, and we have

Pr[m(t)ν+log1βϵ+1ϵlog2000Tlog1βδ] Pr[Lap(1ϵ)log1βϵ]β.

To show Lemma 14 we start with the following claim:

Claim 15.

Pr[x(t)10t,tT]12βτ.

To show Claim 15, we use the following facts about Gaussian random variables.

Fact 16.

If XN(0,σ2) then for a0, Pr[Xσa]exp(a2/2). If XN(0,σ2I)d then for ud, u,X follows N(0,σ2u2) and 1σ2X2 follows χ-squared distribution with d degrees of freedom. Furthermore, for a0, Pr[1σ2X2d+2da+2a]exp(a).

Proof of Claim 15.

We give an outline of this proof. First, notice that, by the update of x(t), we have with high probability (proof will follow):

x(t+1)2 x(t)2+2x(t),η(t)+3.

Then, the main task is to bound sx(s),η(s). However, x(s) is not a bounded variable, so we cannot directly apply concentration inequalities. Instead, we will define a bounded sequence (Y(s)) that behaves like (x(s),η(s)) and proceed to bound s=1tY(s). Finally, we will show that with high probability (Y(s)) behaves like (x(s),η(s)) and from there we can bound x(t).

For each iteration t of Algorithm 2, consider the following event, called Mt that both of the following happen: 1)u(t),η(t)σ2log2Tτβ12, and 2)η(t)225σ2dlog2Tτβ1log2Tβ1. Since u(t) is the average of unit length vectors, u(t)1. Then because η(t)N(0,σ2I), u(t),η(t) follows a N(0,κ2) with κ2σ2 (Fact 16). Additionally, 1σ2η(t)2 follows a χ-squared distribution with d degrees of freedom (Fact 16). Therefore, by Fact 16,

Pr[Mt] 1(Pr[u(t),η(t)>σ2log2Tτβ]+Pr[η(t)2>5σ2dlog2Tτβ])
1βTτ.

We define the following random variable:

Y(t) ={x(t),η(t)if s=0tMt happens and x(t)10t0otherwise

Due to the symmetry of η(t), 𝔼[Y(t)t1]=0. If s=1tMt happens and x(t)10t we have

|Y(t)| =|x(t),η(t)|x(t)η(t)10tlog2Tτβ

Then (Y(t)) forms a bounded martingale difference sequence. By Azuma’s inequality,for all t

Pr[|s=1tY(s)|40t] 2exp((40t)22s=1t(20slog2Tτβ)2)
2exp(1600t2log2Tτβ800t2)βTτ.

Thus by union bound we have with probability 12β/τ, Mt happens and |s=1tY(s)|40t, for all t, simultaneously. When this happens we show by induction that x(t)10t. This is true for t=1. Suppose that we have x(s)10s for all st. We have

x(t+1)2 =x(t)+u(t)+η(t)2
x(t)2+u(t)2+η(t)2+2x(t),u(t)+2u(t),η(t)+2x(t),η(t)
x(t)2+2x(t),η(t)+3.

where in the last inequality, conditioned on Mt, we have η(t)21 and u(t),η(t)12. Since u(t)=1m(t)iS(t)a¯i, and S(t)={i[n],a¯i,x¯(t)0}, we have u(t)21 and x(t),u(t)0. Continuing expanding this recursion, we have

x(t+1)2 x(1)2=0+2s=1tx(s),η(s)+3t.

Due to the induction hypothesis x(s)10s for all st, and Mt happens for all t we have then Y(s)=x(s),η(s) for all st. It follows that 2s=1tx(s),η(s)2s=1tY(s)40t. Therefore

x(t+1)2 240t+3t100(t+1).

Claim 17.

With probability at least 13βτ, λ¯A¯11T, where for convenience we define λ¯A¯=diag(λ¯)A¯ and diag(λ¯) is the diagonal matrix obtained from λ¯.

Proof.

First, with probability at least 1βτ,we have s=1Tη(s)225σ2TdlogTτβT. By Claim 15 and union bound, we have, with probability at least 13βτ, s=1Tη(s)T and x(T)10T. We have that

(λ(t+1)λ(t))A¯ =u(t)=u^(t)η(t)=x(t+1)x(t)η(t)

Henceλ(T)A¯=x(T)s=1T1η(s). Thus λ(T)A¯x(T)+s=1T1η(T)11T. The claim follows due to λ¯=λ(T)T.

Claim 18.

Conditioned on λ¯A¯11T, with probability at least 103, we have c^(s)316πd, in which case 1c^(s)maxx𝒫c^(s),x23d.

In order to prove Claim 18, we need Lemma 6 from [8].

Lemma 19 (Lemma 6 [8]).

Let λn0 and An×d be such that iλi=1 and Ai=1. Take a random gaussian vector gN(0,I)d and let J={i:Ai,g0}. With probability at least 5103, iJλiAi14πd.

Proof.

For simplicity, we drop the superscript s. The proof follows similarly to [8]; however, we need to take into account the noise component γ. Since γN(0,θ2I) with θ2=8d2ϵ2ν2log4000logτβδO(1d3logTdβδ), 1θ2γ2 follows a χ-squared distribution of d degrees of freedom (Fact 16). By Fact 16, for sufficiently large constant in ν, Pr[γ116πd]1βτ. Further, by Lemma 19, with probability 5103 we have iPλ¯ia¯i14πd.

The two events: iPλ¯ia¯i14πd and γ116πd are independent. With probability at least (1βτ)5103103, both events happen. Then,

c^=iPλ¯ia¯i+γ iPλ¯ia¯iγ34iPλ¯ia¯i316πd.

To complete the proof, we now show that, conditioned on the two events iPλ¯ia¯i14πd and γ116πd happening, we have 1c^(s)2maxx𝒫c^(s),x23d. Since c^316πd and γ116πd13c^, we have iPλ¯ia¯ic^γ18πd and iPλ¯ia¯ic^+γ43c^. This gives us

1c^maxx𝒫c^,x =1c^maxx𝒫iPλ¯ia¯i+γ,x
43iPλ¯ia¯imaxx𝒫iPλ¯ia¯i,x+16πd3γx
43λ¯A¯iPλ¯ia¯i+16πd3116πd
4118πd3T+13d23d.

The third inequality is due to γ116πd and x1 since x. Besides, since Ax0 for x𝒫, we also have

maxx𝒫iPλ¯ia¯i,x maxx𝒫i[n]λ¯ia¯i,xmaxx1i[n]λ¯ia¯i,x=λ¯A¯,

The last inequality holds for large enough T=Θ(d2).

Proof of Lemma 14.

The proof of Lemma 14 follows immediately from Claims 17 and 18 when we repeat O(logτβ) times in the for-loop (Line 11 of Algorithm 2).

3.4 Putting it all together

Proof of Theorem 1.

Theorem 1 follows from Propositions 5 and 10 where we set δδd and ϵ=Θ(1d3log1ρ0logdδ).

4 When the LP is Not Fully Dimensional

Algorithm 1 can only guarantee the number of dropped constraints is O~(d2) when the LP has a strictly positive margin. In case the LP has zero margin, i.e, tight (equality) constraints, we can use a standard perturbation technique to create a margin without changing the feasibility of the problem. However, the challenge is to output a solution that satisfies the original constraints. The main idea to handle this case, due to [10], is to iteratively identify tight constraints based on the following observation. An LP with integer entries bounded by U in the absolute value has the same feasibility as that of a relaxed LP with η=12(d+1)((d+1)U)d+1 slackness added to each constraint and, if a solution to the relaxed LP violates a constraint in the initial LP, that constraint must be tight. This suggests that we can solve the relaxed LP which has a positive margin, then identify tight constraints and recurse at most d times (we can stop after identifying d linearly independent tight constraints).

Our improvement over the algorithm of [10] is two-fold. First, we use Algorithm 1 as the solver in each iteration, which improves the number of dropped constraints from O~(d5ϵlog1.51ρ0) to O(d2ϵlog1ρ0log2dβδ). Second, we avoid the fast decrease of the margin during the course of the algorithm, which saves another factor d.

4.1 Synthetic systems of linear equations

During its course, the algorithm needs to identify the set of tight constraints and privatize them for the subsequent use. To this end, we will use the algorithm from [10] for privately generating (or sanitizing) synthetic systems of linear equations, stated in the following result.

Theorem 20 (Theorem C [10]).

There exists an (ϵ,δ)-DP algorithm such that for any system of linear equations , with probability at least 1δ, the algorithm outputs a system of linear equations 𝒪 which satisfies

  1. 1.

    Any solution to is a solution to 𝒪, and

  2. 2.

    Each solution to 𝒪 satisfies all but O(d2ϵlogdδ) equations in .

Note that for the analysis we also use the fact that this algorithm first privately selects a set of tight constraints in the original LP, then outputs a canonical basis for the subspace defined by these tight constraints.

4.2 Algorithm

The algorithm is as follows. The algorithm runs at most d iterations. In each iteration, the algorithm solves the LP Axb+η𝟙 with a set E of sanitized equality constraints that were identified in previous iterations. First, the algorithm selects k=|E| independent columns in E to eliminate k variables then solves the LP without equality constraints using the Algorithm 1 (after homogenizing the LP via the reduction from Section 2). The algorithm terminates if it has found d tight constraints or if the number of constraints violated by the solution is small. Otherwise, the algorithm sanitizes the set of constraints violated by the solution and adds them to E then recurses. Our algorithm is described in Algorithm 3.

Algorithm 3 Private solver for Axb.

4.3 Analysis

For the privacy guarantee, we have the following lemma.

Proposition 21.

The algorithm is (ϵ,δ)-DP for ϵ=3dϵ2+3dϵ2log1δ; δ=(3d+1)δ.

Proof.

This comes directly from advanced composition over d iterations, each of which uses three (ϵ,δ) private mechanisms.

Next, for the utility guarantee, we first show that during the iterations, the entries of the LP (after variable elimination) does not blow up by a factor more than Ud and thereby, we can lower bound the margin of the relaxed LP.

Lemma 22.

For each step t of Algorithm 3, consider the LP Axb, x0 with the entries of A,b bounded by U in the absolute value and let E be the set of k equality constraints Cx=g where C has rank k. We can use k independent columns in C and Gaussian elimination to eliminate k variables to obtain an LP A~xb~, x0 with integer entries and dk variables such that the entries of A~,b~ are bounded by k2(k1)!Uk+1. Furthermore, if the LP {Axb;Cx=g,x0} is feasible then the LP resulting from the same elimination from {Axb+η𝟙;Cx=g,x0} has margin parameter ρtη((d+1)U)2(d+1).

Proof.

See Appendix. Equipped with Lemma 22, we can show the bound for the number of dropped constraints.

Proposition 23.

With probability at least 1d(β+δ), Algorithm 3 outputs a solution that satisfies all but O(d3.5ϵlog2dβδlog(dU)) constraints.

Proof.

See Appendix.

4.4 Proof of Theorem 2

Proof.

The proof of Theorem 2 follows from Propositions 21 and 23 where we select δδd, ϵ1dlog1δ and ββd.

References

  • [1] Amos Beimel, Shay Moran, Kobbi Nissim, and Uri Stemmer. Private center points and learning of halfspaces. In COLT, volume 99 of Proceedings of Machine Learning Research, pages 269–282. PMLR, 2019. URL: http://proceedings.mlr.press/v99/beimel19a.html.
  • [2] Omri Ben-Eliezer, Dan Mikulincer, and Ilias Zadik. Archimedes meets privacy: On privately estimating quantiles in high dimensions under minimal assumptions. In NeurIPS, 2022.
  • [3] Mark Bun, Kobbi Nissim, Uri Stemmer, and Salil P. Vadhan. Differentially private release and learning of threshold functions. In FOCS, pages 634–649. IEEE Computer Society, 2015. doi:10.1109/FOCS.2015.45.
  • [4] John Dunagan and Santosh S. Vempala. A simple polynomial-time rescaling algorithm for solving linear programs. Math. Program., 114(1):101–114, 2008. doi:10.1007/S10107-007-0095-7.
  • [5] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, volume 4004 of Lecture Notes in Computer Science, pages 486–503. Springer, 2006. doi:10.1007/11761679_29.
  • [6] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. J. Priv. Confidentiality, 7(3):17–51, 2016. doi:10.29012/JPC.V7I3.405.
  • [7] Yue Gao and Or Sheffet. Differentially private approximations of a convex hull in low dimensions. In ITC, volume 199 of LIPIcs, pages 18:1–18:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.ITC.2021.18.
  • [8] Rebecca Hoberg and Thomas Rothvoss. An improved deterministic rescaling for linear programming algorithms. In IPCO, volume 10328 of Lecture Notes in Computer Science, pages 267–278. Springer, 2017. doi:10.1007/978-3-319-59250-3_22.
  • [9] Justin Hsu, Aaron Roth, Tim Roughgarden, and Jonathan R. Ullman. Privately solving linear programs. In ICALP (1), volume 8572 of Lecture Notes in Computer Science, pages 612–624. Springer, 2014. doi:10.1007/978-3-662-43948-7_51.
  • [10] Haim Kaplan, Yishay Mansour, Shay Moran, Uri Stemmer, and Nitzan Tur. On differentially private linear algebra. CoRR, abs/2411.03087, 2024. doi:10.48550/arXiv.2411.03087.
  • [11] Haim Kaplan, Yishay Mansour, Uri Stemmer, and Eliad Tsfadia. Private learning of halfspaces: Simplifying the construction and reducing the sample complexity. In NeurIPS, 2020.
  • [12] Andrés Muñoz Medina, Umar Syed, Sergei Vassilvitskii, and Ellen Vitercik. Private optimization without constraint violations. In AISTATS, volume 130 of Proceedings of Machine Learning Research, pages 2557–2565. PMLR, 2021. URL: http://proceedings.mlr.press/v130/munoz21a.html.
  • [13] Kobbi Nissim, Uri Stemmer, and Salil P. Vadhan. Locating a small cluster privately. In PODS, pages 413–427. ACM, 2016. doi:10.1145/2902251.2902296.
  • [14] Javier Peña and Negar Soheili. A deterministic rescaled perceptron algorithm. Math. Program., 155(1-2):497–510, 2016. doi:10.1007/S10107-015-0860-Y.

Appendix A Missing Proofs from Section 3

A.1 From Section 3.2

As a reminder, we let (A,b) and (A,b) be the neighboring LPs. The corresponding computed terms in Algorithm 2 for V are denoted with the extra prime symbol (for example, S(t) and S(t)).

Proof of Claim 7.

Fix two neighboring LPs V and V. Suppose V has one extra constraint. S(t) can have at most one more constraint so it is immediate that m(t) has 2 sensitivity 1. We consider the case S(t) has one extra constraint a¯,

u(t)u(t) =(1m(t)iS(t)a¯i)(a¯m(t)+1+1m(t)+1iS(t)a¯i)
=iS(t)a¯im(t)(m(t)+1)a¯m(t)+1.

Then by triangle inequality

u(t)u(t) iS(t)a¯im(t)(m(t)+1)+a¯m(t)+1=2m(t)+12m(t).

When V has one less constraint, we proceed similarly.

u(t)u(t) =(a¯m(t)+1m(t)iS(t)a¯i)(1m(t)1iSa¯i)=iS(t)a¯im(t)(m(t)1)+a¯m(t).

Then by triangle inequality

u(t)u(t) iS(t)a¯im(t)(m(t)1)+a¯m(t)=2m(t).

Claim 9.

For simplicity, we drop the superscript s for iteration s. Consider the case where V and P have one extra constraint ak. For all t[T], for ik we have

0λi(t)λi(t) 1m(t)1m(t)+1=1m(t)(m(t)+1)λi(t)m(t)λi(t)m.

Hence |λ¯iλ¯i|λ¯im. Besides, we also have λ¯k1m. For coordinate j,

cjcj =iPλ¯ia¯i,jiPλ¯ia¯i,jλ¯ka¯k,j=iP(λ¯iλ¯i)a¯i,jλ¯ka¯k,j.

Then, to compute the 2 sensitivity of c, we take j(cjcj)2.

j(cjcj)2 =j(iP(λ¯iλ¯i)0a¯i,jλ¯ka¯k,j)2
j(iP(λ¯iλ¯i)|a¯i,j|+λ¯k|a¯k,j|)2
j(1miPλ¯i|a¯i,j|+1m|a¯k,j|)2
2m2j(i=1nλ¯i|a¯i,j|)2Jensen inequality: iλ¯i=1+2m2j|a¯k,j|2=1
2m2jiλ¯i|a¯i,j|2+2m2=2m2iλ¯ij|a¯i,j|2=1+2m2=4m2.

Similarly, consider the case where V and P have one extra constraint ak compared with the neighbor V. For all t[T1], for ik we have

0λi(t)λi(t) 1m(t)11m(t)=1m(t)(m(t)1)λi(t)m(t).

Hence |λ¯iλ¯i|λ¯im. Besides, we also have λ¯k1m. For coordinate j

cjcj =iPλ¯ia¯i,jiPλ¯ia¯i,j+λ¯ka¯k,j=iP(λ¯iλ¯i)a¯i,j+λ¯ka¯k,j.

Then

j(cjcj)2 =j(iP(λ¯iλ¯)i0a¯i,j+λ¯ka¯k,j)2
j(iP(λ¯iλ¯)i|a¯i,j|+λ¯k|a¯k,j|)2
j(1miPλ¯i|a¯i,j|+1m|a¯k,j|)2
2m2j(i=1nλ¯i|a¯i,j|)2Jensen inequality: iλ¯i=1+2m2j|a¯k,j|2=1
2m2jiλ¯i|a¯i,j|2+2m2=2m2iλ¯ij|a¯i,j|2=1+2m2=4m2.

Appendix B Missing Proofs from Section 4

Proof of Lemma 22.

Recall that the equality constraint Cx=g is obtained from sanitizing a set of equality constraints with the entries bounded by U in the absolute value. In particular, there is a set of tight constraints Ax=b in the original LP with entries bounded by U in the absolute value that spans the same subspace as Cx=g – that is, there is an invertible matrix Mk×k such that [Cg]=M[Ab].

Let CK be k independent columns of C, where we let K be the set of indices of the columns we selected; the set K corresponds to the set of variables that we will eliminate. Eliminating the variables in K in Axb using Cx=g means switching to the new linear constraints (AAKCK1C)xbAKCK1g. Eliminating the same variables using Ax=b would get the result (AAKAAK1)xbAKAbK1. Notice that CK1C=(MAK)1MA=AAK1 and CK1g=(MAK)1g=AbK1 so the two results are identical.

Therefore, we can think of using Ax=b for variable elimination instead of using Cx=g. Note that the entries of A,b,A,b are bounded by U and the entries of (AK)1 are fractions with denominator det(AK), so to maintain the integrality during the elimination, we can multiply each row of A with det(AK). Therefore, the resulting LP has entries bounded by k2(k1)!Uk+1.

We now show the second claim. First, let us show that the LP {Axb;Cx=g,x0}={Axb;Ax=b,x0} has a solution x such that xi((d+1)U)d+1 for all i. Let x be any vertex solution. By Cramer’s rule, xi is the ratio of two determinants with integer entries U. We can bound the numerator of this ratio by ((d+1)U)d+1 and lower bound the denominator by 1.

Next, consider the LP with added slack {Axb+η𝟙,Ax=b,x0}. After performing the variable elimination as described above, we obtain an LP {A~xb~+η𝟙,x0}. Finally, we bound the margin of the vertex solution x (restricted to the variables that were not eliminated) for the homogenized version of the latter LP . Since A~,b~ have entries bounded by k2(k1)!Uk+1 and x has entries bounded by ((d+1)U)d+1, the margin of the homogenized LP is at least

η(x1)maxi[A~ib~i]η(d+1)(k2(k1)!Uk+1)((d+1)U)d+1
η((d+1)U)2(d+1).

To show Lemma 23, we restate the following lemma from [10] shows that tightness in the relaxed system implies tightness in the original system.

Lemma 24 (Lemma 36 [10]).

For matrices A1,A2 with dimension m1×d, m2×d and b1,b2 being vectors of length m1, m2. The entries are integers with upper bound U in the absolute value. For η20 being a vector of length m2 such that the entries of η2 are bounded by 12(d+1)((d+1)U)d+1. Then if the system

A1x b1
A2x =b2+η2
x 0

is feasible then the system

A1x b1
A2x =b2
x 0

is feasible.

Proof of Lemma 23.

In each iteration, the algorithm solves the LP A~xb~+η𝟙 with a set of equality constraints Cx=g, where A~xb~ is a subset of the constraints of the original LP Axb. Note that, by construction, Cx=g is equivalent to a set of tight constraints Ax=b in the original LP. Solving this LP gives us a solution that satisfies

A1x b1
A2x =b2+η2
Ax =b

for a vector η2η𝟙, where A1xb1 and A2xb2 are constraints in the original LP. Since all entries of the above LP are bounded by U and η=12(d+1)((d+1)U)d+1, Lemma 24 implies that the LP

A1x b1
A2x b2
Ax =bequivalently, Cx=g

is feasible. The 𝖯𝗋𝗂𝗏𝖺𝗍𝖾𝖯𝖾𝗋𝖼𝖾𝗉𝗍𝗋𝗈𝗇 algorithm succeeds with probability 1β and the sanitization step succeeds with probability 1δ. Therefore, after each iteration, with probability at least 1(β+δ) the set of tight (equality) constraints is increased by at least 1. Hence, with probability at least 1d(β+δ) the algorithm must terminate after at most d iterations.

By Lemma 22, in the homogenized LP after variable elimination, the margin is at least η((d+1)U)2(d+1)η3=ρ. Therefore, with probability 1β, the number of dropped constraints when using 𝖯𝗋𝗂𝗏𝖺𝗍𝖾𝖯𝖾𝗋𝖼𝖾𝗉𝗍𝗋𝗈𝗇 in Line 5 is at most

O(d2ϵlog2dβδlog1ρ) =O(d2.5ϵlog2dβδlog(dU)).

If in iteration t, the algorithm returns at Line 7, the number of dropped constraints in J2 is at most d2ϵ with probability at least

1Pr[Lap(1ϵ)<log1δ] 1δ.

If the algorithm returns at Line 10, again, the number of dropped constraints is at most O(d2ϵlog2dβδlog1ρ) with probability 1δ. By Theorem 20, the number of dropped constraints by sanitization in each iteration is at most O(d2ϵlogdδ) with probability 1δ.

Combining these, over all iterations, with probability at least 1d(β+δ), the number of dropped constraints is at most O(d3.5ϵlog2dβδlog(dU)).