Abstract 1 Introduction 2 Preliminaries 3 Algorithms against an Oblivious Adversary 4 Conditional Lower Bounds for an Adaptive Adversary 5 Conclusion and Open Problems References Appendix A Connections to Dynamic Positive Semi-definite Programs

Decremental (1+ϵ)-Approximate Maximum Eigenvector: Dynamic Power Method

Deeksha Adil Institute for Theoretical Studies, ETH Zürich, Switzerland Thatchaphol Saranurak ORCID Department of Computer Science, University of Michigan, Ann Arbor, MI, USA
Abstract

We present a dynamic algorithm for maintaining (1+ϵ)-approximate maximum eigenvector and eigenvalue of a positive semi-definite matrix A undergoing decreasing updates, i.e., updates which may only decrease eigenvalues. Given a vector v updating AAvv, our algorithm takes O~(nnz(v)) amortized update time, i.e., polylogarithmic per non-zeros in the update vector.

Our technique is based on a novel analysis of the influential power method in the dynamic setting. The two previous sets of techniques have the following drawbacks (1) algebraic techniques can maintain exact solutions but their update time is at least polynomial per non-zeros, and (2) sketching techniques admit polylogarithmic update time but suffer from a crude additive approximation.

Our algorithm exploits an oblivious adversary. Interestingly, we show that any algorithm with polylogarithmic update time per non-zeros that works against an adaptive adversary and satisfies an additional natural property would imply a breakthrough for checking psd-ness of matrices in O~(n2) time, instead of O(nω) time.

Keywords and phrases:
Power Method, Dynamic Algorithms
Category:
Track A: Algorithms, Complexity and Games
Funding:
Deeksha Adil: Supported by Dr. Max Rössler, the Walter Haefner Foundation and the ETH Zürich Foundation.
Thatchaphol Saranurak: Supported by NSF grant CCF-2238138.
Copyright and License:
[Uncaptioned image] © Deeksha Adil and Thatchaphol Saranurak; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Algorithm design techniques
Related Version:
Full Version: https://arxiv.org/abs/2402.17929
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

Computing eigenvalues and eigenvectors of a matrix is predominant in several applications, including principle component analysis, clustering in high-dimensional data, semidefinite programming, spectral graph partitioning, and algorithms such as Google’s PageRank algorithm. In the era of massive and dynamic datasets, developing algorithms capable of efficiently updating spectral information with changing inputs has become indispensable.

The study of the change in the spectrum of a matrix undergoing updates spans over five decades, with [13] providing the first algebraic characterization of the change for positive semi-definite matrices via the secular equation. This result offered an explicit formula for exactly computing all new eigenvalues when a matrix undergoes a single rank-one update, assuming knowledge of the entire eigen-decomposition of the initial matrix. Subsequently, [12] showed how to additionally compute eigenvectors explicitly, and handle the case of repeated eigenvalues of the initial matrix. A decade later, [5] extended the works of [13, 12] to compute all new eigenvalues and eigenvectors of a positive semi-definite matrix undergoing an update of the form of a small-rank matrix, or a single batch update. Further works extend these techniques to computing singular values [27], as well as computing new eigenvalues and eigenvectors when only the partial eigen-decomposition of the initial matrix is known [21]. However, all these works require a full eigen-decomposition of the initial matrix and only handle a single update to the matrix.

Another independent line of work aims at finding a small and sparse matrix that has eigenvalues close to the original matrix. A recent work in this direction by [10] provides a universal sparsifier SS for positive semi-definite matrices 𝑨, such that SS has at most n/ϵ4 non-zero entries and 𝑨𝑨SSϵn. [28] give an alternate technique by using gaussian sketches with O(1/ϵ2) rows to approximate all eigenvalues of 𝑨 to an additive error of ϵ𝑨F. The algorithms that find sparse matrices approximating the eigenvalues in these works may be extended to handle updates in the initial matrix quickly. However, the approximation of the eigenvalues achieved is quite crude, i.e., at least ϵn additive factor. This approximation is also shown to be tight for such techniques.

Now consider the simpler task of only maintaining the maximum eigenvalue and eigenvector of a matrix 𝑨 as it undergoes updates. As we have seen above, known methods based on algebraic techniques require full spectral information before computing the new eigenvalues, and maintaining the entire eigen-decomposition can be slow. Sparsification-based algorithms are fast but only achieve large additive approximations of at least ϵn, which is not quite satisfactory. Works on streaming PCA, which have a similar goal of maintaining large eigenvectors focus on optimizing the space complexity instead of runtimes [2]. Previous works based on dynamically maintaining the matrix inverse can maintain (1+ϵ)-multiplicative approximations to the maximum eigenvalues of an n×n symmetric matrix undergoing single-entry updates with an update time of O(n1.447/poly(ϵ)) [26]. This was further improved to O(n1.407/poly(ϵ))[32]. The update time is even slower for row, column, or rank-one updates. In any case, it takes polynomial time per non-zeros of the updates. This leads to the following natural question:

Is there a dynamic algorithm that can maintain a multiplicative approximation of the maximum eigenvalue of a matrix using polylogarithmic update time per non-zeros in the update?

In this paper, we study the problem of maintaining a (1+ϵ)-multiplicative approximation to the maximum eigenvalue and eigenvector of a positive semi-definite matrix as it undergoes a sequence of rank-one updates that may only decrease the eigenvalues. We note that this is equivalent to finding the minimum eigenvalue and eigenvector of a positive semi-definite matrix undergoing rank-one updates that may only increase the eigenvalues. We now formally state our problem.

Problem 1 (Decremental Approximate Maximum Eigenvalue and Eigenvector).

We are given a size parameter n, an accuracy parameter ϵ(0,1), a psd matrix 𝐀00 of size n×n, and an online sequence of vectors 𝐯1,𝐯2,,𝐯T that update 𝐀t𝐀t1𝐯t𝐯t with a promise that 𝐀t0 for all t. The goal is to maintain an ϵ-approximate eigenvalue λt and ϵ-approximate eigenvector 𝐰tn of 𝐀t for all time t. That is λt and unite vector 𝐰t satisfying,

max𝒖:unit𝒖𝑨t𝒖λt(1ϵ)max𝒖:unit𝒖𝑨t𝒖, (1)

and

𝒘t𝑨t𝒘t(1ϵ)max𝒖:unit𝒖𝑨t𝒖. (2)

1.1 Our Results

We give an algorithm for the Decremental Approximate Maximum Eigenvalue and Eigenvector Problem that has an amortized update time of O~(nnz(𝒗t))O~(n)111O~ hides polynomials in logn.. In other words, the total time required by our algorithm over T updates is at most O~(nnz(𝑨0)+i=1Tnnz(𝒗i)). Observe that our algorithm only requires time that is O~(1)× (time required to read the input) and can handle any number of decremental updates. Our algorithm works against an oblivious adversary, i.e., the update sequence is fixed from the beginning. This is the first algorithm that can handle a sequence of updates while providing a multiplicative approximation to the eigenvalues and eigenvectors in a total runtime of O~(n2+nT) and an amortized update time faster than previous algorithms by a factor of nΩ(1). Formally, we prove the following:

Theorem 2.

There is an algorithm for Problem 1 under a sequence of T decreasing updates, that given parameters n, 𝐀0, and ϵ>1/n as input, with probability at least 11/n works against an oblivious adversary in total time,

O(log3nlog6nϵlogλmax(𝑨0)λmax(𝑨T)ϵ4(nnz(𝑨0)+i=1Tnnz(𝒗i))).

Our algorithm is a novel adaptation of the classical power method (see, e.g., [29]) to the dynamic setting, along with a new analysis that may be of independent interest.

Our work can also be viewed as the first step towards generalizing the dynamic algorithms for solving positive linear programs of [11] to solving dynamic positive semi-definite programs. We discuss this connection in detail in Appendix A.

1.2 Towards Separation between Oblivious and Adaptive adversaries

We also explore the possibility of removing the assumption of an oblivious adversary and working against adaptive adversaries. Recall that update sequences are given by adaptive adversaries when the update sequence can depend on the solution of the algorithm.

We show that if there is an algorithm for Problem 1 with a total running time of at most O~(n2) such that the output 𝒘t’s satisfy an additional natural property (which is satisfied by the output of the power method, but not by our algorithm), then it contradicts the hardness of a well-known barrier in numerical linear algebra, therefore ruling out such algorithms.

We first state the barrier formally, and then state our result for adaptive adversaries. Recall that, given a matrix 𝑨, the condition number of 𝑨 is maxx:unit𝑨𝒙minx:unit𝑨𝒙.

Problem 3 (Checking psdness with certificate).

Given δ(0,1), parameter κ, and a symmetric matrix 𝐀 of size n×n with condition number at most κ, either

  • Compute a matrix 𝑿 where 𝑨𝑿𝑿Tδmin𝒙=1𝑨𝒙, certifying that 𝑨 is a psd matrix, or

  • Report that 𝑨 is not a psd matrix.

Recall that A is a psd matrix iff there exists 𝑿 such that 𝑨=𝑿𝑿. The matrix of 𝑿 is called a vector realization of 𝑨, and note that 𝑿 is not unique. The problem above asks to compute a (highly accurate) vector realization of 𝑨. Both eigendecomposition and Cholesky decomposition of 𝑨 give a solution for Problem 3.222Given an eigendecomposition 𝑨=𝑸Λ𝑸 where Λ is a non-negative diagonal matrix and 𝑸 is an orthogonal matrix, we set 𝑿=𝑸Λ1/2. Given a Cholesky decomposition 𝑨=𝑳𝑳 where 𝑳 is a lower triangular matrix, we set 𝑿=𝑳. Banks et al [7] showed how to compute with high accuracy an eigendecomposition of a psd matrix in O(nω+η) time for any constant η>0, where ω>2.37 is the matrix multiplication exponent. Hence, a vector realization of 𝑨 can be found in the same time. Observe that, when δ<κ, Problem 3 is at least as hard as certifying that a given matrix 𝐀 is a psd matrix. It is a notorious open problem whether certifying the psd-ness of a matrix can be done faster than o(nω) even when the condition number is polynomial in n. Therefore, we view Problem 3 as a significant barrier and formalize it as follows.

Hypothesis 4 (PSDness Checking Barrier).

There is a constant η>0 such that, every randomized algorithm for solving Problem 3 for instances with κδpoly(n), correctly with probability at least 2/3 requires n2+η time.

Our negative result states that, assuming Hypothesis 4, there is no algorithm for Problem 1 against adaptive adversaries with sub-polynomial update time that maintains 𝒘t satisfying an additional property stated below.

Property 5.

For every t let 𝐮i(𝐀t) denote the eigenvectors of 𝐀t and λi(𝐀t) denote the eigenvalues. For all i such that λi(𝐀t)λ1(𝐀t)/2,

(𝒘t𝒖i(𝑨t))21n2λi(𝑨t)λ1(𝑨t).
Theorem 6.

Assuming Hypothesis 4, there is no algorithm for Problem 1 that maintains 𝐰t’s additionally satisfying Property 5, and given parameters n and ϵ=min{11no(1),1δ1+δ} as input, works against an adaptive adversary in time no(1)(nnz(𝐀0)+t=1Tnnz(𝐯i)).

Let us motivate Property 5 from several aspects below. First, in the static setting, this property can be easily satisfied since the static power method strongly satisfies it (see Lemma 14). Second, the statement of the property itself is a natural property that we might expect from an algorithm. Consider a “bad” eigenvector 𝒖i whose corresponding eigenvalue is very small, i.e., less than half of the maximum one. It states that the projection of the output 𝒘t along 𝒖i should be very small, i.e., a polynomial factor smaller than the projection along the maximum eigenvector. This is intuitively useful because we do not want the output 𝒘t to direct to the “bad” direction. It should mainly direct along the approximately maximum eigenvector. Third, we formalize this intuition and crucially exploit Property 5 to prove a reduction in Theorem 6. Specifically, Property 5 allows us to decrease eigenvalues of a psd matrix while maintaining the psd-ness, which is crucial for us. See Section 4 for more details. Lastly, our current algorithm from Theorem 2, a dynamic version of the power method, actually maintains 𝒘t’s that satisfy this property for certain snapshots, but not at every step. Unfortunately, we do not see how to strengthen our algorithm to satisfy Property 5 nor how to remove it from the requirement of Theorem 6. We leave both possibilities as open problems.

Understanding the power and limitations of dynamic algorithms against an oblivious adversary vs. an adaptive adversary has become one of the main research programs in the area of dynamic algorithms. However, finding a natural dynamic problem that separates oblivious and adaptive adversaries is still a wide-open problem.333 Beimel et al. [9] gave the first separations for artificial problems assuming a strong cryptographic assumption. Bateni et al. [8] shows a separation between the adversaries for the k-center clustering problem, however, their results are based on the existence of a black box which the adaptive adversary can control. In this paper, we suggest the problem of maintaining approximate maximum eigenvectors as a natural candidate for the separation and give some preliminary evidence for this.

Organization

In the following sections, we begin with some preliminaries required to prove our results in Section 2, followed by our algorithm and its analysis in Section 3, and our conditional lower bound in Section 4. In Section 5, we present some open problems. In the appendix (Section A), we show connections with positive semi-definite programs.

2 Preliminaries

Let 𝑨0 denote the initial matrix. Let λ0=λmax(𝑨0)=𝑨0 denote the maximum eigenvalue of the initial matrix 𝑨0. The following are the key definitions we use in our analysis.

Definition 7 (ϵ-max span and dimension).

We define span(ϵ,𝐀) to denote the space spanned by all eigenvectors of 𝐀 corresponding to eigenvalues λ satisfying λ(1ϵ)λ0. Let dim(ϵ,𝐀) to denote the dimension of the space span(ϵ,𝐀).

We emphasize that λ0 depends only on 𝑨0. So, it is a static value that does not change through time. We will use the following linear algebraic notations.

Definition 8.

Let S1 and S2 be two subspaces of a vector space S. The sum S1+S2 is the space,

S1+S2={s=s1+s2:s1S1,s2S2}.

The complement S¯ of S is the vector space such that S+S¯=n, and SS¯={0}. The difference, S1S2 is defined as,

S1S2=S1S2¯.

Next, we list standard facts about high-dimensional probability needed in our analysis.

Lemma 9 (Chernoff Bound).

Let x1,xm be independent random variables such that axib for all i. Let x=ixi and let μ=𝔼[x]. Then for all δ>0,

Pr[x(1+δ)μ]exp(2δ2μ2m(ba)2)
Pr[x(1δ)μ]exp(δ2μ2m(ba)2)
Lemma 10 (Norm of Gaussian Vector).

A random vector 𝐯n with every coordinate chosen from a normal distribution, N(0,1) satisfies,

Pr[|𝒗2n|2(1+δ)δn]1eδ2n.
Proof.

The vector 𝒗 has entries that are from N(0,1). Now, every 𝒗i2 follows a χ2 distribution. From Lemma 1 of [18] we have the following tail bound for a sum of χ2 random variables,

Pr[|i𝒗i2n|>2nx+2x]ex.

Choosing x=nδ2 gives,

Pr[|𝒗2n|2δ(1+δ)n]1eδ2n,

as required.

Lemma 11 (Distribution of χ2 Variable).

Let xN(0,1) be a gaussian random variable. Then,

Pr[x21n4]11n2.
Proof.

The probability distribution function for y=x2 is given by,

𝒇(y)=12Γ(12)y12ey2.

It is known that Γ(12)=π. Now,

Pr[x21n4]=01/n412Γ(12)y12ey2dye02π01/n4y12dy=2π1n21n2.

3 Algorithms against an Oblivious Adversary

To prove Theorem 2, we first reduce Problem 1 to solving a normalized threshold version of the problem where we assume that initially, the maximum eigenvalue is not much bigger than one. Then we want to maintain a certificate that the maximum eigenvalue is not much less than one until no such certificate exists. This is formalized below.

Problem 12 (DecMaxEV(ϵ,𝑨0,𝒗1,,𝒗T)).

Let 𝐀0 be an n×n symmetric PSD matrix such that λmax(𝐀0)1+ϵlogn. The DecMaxEV(ϵ,𝐀0,𝐯1,,𝐯T) problem asks to find for every t, a vector 𝐰t such that

𝒘t=1and𝒘t𝑨t𝒘t140ϵ,

or return False indicating that λmax(𝐀t)1ϵlogn.

We defer the proof of the standard reduction stated below to the appendix.

Lemma 13.

Given an algorithm 𝒜 that solves the decision problem DecMaxEV(ϵ,𝐀0,𝐯1,,𝐯T) (Definition 12) for any ϵ>0, 𝐀00 and vectors 𝐯1,,𝐯T in time 𝒯, we can solve Problem 1 in total time O(log2nlognϵϵnnz(𝐀0)+lognϵlogλmax(𝐀0)λmax(𝐀T)𝒯).

Next, we describe Algorithm 1 which can be viewed as an algorithm for Problem 12 when there are no updates. Our algorithm essentially applies the power iteration, which is a standard algorithm used to find an approximate maximum eigenvalue and eigenvector of a matrix. In the algorithm, we make R=O(logn) copies to boost the probability.

Algorithm 1 DecMaxEV with no update.

Below, we state the guarantees of the power method.

Lemma 14.

Let ϵ>0 and 𝐀0. Let 𝐖 be as defined in Line 9 in the execution of PowerMethod(ϵ,𝐀). With probability at least 11/n10, for some 𝐰𝐖, it holds that 𝐰𝐀𝐰(1ϵ2)λmax(𝐀). The total time taken by the algorithm is at most O(nnz(𝐀)lognlognϵϵ).

Furthermore, let λi and 𝐮i denote the eigenvalues and eigenvectors of 𝐀. For all i such that λi(𝐀)λmax(𝐀)2, with probability at least 12/n10, [𝐰𝐮i]21n8λiλ1.

We note that the last line of the above lemma is saying that the vectors returned by the power method satisfy Property 5, which we state for completeness but is not required by our algorithm. The following result is a direct consequence of Lemma 14.

Corollary 15.

Let ϵ>0,𝐀0. Let 𝐖 be as defined in Line 9 in the execution of PowerMethod(ϵ,𝐀). If λmax(𝐀)1ϵ, then with probability at least 11/n10, 𝐰𝐀𝐰15ϵ for some 𝐰𝐖. Furthermore, if λmax(𝐀)1ϵ/logn, then with probability at least 11/n10, 𝐰𝐀𝐰1ϵ for some 𝐰𝐖. The total time taken by the algorithm is at most O(nnz(𝐀)lognlognϵϵ).

Observe that, if the algorithm returns [r0,𝑾], then (𝒘(r))𝑨𝒘(r)15ϵ for r=r0, and 𝒘(r0) is therefore a solution to Problem 12 when there is no update. The power method and its analysis are standard, and we thus defer the proof of Lemma 14 to the appendix.

Next, in Algorithms 2 and 3 we describe an algorithm for Problem 12 when we have an online sequence of updates 𝒗1,,𝒗T. The algorithm starts by initializing R=O(logn) copies of the approximate maximum eigenvectors from the power method. Given a sequence of updates, as long as one of the copies is the witness that the current matrix 𝑨t still has a large eigenvalue, i.e., there exists r where (𝒘(r))t𝑨t𝒘t(r)140ϵ, we can just return 𝒘(r) as the solution to Problem 12. Otherwise, (𝒘(r))t𝑨t𝒘t(r)<140ϵ for all rR and none of the vectors from the previous call to the power method are a witness of large eigenvalues anymore. In this case, we simply recompute these vectors by calling the power method again. If the power method returns that there is no large eigenvector, then we return False from now. Otherwise, we continue in the same manner. Note that our algorithm is very simple, but as we will see, the analysis is not straightforward.

Algorithm 2 Initialization.
Algorithm 3 Update algorithm at time t (At1,rt,𝑾t1=[wt1(r):r=1,R],ϵ are maintained).

3.1 Proof Overview

The overall proof of Theorem 2, including the proof of correctness and the runtime depends on the number of executions in Line 4 in Algorithm 3. If the number of executions of Line 4 is bounded by poly(logn/ϵ), then the remaining analysis is straightforward. Therefore, the majority of our analysis is dedicated to proving this key lemma, i.e., poly(logn/ϵ) bound on the number of calls to the power method:

Lemma 16 (Key Lemma).

The number of executions of Line 4 over all updates is bounded by O(lognlog5nϵ/ϵ2) with probability at least 11n.

Given the key lemma, the correctness and runtime analyses are quite straightforward. We now give an overview of the proof of Lemma 16.

Let us consider what happens between two consecutive calls to Line 4, say at 𝑨 and 𝑨~=𝑨i=1k𝒗i𝒗i. We first define the following subspaces of 𝑨 and 𝑨~. Recall Definition 8, which we use to define the following subspaces.

Definition 17 (Subspaces of 𝑨 and 𝑨~).

Given ϵ>0, 𝐀, and 𝐀~ define for ν=0,1,,15lognϵ1:

Tν=span((ν+1)ϵ5lognϵ,𝑨)span(νϵ5lognϵ,𝑨),

and,

T~ν=span((ν+1)ϵ5lognϵ,𝑨~)span(νϵ5lognϵ,𝑨~).

That is, the space Tν and T~ν are spanned by eigenvectors of 𝐀 and 𝐀~, respectively, corresponding to eigenvalues between (1(ν+1)ϵ5lognϵ)λ0 and (1νϵ5lognϵ)λ0.

Let dν=dim(Tν) and d~ν=dim(T~ν). Also define,

T~=span(3ϵ,𝑨~),T=span(3ϵ,𝑨),

and let d=dim(T), d~=dim(T~).

Observe that T=ν=015lognϵ1Tν and similarly T~=ν=015lognϵ1T~ν. We next define some indices/levels corresponding to large subspaces, which we call “important levels”.

Definition 18 (Important ν).

We say a level ν is important if,

dνϵ600log3nϵν<νdν.

We will use to denote the set of ν that are important.

The main technical lemma that implies Lemma 16 is the following:

Lemma 19 (Measure of Progress).

Let ϵ>0 and let 𝐖=[𝐰(1),,𝐰(R)] be as defined in Line 9 in the execution of PowerMethod(ϵ,𝐀). Let 𝐯1,,𝐯k be a sequence of updates generated by an oblivious adversary and define 𝐀~=𝐀i=1k𝐯i𝐯i.

Suppose that λmax(𝐀)1ϵ and 𝐰𝐀~𝐰<140ϵ for all 𝐰𝐖. Then, with probability at least 150lognϵn2, for some ν,

  • dim(TνT~)ϵ300lognϵdν if dν3000lognlognϵϵ, or

  • dim(TνT~)1 if dν<3000lognlognϵϵ.

We prove this lemma in the full version. Intuitively speaking, it means that, whenever Line 4 of Algorithm 3 is executed, there is some important level ν such that an Ω(ϵ/polylog(n/ϵ))-fraction of eigenvalues of 𝑨 at level ν have decreased in value. This is the crucial place where we exploit an oblivious adversary.

Given Lemma 19, the remaining proof of Lemma 16 follows a potential function analysis which is presented in detail in the full version. We consider potentials Φj=ν=0jdν for j=0,,15lognϵ1. The main observation is that Φj is non-increasing over time for all j, and whenever there exists ν0 that satisfies the condition of Lemma 19, Φν0 decreases by dim(Tν0T~). Since dim(Tν0T~)Ω(ϵ/polylog(n/ϵ))dν0 and ν0, i.e., Φν0=ν<ν0dν+dν0dν0(O(log3nϵ)ϵ+1), we can prove that Φν0 decreases by a multiplicative factor of Ω(1ϵ2/polylog(n/ϵ)). As a result, every time our algorithm executes Line 4, Φj decreases by a multiplicative factor for some j, and since we have at most 15lognϵ values of j, we can only have poly(logn/ϵ) executions of Line 4.

It remains to describe how we prove Lemma 19 at a high level. We can write 𝒘𝑨~𝒘 for any 𝒘𝑾 as

𝒘𝑨~𝒘=𝒘𝑨𝒘𝒘𝑽𝒘,

for 𝑽=i=1k𝒗i𝒗i. Our strategy is to show that:

If dim(TνT~) does not satisfies the inequalities in Lemma 19 for all ν,
then 𝒘𝑽𝒘35ϵ for all 𝒘𝑾. ()

Given (3.1) as formalized in the full version, we can conclude Lemma 19 because, from the definition of 𝑨 and 𝑨~, we have that for some 𝒘𝑾, 𝒘𝑨𝒘15ϵ by Corollary 15 and 𝒘𝑨~𝒘<140ϵ. As a result for this 𝒘, 𝒘𝑽𝒘>35ϵ. Now, by contra-position of (3.1), we have that dim(TνT~) is large for some ν.

To prove (3.1), we further decompose 𝒘𝑽𝒘 as

𝒘𝑽𝒘=𝒘𝑽T~𝒘+ν=015lognϵ1𝒘𝑽TνT~𝒘+𝒘𝑽T¯𝒘.

In the above equation, 𝑽T~=ΠT~𝑽ΠT~,𝑽TνT~=Πν𝑽Πν, and 𝑽T¯=ΠT¯𝑽ΠT¯ where ΠT~,Πν,ΠT¯ denote the projections matrices that project any vector onto the spaces T~, TνT~, and T¯ respectively444Suppose a subspace S is spanned by vectors 𝒖1,,𝒖k. Let 𝑼=[𝒖1,,𝒖k]. Recall that the projection matrix onto S is 𝑼(𝑼𝑼)1𝑼.. Refer to the full version for proof of why such a split is possible. Our proof of (3.1) then bounds the terms on the right-hand side. Let us consider each term separately.

  1. 1.

    𝒘𝑽T~𝒘: We prove that this is always at most 10ϵ(1+ϵ). From the definition of 𝑽,

    𝒘𝑽T~𝒘=𝒘ΠT~𝑨ΠT~𝒘𝒘ΠT~𝑨~ΠT~𝒘.

    Since ΠT~𝒘 is the projection of 𝒘 along the large eigenspace of 𝑨~, the second term on the right-hand side above is large, i.e. (110ϵ)λ0ΠT~𝒘2. The first term on the right-hand side can be bounded as, 𝒘ΠT~𝑨ΠT~𝒘𝑨ΠT~𝒘2λ0ΠT~𝒘2. Therefore the difference on the right-hand side is at most 10ϵλ0ΠT~𝒘210ϵλ0𝒘2=10ϵλ010ϵ(1+ϵ).

  2. 2.

    𝒘𝑽T¯𝒘: Observe that this term corresponds to the projection of 𝒘 along the space spanned by the eigenvalues of 𝑨 of size at most 13ϵ. Let 𝒖i and λi denote an eigenvector and eigenvalue pair with λi<13ϵ. Since the power method can guarantee that 𝒘𝒖iλi2K, we have λi2K(13ϵ)2Kpoly(ϵn) is tiny. So we have that 𝒘𝑽T¯𝒘ϵ.



    Before we look at the final case, we define a basis for the space Tν.

    Definition 20 (Basis for Tν).

    Let Tν be as defined in Definition 17. Define indices aν and bν with bνaν+1=dν such that the basis of Tν is given by 𝐮aν,,𝐮bν, where 𝐮1,𝐮2,,𝐮n are the eigenvectors of 𝐀 in decreasing order of eigenvalues.

  3. 3.

    𝒘𝑽TνT~𝒘: For this discussion, we will ignore the constant factors and assume that the high probability events hold. Let Πν denote the projection matrix for the space TνT~. Observe that 𝒘𝑽TνT~𝒘=𝒘Πν𝑽Πν𝒘𝑽Πν𝒘2(1+ϵ)Πν𝒘2, where the last inequality is because 𝑨~=𝑨𝑽0, and therefore, 𝑽𝑨(1+ϵ). Hence, it suffices to bound Πν𝒘2=O(ϵ).

    We can write 𝒘=i=1nλiKαi𝒖ii=1nλi2Kαi2 where λi,𝒖i’s are the eigenvalues and eigenvectors of 𝑨 and αiN(0,1). Define 𝒛=i=1nzi𝒖i where zi=λiKαi. That is, 𝒘=𝒛𝒛. Since Πν𝒘=Πν𝒛/𝒛, it suffices to show that Πν𝒛2O(ϵ)𝒛2. We show this in two separate cases. In both cases, we start with the following bound

    Πν𝒛2λaν2Kdim(TνT~),

    which holds with high probability. To see this, let 𝒈νN(0,1) be a gaussian vector in the space TνT~. We can couple gν with Πν𝒛 so that Πν𝒛 is dominated by λaνK𝒈ν. So Πν𝒛2λaν2K𝒈ν2. By Lemma 10, the norm square of gaussian vector is concentrated to its dimension so 𝒈ν2dim(TνT~) with high probability, thus proving the inequality. Next, we will bound dim(TνT~) in terms of 𝒛 in two cases.

    When 𝝂𝓘

    From the definition of the important levels, we have

    dim(TνT~)dνO(ϵ)log3nϵν<νdν.

    Now, we have ν<νdνi=1bν1αi2 because αiN(0,1) is gaussian and the norm square of gaussian vector is concentrated to its dimension (Lemma 10). Since αi=zi/λiK, we have that

    ν<νdνi=1bν1αi2=i=1bν1zi2λi2K𝒛2/λbν12K.

    Therefore, we have

    Πν𝒛2λaν2Kdim(TνT~)(λaνλbν1)2KO(ϵ)log3nϵ𝒛2O(ϵ)𝒛2

    where the last inequality is trivial because λbν1λaν by definition.

    When 𝝂𝓘

    In this case, according to (3.1), we can assume

    dim(TνT~)ϵdν.

    Again, by Lemma 10, we have that dνi=aνbναi2 because αiN(0,1) is gaussian. Since αi=zi/λiK, we have

    dνi=aνbναi2=i=aνbνzi2λi2K𝒛2/λbν2K.

    Therefore,

    Πν𝒛2λaν2Kdim(TνT~)(λaνλbν)2Kϵ𝒛2O(ϵ)z2

    where the last inequality is because (λaνλbν)2K(1νϵ5lognϵ1(ν+1)ϵ5lognϵ)2K(1+ϵ2lognϵ)2KO(1).

From these three cases, we can conclude that if dim(TνT~) is small for all ν, then 𝒘𝑽𝒘35ϵ, proving our claim.

We defer the formal proofs of the claims made in this section to the full version.

4 Conditional Lower Bounds for an Adaptive Adversary

In this section, we will prove a conditional hardness result for algorithms against adaptive adversaries. In particular, we will prove Theorem 6. Consider Algorithm 4 for solving Problem 3. The only step in Algorithm 4 whose implementation is not specified is Line 8. We will implement this step using an algorithm for Problem 1.

Algorithm 4 Algorithm for Checking PSDness.

High-level idea

Overall for our hardness result, we use the idea that an adaptive adversary can use the maximum eigenvectors returned to perform an update. This can happen n times and in the process, we would recover the entire eigen-decomposition of the matrix, which is hard. Now consider Algorithm 4. We claim that Algorithm 4 solves Problem 3. At the first glance, this claim looks suspicious because the input matrix for Problem 3 might not be PSD, but the dynamic algorithm for Problem 1 at Line 8 has any guarantees only when the matrices remain PSD. However, the reduction does work by crucially exploiting Property 5. The high-level idea is as follows.

  • If the input matrix 𝑨 is initially PSD, then we can show that 𝑨t remains PSD for all t by exploiting Property 5, (see Lemma 21). So, the approximation guarantee of the algorithm at Line 8 is valid at all steps. From this guarantee, 𝑨T must be tiny since we keep decreasing the approximately maximum eigenvalues (see Lemma 22). At the end, the reduction will return 𝑿.

  • If the input matrix 𝑨 is initially not PSD, there must exist a direction 𝒗 such that 𝒗𝑨𝒗<0. Since in the reduction, we update 𝑨T=𝑨𝑾 for some 𝑾0, we must have that 𝒗𝑨T𝒗<𝒗𝑨𝒗. That is, this negative direction remains negative or gets even more negative. It does not matter at all what guarantees the algorithm at Line 8 has. We still have that 𝑨T cannot be tiny. We can distinguish whether 𝑨T is tiny or not using the static power method at Line 11, and, hence, we will return False in this case (see Lemma 22).

Analysis

We prove the guarantees of the output of Algorithm 4 when 𝒘t’s satisfy Property 5 for all t.

Lemma 21.

In Algorithm 4, let 𝐰t’s, t=1,,T be generated such that they additionally satisfy Property 5. If 𝐀00, then 𝐀t0 for all t.

We would like to point out that our parameter ϵ is quite large. This just implies that our reduction can work even if we find crude approximations to the maximum eigenvector as long as this is along the directions with large eigenvalue, since 𝒘 also has to satisfy Property 5. We defer the proof of the above to the Appendix.

Lemma 22.

In Algorithm 4, let 𝐰t’s, t=1,,T be generated such that they additionally satisfy Property 5.

  • If 𝑨0, then Algorithm 4 returns 𝑿 such that 𝑨𝑿𝑿δmin𝒙=1𝑨𝒙.

  • If 𝑨 is not psd, then Algorithm 4 returns False.

Proof of Theorem 6

We are now ready to prove our conditional lower bound.

Proof.

Let (ϵ,𝑨0,𝒗1,,𝒗T) denote an algorithm for Problem 1 that maintains an ϵ-max eigenvalue (1), μt, and eigenvector (2), 𝒘t, for matrices 𝑨t=𝑨t1𝒗t𝒗t such that 𝒘t’s satisfy Property 5. We will show that if the total update time of is no(1)(nnz(𝑨0)+t=1Tnnz(𝒗i)), then there is an n2+o(1)-time algorithm for Problem 3 which contradicts Hypothesis 4.

Given an instance (δ,κ,𝑨) of Problem 3, we will run Algorithm 4 where Line 8 is implemented using . We will generate the input and the update sequence for as follows. Set 𝑨0𝑨. Set ϵ and T according to Algorithm 4. For 1tT, we set 𝒗t=110μt1𝒘t1 according to Line 7 of Algorithm 4. We note that this is a valid update sequence for Problem 1 when 𝑨0 since from Lemma 21, if 𝑨0 then, 𝑨t=𝑨i=0t1μi10𝒘i𝒘i0.

Now, we describe what we return as an answer for Problem 3. From Lemma 22 if 𝑨 is not PSD, then Algorithm 4 returns False and reports the matrix is not PSD. Additionally if 𝑨0, the algorithm returns 𝑿=110[μ1𝒘1μ2𝒘2μT𝒘T] as a certificate that 𝑨 is PSD. This completes the reduction from Problem 3 to Problem 1.

The total time required by the reduction is

O(no(1)(nnz(𝑨)+t=1Tnnz(𝒘t)))O(no(1)(n2+Tn))O(n2+o(1)logκδ),

which is at most n2+o(1) when κδpoly(n).

To conclude, we indeed obtain an n2+o(1)-time algorithm for Problem 3. Assuming Hypothesis 4, the algorithm cannot have no(1)(nnz(𝑨0)+t=1Tnnz(𝒗i)) total update time.

5 Conclusion and Open Problems

Upper Bounds

We have presented a novel extension of the power method to the dynamic setting. Our algorithm from Theorem 2 maintains a multiplicative (1+ϵ)-approximate maximum eigenvalue and eigenvector of a positive semi-definite matrix that undergoes decremental updates from an oblivious adversary. The algorithm has polylogarithmic amortized update time per non-zeros in the updates.

Our algorithm is simple, but our analysis is quite involved. While we believe a tighter analysis that improves our logarithmic factors is possible, it is an interesting open problem to give a simpler analysis for our algorithm. Other natural questions are whether we can get similar algorithms in incremental or fully dynamic settings and whether one can get a worst-case update time.

Lower Bounds

We have shown a conditional lower bound for a class of algorithms against an adaptive adversary in Theorem 6. It would also be very exciting to generalize our lower bound to hold for any algorithm against an adaptive adversary, as that would imply an oblivious-vs-adaptive separation for a natural dynamic problem.

Incremental Updates

We believe that the corresponding incremental updates problem, i.e., we update the matrix as 𝑨t𝑨t1+𝒗t𝒗t cannot be solved in polylogarithmic amortized update time, even when the update sequence 𝒗t’s are from an oblivious adversary. At a high level, the incremental version of our problem seems significantly harder for the following reasons. When we perform decremental updates to a matrix, the new maximum eigenvector must be a part of the eigenspace spanned by the original maximum eigenvectors. Furthermore, it is easy to detect whether the maximum eigenvalue has gone down as we have shown in our paper. For the incremental setting, it is possible that after an update the maximum eigenvalue has gone up and the new maximum eigenvector is a direction that was not the previous one or the update direction and in such cases we cannot really detect this quickly with known information on previous eigenvectors and update directions. This can also happen n times and in every such case, we have to compute the eigenvalue and eigenvector from scratch. Therefore, we leave lower bounds and algorithms for incremental setting as an open problem.

Dynamic SDPs

As discussed in Appendix A, Theorem 2 can be viewed as a starting point towards a dynamic algorithm for general positive semi-definite programs. Can we make further progress? The dynamic semi-definite program problem, even with just two matrix constraints, already seems to be challenging.

One promising approach to attack this problem is to dynamize the matrix multiplicative weight update (MMWU) method for solving a packing/covering SDP [23] since the corresponding approach was successful for linear programs – the near-optimal algorithms of [11] are essentially dynamized multiplicative weight update methods for positive linear programs. However, in our preliminary study exploring this approach, we could only obtain an algorithm that solves Problem 24, which has a single matrix constraint, and solves Problem 1 partially, i.e., maintains an approximate eigenvalue only. The main barrier in this approach is that the algorithm requires maintaining the exponential of the sum of the constraint matrices, and to do this fast, we require that for any two constraint matrices 𝑨 and 𝑩, e𝑨+𝑩=e𝑨e𝑩 which only holds when 𝑨 and 𝑩 commute i.e., 𝑨𝑩=𝑩𝑨. Note that when 𝑨 and 𝑩 are diagonal, this is true; therefore, we can obtain the required algorithms for positive LPs. Even when we have just two constraint matrices where one of them is a diagonal matrix, this remains an issue as the matrices still do not commute.

References

  • [1] Zeyuan Allen-Zhu, Yin Tat Lee, and Lorenzo Orecchia. Using optimization to obtain a width-independent, parallel, simpler, and faster positive sdp solver. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, pages 1824–1831. SIAM, 2016. doi:10.1137/1.9781611974331.CH127.
  • [2] Zeyuan Allen-Zhu and Yuanzhi Li. First efficient convergence for streaming k-pca: a global, gap-free, and near-optimal rate. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 487–492. IEEE, 2017. doi:10.1109/FOCS.2017.51.
  • [3] Zeyuan Allen-Zhu and Lorenzo Orecchia. Nearly-linear time positive lp solver with faster convergence rate. In Proceedings of the forty-seventh annual ACM symposium on Theory of Computing, pages 229–236, 2015. doi:10.1145/2746539.2746573.
  • [4] Zeyuan Allen-Zhu and Lorenzo Orecchia. Nearly linear-time packing and covering lp solvers: Achieving width-independence and-convergence. Mathematical Programming, 175:307–353, 2019. doi:10.1007/S10107-018-1244-X.
  • [5] Peter Arbenz, Walter Gander, and Gene H Golub. Restricted rank modification of the symmetric eigenvalue problem: Theoretical considerations. Linear Algebra and its Applications, 104:75–95, 1988.
  • [6] Sanjeev Arora and Satyen Kale. A combinatorial, primal-dual approach to semidefinite programs. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 227–236, 2007. doi:10.1145/1250790.1250823.
  • [7] Jess Banks, Jorge Garza-Vargas, Archit Kulkarni, and Nikhil Srivastava. Pseudospectral shattering, the sign function, and diagonalization in nearly matrix multiplication time. Foundations of Computational Mathematics, pages 1–89, 2022.
  • [8] MohammadHossein Bateni, Hossein Esfandiari, Hendrik Fichtenberger, Monika Henzinger, Rajesh Jayaram, Vahab Mirrokni, and Andreas Wiese. Optimal fully dynamic k-center clustering for adaptive and oblivious adversaries. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2677–2727. SIAM, 2023. doi:10.1137/1.9781611977554.CH101.
  • [9] Amos Beimel, Haim Kaplan, Yishay Mansour, Kobbi Nissim, Thatchaphol Saranurak, and Uri Stemmer. Dynamic algorithms against an adaptive adversary: Generic constructions and lower bounds. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1671–1684, 2022. doi:10.1145/3519935.3520064.
  • [10] Rajarshi Bhattacharjee, Gregory Dexter, Cameron Musco, Archan Ray, and David P Woodruff. Universal matrix sparsifiers and fast deterministic algorithms for linear algebra. arXiv preprint arXiv:2305.05826, 2023.
  • [11] Sayan Bhattacharya, Peter Kiss, and Thatchaphol Saranurak. Dynamic algorithms for packing-covering lps via multiplicative weight updates. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1–47. SIAM, 2023. doi:10.1137/1.9781611977554.CH1.
  • [12] James R Bunch, Christopher P Nielsen, and Danny C Sorensen. Rank-one modification of the symmetric eigenproblem. Numerische Mathematik, 31(1):31–48, 1978.
  • [13] Gene H Golub. Some modified matrix eigenvalue problems. SIAM review, 15(2):318–334, 1973.
  • [14] Garud Iyengar, David J Phillips, and Cliff Stein. Feasible and accurate algorithms for covering semidefinite programs. In Scandinavian Workshop on Algorithm Theory, pages 150–162. Springer, 2010.
  • [15] Garud Iyengar, David J Phillips, and Clifford Stein. Approximating semidefinite packing programs. SIAM Journal on Optimization, 21(1):231–268, 2011. doi:10.1137/090762671.
  • [16] Arun Jambulapati, Yin Tat Lee, Jerry Li, Swati Padmanabhan, and Kevin Tian. Positive semidefinite programming: Mixed, parallel, and width-independent, 2021. arXiv:2002.04830.
  • [17] Philip Klein and Hsueh-I Lu. Efficient approximation algorithms for semidefinite programs arising from max cut and coloring. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 338–347, 1996. doi:10.1145/237814.237980.
  • [18] Beatrice Laurent and Pascal Massart. Adaptive estimation of a quadratic functional by model selection. Annals of statistics, pages 1302–1338, 2000.
  • [19] Yin Tat Lee and He Sun. An sdp-based algorithm for linear-sized spectral sparsification. In Proceedings of the 49th annual acm sigact symposium on theory of computing, pages 678–687, 2017. doi:10.1145/3055399.3055477.
  • [20] Michael Luby and Noam Nisan. A parallel approximation algorithm for positive linear programming. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 448–457, 1993. doi:10.1145/167088.167211.
  • [21] Roy Mitz, Nir Sharon, and Yoel Shkolnisky. Symmetric rank-one updates from partial spectrum with an application to out-of-sample extension. SIAM Journal on Matrix Analysis and Applications, 40(3):973–997, 2019. doi:10.1137/18M1172120.
  • [22] Lorenzo Orecchia, Sushant Sachdeva, and Nisheeth K Vishnoi. Approximating the exponential, the lanczos method and an o (m)-time spectral algorithm for balanced separator. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 1141–1160, 2012. doi:10.1145/2213977.2214080.
  • [23] Richard Peng, Kanat Tangwongsan, and Peng Zhang. Faster and simpler width-independent parallel algorithms for positive semidefinite programming. In Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pages 101–108, 2012.
  • [24] Serge A Plotkin, David B Shmoys, and Éva Tardos. Fast approximation algorithms for fractional packing and covering problems. Mathematics of Operations Research, 20(2):257–301, 1995. doi:10.1287/MOOR.20.2.257.
  • [25] Kent Quanrud. Nearly linear time approximations for mixed packing and covering problems without data structures or randomization. In Symposium on Simplicity in Algorithms, pages 69–80. SIAM, 2020. doi:10.1137/1.9781611976014.11.
  • [26] Piotr Sankowski. Dynamic transitive closure via dynamic matrix inverse. In 45th Annual IEEE Symposium on Foundations of Computer Science, pages 509–517. IEEE, 2004.
  • [27] Peter Stange. On the efficient update of the singular value decomposition. In PAMM: Proceedings in Applied Mathematics and Mechanics, volume 8, pages 10827–10828. Wiley Online Library, 2008.
  • [28] William Swartworth and David P Woodruff. Optimal eigenvalue approximation via sketching. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 145–155, 2023. doi:10.1145/3564246.3585102.
  • [29] Lloyd N Trefethen and David Bau. Numerical linear algebra, volume 181. Siam, 2022.
  • [30] Luca Trevisan. Parallel approximation algorithms by positive linear programming. Algorithmica, 21(1):72–88, 1998. doi:10.1007/PL00009209.
  • [31] Luca Trevisan. Max cut and the smallest eigenvalue. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 263–272, 2009. doi:10.1145/1536414.1536452.
  • [32] Jan van den Brand, Danupon Nanongkai, and Thatchaphol Saranurak. Dynamic matrix inverse: Improved algorithms and matching conditional lower bounds. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 456–480. IEEE, 2019. doi:10.1109/FOCS.2019.00036.
  • [33] Di Wang, Satish Rao, and Michael W Mahoney. Unified acceleration method for packing and covering problems via diameter reduction. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016.

Appendix A Connections to Dynamic Positive Semi-definite Programs

This section discusses connections between our Problem 1 and the dynamic versions of positive semi-definite programs. Using this connection, we conclude that Theorem 2 implies a dynamic algorithm for a special case of the dynamic covering SDP problem.

We first define packing and covering semi-definite programs (SDPs).555Some papers [16] flip our definition of packing and covering SDPs by considering their dual form.

Definition 23 (Packing/Covering SDP).

Let 𝐂,𝐀i’s for i=1,2,,m be n×n symmetric PSD matrices and bi’s denote positive real numbers. The packing SDP problem asks to find

max𝒀0Tr[𝑪𝒀]s.t. Tr[𝑨i𝒀]bi,i=1,,m.

The covering SDP problem asks to find

min𝒀0Tr[𝑪𝒀]s.t. Tr[𝑨i𝒀]bi,i=1,,m.

Note that when the matrices 𝑪 and 𝑨i’s are all diagonal matrices, the packing and covering SDP problems above are precisely the well-studied packing and covering LP problems, respectively. Near-linear time algorithms for (1+ϵ)-approximately solving packing and covering LPs are very well-studied in a long line of work, some of which are [3, 4, 33, 25], and these problems have many applications, such as in graph embedding [24], approximation algorithms [20, 30], scheduling [24], to name a few.

Dynamic LPs

Near-optimal dynamic algorithms for packing and covering LPs were shown in [11]. The paper studies two kinds of updates – restricting and relaxing updates. Restricting updates can only shrink the feasible region. In contrast, relaxing updates can only grow the feasible region. In [11], the authors gave a deterministic algorithm that can maintain a (1+ϵ)-approximate solution to either packing and covering LPs that undergo only restricting updates or only relaxing updates in total time O~(N/ϵ3+t/ϵ), where N is the total number of nonzeros in the initial input and the updates, and t is the number of updates. Hence, this is optimal up to logarithmic factors.

A natural question is whether one can generalize the near-optimal dynamic LP algorithms with polylogarithmic overhead by [11] to work with SDPs since SDPs capture many further applications such as maximum cuts [15, 17], Sparse PCA [15], sparsest cuts [14], and balanced separators [22], among many others.

Static SDPs

Unfortunately, the algorithms for solving packing and covering SDPs are much more limited, even in the static setting. Near-linear time algorithms are known only for covering SDPs when the cost matrix 𝑪=𝑰 is the identity [23, 1].

The fundamental barrier to working with general psd matrix 𝑪 in covering SDPs is that it is as hard as approximating the minimum eigenvalue of 𝑪 (consider the program max𝒀0Tr[𝑪𝒀] such that Tr[𝒀]1). To the best of our knowledge, near-linear time algorithms for approximating the minimum eigenvalue assume a near-linear-time solver for 𝑪, i.e., to compute 𝑪1𝒙 in the near-linear time given 𝒙. This can be done, for example, by applying the power method to 𝑪1. When 𝑪1 admits a fast solver, sometimes one can approximately solve a covering SDP fast, such as for spectral sparsification of graphs [19], and the max-cut problem [6, 31].

For packing SDPs, there is simply no near-linear time algorithm known. An algorithm for approximately solving packing SDPs and even the generalization to mixed packing-covering SDPs was claimed in [16], but there is an issue in the convergence analysis even for pure packing SDPs. Fast algorithms for this problem, hence, remain open.

Dynamic SDPs

Since near-linear time static algorithms are prerequisites for dynamic algorithms with polylogarithmic overhead, we can only hope for a dynamic algorithm for covering SDPs when 𝑪 is an identity. Below, we will show that our algorithm for Theorem 2 implies a dynamic algorithm for maintaining the covering SDP solution when there is a single constraint and the updates are restricting. This follows because this problem is equivalence to Problem 1.

We first define the dynamic covering problem with a single constraint under restricting updates.

Problem 24 (Covering SDP with a Single Matrix Constraint under Restricting Updates).

Given 𝐀00, an accuracy parameter ϵ>0, and an online sequence of vectors 𝐯1,𝐯2,,𝐯T that update 𝐀t𝐀t𝐯t𝐯t such that 𝐀t0. The problem asks to explicitly maintain, for all t, an (1+ϵ)-approximate optimal value νt of the SDP, i.e.,

νt(1+ϵ)OPTt=defmin𝒀0,Tr[𝑨t𝒀]1Tr[𝒀].

Furthermore, given a query request, return a matrix 𝐐(t) where 𝐘(t)=𝐐(t)𝐐(t)n×n is a (1+ϵ)-approximate optimal solution, i.e.,

Tr[𝑨t𝒀(t)]1 and Tr[𝒀(t)](1+ϵ)OPTt.

Problem 24 is equivalent to Problem 1 in the following sense: given an algorithm for Problem 1, we can obtain an algorithm for Problem 24 with the same total update time and optimal query time. Conversely, given an algorithm for Problem 24, we can obtain an algorithm for Problem 1 in the eigenvalue-only version with the same total update time.

Proposition 25.

The following holds,

  1. 1.

    Given an algorithm for Problem 1 with update time 𝒯, there is an algorithm for Problem 24 with update time 𝒯 and query time O(n), i.e., the time required to query 𝑸(t) at any time t.

  2. 2.

    Given an algorithm for Problem 24 with update time 𝒯~, there is an algorithm for Problem 1 that only maintains the approximate eigenvalues with update time at most 𝒯~.

Proof.

Let us first characterize the solution to Problem 24 at any t for ϵ=0. Since any feasible solution 𝒀 is PSD it can be characterized as, 𝒀=i=1npi𝒚i𝒚i for some unit vectors 𝒚i’s. Now Tr[𝒀]=ipi and

1Tr[𝑨t𝒀]=i=1npiTr[𝑨t𝒚i𝒚i]=i=1npi𝒚i𝑨t𝒚iλmax(𝑨t)i=1npi.

The above implies that for any feasible solution 𝒀(t), it must hold that i=1npi1/λmax(𝑨t), and equality holds iff 𝒚i’s are the maximum eigenvector of 𝑨t for all t. Since the problem is asking to minimize i=1npi, the solution must be 𝒀(t)=1λmax(𝑨t)𝒖𝒖 where 𝒖 is the maximum eigenvector of 𝑨t. Note that here 𝑸(t)=1λmax(𝑨t)𝒖.

We now show the first part, by proving that at any t, given ϵ, the solution to Problem 1 gives a solution to Problem 24. Let λt and 𝒘t denote an ϵ/2-approximate solution to Problem 1 for some t. Consider the solution 𝑸(t)=1(1ϵ/2)λt𝒘t which gives 𝒀(t)=1(1ϵ/2)λt𝒘t𝒘t. Then,

Tr[𝑨t𝒀(t)]=1(1ϵ/2)λt𝒘t𝑨t𝒘t(1ϵ/2)λmax(𝑨t)(1ϵ/2)λmax(𝑨)1.

Therefore, 𝒀(t) satisfies the constraints of Problem 24. Next we look at the objective value, νt=Tr[𝒀(t)]=1(1ϵ/2)λt1(1ϵ/2)2λmax(𝑨t)(1+ϵ)OPTt. We can maintain 𝑸(t) by just maintaining 𝒘t,λt which requires no extra time. The time required to obtain 𝑸(t), which is the query time, from 𝒘t and λt is at most O(nnz(𝒘t))=O(n) and the value of νt can be obtained in O(1) time.

To see the other direction, consider the solution 𝑸(t), νt of Problem 24. We can set λt=1νt. This implies that,

λt=1νtλmax(𝑨t)(1+ϵ)(1ϵ)λmax(𝑨t),

as required. In this case, we do not require any extra time.

By plugging Theorem 2 into Proposition 25, we conclude the following.

Corollary 26.

There is a randomized algorithm for Problem 24 under a sequence of T restricting updates, that given n,𝐀0 and ϵ>1/n as input, with probability at least 11/n works against an oblivious adversary in total update time

O(log3nlog6nϵlogλmax(𝑨0)λmax(𝑨T)ϵ4(nnz(𝑨0)+i=1Tnnz(𝒗i))),

and query time O(n).