Abstract 1 Introduction 2 Preliminaries and Problem Statement 3 The Simple Case of Low-Degree Graphs 4 Our Algorithm for General Graphs 5 Conclusion References Appendix A Proof of Theorem 14 Appendix B Constant Query Complexity in Random Order Streams

Learning-Augmented Streaming Algorithms for Approximating MAX-CUT

Yinhao Dong ORCID School of Computer Science and Technology, University of Science and Technology of China, Hefei, China Pan Peng ORCID School of Computer Science and Technology, University of Science and Technology of China, Hefei, China Ali Vakilian ORCID Toyota Technological Institute at Chicago (TTIC), IL, USA
Abstract

We study learning-augmented streaming algorithms for estimating the value of MAX-CUT in a graph. In the classical streaming model, while a 1/2-approximation for estimating the value of MAX-CUT can be trivially achieved with O(1) words of space, Kapralov and Krachun [STOC’19] showed that this is essentially the best possible: for any ϵ>0, any (randomized) single-pass streaming algorithm that achieves an approximation ratio of at least 1/2+ϵ requires Ω(n/2poly(1/ϵ)) space.

We show that it is possible to surpass the 1/2-approximation barrier using just O(1) words of space by leveraging a (machine learned) oracle. Specifically, we consider streaming algorithms that are equipped with an ϵ-accurate oracle that for each vertex in the graph, returns its correct label in {1,+1}, corresponding to an optimal MAX-CUT solution in the graph, with some probability 1/2+ϵ, and the incorrect label otherwise.

Within this framework, we present a single-pass algorithm that approximates the value of MAX-CUT to within a factor of 1/2+Ω(ϵ2) with probability at least 2/3 for insertion-only streams, using only poly(1/ϵ) words of space. We also extend our algorithm to fully dynamic streams while maintaining a space complexity of poly(1/ϵ,logn) words.

Keywords and phrases:
Learning-Augmented Algorithms, Graph Streaming Algorithms, MAX-CUT
Copyright and License:
[Uncaptioned image] © Yinhao Dong, Pan Peng, and Ali Vakilian; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Streaming, sublinear and near linear time algorithms
Funding:
The work was conducted in part while Pan Peng and Ali Vakilian were long-term visitors at the Simons Institute for the Theory of Computing as part of the Sublinear Algorithms program. Yinhao Dong and Pan Peng are supported in part by NSFC grant 62272431.
Related Version:
Full Version: https://arxiv.org/abs/2412.09773
Editors:
Raghu Meka

1 Introduction

Given an undirected, unweighted graph, the MAX-CUT problem seeks to find a partition of the vertices (a cut) that maximizes the number of edges with endpoints on opposite sides of the partition (such edges are said to be “cut” by the partition). MAX-CUT is a well-known NP-hard problem. The best-known approximation algorithm, developed by Goemans and Williamson [22], achieves a 0.878-approximation ratio, which is the best possible under the Unique Games Conjecture [33].

We consider the problem of estimating the value of MAX-CUT in a graph under the streaming model of computation, where MAX-CUT denotes some fixed optimal solution, and the value of a solution is defined as the number of edges in the graph that are cut by the partition. In this model, the graph is presented as a sequence of edge insertions and deletions, known as a graph stream. The objective is to analyze the graph’s structure using minimal space. The model is called insertion-only if the stream contains only edge insertions; if both insertions and deletions are allowed, it is referred to as the dynamic model.

In the streaming model, a 1/2-approximation for estimating the value of MAX-CUT can be trivially achieved using O(1) words of space, where a word represents the space required to encode the size of the graph. This is done by simply counting the total number m of edges in the graph and outputting m/2. However, Kapralov and Krachun [31] showed that this is essentially the best possible: for any ϵ>0, any (randomized) single-pass streaming algorithm in the insertion-only model that achieves an approximation ratio of at least 1/2+ϵ requires Ω(n/2poly(1/ϵ)) space.

We focus on the problem of estimating the value of MAX-CUT in a streaming setting within the framework of learning-augmented algorithms. These algorithms utilize predictions from a machine learning model to solve a problem, where the predictions typically include some information about the optimal solution, future events, or yet unread data in the input stream (see, e.g., [24, 27, 13]). Learning-augmented algorithms have gained significant attention recently, partly due to numerous breakthroughs in machine learning. Ideally, such algorithms should be both robust and consistent: when the predictions are accurate, the algorithm should outperform the best-known classical algorithms, and when the predictions are inaccurate, the algorithm should still provide performance guarantees that are close to or match those of the best-known classical algorithms. Despite the extensive research in the area of learning-augmented algorithms over the past few years111See https://algorithms-with-predictions.github.io/ for an up-to-date repository of publications on this topic., our understanding of this framework within the streaming model remains limited.

In this work, we consider the noisy prediction model, also referred to as ϵ-accurate predictions, where the algorithm has oracle access to a prediction vector Y{1,1}n. Each entry Yv is independently correct with probability 12+ϵ, where ϵ(0,12] represents the bias of the predictions. Specifically, for each vV, we have Pr[Yv=xv]=12+ϵ and Pr[Yv=xv]=12ϵ, where x{1,1}n denotes some fixed but unknown optimal solution for MAX-CUT. This model captures a scenario where an ML algorithm (or oracle) provides predictions for the values of x that are unreliable and noisy, being only slightly better than random guesses (i.e., just above the 1/2 probability that a random solution would satisfy). Recently, variants of this prediction model have been used to design improved approximation algorithms for MAX-CUT and constraint satisfaction problems (CSPs) [14, 21].

Specifically, we study the following question:

Given an oracle 𝒪 that provides ϵ-accurate predictions Y about the optimal MAX-CUT of a streaming graph G, can we improve upon the worst-case approximation ratio and space trade-off established by [31], specifically the (12+ϵ) ratio with Ω(n) space complexity, for estimating the MAX-CUT value?

1.1 Our Results

We provide an affirmative answer by presenting a single-pass streaming algorithm that surpasses the 1/2-approximation barrier for the value of MAX-CUT using only poly(1/ϵ) (resp., poly(1/ϵ,logn))words of space in insertion-only (resp., fully dynamic) streams. Formally, we establish the following result:

Theorem 1.

Let ϵ(0,12]. Given oracle access to an ϵ-accurate predictor, there exists a single-pass streaming algorithm that provides a (12+Ω(ϵ2))-approximation for estimating the MAX-CUT value of a graph in insertion-only (resp., fully dynamic) streams using poly(1/ϵ) (resp., poly(1/ϵ,logn)) words of space with probability at least 2/3.

By using median trick, we can boost the success probability to 1δ, where δ(0,1), with an additional log(1/δ) factor in space complexity.

We remark that the “robustness” of our learning-augmented algorithm comes for free, as we can always compute the number of edges in the graph and report half of them as a 12-approximation solution, regardless of the value of ϵ. Furthermore, with predictions, it gains an advantage of Ω(ϵ2) in the approximation ratio. Furthermore, our algorithm does not require the exact value of ϵ; a constant-factor approximation of ϵ is sufficient. This estimation is only needed to determine the sample size.

Our algorithm is based on the observation that when the maximum degree of a graph is relatively small (i.e. smaller than poly(ϵ)m), the number of edges with endpoints having different predicted labels already provides a strictly better-than-12 approximation of the MAX-CUT value. For general graphs, we employ some well-known techniques, such as the CountMin sketch and 0-sampling, to distinguish between high-degree and low-degree vertices in both insertion-only and dynamic streams. By combining a natural greedy algorithm with an algorithm tailored for the low-degree part, we achieve a non-trivial approximation. The space complexity of the resulting algorithm is primarily determined by the subroutines for identifying the set of high-degree vertices, which can be bounded by poly(1/ϵ,logn).

We note that it is standard to assume that the noisy oracle 𝒪 has a compact representation and that its space complexity is not included in the space usage of our algorithm, following the conventions in streaming learning-augmented algorithms [24, 27, 13, 1]. Indeed, as noted in [24], a reliable oracle can often be learned space-efficiently in practice as well. Moreover, we show that in the case of random-order streams, our algorithm only needs to query 𝒪 for the labels of a constant number of vertices (see Appendix B).

Additionally, our algorithm actually works in a weaker model. That is, we can assume that in our streaming framework, the predicted label of each vertex is associated with its edges. Thus, the elements in a stream can be represented as (e=(u,v),Yu,Yv), where the predictions (Yu)uV remain consistent throughout the stream. The case of dynamic streams is defined analogously. Notably, in this model, no additional space is required to store the predictors.

1.2 Related Work

Learning-augmented algorithms have been actively researched in online algorithms [42, 35, 4, 6, 7, 25, 41, 5, 36], data structures [43, 20, 46, 39, 44], graph algorithms [18, 12, 8, 37, 16, 10, 40, 23, 17, 14, 21, 11], and sublinear-time algorithms [26, 19, 38, 45]. Our algorithms fit into the category of learning-augmented streaming algorithms. Hsu et al. [24] introduced learning-augmented streaming algorithms for frequency estimation, and Jiang et al. [27] extended this framework to various norm estimation problems in data stream. Recently, Aamand et al. [2, 1] developed learning-augmented frequency estimation algorithms, that improve upon the work of [24]. Additionally, Chen et al. [13] studied single-pass streaming algorithms to estimate the number of triangles and four-cycles in a graph. It is worth mentioning that both our work and previous efforts on learning-augmented streaming algorithms mainly focus on using predictors to improve the corresponding space-accuracy trade-offs. Furthermore, recent studies have explored learning-augmented algorithms for MAX-CUT and constraint satisfaction problems (CSPs) [14, 21] within a variant of our prediction model. In particular, for the Max-CUT problem, Cohen-Addad et al. [14] achieved a (0.878+Ω~(ϵ4))-approximation using SDP. However, it is important to emphasize that their setting differs significantly from ours: they focus on finding the MAX-CUT in the offline setting, whereas our goal is to estimate its size in the streaming setting. Additionally, Ghoshal et al. [21] developed an algorithm that yields near-optimal solutions when the average degree is larger than a threshold determined only by the bias of the predictions, independent of the instance size. They further extended this result to weighted graphs and the MAX-3LIN problem.

The aforementioned lower bound proven in [31] is, in fact, the culmination of a series of works exploring the trade-offs between space and approximation ratio for the streaming MAX-CUT value problem, including [29, 34, 30]. In our main context, we focus on the problem of estimating the value of MAX-CUT in a streaming setting. Another related problem is that of finding the MAX-CUT itself in a streaming model. Since even outputting the MAX-CUT requires Ω(n) space, research in this area primarily considers streaming algorithms that use O~(n) space, known as the semi-streaming model, where O~() hides polylogarithmic factors. Beyond the trivial 12-approximation algorithm, which simply partitions the vertices randomly into two parts, there exists a (1ϵ)-approximation algorithm for finding MAX-CUT. This algorithm works by constructing a (1ϵ)-cut or spectral sparsifier of the input streaming graph, which can be achieved in O~(n) space (see e.g. [3, 32]), and then finding the MAX-CUT of the sparsified graph (which may require exponential time).

2 Preliminaries and Problem Statement

Notations.

Throughout the paper we let G=(V,E) be an undirected, unweighted graph with n vertices and m edges. Given a vertex vV and disjoint sets S,TV, e(v,S) denotes the number of edges incident to v whose other endpoint belongs to S; e(v,S)=|{(u,v)|(u,v)E and uS}|. Similarly, we define e(S,T) to denote the number of edges with one endpoint in S and the other endpoint in T; e(S,T)=vTe(v,S). A cut (S,T) of G is a bipartition of V into two disjoint subsets S and T. The value of the cut (S,T) is e(S,T).

In the descriptions of our algorithms, for a vertex set SV, we define S+ (resp. S) as the set of vertices with predicted + (resp. ) signs by the given ϵ-accurate oracle 𝒪. Note that S+ and S form a bipartition of S. Throughout the paper, all space complexities are measured in words unless otherwise specified. The space complexity in bits is larger by a factor of logm+logn.

In our algorithms, we utilize several well-known techniques in the streaming model, which we define below.

CountMin Sketch [15].

Given a stream of m integers from [n], let fi denote number of occurrences of i in the stream, for any i[n]. In CountMin sketch, there are k uniform and independent random hash functions h1,h2,,hk:[n][w] and an array C of size k×w. The algorithm maintains C, such that C[,s]=j:h(j)=sfj at the end of the stream. Upon each query of an arbitrary i[n], the algorithm returns f~i=min[k]C[,h(i)].

Theorem 2 (CountMin Sketch [15]).

For any i[n] and any [k], we have f~ifi and 𝔼[C[,h(i)]]fi+mw. The space complexity is O(kw) words. Let ϵ,δ(0,1). If we set k=eϵ and w=ln1δ, then for any i[n], we have f~ifi, and with probability at least 1δ, f~ifi+ϵm. The corresponding space complexity is O(1ϵln1δ) words.

Reservoir Sampling [47].

The algorithm samples k elements from a stream, such that at any time mk, the sample consists of a uniformly random subset of size k of the elements seen so far. The space complexity of the algorithm is O(k(logn+logm)) bits.

Definition 3 (0-sampler [28]).

Let xn be a non-zero vector and δ(0,1). An 0-sampler for x returns FAIL with probability at most δ and otherwise returns some index i such that xi0 with probability 1|supp(x)| where supp(x)={ixi0} is the support of x.

The following theorem states that 0-samplers can be maintained using a single pass in dynamic streams.

Theorem 4 (Theorem 2 in [28]).

Let δ(0,1). There exists a single-pass streaming algorithm for maintaining an 0-sampler for a non-zero vector xn (with failure pribability δ) in the dynamic model using O(log2nlogδ1) bits of space.

Problem Statement.

In this work, we consider the problem of estimating the value of MAX-CUT of a graph G in the learning-augmented streaming model. Formally, given an undirected, unweighted graph G=(V,E), which is represented as a sequence of edges, i.e., σ:=e1,e2, for eiE, our goal is to scan the sequence in one pass and output an estimate of the MAX-CUT value, while minimizing space usage. When the sequence contains only edge insertions, it is referred to as an insertion-only stream. When the sequence contains both edge insertions and deletions, it is referred to as a dynamic stream. Note that in dynamic streams, the sequence of the stream is often represented as σ:=(e1,Δ1),(e2,Δ2),, where for each i, eiE and Δi=1 (resp. 1) denotes edge insertion (resp. deletion).

Furthermore, we assume that the algorithms are equipped with an oracle 𝒪, which provides ϵ-accurate predictions Y{1,1}n where ϵ(0,12]. This information is provided via an external oracle, and for the purpose of our algorithm, we assume that the edges in the stream are annotated with the predictions of their endpoints.

Specifically, for each vertex vV, Pr[Yv=xv]=12+ϵ and Pr[Yv=xv]=12ϵ, where x{1,1}n is some fixed but unknown optimal solution. Following previous works on MAX-CUT with ϵ-accurate predictions [14, 21], we also assume that the (Yv)vV are independent.

Finally, since we assume that accuracy parameter of predictions, ϵ, is known to the algorithm up to a constant factor in advance, and the space complexity of all algorithms in the paper is at most poly(logn,1/ϵ,1/δ), we will assume throughout that m=Ω(ϵ11δ7). Otherwise, when m=O(ϵ11δ7), we can simply store all edges in poly(1/ϵ,1/δ) space and compute the MAX-CUT exactly.

3 The Simple Case of Low-Degree Graphs

In this section, we show that when all vertices have “low-degree” then following the ϵ-accurate predictions, we can beat the 12-approximation barrier for streaming MAX-CUT. Informally, once an edge (u,v) arrives in the stream, we consult the noisy oracle 𝒪 and if their endpoints are predicted to be in different parts by the noisy oracle, i.e., 𝒪(u)𝒪(v), we increase the size of the cut by one. This can be simply implemented using O(logn) bits of space. Refer to Algorithm 1 for a formal description.

Algorithm 1 A learning-augmented streaming algorithm for estimating MAX-CUT value in low-degree graphs.
1:Graph G with Δ<ϵ2δm414ϵ2 as a stream of edges, an ϵ-accurate oracle 𝒪{1,1}
2:Estimate of the MAX-CUT value of G
3:initialize X0.
4:for each edge (u,v) in the stream do
5:  Yu=𝒪(u),Yv=𝒪(v)
6:  if YuYv then XX+1   
7:return X

Next, we analyze the space complexity and approximation guarantee of Algorithm 1.

Theorem 5.

Let ϵ(0,12] and δ(0,1). Given an ϵ-accurate oracle 𝒪 for the MAX-CUT of G, if the maximum degree Δ in G satisfies Δ<ϵ2δm414ϵ2, then with probability at least 1δ, Algorithm 1 outputs a (12+ϵ2)-approximation of the MAX-CUT value of G. The algorithm uses O(1) words of space.

Proof.

We first analyze the space complexity. In Algorithm 1, we maintain a counter X for the edges whose endpoints have different signs, which takes O(logn) bits of space. Therefore, the space complexity of the algorithm is O(1) words.

Next, we show that Algorithm 1 with high probability outputs a (12+ϵ2)-approximation of the MAX-CUT when all vertices have low degrees.

Since (V+,V) is a feasible cut of G, where V+ (resp. V) is the set of vertices with predicted + (resp. ) signs by the given ϵ-accurate oracle 𝒪, we have XOPT, where OPT denotes the (optimal) MAX-CUT value of G. For each edge (u,v)E, define an indicator random variable Xuv such that Xuv=1 if YuYv, and is zero otherwise.

Let x be the assignment vector corresponding to the MAX-CUT in G. Specifically, if (S,VS) is the MAX-CUT of G, then xv=1 if vS, and 1 otherwise. Consider the following two cases:

  1. (I.1)

    If xuxv, then Pr[YuYv]=Pr[Yu=xuYv=xv]+Pr[YuxuYvxv]=12+2ϵ2.

  2. (I.2)

    If xu=xv, then Pr[YuYv]=Pr[Yu=xuYvxv]+Pr[YuxuYv=xv]=122ϵ2.

Note that X=(u,v)EXuv and 𝔼[Xuv]=Pr[YuYv]. Then, by the linearity of expectation,

𝔼[X]=(u,v)Exuxv𝔼[Xuv]+(u,v)Exu=xv𝔼[Xuv] =OPT(12+2ϵ2)+(mOPT)(122ϵ2)
=(122ϵ2)m+4ϵ2OPT
(12+2ϵ2)OPT,

where the last inequality holds since 122ϵ20 for ϵ(0,12] and mOPT.

Next, we bound Var[X]. Note that Var[Xuv]=𝔼[Xuv]𝔼[Xuv]2 since (Xuv)(u,v)E are indicator random variables. By (I.1) and (I.2), Var[Xuv]=144ϵ4.

  1. (II.1)

    If xuxv, then Var[Xuv]=(12+2ϵ2)(12+2ϵ2)2=144ϵ4.

  2. (II.2)

    If xu=xv, then Var[Xuv]=(122ϵ2)(122ϵ2)2=144ϵ4.

For any pair of distinct edges (u,v),(u,w)E, we compute 𝔼[XuvXuw]=Pr[Xuv=Xuw=1]=Pr[Yv=Yw=Yu] and Cov[Xuv,Xuw]=𝔼[XuvXuw]𝔼[Xuv]𝔼[Xuw].

  1. (III.1)

    If xu=xv and xu=xw, then

    𝔼[XuvXuw] =Pr[Yu=xuYvxvYwxw]+Pr[YuxuYv=xvYw=xw]
    =(12+ϵ)(12ϵ)2+(12ϵ)(12+ϵ)2=14ϵ2.

    Therefore, Cov[Xuv,Xuw]=(14ϵ2)(122ϵ2)2=ϵ24ϵ4.

  2. (III.2)

    If xu=xv and xuxw, then

    𝔼[XuvXuw] =Pr[Yu=xuYvxvYw=xw]+Pr[YuxuYv=xvYwxw]
    =(12+ϵ)2(12ϵ)+(12ϵ)2(12+ϵ)=14ϵ2.

    Therefore, Cov[Xuv,Xuw]=(14ϵ2)(122ϵ2)(12+2ϵ2)=4ϵ4ϵ2.

  3. (III.3)

    If xuxv and xu=xw, then

    𝔼[XuvXuw] =Pr[Yu=xuYv=xvYwxw]+Pr[YuxuYvxvYw=xw]
    =(12+ϵ)2(12ϵ)+(12ϵ)2(12+ϵ)=14ϵ2.

    Therefore, Cov[Xuv,Xuw]=(14ϵ2)(12+2ϵ2)(122ϵ2)=4ϵ4ϵ2

  4. (III.4)

    If xuxv and xuxw, then

    𝔼[XuvXuw] =Pr[Yu=xuYv=xvYw=xw]+Pr[YuxuYvxvYwxw]
    =(12+ϵ)3+(12ϵ)3=14+3ϵ2.

    Therefore, Cov[Xuv,Xuw]=(14+3ϵ2)(12+2ϵ2)2=ϵ24ϵ4.

Hence, Cov[Xuv,Xuw]ϵ24ϵ4. So,

Var[X] =(u,v)EVar[Xuv]+(u,v),(u,w)EvwCov[Xuv,Xuw]+(u,v),(w,z)Eu,v,w,z all distinctCov[Xuv,Xwz]
m(144ϵ4)+mΔ(ϵ24ϵ4)+0(14+Δϵ2)m.

Then, by applying Chebyshev’s inequality,

Pr[X(12+ϵ2)OPT] Pr[|X𝔼[X]|ϵ2OPT]
Var[X]ϵ4OPT2
(14+Δϵ2)mϵ4OPT2 Var[X](14+Δϵ2)m
4(14+Δϵ2)ϵ4m OPTm2
<δ. Δ<ϵ2δm414ϵ2

So, with probability at least 1δ, X(12+ϵ2)OPT.

4 Our Algorithm for General Graphs

In this section, we present our (12+Ω(ϵ2))-approximation for streaming MAX-CUT in O(poly(1/ϵ,1/δ)) space using ϵ-accurate predictions. For better presenting our algorithmic ideas, we first consider the problem in the random order streams in Section 4.1. Next, in Section 4.2, we show how to extend our algorithm to arbitrary order streams using more advanced sketching techniques. Finally, in Section 4.3, we show that our algorithm also work in dynamic streams where both edge insertions and deletions are allowed.

Offline Implementation.

We first describe our algorithm in the offline setting. Recall that our algorithm from Section 3 is effective when the maximum degree in the graph is smaller than a pre-specified threshold ϕ=Θ(ϵ2δm), where δ is the target failing probability of the algorithm. Let H and L respectively denote the high-degree (i.e., vertices with degree at least ϕ) and low-degree (i.e., vertices with degree less than ϕ) vertices in the input graph G=(V,E). Now, by the guarantee of Theorem 5, suppose we have a (12+ϵ2)-approximation of MAX-CUT on the induced subgraph G[L] denoted by (L+,L). This cut is exactly the cut suggested by the given ϵ-accurate oracle 𝒪. Next, we extend this cut using the vertices in H in a greedy manner. We pick vertices in H one by one in an arbitrary order and each time add them to either L+ or L sections so that the size of cut maximized. We denote the resulting cut by (C,VC). Another cut we consider is simply (H,L). We show at least one these two cuts is a (12+Ω(ϵ2))-approximation for MAX-CUT on G with high probability.

Throughout the paper, we define high-degree vertices as those with degree ϵ2mc, where c=80δ. Vertices with degrees below this threshold are considered low-degree.

Theorem 6.

Let ϵ(0,12] and δ(0,1). The best of (C,VC) and (H,L) cuts is a (12+ϵ216)-approximation for MAX-CUT on G with probability at least 1δ.

Without loss of generality, we assume that e(H,L)<(12+ϵ2)OPT, otherwise we are done. Then it suffices to show that e(C,VC)(12+ϵ216)OPT. Recall that the cut (C,VC) is obtained by extending the cut (L+,L) of G[L] suggested by the given ϵ-accurate oracle using the vertices in H in a greedy manner. We first show that Algorithm 1 works for the induced subgraph G[L].

Lemma 7.

Let ϵ(0,12] and δ(0,1). Suppose that e(H,L)<(12+ϵ2)OPT, c=80δ and m>c3δϵ4+14ϵ4. Then there exists a (12+ϵ2)-approximation of the MAX-CUT value of G[L] with probability at least 1δ.

Proof.

Note that m=mL+mH+e(H,L), where mL (resp. mH) is the number of edges in G[L] (resp. G[H]). Then we have mL=mmHe(H,L)>m4c2ϵ4(12+ϵ2)m, since |H|2mϵ2m/c=2cϵ2 and e(H,L)<(12+ϵ2)OPT(12+ϵ2)m.

Since the high-degree threshold is ϵ2mc, we have ΔL<ϵ2mc, where ΔL is the maximum degree of G[L]. It is easy to check that ϵ2mc<ϵ2δ4(m4c2ϵ4(12+ϵ2)m)14ϵ2 by substituting in the conditions for c and m. It follows that ΔL<ϵ2δmL414ϵ2.

Therefore, by the guarantee of Theorem 5, with probability at least 1δ, there exists a (12+ϵ2)-approximation of MAX-CUT on G[L] denoted by (L+,L), i.e., e(L+,L)(12+ϵ2)OPTL, where OPTL is the size of the MAX-CUT value of G[L].

Lemma 8.

Let ϵ(0,12]. Suppose that e(H,L)=αOPT (where α<12+ϵ2) and m>8c2ϵ8. We have OPTL>(1αϵ4)OPT, where OPTL is the size of the MAX-CUT value of G[L].

Proof.

Note that OPTOPTL+OPTH+e(H,L), where OPTL (resp. OPTH) is the size of the MAX-CUT of G[L] (resp. G[H]). It follows that

OPTL OPTOPTHe(H,L)
OPT4c2ϵ4e(H,L) OPTHmH|H|24c2ϵ4
=OPT4c2ϵ4αOPT e(H,L)=αOPT
>OPTϵ4OPTαOPT 4c2ϵ4<ϵ4m2ϵ4OPT
=(1αϵ4)OPT.

Proof of Theorem 6.

Since the algorithm returns the best of two cuts, if the value of the cut (H,L) is at least (12+ϵ2)OPT (i.e., e(H,L)(12+ϵ2)OPT), then we are done. Hence, without loss of generality, we can assume that the size of the cut (H,L) is strictly less than (12+ϵ2)OPT. Based on Lemma 7 and Lemma 8, we have e(C,VC) e(L+,L)+vHmax{e(v,L+),e(v,L)} (12+ϵ2)OPTL+vHmax{e(v,L+),e(v,L)} e(L+,L)(12+ϵ2)OPTL (12+ϵ2)OPTL+12e(H,L) max{e(v,L+),e(v,L)}e(v,L)2 >(12+ϵ2)(1αϵ4)OPT+α2OPT OPTL>(1αϵ4)OPT =(12+(1α)ϵ2ϵ42ϵ6)OPT >(12+(12ϵ2)ϵ2ϵ42ϵ6)OPT α<12+ϵ2 (12+ϵ216)OPT, (12ϵ2)ϵ2ϵ42ϵ6ϵ216

with probability at least 1δ.

4.1 Warm-up: Random Order Streams

Overview of the Algorithm.

Suppose that the edges of G=(V,E) arrive one by one in a random order stream. At a high level, we would like to run our algorithm from Section 3 for low-degree vertices L (i.e., the induced subgraph on L), and then, using a greedy approach, add the high-degree vertices H to the constructed cut. However, the set H and L are not known a priori in the stream, and it is not clear how to store the required information to run the described algorithm, space efficiently.

To detect high-degree vertices and collect sufficient information to run the greedy approach on them at the end of the stream, we rely on the random order of the stream. We store a small number of edges from the beginning of the stream and then gather degree information for those vertices that are candidates for being high-degree. More precisely, we store the first poly(1/ϵ,1/δ) edges in the (random order) stream and use them to identify a set H~ of size poly(1/ϵ,1/δ) that contains all high-degree vertices H with probability 1δ. We then store all edges between vertices in H~ throughout the stream, which requires poly(1/ϵ,1/δ) words of space. Additionally, for every vertex vH~, we maintain the number of edges between v and L~+V+H~, as well as between v and L~VH~. It is straightforward to check that these counters require poly(1/ϵ,1/δ) words of space in total. We then apply Algorithm 1 to the graph G[VH~] to approximate the MAX-CUT value of G[VH~].

Finally, at the end of stream, since we can compute the degree of all vertices in H~ exactly, we can determine the set of high degree vertices HH~. First, for the remaining vertices S~H~H, we (hypothetically) feed them to our algorithm for the low-degree vertices. Note that we have all edges between any pair of vertices in S~, as well as all the number of their incident edges to L~+ and L~. Therefore, we can compute the size of (L+,L) cut exactly. Now it only remains to run the greedy algorithm for the high-degree vertices H. Similarly, since we have computed the number of incident edges of high-degree vertices to L~+,L~, and we have stored all edges with both endpoints in H~, we can perform the greedy extension of (L~+S~+,L~S~) using H. Moreover, using the same set of degree information, we can compute the size of (H,VH) as well. Then, we can return the best of these two cuts as our estimate of MAX-CUT in G.

Next, we prove the approximation guarantee and the space complexity of the algorithm. The algorithm is formally given in Algorithm 2.

Algorithm 2 Estimating the MAX-CUT value in (insertion-only) random-order streams.
1:Graph G as a random-order stream of edges, an ϵ-accurate oracle 𝒪{1,1}, a high-degree threshold θ=ϵ2mc (where c=80δ).
2:The estimate of the MAX-CUT value of G.
3: Preprocessing phase
4:initialize F, H~, e(L+,L)0.
5: Streaming phase
6:for each edge (u,v) in the first βδ3ϵ4 edges of the stream (where β is a sufficiently large universal constant) do
7:  FF{(u,v)}, H~H~{u,v}
8:for each vertex vH~ do
9:  initialize e(v,VH~)0, e(v,L+)0, e(v,L)0.
10:for each remaining edge (u,v) in the stream do
11:  Yu=𝒪(u),Yv=𝒪(v)
12:  if uH~ and vH~ then FF{(u,v)}
13:  else
14:   if uH~ and vVH~ then e(u,VH~)e(u,VH~)+1
15:   if vH~ and uVH~ then e(v,VH~)e(v,VH~)+1   
16:  if uH~ and vVH~ then
17:   if Yv=1 then e(u,L+)e(u,L+)+1 else e(u,L)e(u,L)+1   
18:  if uVH~ and vVH~ and YuYv then
19:   e(L+,L)e(L+,L)+1 (i.e., apply Algorithm 1 on G[VH~])   
20: Postprocessing phase
21:H{vH~:|{eF:ve}|+e(v,VH~)θ}, LVH
22:for each edge (u,v)F do
23:  Yu=𝒪(u),Yv=𝒪(v)
24:  if uH and vL then
25:   if Yv=1 then e(u,L+)e(u,L+)+1 else e(u,L)e(u,L)+1   
26:  if uH~H and vH~H and YuYv then e(L+,L)e(L+,L)+1
27:for each vertex vH~H do
28:  Yv=𝒪(v)
29:  if Yv=1 then e(L+,L)e(L+,L)+e(v,L) else e(L+,L)e(L+,L)+e(v,L+)
30:ALG1e(L+,L)+vHmax{e(v,L),e(v,L+)}
31:ALG2vH(e(v,L)+e(v,L+))
32:return max{ALG1,ALG2}
Theorem 9.

Let ϵ(0,12] and δ(0,1). Given an ϵ-accurate oracle 𝒪 for the MAX-CUT of G, there exists a single-pass (12+ϵ216)-approximation algorithm for estimating the MAX-CUT value of G in the insertion-only random-order streams. The algorithm uses O(1δ6ϵ8) words of space. The approximation holds with probability at least 1δ.

Lemma 10.

Let δ(0,1). Then H~ contains all high-degree vertices in G with probability at least 1δ2.

Proof.

Suppose that v is a high-degree vertex in G, i.e., deg(v)ϵ2mc. For each edge ei incident to v, define an indicator random variable

Xi={1, if ei is in the first βδ3ϵ4 edges of the random order stream0, otherwise 

So, 𝔼[Xi]=Pr[Xi=1]=βδ3ϵ4m. We define X:=i[deg(v)]Xi to denote the number of edges incident to v that appear in the first βδ3ϵ4 edges of the random order stream. Then, 𝔼[X]=i[deg(v)]𝔼[Xi]=βdeg(v)δ3ϵ4m.

Next, we bound Var[X]. For any i[deg(v)], Var[Xi]=𝔼[Xi2](𝔼[Xi])2=𝔼[Xi](𝔼[Xi])2=βδ3ϵ4m(βδ3ϵ4m)2=βδ3ϵ4mβ2δ6ϵ8m2. Since (Xi)i[deg(v)] are negatively correlated, Var[X]i[deg(v)]Var[Xi]=deg(v)(βδ3ϵ4mβ2δ6ϵ8m2)βdeg(v)δ3ϵ4m. Therefore, by Chebyshev’s inequality,

Pr[vH~]=Pr[X0] =Pr[X𝔼[X]βdeg(v)δ3ϵ4m]
Var[X]δ6ϵ8m2β2deg2(v)
δ3ϵ4mβdeg(v) Var[X]βdeg(v)δ3ϵ4m
δ3ϵ2cβ. deg(v)ϵ2mc

By our definition of high-degree vertices, there are at most 2mϵ2m/c=2cϵ2 high-degree vertices in G. By union bound, with probability at least 12cϵ2δ3ϵ2cβ=12c2δ3β1δ2 (since c=80δ and β is sufficiently large), H~ contains all high-degree vertices in G.

Proof of Theorem 9.

In Algorithm 2, we store the first βδ3ϵ4 edges of the stream (Line 7) and the remaining edges with both endpoints in H~ (Line 12). We also maintain several counters for vertices in H~. Therefore, the total space complexity of Algorithm 2 is O(1δ6ϵ8) words.

Next, we show the approximation guarantee of Algorithm 2. Note that at the end of the stream we can identify H exactly, using the information stored during the streaming phase. In Algorithm 2, we apply Algorithm 1 to the subgraph G[VH~] during the streaming phase (Line 19). Hypothetically, during the postprocessing phase, we apply Algorithm 1 to the subgraph G[H~H] (Line 26) and to edges with one endpoint in H~H and the other endpoint in VH~ (Line 29). This is equivalent to applying Algorithm 1 to G[L] since L=(VH~)(H~H). Then the approximation guarantee of Algorithm 2 follows directly from Theorem 6 (with failure probability δ2) and Lemma 10 by using union bound.

4.2 Arbitrary Order Streams

Overview of the Algorithm.

Unlike random order streams, where we can identify a set H~, containing the high-degree vertices H, by storing only a small number of edges from the beginning of the stream, finding high-degree vertices is more challenging in arbitrary order streams. To handle this, we employ reservoir sampling [47]. Specifically, we uniformly sample poly(1/ϵ,1/δ) edges from the stream. Then, at the end of the stream, we use these sampled edges to compute a small set H~ such that H~H. This approach is similar to what we used in the random order stream; however, in this case, we can only retrieve H~ rather than H at the end of the stream.

We also need to estimate the number of incident edges related to high-degree vertices. To this end, we use techniques from vector sketching, particularly those developed for frequency estimation and heavy hitters. We consider the sketching techniques for heavy hitters, specifically the randomized summaries of CountMin sketch [15], corresponding to V+ and V, denoted by 𝐂𝐌[V+] and 𝐂𝐌[V], respectively. Intuitively, by the end of the stream, 𝐂𝐌[V+] and 𝐂𝐌[V] will contain the estimates of the number of incident edges of high-degree vertices to sets V+ and V, respectively. More precisely, as an edge (u,v) arrive, we increment the counters related to u in 𝐂𝐌[V𝒪(v)], and increment the counters related to v in 𝐂𝐌[V𝒪(u)], where 𝒪(v) and 𝒪(u) are respectively the predicted sign of v, u by the given ϵ-accurate oracle 𝒪.

Note that we cannot detect H exactly by the end of stream. Instead, we have H~. We use 𝐂𝐌[V+] and 𝐂𝐌[V] to approximately compute the value of the cut (L~+,L~) on the induced subgraph G[L~], where L~:=VH~. Then, using 𝐂𝐌[V+] and 𝐂𝐌[V], we can approximately run the greedy approach for H~. Unlike the random order setting where we could implement the greedy extension exactly, here we can only store the approximate values. However, we show that our estimates of the number of incident edges of high-degree vertices to sets V+ and V are only off by poly(ϵ,δ)m additive terms with high probability, and they suffice to provide a strictly better than 12-approximation for MAX-CUT on G[L~] and its extension with H~. Similarly, we can use 𝐂𝐌[V+] and 𝐂𝐌[V] to compute the size of (H~,L~) approximately too. Then, we can show that for sufficiently small value of ϵ, the best of two candidate cuts is a (12+Ω(ϵ2))-approximation for MAX-CUT on G with probability 1δ.

Next, we prove the approximation guarantee and the space complexity of our algorithm for arbitrary order streams. The algorithm is formally given in Algorithm 3.

Algorithm 3 Estimating the MAX-CUT value in (insertion-only) arbitrary-order streams.
1:Graph G as an arbitrary-order stream of edges, an ϵ-accurate oracle 𝒪{1,1}
2:Estimate of the MAX-CUT value of G
3: Preprocessing phase
4:Initialize e(V+,V)0.
5:Set h1,,hk:[n][w] be 2-wise independent hash functions, with k=eϵ7δ3 and w=ln8βϵ4δ4 (where β is a sufficiently large universal constant).
6:Initialize 𝐂𝐌[V+] and 𝐂𝐌[V] to zero.
7: Streaming phase
8:for each edge (u,v) in the stream do
9:  Yu=𝒪(u),Yv=𝒪(v)
10:  if YuYv then e(V+,V)e(V+,V)+1
11:  for each [k] do
12:   𝐂𝐌[V𝒪(u)][,h(v)]𝐂𝐌[V𝒪(u)][,h(v)]+1
13:   𝐂𝐌[V𝒪(v)][,h(u)]𝐂𝐌[V𝒪(v)][,h(u)]+1   
14:In parallel, uniformly sample βδ3ϵ4 edges F in the stream via reservoir sampling [47].
15: Postprocessing phase
16:for each vV do
17:  fv+min[k]𝐂𝐌[V+][,h(v)]
18:  fvmin[k]𝐂𝐌[V][,h(v)]
19:H~:=(u,v)F{u,v}
20:H~+{vH~:Yv=1},H~VH~+
21:e~(L~+,L~)e(V+,V)vH~+fvvH~fv+
22:ALG1e~(L~+,L~)+vH~max{fv,fv+}
23:ALG2vH~(fv+fv+)
24:return max{ALG1,ALG2}

Theorem 11.

Let ϵ(0,12] and δ(0,1). Given an ϵ-accurate oracle 𝒪 for the MAX-CUT of G, there exists a single-pass (12+Ω(ϵ2))-approximation algorithm for estimating the MAX-CUT value of G in the insertion-only arbitrary-order streams. The algorithm uses O(1ϵ7δ3ln1ϵδ) words of space. The approximation holds with probability at least 1δ.

Lemma 12.

Let δ(0,1). Then H~ contains all high-degree vertices H in G with probability at least 1δ2.

Proof.

The proof is basically similar to that of Lemma 10. Suppose that v is a high-degree vertex in G, i.e., deg(v)ϵ2mc. For each edge ei incident to v, define an indicator random variable

Xi={1, if eiF0, otherwise 

So, 𝔼[Xi]=Pr[Xi=1]=βδ3ϵ4m. We define X:=i[deg(v)]Xi to denote the number of edges incident to v that are sampled by the end of the stream. Then, 𝔼[X]=i[deg(v)]𝔼[Xi]=βdeg(v)δ3ϵ4m.

Next, we bound Var[X]. For any i[deg(v)], Var[Xi]=𝔼[Xi2](𝔼[Xi])2=𝔼[Xi](𝔼[Xi])2=βδ3ϵ4m(βδ3ϵ4m)2=βδ3ϵ4mβ2δ6ϵ8m2. Since (Xi)i[deg(v)] are negatively correlated, Var[X]i[deg(v)]Var[Xi]=deg(v)(βδ3ϵ4mβ2δ6ϵ8m2)βdeg(v)δ3ϵ4m. Therefore, by Chebyshev’s inequality,

Pr[vH~]=Pr[X0] =Pr[X𝔼[X]βdeg(v)δ3ϵ4m]
Var[X]δ6ϵ8m2β2deg2(v)
δ3ϵ4mβdeg(v) Var[X]βdeg(v)δ3ϵ4m
δ3ϵ2cβ. deg(v)ϵ2mc

By our definition of high-degree vertices, there are at most 2mϵ2m/c=2cϵ2 high-degree vertices in G. By union bound, with probability at least 12cϵ2δ3ϵ2cβ=12c2δ3β1δ2 (since c=80δ and β sufficiently large), H~ contains all high-degree vertices in G.

Lemma 13.

Let ϵ(0,12] and δ(0,1). For all vertices vH~, (1) fv+e(v,V+) and fve(v,V). (2) with probability at least 1δ4, fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m.

Proof.

(1) By Theorem 2, we have fv+e(v,V+) and fve(v,V) for all vV.

(2) By Theorem 2, we have fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m with probability at least 1ϵ4δ48β for any vV. Since |H~|2βδ3ϵ4, by union bound, with probability at least 1ϵ4δ48β2βδ3ϵ4=1δ4, we have fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m for all vH~.

Proof of Theorem 11.

In Algorithm 3, we use CountMin sketch with k=eϵ7δ3 and w=ln8βϵ4δ4 to estimate the number of incident edges of high-degree vertices to with respect to V+ and V, which takes O(1ϵ7δ3ln1ϵ4δ4)=O(1ϵ7δ3ln1ϵδ) words of space, by Theorem 2. Also, we use reservoir sampling to uniformly sample O(1δ3ϵ4) edges, which takes O(1δ3ϵ4) words of space. Therefore, the total space complexity of Algorithm 3 is O(1ϵ7δ3ln1ϵδ) words.

Next, we show the approximation guarantee of Algorithm 3. Recall that in arbitrary order streams, we cannot detect H by the end of stream, and we have H~ instead. Let A1:=e(L~+,L~)+vH~max{e(v,L~),e(v,L~+)} denote the value of the cut (C~,VC~) obtained by running Algorithm 1 on G[L~] and then assigning the vertices in H~ in a greedy manner. Let A2:=vH~(e(v,L~)+e(v,L~+)) denote the value of the cut (H~,L~). Suppose that the best of (C~,VC~) and (H~,L~) cuts is a (12+ϵ216)-approximation for the MAX-CUT value of G. Recall that ALG1=e~(L~+,L~)+vH~max{fv,fv+} and ALG2=vH~(fv+fv+). In the following, we show that ALG1 and ALG2 are good approximations of A1 and A2, respectively. Therefore, max{ALG1,ALG2} returned by Algorithm 3 is also a strictly better than 12-approximation for the MAX-CUT value of G.

Since V+=H~+L~+ and V=H~L~, we have e(v,V+)=e(v,H~+)+e(v,L~+) and e(v,V)=e(v,H~)+e(v,L~) for any vertex vV. By Lemma 13, with probability at least 1δ4, for all vertices vH~, we have

e(v,H~+)+e(v,L~+)fv+e(v,H~+)+e(v,L~+)+2ϵ7δ3m,
e(v,H~)+e(v,L~)fve(v,H~)+e(v,L~)+2ϵ7δ3m.

Since |H~|2βδ3ϵ4, we have 0e(v,H~+),e(v,H~)2βδ3ϵ4. Therefore, we have

e(v,L~+)fv+2βδ3ϵ4+2ϵ7δ3m+e(v,L~+),
e(v,L~)fv2βδ3ϵ4+2ϵ7δ3m+e(v,L~).

Then we have

0fv+e(v,L~+)2βδ3ϵ4+2ϵ7δ3m and 0fve(v,L~)2βδ3ϵ4+2ϵ7δ3m,

and

max{e(v,L~+),e(v,L~)}max{fv+,fv}2βδ3ϵ4+2ϵ7δ3m+max{e(v,L~+),e(v,L~)}.

Note that

e(L~+,L~) =e(V+,V)e(H~+,H~)vH~+e(v,L~)vH~e(v,L~+),
e~(L~+,L~) =e(V+,V)vH~+fvvH~fv+.

We have

e(L~+,L~)e~(L~+,L~) =vH~+(fve(v,L~))+vH~(fv+e(v,L~+))e(H~+,H~)
2βδ3ϵ4(2βδ3ϵ4+2ϵ7δ3m)+2βδ3ϵ4(2βδ3ϵ4+2ϵ7δ3m)0
=8β2δ6ϵ8+8βϵ3m.

Therefore,

A1ALG1 =(e(L~+,L~)e~(L~+,L~))+vH~(max{e(v,L~),e(v,L~+)}max{fv,fv+})
(8β2δ6ϵ8+8βϵ3m)+2βδ3ϵ40=8β2δ6ϵ8+8βϵ3m.

So, ALG1A1(8β2δ6ϵ8+8βϵ3m)=A1Θ(ϵ3m).

Similarly,

A2ALG2=vH~((e(v,L~)fv)+(e(v,L~+)fv+))2βδ3ϵ4(0+0)=0.

So, ALG2A2.

Since we assume that max{A1,A2} is a (12+ϵ216)-approximation for the MAX-CUT value of G, we have ALG:=max{ALG1,ALG2}max{A1,A2}Θ(ϵ3m)(12+ϵ216)OPTΘ(ϵ3OPT)=(12+Ω(ϵ2))OPT.

Finally, it remains to show that the best of (C~,VC~) and (H~,L~) cuts is a (12+ϵ216)-approximation for the MAX-CUT value of G. This directly follows from Theorem 6 (with failure probability δ4), by substituting H and L with H~ and L~, respectively. Together with Lemma 12, Lemma 13, and applying union bound, this concludes the proof.

4.3 Extension to Dynamic Streams

Overview of the Algorithm.

Our algorithm in dynamic streams is basically similar to Algorithm 3. The only difference is that to compute a small set H~ at the end of the stream such that H~H, instead of reservoir sampling, we need to use 0-sampling [28].

Next, we prove the approximation guarantee and the space complexity of our algorithm for dynamic streams. The algorithm is formally given in Algorithm 4.

Algorithm 4 Estimating the MAX-CUT value in dynamic streams.
1:Graph G as a dynamic stream of edges, an ϵ-accurate oracle 𝒪{1,1}
2:Estimate of the MAX-CUT value of G
3: Preprocessing phase
4:Initialize e(V+,V)0.
5:Set h1,,hk:[n][w] be 2-wise independent hash functions, with k=eϵ7δ3 and w=ln8βϵ4δ4 (where β is a sufficiently large universal constant).
6:Initialize 𝐂𝐌[V+] and 𝐂𝐌[V] to zero.
7: Streaming phase
8:for each item ((u,v),Δ(u,v)) in the stream do
9:  Yu=𝒪(u),Yv=𝒪(v)
10:  if YuYv then e(V+,V)e(V+,V)+1
11:  for each [k] do
12:   𝐂𝐌[V𝒪(u)][,h(v)]𝐂𝐌[V𝒪(u)][,h(v)]+Δ(u,v)
13:   𝐂𝐌[V𝒪(v)][,h(u)]𝐂𝐌[V𝒪(v)][,h(u)]+Δ(u,v)   
14:In parallel, maintain βδ3ϵ4 0-samplers (with failure probability δ=δ4ϵ48β) to obtain βδ3ϵ4 different edges F.
15: Postprocessing phase
16:for each vV do
17:  fv+min[k]𝐂𝐌[V+][,h(v)]
18:  fvmin[k]𝐂𝐌[V][,h(v)]
19:H~:=(u,v)F{u,v}
20:H~+{vH~:Yv=1},H~VH~+
21:e~(L~+,L~)e(V+,V)vH~+fvvH~fv+
22:ALG1e~(L~+,L~)+vH~max{fv,fv+}
23:ALG2vH~(fv+fv+)
24:return max{ALG1,ALG2}
Theorem 14.

Let ϵ(0,12] and δ(0,1). Given an ϵ-accurate oracle 𝒪 for the MAX-CUT of G, there exists a single-pass (12+Ω(ϵ2))-approximation algorithm for estimating the MAX-CUT value of G in the dynamic streams. The algorithm uses O(log2nϵ7δ3log1ϵδ) bits of space. The approximation holds with probability at least 1δ.

Note that Theorem 1 follows from Theorem 11 and Theorem 14 by setting δ=1/3.

Proof Sketch of Theorem 14.

The proof follows a similar structure to that of Theorem 11. First, we show that 0-sampling in dynamic streams can effectively replace reservoir sampling in insertion-only streams, allowing us to retrieve a set H~H with high probability by the end of the stream.

Next, we show that for all vertices vH~, our estimates of the number of incident edges from v to the sets V+ and V has additive error at most poly(ϵ,δ)m, with high probability.

Finally, using arguments similar to those in the proof of Theorem 11, we can show that Algorithm 4 provides a (12+Ω(ϵ2))-approximation for the MAX-CUT value of G with probability 1δ. For completeness, we provide the detailed proof in Appendix A.

5 Conclusion

We present the first learning-augmented algorithm for estimating the value of MAX-CUT in the streaming setting by leveraging ϵ-accurate predictions. Specifically, we provide a single-pass streaming algorithm that achieves a (1/2+Ω(ϵ2))-approximation for estimating the MAX-CUT value of a graph in insertion-only (respectively, fully dynamic) streams using poly(1/ϵ) (respectively, poly(1/ϵ,logn)) words of space. This result contrasts with the lower bound in the classical streaming setting (without predictions), where any (randomized) single-pass streaming algorithm that achieves an approximation ratio of at least 1/2+ϵ requires Ω(n/2poly(1/ϵ)) space.

Our work leaves several questions for further research. For example, it would be interesting to extend our algorithm from unweighted to weighted graphs. Currently, our algorithm for handling the low-degree part does not seem to work for weighted graphs, as an edge with a heavy weight but incorrectly predicted signs can significantly affect our estimator. Another open question is to establish lower bounds on the trade-offs between approximation ratio and space complexity when the streaming algorithm is provided with predictions.

References

  • [1] Anders Aamand, Justin Chen, Huy Nguyen, Sandeep Silwal, and Ali Vakilian. Improved frequency estimation algorithms with and without predictions. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems (NeurIPS), 2023.
  • [2] Anders Aamand, Piotr Indyk, and Ali Vakilian. frequency estimation algorithms under zipfian distribution. arXiv preprint, 2019. arXiv:1908.05198.
  • [3] Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Graph sketches: sparsification, spanners, and subgraphs. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems (PODS), pages 5–14, 2012. doi:10.1145/2213556.2213560.
  • [4] Spyros Angelopoulos, Christoph Dürr, Shendan Jin, Shahin Kamali, and Marc Renault. Online computation with untrusted advice. In 11th Innovations in Theoretical Computer Science Conference (ITCS), volume 151 of LIPIcs, pages 52:1–52:15, 2020. doi:10.4230/LIPICS.ITCS.2020.52.
  • [5] Antonios Antoniadis, Christian Coester, Marek Eliás, Adam Polak, and Bertrand Simon. Online metric algorithms with untrusted predictions. ACM Trans. Algorithms, 19(2):19:1–19:34, 2023. doi:10.1145/3582689.
  • [6] Antonios Antoniadis, Themis Gouleakis, Pieter Kleer, and Pavel Kolev. Secretary and online matching problems with machine learned advice. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS), 2020.
  • [7] Etienne Bamas, Andreas Maggiori, and Ola Svensson. The primal-dual method for learning augmented algorithms. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS), 2020.
  • [8] Siddhartha Banerjee, Vincent Cohen-Addad, Anupam Gupta, and Zhouzi Li. Graph searching with predictions. In 14th Innovations in Theoretical Computer Science Conference (ITCS), volume 251 of LIPIcs, pages 12:1–12:24, 2023. doi:10.4230/LIPICS.ITCS.2023.12.
  • [9] Arnab Bhattacharyya and Yuichi Yoshida. Property Testing - Problems and Techniques. Springer, 2022. doi:10.1007/978-981-16-8622-1.
  • [10] Jan van den Brand, Sebastian Forster, Yasamin Nazari, and Adam Polak. On dynamic graph algorithms with predictions. In Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3534–3557, 2024. doi:10.1137/1.9781611977912.126.
  • [11] Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, and Chen Wang. Learning-augmented maximum independent set. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), volume 317 of LIPIcs, pages 24:1–24:18, 2024. doi:10.4230/LIPICS.APPROX/RANDOM.2024.24.
  • [12] Justin Chen, Sandeep Silwal, Ali Vakilian, and Fred Zhang. Faster fundamental graph algorithms via learned predictions. In International Conference on Machine Learning (ICML), volume 162 of Proceedings of Machine Learning Research, pages 3583–3602, 2022. URL: https://proceedings.mlr.press/v162/chen22v.html.
  • [13] Justin Y. Chen, Talya Eden, Piotr Indyk, Honghao Lin, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner, David P. Woodruff, and Michael Zhang. Triangle and four cycle counting with predictions in graph streams. In The Tenth International Conference on Learning Representations (ICLR), 2022. URL: https://openreview.net/forum?id=8in_5gN9I0.
  • [14] Vincent Cohen-Addad, Tommaso d’Orsi, Anupam Gupta, Euiwoong Lee, and Debmalya Panigrahi. Learning-augmented approximation algorithms for maximum cut and related problems. In Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems (NeurIPS), 2024.
  • [15] Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58–75, 2005. doi:10.1016/J.JALGOR.2003.12.001.
  • [16] Sami Davies, Benjamin Moseley, Sergei Vassilvitskii, and Yuyan Wang. Predictive flows for faster ford-fulkerson. In International Conference on Machine Learning (ICML), volume 202 of Proceedings of Machine Learning Research, pages 7231–7248, 2023. URL: https://proceedings.mlr.press/v202/davies23b.html.
  • [17] Adela F DePavia, Erasmo Tani, and Ali Vakilian. Learning-based algorithms for graph searching problems. In International Conference on Artificial Intelligence and Statistics (AISTATS), volume 238 of Proceedings of Machine Learning Research, pages 928–936, 2024. URL: https://proceedings.mlr.press/v238/depavia24a.html.
  • [18] Michael Dinitz, Sungjin Im, Thomas Lavastida, Benjamin Moseley, and Sergei Vassilvitskii. Faster matchings via learned duals. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems (NeurIPS), pages 10393–10406, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/5616060fb8ae85d93f334e7267307664-Abstract.html.
  • [19] Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, and Tal Wagner. Learning-based support estimation in sublinear time. In 9th International Conference on Learning Representations (ICLR), 2021. URL: https://openreview.net/forum?id=tilovEHA3YS.
  • [20] Paolo Ferragina and Giorgio Vinciguerra. The pgm-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow., 13(8):1162–1175, 2020. doi:10.14778/3389133.3389135.
  • [21] Suprovat Ghoshal, Konstantin Makarychev, and Yury Makarychev. Constraint satisfaction problems with advice. In Proceedings of the 2025 ACM-SIAM Symposium on Discrete Algorithms (SODA), 2025.
  • [22] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995. doi:10.1145/227683.227684.
  • [23] Monika Henzinger, Barna Saha, Martin P Seybold, and Christopher Ye. On the complexity of algorithms with predictions for dynamic graph problems. In 15th Innovations in Theoretical Computer Science Conference (ITCS), volume 287 of LIPIcs, pages 62:1–62:25, 2024. doi:10.4230/LIPICS.ITCS.2024.62.
  • [24] Chen-Yu Hsu, Piotr Indyk, Dina Katabi, and Ali Vakilian. Learning-based frequency estimation algorithms. In 7th International Conference on Learning Representations (ICLR), 2019.
  • [25] Sungjin Im, Ravi Kumar, Mahshid Montazer Qaem, and Manish Purohit. Online knapsack with frequency predictions. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems (NeurIPS), pages 2733–2743, 2021. URL: https://proceedings.neurips.cc/paper/2021/hash/161c5c5ad51fcc884157890511b3c8b0-Abstract.html.
  • [26] Piotr Indyk, Ali Vakilian, and Yang Yuan. Learning-based low-rank approximations. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems (NeurIPS), pages 7400–7410, 2019. URL: https://proceedings.neurips.cc/paper/2019/hash/1625abb8e458a79765c62009235e9d5b-Abstract.html.
  • [27] Tanqiu Jiang, Yi Li, Honghao Lin, Yisong Ruan, and David P Woodruff. Learning-augmented data stream algorithms. In 8th International Conference on Learning Representations (ICLR), 2020.
  • [28] Hossein Jowhari, Mert Sağlam, and Gábor Tardos. Tight bounds for lp samplers, finding duplicates in streams, and related problems. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS), pages 49–58, 2011.
  • [29] Michael Kapralov, Sanjeev Khanna, and Madhu Sudan. Streaming lower bounds for approximating max-cut. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1263–1282, 2015. doi:10.1137/1.9781611973730.84.
  • [30] Michael Kapralov, Sanjeev Khanna, Madhu Sudan, and Ameya Velingker. (1+ ω(1))-approximation to max-cut requires linear space. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1703–1722, 2017. doi:10.1137/1.9781611974782.112.
  • [31] Michael Kapralov and Dmitry Krachun. An optimal space lower bound for approximating MAX-CUT. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 277–288, 2019. doi:10.1145/3313276.3316364.
  • [32] Michael Kapralov, Aida Mousavifar, Cameron Musco, Christopher Musco, Navid Nouri, Aaron Sidford, and Jakab Tardos. Fast and space efficient spectral sparsification in dynamic streams. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1814–1833, 2020. doi:10.1137/1.9781611975994.111.
  • [33] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for max-cut and other 2-variable csps? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.
  • [34] Dmitry Kogan and Robert Krauthgamer. Sketching cuts in graphs and hypergraphs. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science (ITCS), pages 367–376, 2015. doi:10.1145/2688073.2688093.
  • [35] Ravi Kumar, Manish Purohit, and Zoya Svitkina. Improving online algorithms via ml predictions. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems (NeurIPS), pages 9684–9693, 2018. URL: https://proceedings.neurips.cc/paper/2018/hash/73a427badebe0e32caa2e1fc7530b7f3-Abstract.html.
  • [36] Silvio Lattanzi, Thomas Lavastida, Benjamin Moseley, and Sergei Vassilvitskii. Online scheduling via learned weights. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1859–1877, 2020. doi:10.1137/1.9781611975994.114.
  • [37] Silvio Lattanzi, Ola Svensson, and Sergei Vassilvitskii. Speeding up bellman ford via minimum violation permutations. In International Conference on Machine Learning (ICML), volume 202 of Proceedings of Machine Learning Research, pages 18584–18598, 2023. URL: https://proceedings.mlr.press/v202/lattanzi23a.html.
  • [38] Yi Li, Honghao Lin, Simin Liu, Ali Vakilian, and David P. Woodruff. Learning the positions in countsketch. In The Eleventh International Conference on Learning Representations (ICLR), 2023.
  • [39] Honghao Lin, Tian Luo, and David P. Woodruff. Learning augmented binary search trees. In International Conference on Machine Learning (ICML), volume 162 of Proceedings of Machine Learning Research, pages 13431–13440, 2022. URL: https://proceedings.mlr.press/v162/lin22f.html.
  • [40] Quanquan C Liu and Vaidehi Srinivas. The predicted-deletion dynamic model: Taking advantage of ml predictions, for free. arXiv preprint, 2023. doi:10.48550/arXiv.2307.08890.
  • [41] Thodoris Lykouris and Sergei Vassilvitskii. Competitive caching with machine learned advice. J. ACM, 68(4):24:1–24:25, 2021. doi:10.1145/3447579.
  • [42] Mohammad Mahdian, Hamid Nazerzadeh, and Amin Saberi. Allocating online advertisement space with unreliable estimates. In Proceedings of the 8th ACM conference on Electronic commerce, pages 288–294, 2007. doi:10.1145/1250910.1250952.
  • [43] Michael Mitzenmacher. A model for learned bloom filters and optimizing by sandwiching. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems (NeurIPS), pages 462–471, 2018. URL: https://proceedings.neurips.cc/paper/2018/hash/0f49c89d1e7298bb9930789c8ed59d48-Abstract.html.
  • [44] Atsuki Sato and Yusuke Matsui. Fast partitioned learned bloom filter. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems (NeurIPS), 2023.
  • [45] Nicholas Schiefer, Justin Y Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal, and Tal Wagner. Learned interpolation for better streaming quantile approximation with worst-case guarantees. In SIAM Conference on Applied and Computational Discrete Algorithms (ACDA), pages 87–97, 2023. doi:10.1137/1.9781611977714.8.
  • [46] Kapil Vaidya, Eric Knorr, Michael Mitzenmacher, and Tim Kraska. Partitioned learned bloom filters. In 9th International Conference on Learning Representations (ICLR), 2021.
  • [47] Jeffrey S Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS), 11(1):37–57, 1985. doi:10.1145/3147.3165.

Appendix A Proof of Theorem 14

Lemma 15.

Let ϵ(0,12] and δ(0,1). With probability at least 1δ4, Line 14 of Algorithm 4 returns a set F which contains βδ3ϵ4 different edges.

Proof.

The error is introduced by the 0-samplers and the probability that sampling βδ3ϵ4 uniform random edges in parallel does not yield βδ3ϵ4 different edges.

By Theorem 4, the 0-sampler fails with probability at most δ. Since we maintain βδ3ϵ4 0-samplers, by union bound, the failure probability introduced by 0-samplers is at most βδ3ϵ4δ.

Suppose that all the 0-samplers succeed, i.e., |F|=βδ3ϵ4. Now we bound the probability that there exist two identical edges in F. Consider two fixed edges e1,e2F. Then Pr[e1=e2]=1m. By union bound, the probability that there exists at least one pair of identical edges is at most (β/δ3ϵ42)1mβ2δ6ϵ8m.

Therefore, the probability that F contains βδ3ϵ4 different edges is at least 1βδ3ϵ4δβ2δ6ϵ8m1δ8δ81δ4 (since m=Ω(ϵ11δ7)).

Lemma 16.

Let δ(0,1). By the end of the stream, H~ contains all high-degree vertices in G with probability at least 1δ4.

Proof.

Suppose that v is a high-degree vertex in G, i.e., deg(v)ϵ2mc. For each edge ei incident to v, define an indicator random variable

Xi={1, if eiF0, otherwise 

So, 𝔼[Xi]=Pr[Xi=1]=1(11m)β/δ3ϵ41exp(βδ3ϵ4m)βδ3ϵ4mβ22δ6ϵ8m2β5δ3ϵ4m for sufficiently large m. We define X:=i[deg(v)]Xi to denote the number of edges incident to v that are sampled by the end of the stream. Then, 𝔼[X]=i[deg(v)]𝔼[Xi]βdeg(v)5δ3ϵ4m.

Next, we bound Var[X]. For any i[deg(v)], Var[Xi]=𝔼[Xi2](𝔼[Xi])2=𝔼[Xi](𝔼[Xi])2β5δ3ϵ4m(β5δ3ϵ4m)2=β5δ3ϵ4mβ225δ6ϵ8m2. Since (Xi)i[deg(v)] are negatively correlated, Var[X]i[deg(v)]Var[Xi]=deg(v)(β5δ3ϵ4mβ225δ6ϵ8m2)βdeg(v)5δ3ϵ4m. Therefore, by Chebyshev’s inequality,

Pr[vH~]=Pr[X0] Pr[X𝔼[X]βdeg(v)5δ3ϵ4m]
Var[X]25δ6ϵ8m2β2deg2(v)
5δ3ϵ4mβdeg(v) Var[X]βdeg(v)5δ3ϵ4m
5δ3ϵ2cβ. deg(v)ϵ2mc

By our definition of high-degree vertices, there are at most 2mϵ2m/c=2cϵ2 high-degree vertices in G. By union bound, with probability at least 12cϵ25δ3ϵ2cβ=110c2δ3β1δ4 (since c=80δ and β sufficiently large), H~ contains all high-degree vertices in G.

Lemma 17.

Let ϵ(0,12] and δ(0,1). For all vertices vH~, (1) fv+e(v,V+) and fve(v,V). (2) with probability at least 1δ4, fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m.

Proof.

(1) By Theorem 2, we have fv+e(v,V+) and fve(v,V) for all vV.

(2) By Theorem 2, we have fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m with probability at least 1ϵ4δ48β for any vV. Since |H~|2βδ3ϵ4, by union bound, with probability at least 1ϵ4δ48β2βδ3ϵ4=1δ4, we have fv+e(v,V+)+2ϵ7δ3m and fve(v,V)+2ϵ7δ3m for all vH~.

Proof of Theorem 14.

In Algorithm 4, we use CountMin sketch with k=eϵ7δ3 and w=ln8βϵ4δ4 to estimate the number of incident edges of high-degree vertices to with respect to V+ and V, which takes O(1ϵ7δ3ln1ϵ4δ4)=O(1ϵ7δ3ln1ϵδ) words of space, by Theorem 2. Also, we maintain O(1δ3ϵ4) 0-samplers, which takes O(log2nδ3ϵ4log1δ4ϵ4)=O(log2nδ3ϵ4log1δϵ) bits of space, by Theorem 4. Therefore, the total space complexity of Algorithm 4 is O(log2nϵ7δ3log1ϵδ) bits.

Next, we show the approximation guarantee of Algorithm 4. Similar to arbitrary order streams, we cannot detect H by the end of stream. Instead, we have H~. Let A1:=e(L~+,L~)+vH~max{e(v,L~),e(v,L~+)} denote the value of the cut (C~,VC~) obtained by running Algorithm 1 on G[L~] and then assigning the vertices in H~ in a greedy manner. Let A2:=vH~(e(v,L~)+e(v,L~+)) denote the value of the cut (H~,L~). Suppose that the best of (C~,VC~) and (H~,L~) cuts is a (12+ϵ216)-approximation for the MAX-CUT value of G. In the following, we show that ALG1 and ALG2 are good approximations of A1 and A2, respectively. Therefore, max{ALG1,ALG2} returned by Algorithm 4 is also a strictly better than 12-approximation for the MAX-CUT value of G.

Since V+=H~+L~+ and V=H~L~, we have e(v,V+)=e(v,H~+)+e(v,L~+) and e(v,V)=e(v,H~)+e(v,L~) for any vertex vV. By Lemma 17, with probability at least 1δ4, for all vertices vH~, we have

e(v,H~+)+e(v,L~+)fv+e(v,H~+)+e(v,L~+)+2ϵ7δ3m,
e(v,H~)+e(v,L~)fve(v,H~)+e(v,L~)+2ϵ7δ3m.

Since |H~|2βδ3ϵ4, we have 0e(v,H~+),e(v,H~)2βδ3ϵ4. Therefore, we have

e(v,L~+)fv+2βδ3ϵ4+2ϵ7δ3m+e(v,L~+),
e(v,L~)fv2βδ3ϵ4+2ϵ7δ3m+e(v,L~).

Then we have

0fv+e(v,L~+)2βδ3ϵ4+2ϵ7δ3m and 0fve(v,L~)2βδ3ϵ4+2ϵ7δ3m,

and

max{e(v,L~+),e(v,L~)}max{fv+,fv}2βδ3ϵ4+2ϵ7δ3m+max{e(v,L~+),e(v,L~)}.

Note that

e(L~+,L~) =e(V+,V)e(H~+,H~)vH~+e(v,L~)vH~e(v,L~+),
e~(L~+,L~) =e(V+,V)vH~+fvvH~fv+.

We have

e(L~+,L~)e~(L~+,L~) =vH~+(fve(v,L~))+vH~(fv+e(v,L~+))e(H~+,H~)
2βδ3ϵ4(2βδ3ϵ4+2ϵ7δ3m)+2βδ3ϵ4(2βδ3ϵ4+2ϵ7δ3m)0
=8β2δ6ϵ8+8βϵ3m.

Therefore,

A1ALG1 =(e(L~+,L~)e~(L~+,L~))+vH~(max{e(v,L~),e(v,L~+)}max{fv,fv+})
(8β2δ6ϵ8+8βϵ3m)+2βδ3ϵ40=8β2δ6ϵ8+8βϵ3m.

So, ALG1A1(8β2δ6ϵ8+8βϵ3m)=A1Θ(ϵ3m).

Similarly,

A2ALG2=vH~((e(v,L~)fv)+(e(v,L~+)fv+))2βδ3ϵ4(0+0)=0.

So, ALG2A2.

Since we assume that max{A1,A2} is a (12+ϵ216)-approximation for the MAX-CUT value of G, we have ALG:=max{ALG1,ALG2}max{A1,A2}Θ(ϵ3m)(12+ϵ216)OPTΘ(ϵ3OPT)=(12+Ω(ϵ2))OPT.

Finally, it remains to show that the best of (C~,VC~) and (H~,L~) cuts is a (12+ϵ216)-approximation for the MAX-CUT value of G. This directly follows from Theorem 6 (with failure probability δ4), by substituting H and L with H~ and L~, respectively. Together with Lemma 15, Lemma 16, Lemma 17, and applying union bound, this concludes the proof.

Appendix B Constant Query Complexity in Random Order Streams

In this section, we show that in the case of random-order streams, it suffices to query the ϵ-accurate oracle 𝒪 for the labels of a constant number of vertices. For simplicity, we provide only a sketch of the main ideas and omit the formal proof. We will utilize the following lemma.

Lemma 18 (Lemma 2.2 in [9]).

Let a,b with a<b, n and x[a,b]n. Let Σ=i=1nxi. For any η,δ(0,1), there exists an algorithm which samples a set T of t=O(η2logδ1) indices from [n] and returns an estimate Σ~=ntiTxi such that |ΣΣ~|η(ba)n with probability at least 1δ.

Similar to Algorithm 2, we store the first poly(1/ϵ,1/δ) edges in the random order stream and use them to identify a set H~ of size poly(1/ϵ,1/δ) that contains all high-degree vertices H with high probability as in Lemma 10. For this part, we do not need any information on the labels of vertices provided by the oracle 𝒪. Recall that the algorithm considers two candidate cuts and returns the one with the larger size. Let L~:=VH~ and S~:=H~H. The first candidate is obtained by performing the greedy extension of (L~+S~+,L~S~) using H. The second candidate is simply the cut (H,L). Formally, the sizes of these two cuts are given as follows:

ALG1 =e(L+,L)+vHmax{e(v,L),e(v,L+)}
=e(L~+,L~)+e(S~+,S~)+vS~+e(v,L~)+vS~e(v,L~+)+vHmax{e(v,L),e(v,L+)},
ALG2 =e(H,L)=vHe(v,L)=vH(e(v,L~)+e(v,S~)).

Observe that, to compute the second cut size ALG2, there is no need to query the oracle 𝒪. It suffices to count e(v,L~) for each vertex vH~ using the remaining edges (after the first poly(1/ϵ,1/δ) edges) during the stream. At the end of the stream, we can retrieve H from H~ and compute ALG2 exactly.

Next, we focus on estimating ALG1 by querying the oracle 𝒪 a constant number of times. Specifically, we decompose ALG1 into four parts and estimate each part respectively.

  1. (I)

    e(L~+,L~). During the stream (after the first poly(1/ϵ,1/δ) edges), we employ reservoir sampling to sample O(ϵ4logδ1) edges T uniformly at random from E(G[L~]), the set of edges with both endpoints in L~. Let U:=(u,v)T{u,v}. By Lemma 18, we query 𝒪 for the vertices in U and use e(G[L~])|T|e(U+,U) to estimate e(L~+,L~), with an additive error of ϵ2e(G[L~]).

  2. (II)

    e(S~+,S~). Since we store all edges with both endpoints in H~ during the stream and |H~|=poly(1/ϵ,1/δ), we can compute e(S~+,S~) exactly by querying 𝒪 a constant number of times.

  3. (III)

    vS~+e(v,L~)+vS~e(v,L~+). For each vertex vH~, we use reservoir sampling to sample O(ϵ4logδ1) edges Tv uniformly at random from E(v,L~), the set of edges with one endpoint at v and the other in L~. Let Uv:=(u,v)Tv{u}. By Lemma 18, we query 𝒪 for the vertices in Uv and use e(v,L~)|Tv|e(v,Uv+) to estimate e(v,L~+) and use e(v,L~)|Tv|e(v,Uv) to estimate e(v,L~), both with an additive error of ϵ2e(v,L~). Therefore, we can estimate vS~+e(v,L~)+vS~e(v,L~+) with an additive error of ϵ2vS~e(v,L~)=ϵ2e(S~,L~).

  4. (IV)

    vHmax{e(v,L),e(v,L+)}. Since L=S~L~ and L+=S~+L~+, we have e(v,L)=e(v,S~)+e(v,L~) and e(v,L+)=e(v,S~+)+e(v,L~+), for each vertex vH. Similar to (II) and (III), we can estimate vHmax{e(v,L),e(v,L+)} with an additive error of ϵ2vHe(v,L~)=ϵ2e(H,L~).

Let ALG~1 denote our estimator for ALG1. Then we have |ALG~1ALG1|ϵ2(e(G[L~])+e(S~,L~)+e(H,L~))ϵ2(e(G[L])+e(H,L))ϵ2m.

If ALG2=e(H,L)(12+ϵ2)OPT, then we are done. Hence, without loss of generality, we assume that ALG2=e(H,L)<(12+ϵ2)OPT. By the proof of Theorem 6, we have ALG1(12+ϵ216)OPT(12+ϵ216)m2ϵm if we set ϵ14. Therefore, |ALG~1ALG1|ϵ2mϵALG1. Since ALG1(12+ϵ216)OPT and we can use ALG~1 to approximate ALG1 within a multiplicative factor of (1±ϵ), we are done.

Query Complexity.

In the above analysis, the total query complexity for the ϵ-accurate oracle 𝒪 is poly(1/ϵ,1/δ,log(1/δ)), which is independent of n.