Abstract 1 Introduction 2 Technical Overview 3 Preliminaries 4 The Algorithm for Selecting 𝑸 5 The Reduction

Query Efficient Weighted Stochastic Matching

Mahsa Derakhshan Northeastern University, Boston, MA, USA Mohammad Saneian Northeastern University, Boston, MA, USA
Abstract

In this paper, we study the weighted stochastic matching problem. Let G=(V,E) be a given edge-weighted graph, and let its realization 𝒢 be a random subgraph of G that includes each edge eE independently with a known probability pe. The goal in this problem is to pick a sparse subgraph Q of G without prior knowledge of 𝒢, such that the maximum weight matching among the realized edges of Q (i.e., the subgraph Q𝒢) in expectation approximates the maximum weight matching of the entire realization 𝒢.

It is established by previous work that attaining any constant approximation ratio for this problem requires selecting a subgraph of max-degree Ω(1/p), where p=mineEpe. On the positive side, there exists a (1ε)-approximation algorithm by Behnezhad and Derakhshan [FOCS’20], albeit at the cost of a max-degree having exponential dependence on 1/p. Within the O(1/p) query regime, however, the best-known algorithm achieves a 0.536 approximation ratio due to Dughmi, Kalayci, and Patel [ICALP’23], improving over the 0.501 approximation algorithm by Behnezhad, Farhadi, Hajiaghayi, and Reyhani [SODA’19].

In this work, we present a 0.68-approximation algorithm with the asymptotically optimal O(1/p) queries per vertex. Our result not only substantially improves the approximation ratio for weighted graphs, but also breaks the well-known 2/3 barrier with the optimal number of queries – even for unweighted graphs. Our analysis involves reducing the problem to designing a randomized matching algorithm on a given stochastic graph with some variance-bounding properties. To achieve these properties, we leverage a randomized algorithm by MacRury and Ma [STOC’24] for a variant of online stochastic matching.

Keywords and phrases:
Sublinear algorithms, Stochastic, Matching
Category:
Track A: Algorithms, Complexity and Games
Copyright and License:
[Uncaptioned image] © Mahsa Derakhshan and Mohammad Saneian; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Streaming, sublinear and near linear time algorithms
Related Version:
Full Version: https://arxiv.org/abs/2311.08513
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

In the stochastic weighted matching problem, we are given an n-vertex weighted graph G=(V,E) along with a parameter pe(0,1] for any edge eE. A random subgraph 𝒢 of G is then generated by independently including (or realizing) each edge eE with probability pe. Here, we refer to G as the base graph and 𝒢 as the realized subgraph. The objective of this problem is to select a subgraph Q of the base graph without the knowledge of its realization such that: (1) Q has a small max-degree, namely a constant with respect to n, and (2) The realized edges of Q (i.e., the graph Q𝒢) contain a large weight approximate matching. We define the approximation ratio as the expected weight of the maximum matching among the realized edges of Q over the expected weight of the maximum weighted matching of 𝒢.

One immediate application of the stochastic weighted matching problem is its use as a matching sparsifier, which approximates the maximum weighted matching even when random edge failures occur [1]. Additionally, it finds various applications in matching markets, including kidney exchange [11], online labor markets [9, 10], and dating platforms. In these applications, we are provided with the base graph G, but we are tasked with finding a matching in the realized subgraph 𝒢. To achieve this, an algorithm can query each edge of G to determine whether it is realized. However, these queries often involve time-consuming or costly operations, such as conducting candidate interviews or medical exams. Hence, it is crucial to minimize the number of queries. This can be accomplished by non-adaptively querying a subgraph Q with a small degree while still expecting to find a matching with a large approximation ratio among its realized edges.

A simple lower bound – see e.g. [2] – -shows that attaining any constant approximation ratio for this problem requires selecting a subgraph of max-degree Ω(1/p) where p=mineEpe. This simply follows from the fact that if the graph G is a clique, one needs Ω(1/p) queries per vertex to avoid too many singleton vertices. This raises a natural question:

What is the best approximation achievable with a subgraph of optimal max-degree O(1/p)?

Prior to this work, the best known approximation with these many queries was obtained in the work of Dughmi, Kalayci, and Patel [12] who obtained a 0.536 approximation, improving over the prior 0.501 approximation of Behnezhad, Farhadi, Hajiaghayi, and Reyhani [9].111We note that the result of [9] requires a maximum degree of O(log(1/p)/p). We note that if is also extensive work on achieving higher approximations by increasing the maximum degree e.g. to exp(1/p) [6]. We will overview these results later in Subsection 1.1.

Our main result is a significant improvement of the approximation from 0.536 to 0.68. Our approximation improves to 0.73 in the case of bipartite graphs.

Theorem 1.

For the weighted stochastic matching problem, there exists an algorithm that picks a O1/p degree subgraph Q of G such that the expected weight of the max-weight realized matching in Q is at least 0.68 times the expected weight of the max-weight realized matching in G. The approximation improves to 0.73 for bipartite graphs.

It is also worth noting that our Theorem 1 breaks through the intriguing 2/3-approximation barrier (see [2, 8]) for this problem. Before this work, it was not known whether this bound could be broken by querying a graph of max degree O(1/p) even in the case of unweighted graphs. To achieve this result, we demonstrate that the problem can be reduced to designing approximate matching algorithms with specific properties, which we term variance-bounding matching algorithms. This reduction implies that further exploration of the stochastic matching problem may be focused on developing such algorithms.

1.1 Related Work

Since its introduction in the pioneering work of Blum, Dickerson, Haghtalab, Procaccia, Sandholm, and Sharma [11], the stochastic matching problem has received considerable attention [11, 2, 3, 18, 10, 9, 1, 7, 8, 6]. It is established by Assadi, Khanna, and Li [2] that attaining any constant approximation ratio for this problem requires selecting a subgraph of maximum degree Ω(1/p), where p=mineEpe.

Yamaguchi and Maehara [18] were the first to consider the weighted version of the problem. They provided a 0.5ε approximation algorithm via O(Wlogn/p) queries per vertex, where W is the largest edge weight. Their query complexity was later improved to poly(1/p) by Behnezhad and Reyhani [10]. Subsequently, Behnezhad, Farhadi, Hajiaghayi, and Reyhani [9] provided the first algorithm breaking the 0.5 approximation via O(1/p) queries per vertex. Finally, Dughmi, Kalayci, and Patel [12] improved this approximation ratio to 0.536 using an asymptotically optimal number of queries. In a separate line of work, Behnezhad and Derakhshan [6] showed that if exp(1/p) queries per vertex are allowed, it is possible to improve this approximation ratio up to 1ε.

For the unweighted version of the problem, the pioneering work of Blum et al. [11] provides a 0.5 approximation via poly(1/p) queries per vertex. After a series of works [9, 3], this approximation ratio was improved to 2/3 via O(log(1/p)/p) queries by Assadi and Bernstein [1]. For the special case of unweighted bipartite graphs, Behnezhad, Blum, and Derakhshan improved this approximation ratio to 0.73 [5]. Behnezhad, Derakhshan, and Hajiaghayi [8] were the first to obtain a (1ε)-approximation, albeit using exp(1/p) queries per vertex. Finally, in a recent breakthrough result, Azarmehr, Behnezhad, Ghafari, and Rubinfeld [4] improved this query complexity to (1/p)exp(1/ε).

2 Technical Overview

The algorithm we use to construct Q is a quite simple one which was introduced by [9] and subsequently studied by [8, 5]. Given a parameter t, the algorithm starts by drawing t realizations of G drawn from the same distribution as 𝒢. Let us represent these random subgraphs by 𝒢1,,𝒢t. We then let Q be the union of max-weight matchings of these graphs. That is

Q:=itMWM(𝒢i),

where MWM() is a deterministic algorithm returning the max-weight matching of a given graph. Since Q is a union of t matchings, it clearly has max-degree t. The challenge, however, is proving that for a t as small as O1/p, the realization of Q contains a large weight matching. We provide a constructive proof for this. That is, we design an algorithm for finding a matching with a large approximation ratio in 𝒬, the actual realization of Q. Below, we first briefly review the ideas used by the previous work and then discuss the ingredients we add to achieve our desired result.

2.1 Crucial/Non-crucial Edge Decomposition

The framework utilized to analyze the aforementioned algorithm involves partitioning the edges into two categories: crucial and non-crucial. Separate arguments are then presented to demonstrate how these edges can be integrated to construct a large weight enough matching in 𝒬. Let xe denote the probability of edge e being part of the optimal solution, i.e.,

xe=Pr[eMWM(𝒢)].

We define the set of crucial edges, denoted by C, and the set of non-crucial edges, denoted by N, as follows:

C={eE:xeτ} and N={eE:xe<τ},

where τ is a fixed threshold in the order of p. Note that by choosing a sufficiently large value for t=O1/p, we can ensure that Q contains nearly all of the crucial edges. To establish the existence of a large weight matching in 𝒢, the first step is to construct a matching Mc exclusively on the crucial edges which is an α-approximation with respect to the contribution of the crucial edges to the optimal solution. (Mc should satisfy some other useful properties which we will discuss later.) The next step is constructing a fractional matching 𝒇 on the subgraph of non-crucial edges whose endpoints are unmatched in Mc. This fractional matching should satisfy the two following properties: first, for any edge eN, it holds that E[fe]xe; second, the values of fe should be small enough to ensure that 𝒇 has almost no integrality gap. By combining these steps, the framework constructs a matching with weight almost α×𝖶(MWM(𝒢)C)+𝖶(MWM(𝒢)N). Here 𝖶(.) is a function returning the weight of a given matching.

All papers utilizing this analysis framework require the algorithm used for constructing Mc to match the endpoints of any non-crucial edge e independently. Otherwise, the edge is discarded. This requirement is the main reason why Behnezhad and Derakhshan [6] need to take an exponential number of edges per vertex. To achieve this property, they employ a distributed LOCAL algorithm for constructing Mc, which can lead to a vertex being dependent on the vertices within its Ω(log(Δ)) radius ball, where Δ denotes the crucial degree of a vertex. Since potentially (1/p)log(1/p) non-crucial edges may be discarded for each vertex, these edges need to have small xe values. Consequently, a small threshold τ and a large t must be chosen. Due to known lower bounds for matching in the LOCAL model [15], one cannot hope to prove desirable approximation ratios for a Q of max-degree poly(1/p) following this approach.

In this work, we demonstrate that it is possible to relax the requirement regarding the independent matching of endpoints of any non-crucial edge in Mc. Instead, we replace it with an upper bound on the variance of a parameter related to the neighborhood of each vertex. Specifically, it should be possible to pick a subset A of the vertices unmatched by Mc such that:

  1. 1.

    Any non-crucial edge e=(u,v) satisfies Pr[{u,v}A]δ for a fixed constant δ>0.

  2. 2.

    Let us define random variable Zv related to the neighborhood of any vertex v as

    Zv=e=(u,v)NxePr[{u,v}A]𝟙uA. (1)

    Note that the randomization here is due to A and Mc being random variables themselves. We require the variance of this random variable to be upper-bounded as follows: Var(Zv)10τδ2.

We define an algorithm for finding Mc and A to be a variance-bounding matching algorithm (see definition 10) if it satisfies the above-mentioned property (and a few others). We then provide a reduction demonstrating that if Mc is an α-approximation with respect to the contribution of crucial edges to the optimal solution, then it is possible to find an almost 12α-approximate solution on the realized edges of Q. Our proof strongly relies on independent edge realizations, hence enabling us to break the 2/3 barrier.

The second step of our analysis involves proving the existence of variance-bounding matching algorithms with approximation ratios of almost 0.535 and 11/e, respectively, for general graphs and bipartite graphs. In order to do this, we utilize algorithms designed for a variant of online stochastic matchings, particularly batched random-order contention resolution schemes (RCRS). We prove that any α-selectable RCRS can be used to obtain a variance-bounding matching algorithm with an approximation ratio of almost α. We discuss this in Section 6.

3 Preliminaries

3.1 Notation

In the stochastic weighted matching problem, the input is an n-vertex graph G=(V,E), a vector of weights 𝒘={we:eE} and a probability vector 𝒑={pe:peE}. Subgraph 𝒢 is a random subgraph of G which contains each edge independently with probability pe. The goal in this problem is to pick a subgraph Q of G without the knowledge of its realization such that: (1) Q has a small max-degree, namely a constant with respect to n, and (2) The realized edges of Q (i.e., the graph Q𝒢) contain a large weight approximate matching. We define the approximation ratio as

E[𝖶(MWM(𝒬))]E[𝖶(MWM(𝒢))],

where 𝒬=𝒢Q is the realization of Q and MWM(.) is a deterministic algorithm returning a maximum weighted matching of a given graph, and 𝖶(M)=eMwe is a function returning the weight of a given matching M. We will use opt=MWM(𝒢) to refer to the maximum matching of the actual realization. We may sometimes abuse notation and use opt to refer to its expected weight when it is clear from the context. Note that while opt is a random variable E[𝖶(opt)] is just a number. For any edge eE, we define

xe=Pr[eopt],

where the probability is taken over the randomization in 𝒢. Similarly, for any vertex vV we let xv=Pr[vopt] be the probability that v is matched in opt. By the definition stated below, 𝒙 is a fractional matching as each vertex joins opt w.p. at most one.

Definition 2 (Fractional matching).

A fractional matching 𝐱 of a graph G=(V,E) is an assignment {xe}eE to the edges, where xe[0,1] for each edge eE, and for each vertex vV, we have xv:=evxe1. We use |𝐱|:=eExe to denote the size of a fractional matching, and for any subset EE, we use 𝐱(E):=eExe.

Definition 3 (Graph hallucination).

We say graph is a hallucination of graph H which is a subgraph of G if any edge e=(u,v) in H is present in with probability pe.

Throughout the paper, we use the notation Oε(f(n)) which means we have assumed ε to be a constant to calculate the complexity of f(n). The max-degree of subgraph Q we find in this paper depends on the smallest pe amongst all edges, which we refer to as p. In other words

p=mineEpe.

In the following table, we list a set of variables and their values, which we will use throughout the paper. Values are defined as functions of ε(0,1), which is a sufficiently small constant, and δ(0,1), which we will introduce in Definition 10.

Table 1: Value of the parameters used throughout the paper.
Variable δ τ η β γ c
Value ε0.5 20pε5δ2 ε/10 ε2/100 1ε21+3η 10/ε

3.2 Concentration Inequalities and Probabilistic Tools

In this section, we state the concentration inequalities and some of the probabilistic tools that will be used throughout the paper.

Proposition 4 (The Efron–Stein Inequality [17]).

Suppose X1,,Xn,X1,,Xn are independent random variables with Xi and Xi having the same distribution for all i. Let X=(X1,,Xn) and X(i)=(X1,,Xi1,Xi,,Xi+1,,Xn). Then:

Var(f(X))12i=1nE[(f(X)f(X(i)))2]
Proposition 5 (Chebyshev’s Inequality).

Let X be a random variable with finite non-zero standard deviation s, (and thus finite expected value μ.) Then for any real number c>0, we have

Pr[|Xμ|cs]1c2.
Proposition 6 (Law of Total Variance).

Let X be a random variable and Y be a random variable with respect to the same sample space. Then, the variance of X can be expressed as

Var(X)=E[Var(XY)]+Var(E[XY]).
Definition 7 (Negative Association).

A set of random variables X1,,Xn is said to be negatively associated if for any two disjoint index sets i,j[n] and two functions f, g both monotone increasing or both monotone decreasing it holds:

E[f(Xi:iI)g(Xj:jJ)]E[f(Xi:iI)]E[g(Xj:jJ)]

4 The Algorithm for Selecting 𝑸

In this section, we present a formal statement of the algorithm employed to construct the subgraph Q. We then explain how we can use the tools we provide later in the paper to show quering Q proves Theorem 1 (the main theorem).

In summary, for a given parameter t=Oε(1/p), we draw t matchings from the same distribution as opt (the optimal solution) and define Q as the union of these matchings.

Algorithm 1 Algorithm for constructing Q.

Let us define subsets of crucial and noncrucial edges as follows.

C={eE:xeτ} and N={eE:xe<τ}, (2)

where τ=θ(1t) and t=1τε for a sufficiently small ε(0,1). (The actual value of τ and the other variables used in the paper are presented in Table 1.) Note that in the above algorithm, matching M1,,Mt are independent from each other and come from the same distribution as opt. This means that for any edge e and i[t] we have Pr[eMi]=Pr[eopt]=xe. As a result, the subgraph outputted by this algorithm contains almost all the crucial edges. Moreover, it picks any non-crucial edge eN with a large enough probability as a function of their xe. We formally state these properties below in Claim 8 and Claim 9. While the proofs are pretty straightforward, we include them in the full version for the sake of completeness.

Claim 8.

Given constant numbers τ,ε(0,1), Let Q be the output of Algorithm 1 with parameter t1τε. Any crucial edge eC with xeτ is present in Q with probability at least 1ε.

Claim 9.

Any edge eE is present in Q with probability at least min(1/3,txe/3).

4.1 Proof of the Main Theorem

As discussed previously in Section 2, to prove Theorem 1, we will show that 𝒬 contains a 0.68-approximate matching for general graphs and a 0.73-approximate matching for bipartite graphs. Since 𝒬 is the union of t=Oε(1/p) matchings, this proves our main result. In Definition 10, we define variance-bounding matching algorithms. In Lemma 11, we prove that for any α(0,1) and a small enough constant ε>0 existence of an α-approximate variance-bounding algorithm implies 𝒬 contains a (12αε)-approximate matching. In Lemma 21, we prove the existence of variance-bounding matching algorithms with approximation ratios of (0.5356δ) and (11/e6δ), respectively, for general and bipartite graphs where δ(0,1) is a parameter in Definition 10. In Table 1, we choose δ=ε0.5. By picking a sufficiently small enough ε, this implies that 𝒬 contains a matching with an approximation ratio of

120.535+6ε0.25ε0.68

for general graphs and approximation ratio of

11+1/e+6ε0.25ε0.73

for bipartite graphs.

5 The Reduction

In this section, we first introduce variance-bounding matching algorithms and then show that the the existence of an α-approximation variance-bounding matching algorithm implies that it is possible to find a (12αε)-approximate matching with Oε(1/p) queries per vertex.

Definition 10 (Variance-bounding (VB) matching algorithm).

We call a matching algorithm 𝒱 an α-approximation variance-bounding algorithm if it has the following properties. It takes as input (1) a graph H=(V,E) whose edges are realized independently, each with a given probability pe forming subgraph , and (2) a matching M𝒪 of found by an arbitrary (potentially randomized) algorithm. The algorithm then outputs a matching Mc of and a subset A of vertices that are unmatched in Mc 222A is just a subset of unmatched vertices, so some unmatched vertices may not be in A. such that:

  1. 1.

    Mc is in expectation an α-approximate matching with respect to M𝒪.

  2. 2.

    For any vertex vV, Pr[vA]Pr[vM𝒪].

  3. 3.

    For any two vertices u,v that do not have an edge in H the following holds. Pr[{u,v}A]δ for a δ(0,1).

  4. 4.

    Given a parameter τ(0,1), let 𝒙 be a fractional matching on H¯=(V,E¯) (the complement of H) with xeτ for any eE¯. For any vertex v variable Zv, defined below, satisfies Var(Zv)6τδ2.

Let us briefly explain why we need a variance-bounding matching algorithm. We will use this algorithm on all the realized crucial edges (i.e., H=(V,C)) and define M𝒪 to be a matching with the expected weight the same as the contribution of the crucial edges to the optimal solution. We formally define these inputs in Definition 12. This gives us a matching Mc on the crucial edges and a subset A of vertices unmatched in Mc. We will then construct a fractional matching 𝐟 with a small integrity gap exclusively using the (queried and realized) non-crucial edges between vertices in A. We need Property 1 to ensure that Mc is large with respect to the contribution of crucial edges to the optimal solution. Property 2 ensures that each vertex is available in A with a large enough probability for its non-crucial edges to be able to contribute to 𝐟 almost as much as their contribution to opt. We need Property 3 to ensure that each edge is available for potential contribution to 𝐟 with a large enough probability. Finally, we will use Property 4 to prove that constructing 𝐟 in a particular way does not result in fractional degrees of vertices exceeding one too often. For more details about the importance of this property, see Section 5.2.

Lemma 11 (The Reduction).

For constant numbers α(0.5,1) and ε(0,0.1), existence of an α-approximation variance-bounding algorithm 𝒱 (from Definition 10) implies a (12αε) approximation algorithm for the weighted stochastic matching problem with Oε(1/p) queries per vertex.

We will prove that the existence of an α-approximation variance-bounding algorithm implies that querying the subgraph Q which is the output of Algorithm 1 with parameter t=Oε(1/p) gives us a (12αε)-approximate solution. Before formally proving this in Section 5.3, we need to prove a series of other claims and provide some definitions. Below, we give a brief overview of the proof.

The first step of the proof is using the variance-bounding algorithm on the subgraph of all the crucial edges. Recall that 𝒱 takes as input (1) a graph H whose edges are realized independently, each with a given probability pe forming subgraph , and (2) a matching M𝒪 of found by an arbitrary (potentially randomized) algorithm. Below, we detail the values assign to these parameters in our reduction.

Definition 12 (H and M𝒪).

In our reduction we choose the following values for and M𝒪:

  1. 1.

    We set H to be the subgraph of all the crucial edges C. In other words, H=(V,C).

  2. 2.

    We set M𝒪=MWM(𝒩), where =𝒢H is the actual realization of all the crucial edges, and 𝒩 is a random hallucination (refer to Definition 3) of the non-crucial edges containing each edge independently with probability pe. Note that M𝒪 is a matching only on the realized crucial edges.

In the remainder of this paper, when referring to a variance-bounding algorithm without specifying the input, we assume that the variables H and M𝒪 are defined according to Definition 12. Executing the variance-bounding algorithm 𝒱 with these predefined inputs gives us a matching Mc on the critical edges and a subset of unmatched vertices denoted as A. Using Property 1, we prove (in Claim 13) that Mc is an α approximation with respect to the contribution of the crucial edges to the optimal solution. Since due to Claim 8, 𝒬 contains any crucial edge with probability at least 1ε, this implies that QMc weights, in expectation, at least (1ε)α times the contribution of crucial edges to opt. It is important to note that we apply the algorithm 𝒱 to all realized critical edges, not exclusively those within Q (the queried ones). This approach ensures that the output of 𝒱, consisting of Mc and set A, is independent of the choice of Q, as outlined in Claim 13.

The next step of the reduction is using the non-crucial edges among vertices in A to construct a fractional matching 𝐟. In Lemma 20, we use properties of 𝒱 to ensure that the expected contribution of any non-crucial edge to 𝐟 is almost the same as its contribution to the optimal solution. We then use the fact that these edges are non-crucial (hence have small fes) to prove in Lemma 15 that 𝐟 has almost no integrity gap. Putting these pieces together, we prove that either union of this rounded matching and Mc is an (12αε) approximate solution, or simply only using the crucial edges in 𝒬 gives us this approximation.

In the following claim, we prove two basic properties about Mc and set A and their relation to the set of non-crucial edges in 𝒬.

Claim 13.

Let Mc and A be the outputs of an α-approximation variance-bounding algorithm which takes as input the subgraph H=(V,C) and matching M𝒪 defined in 12. We have the followings for Mc and A:

  1. 1.

    The expected weight of matching Mc is at least α times the weight of matching optC (i.e., the contribution of the crucial edges to the optimal solution).

  2. 2.

    Let Q be the subgraph of edges we choose to query. For any non-crucial edge eN, the event e𝒬 is independent of both Mc and A.

Proof.

To prove the first item of this claim, we will first show that matching M𝒪 defined in 2 has the same expected weight as the contribution of the crucial edges to the optimal solution. In other words, E[𝖶(optC)]=E[M𝒪]. Recall that we have defined M𝒪=MWM(𝒩), where =𝒢H is the actual realization of all the crucial edges, and 𝒩 is a random hallucination of the non-crucial edges containing each edge independently with probability pe. This implies that 𝒩 comes from the same distribution as 𝒢 and as result M𝒪 is drawn from the same distribution as opt. For any crucial edge eC this gives us Pr[eM𝒪]=Pr[eoptC] and E[𝖶(optC)]=E[M𝒪]. This proves the first part of the claim since due to Definition 10, property 1 we know Mc is an α approximation with respect to M𝒪.

( To prove the second part of this claim, observe that event e𝒬 is a function of Q and the realization of non-crucial edges, while Mc and A are obtained from running a variance-bounding matching algorithm with inputs H=(V,C) and M𝒪=MWM(𝒩)H. Here, is the actual realization of all the crucial edges while 𝒩 is a random hallucination of the non-crucial edges (not the actual realization). Graph H=(V,C) and function MWM(.) are deterministic which means the only randomization in determining values of Mc and A comes from 𝒩. Since edges of 𝒢 are realized independently, 𝒩 is independent of the actual realization of the non-crucial edges. It clearly is also independent of the choice of Q. This implies that knowing the outcome of event e𝒬 does not change the distribution of Mc and A; hence they are independent.

5.1 A Fractional Matching on the Non-crucial Edges

In this section, we will construct a fractional matching on the non-crucial edges to augment the matching we get from running a variance-bounding matching on the crucial edges. Given a variance-bounding algorithm 𝒱, let Mc and A be the output of 𝒱 with inputs given according to Definition 12. To begin, let us define a variable ge for any non-crucial edge as follows:

ge=xePr[e𝒬]Pr[u,vA],u (3)

where xe is defined as

xe=Pr[eopt].

Ideally, for constructing our fractional matching, we would assign a fractional value of ge to edge e whenever e𝒬 and both of its endpoints are in A. Since these events are independent due to Claim 13, their joint probability is Pr[e𝒬]Pr[u,vA]. By constructing a fractional matching in this manner, we achieve E[fe]=xe for any edge e and E[𝒇𝒘]=E[𝖶(opt)].

However, the challenge lies in the fact that constructing 𝒇 in this way may result in it not being a valid fractional matching, as certain vertices may have a fractional degree greater than one. In other words, (u,v)Nf(u,v)>1 may occur for some vertices vV. To address this issue, we first scale down the fractional values by a small amount. Subsequently, we discard any vertex whose fractional degree still exceeds one. The challenge then becomes demonstrating that this event does not significantly reduce the expected size of the fractional matching. We formally state the algorithm for constructing a fractional matching on the non-crucial edges in Algorithm 2.

Algorithm 2 A fractional matching on the realized non-crucial edges.

Since our ultimate goal is to demonstrate the existence of a large weight integral matching on 𝒬 rather than a fractional one, let us first address the integrality gap of the fractional matching produced by this algorithm. We first prove an upper bound of ε3 for ge of non-crucial edges in Claim 14. We then use this in Lemma 15 to prove that the output of Algorithm 2 has a small integrality gap. To help with the flow of the paper, both proofs are deferred to the full version.

Claim 14.

By choosing a sufficiently small constant ε>0 in Algorithm 2, we get geε3 for all non-crucial edges.

Lemma 15.

Consider the fractional matching 𝐟 produced by Algorithm 2. There exists an integral matching on the non-crucial edges of 𝒬 between vertices in A with weight at least (1ε/2)𝐟𝐰.

Survival of vertices and non-crucial edges.

For any vertex vV, we say v survives Algorithm 2 iff it is in set A, and it is not killed in Line 13 of the algorithm (i.e., its fractional degree is not reduced to zero). We also say an edge e survives the algorithm iff both its endpoints survive (regardless of whether it is in 𝒬 or not).

5.2 Expected Weight of the Fractional Matching

Let 𝒇 denote the value of 𝒇 constructed by Algorithm 2 before it zeroes out certain fe values in Line 13. As discussed earlier in this section, it is evident that E[fe]=γxe for any edge in eN. Since γ deviates from one by a small constant, the expected weight of 𝒇 is a sufficiently large approximation relative to the contribution of the non-crucial edges to the optimal solution. Thus, the primary challenge lies in proving that we do not incur a substantial loss by zeroing out certain fe values in Line 13. Roughly speaking, we only have the opportunity to use an edge e=(u,v) whenever it is in 𝒬 and its endpoints are in A (i.e., fe0), and we lose this opportunity if at least one of its endpoints does not survive Algorithm 2. That is, we have:

Pr[fe0]=Pr[fe0]Pr[u or v does not survive fe0].

To quantify the amount of loss per edge, we need to upper-bound Pr[u or v do not survive fe0] and show that it is significantly smaller compared to Pr[fe0]. Note that here, fe0 is not independent of e’s end-points surviving since it contributes to their fractional degree. Furthermore, eQ is correlated, albeit negatively (see Claim 25), with the existence of its neighboring non-crucial edges in Q, which may also impact the fractional degrees of u and v in 𝒇. However, it is still helpful to first upper-bound the probability of u and v surviving without conditioning on fe0. The intuition behind this is that since fe is very small (i.e., upper-bounded by ε3 due to Claim 14), its impact on the fractional degree of each endpoint is insignificant. Moreover, since eQ is negatively associated with eQ for any non-crucial ee connected to u or v, conditioning on eQ does not increase their fe. To upper-bound Pr[v does not survive] for any vertex v, let us define

Yv=e=(u,v)Nge𝟙uA𝟙e𝒬. (4)

Since we set fe=geγ iff eQ and {u,v}A, whenever vertex v is present in A we have

Yv/γ=e=(u,v)Nfe.

Hence, vertex v survive Algorithm 2 iff Yv/γ1. In Lemma 16, we prove that random variable Yv is concentrated around its mean for any vertex. This analysis crucially relies on Property 4 of variance-bounding algorithms. This would have been enough if we knew E[Yv] is sufficiently close to one. While we do not exactly have this, we can use the second property of the variance-bounding algorithms to show E[Yv|vA]1. This is only doable thanks to Property 2. Combining all these together, we are able to finally prove in Lemma 17 that Yv is sufficiently close to one, with a sufficiently large portability.

Since both Lemma 16 and Lemma 17 have lengthy and technical proofs, we respectively allocate Section A.1 and another section in the full version to present detailed proofs for them. Finally, we put the pieces together in Lemma 20 to demonstrate that E[fe] is sufficiently large compared to xe (the contribution of e to the optimal solution).

Lemma 16.

For any vertex vV define random variable

Yv=e=(u,v)Nge𝟙uA𝟙e𝒬,

where ge=xePr[e𝒬]Pr[{u,v}A]. The following inequality holds for these random variables.

Pr[|Yv𝔼[Yv]|η]β

for β=ε2100 and η=ε10.

Lemma 17.

For any vertex vV we have:

Pr[Yv1+3η]β

The proof is deferred to the full version of the paper.

Definition 18.

For a vertex u we define Yv(u) to be the summation that we have for Yv except for the edge e=(u,v). Formally:

Yv(u)=e=(u,v)N,uuge𝟙uA𝟙e𝒬,
Lemma 19.

For every edge e=(v,u) and constant λ(0,1) we have:

Pr[Yv(u)>λ]Pr[Yv(u)>λe𝒬]
Lemma 20.

For every non-crucial edge e=(u,v) we have E[fe](1ε/2)xe.

Due to space constraints, we defer the proof of these two lemmas to the full version.

5.3 Proof of Lemma 11 (The Reduction)

In this section, we will put all the pieces together to formally prove Lemma 11. Let Q be the subgraph outputted by Algorithm 1. We prove that the existence of an α-approximation variance-bounding matching algorithm means 𝒬, the realization of Q, contains a 12αε approximate solution. Since Q is the union of t=1τε matchings, plugging in the value of τ=pε5δ2 from Table 1 implies Q has max-degree Oε(1/p). Therefore, to prove this lemma, it suffices to show that 𝒬 contains a 12αε approximate solution.

Let Mc and A be the outputs of the α-approximation variance-bounding algorithm on inputs specified in Definition 12. Recall that Mc is a matching on the crucial edges and A is a subset of vertices unmatched in Mc. Let σ be the ratio of the optimal solution that comes from the crucial edges. That is

σ=eCPr[eopt]we𝖶(opt).

Due to Claim 13, we know that the expected weight of Mc is ασ fraction of the optimal solution. Furthermore, we showed in Claim 8 that any crucial edge belongs to Q with probability at least (1ε). As a result we have

E[𝖶(Mc𝒬)](1ε)ασ𝖶(opt). (5)

The next step is to use the non-crucial edges among vertices in A to augment Mc𝒬. In Lemma 20, we prove that it is possible to construct a fractional matching 𝒇 on the non-crucial edges among vertices in A such that for any non-crucial edge

E[fe](1ε/2)Pr[eopt].

Hence, E[𝒇𝒘](1σ)(1ε/2)𝖶(opt). As a result of Lemma 15 it is possible to round 𝒇 and achieve an integral matching Mn such that

E[𝖶(Mn)](1ε/2)(1σ)(1ε/2)𝖶(opt)(1ε)(1σ)𝖶(opt). (6)

Putting Equation 5 and Equation 6 together implies the existence of a matching in 𝒬 with expected weight at least

(1ε)ασ𝖶(opt)+(1ε)(1σ)𝖶(opt)=(1ε)𝖶(opt)(1σ+ασ).

We claim that the best of this matching and simply taking the max-weight matching among the crucial edges of 𝒬 gives us the desired approximating ratio. Since each crucial edge belongs to Q w.p. at least 1ε, its realization contains a matching with expected weight at least (1ε) times the contribution of crucial edges to the optimal solution which is (1ε)σ𝖶(opt). The best of these two solutions gives us the approximation ratio of at least

(1ε)max(σ,1σ+ασ)(1ε)12α12αε.

Hence, this implies that the realization of subgraph Q with max-degree Oε(1/p) contains a (12αε)- approximate solution completing the proof of Lemma 11.

6 A Variance-Bounding Matching Algorithm

In this section, we discuss the existence of variance-bounding matching algorithms (defined in 10) for general and bipartite graphs. We will show that any α-selectable batched random-order contention resolution schemes (RCRS) [13] can be used to get a variance bounding matching algorithm with an approximation ratio of almost α.

Lemma 21 (Variance-bounding Matching Lemma).

For any sufficiently small constant δ(0,1), there exists a (0.5356δ) approximation variance-bounding matching algorithm for general graphs and a (11/e6δ) approximation one for bipartite graphs, which satisfy the third property with parameter δ.

To prove this lemma, we will design an algorithm that achieves the properties of a variance-bounding matching algorithm discussed in Definition 10. Let us start with the definition of batched RCRS, which we will use in the design of our algorithms.

Batched RCRS.

Suppose we are given a graph G=(V,E) along with a fractional matching 𝒚 on the graph. The graph is revealed in an online manner as follows. Vertices arrive in a uniformly random order given by a permutation π. Upon the arrival of a vertex v, the status of the edges connecting v to vertices before it (i.e., all vertices u that πu<πv) are revealed, namely a batch of edges. Then, at most, one of the edges in the batch becomes active such that

Pr[e becomes active]=ye

for any edge e in the batch. A batched RCRS decides, upon the arrival of each vertex, irrevocably whether to select the active edge (if any exists). At any point in time, the selected edges must form a matching. Given a parameter α a batched RCRS is called α-selectable if it picks each active element with probability at least α.

The best-known batched RCRS for general graphs is by MacRury and Ma [16] and has α=0.535. For bipartite graphs, there is a batched RCRS with α=11e due to Gamlath, Kale, and Svensson. [14].

Proposition 22 ([16] and [14]).

There exists a 0.535-selectable batched RCRS for general graphs and a (11/e)-selectable batched RCRS for bipartite graphs.

We state our variance-bounding matching algorithm formally in Algorithm 3. The algorithm starts by drawing a random permutation π over the vertices uniformly at random. We then let the vertices arrive in the order given by this permutation. Upon arrival of a vertex v, we look at the realization of its edges to the vertices with smaller πu. Then, a random process decides which one of its edges (if any) becomes active. We explain this process in Definition 23. The process is designed in a way that the probability of each edge becoming active is (16δ)Pr[eM𝒪] where M𝒪 is the random matching in the statement of Lemma 21 and δ is the parameter from the third property of variance-bounding matching algorithms.

Definition 23.

(Edge Activation Process) The activation probability of the edges in this process comes from matching M𝒪 and parameter δ in the statement of Definition 10. (Recall that M𝒪 is a matching on the realized crucial edges.) See Definition 12 for what we set M𝒪 to. Let us define

ye=Pr[eM𝒪], (7)

where the randomization comes from the realization of edges in and the algorithm for finding M𝒪. Note that 𝐲 is a fractional matching since each vertex joins M𝒪 with probability at most one. Moreover, define set Ev,π={uE:πu<πv} to be all of v’s edges to vertices u with πu<πv. After looking at the realization of all these edges, let ye be the probability of e being in M𝒪 conditioned on the realization of Ev,π. That is

ye=Pr[eM𝒪realization of edges in Ev,π]. (8)

We activate at most one of the realized edges at random such that the probability of any remaining realized edge e being the active one is g(e)=(16δ)ye. This is possible since ye of the realized edges sums up to at most one.

We now have all the required tools to design our algorithm (stated below) which we claim satisfies the properties stated in Lemma 21.

Algorithm 3 Variance-bounding Matching on H=(V,E).
Claim 24.

For any permutation π in Algorithm 3, the probability of any edge e becoming active is (16δ)Pr[eM𝒪]

Proof.

We defer the proof of this claim to the full version due to space constraints.

Now, we are ready to prove Lemma 21 by showing that Algorithm 3 satisfies the properties of a variance-bounding matching algorithm. We defer this to the full version.

References

  • [1] Sepehr Assadi and Aaron Bernstein. Towards a unified theory of sparsification for matching problems. In Jeremy T. Fineman and Michael Mitzenmacher, editors, 2nd Symposium on Simplicity in Algorithms, SOSA 2019, January 8-9, 2019, San Diego, CA, USA, volume 69 of OASIcs, pages 11:1–11:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/OASICS.SOSA.2019.11.
  • [2] Sepehr Assadi, Sanjeev Khanna, and Yang Li. The Stochastic Matching Problem with (Very) Few Queries. In Proceedings of the 2016 ACM Conference on Economics and Computation, EC ’16, Maastricht, The Netherlands, July 24-28, 2016, pages 43–60, 2016. doi:10.1145/2940716.2940769.
  • [3] Sepehr Assadi, Sanjeev Khanna, and Yang Li. The Stochastic Matching Problem: Beating Half with a Non-Adaptive Algorithm. In Proceedings of the 2017 ACM Conference on Economics and Computation, EC ’17, Cambridge, MA, USA, June 26-30, 2017, pages 99–116, 2017. doi:10.1145/3033274.3085146.
  • [4] Amir Azarmehr, Soheil Behnezhad, Alma Ghaffari, and Ronitt Rubinfeld. Stochastic matching via in-n-out local computation algorithms. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, 2025.
  • [5] Soheil Behnezhad, Avrim Blum, and Mahsa Derakhshan. Stochastic vertex cover with few queries. In Joseph (Seffi) Naor and Niv Buchbinder, editors, Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 - 12, 2022, pages 1808–1846. SIAM, 2022. doi:10.1137/1.9781611977073.73.
  • [6] Soheil Behnezhad and Mahsa Derakhshan. Stochastic weighted matching:(1ε) approximation. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1392–1403. IEEE, 2020.
  • [7] Soheil Behnezhad, Mahsa Derakhshan, Alireza Farhadi, MohammadTaghi Hajiaghayi, and Nima Reyhani. Stochastic matching on uniformly sparse graphs. In Dimitris Fotakis and Evangelos Markakis, editors, Algorithmic Game Theory - 12th International Symposium, SAGT 2019, Athens, Greece, September 30 - October 3, 2019, Proceedings, volume 11801 of Lecture Notes in Computer Science, pages 357–373. Springer, 2019. doi:10.1007/978-3-030-30473-7_24.
  • [8] Soheil Behnezhad, Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Stochastic matching with few queries: (1-ε) approximation. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors, Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 1111–1124. ACM, 2020. doi:10.1145/3357713.3384340.
  • [9] Soheil Behnezhad, Alireza Farhadi, MohammadTaghi Hajiaghayi, and Nima Reyhani. Stochastic Matching with Few Queries: New Algorithms and Tools. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2855–2874, 2019. doi:10.1137/1.9781611975482.177.
  • [10] Soheil Behnezhad and Nima Reyhani. Almost Optimal Stochastic Weighted Matching with Few Queries. In Proceedings of the 2018 ACM Conference on Economics and Computation, Ithaca, NY, USA, June 18-22, 2018, pages 235–249, 2018. doi:10.1145/3219166.3219226.
  • [11] Avrim Blum, John P. Dickerson, Nika Haghtalab, Ariel D. Procaccia, Tuomas Sandholm, and Ankit Sharma. Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries. In Proceedings of the Sixteenth ACM Conference on Economics and Computation, EC ’15, Portland, OR, USA, June 15-19, 2015, pages 325–342, 2015. doi:10.1145/2764468.2764479.
  • [12] Shaddin Dughmi, Yusuf Hakan Kalayci, and Neel Patel. On sparsification of stochastic packing problems. In Kousha Etessami, Uriel Feige, and Gabriele Puppis, editors, 50th International Colloquium on Automata, Languages, and Programming, ICALP 2023, July 10-14, 2023, Paderborn, Germany, volume 261 of LIPIcs, pages 51:1–51:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ICALP.2023.51.
  • [13] Tomer Ezra, Michal Feldman, Nick Gravin, and Zhihao Gavin Tang. Online stochastic max-weight matching: prophet inequality for vertex and edge arrival models. In Proceedings of the 21st ACM Conference on Economics and Computation, pages 769–787, 2020. doi:10.1145/3391403.3399513.
  • [14] Buddhima Gamlath, Sagar Kale, and Ola Svensson. Beating greedy for stochastic bipartite matching. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2841–2854. SIAM, 2019. doi:10.1137/1.9781611975482.176.
  • [15] Fabian Kuhn, Thomas Moscibroda, and Roger Wattenhofer. Local computation: Lower and upper bounds. Journal of the ACM (JACM), 63(2):1–44, 2016. doi:10.1145/2742012.
  • [16] Calum MacRury and Will Ma. Random-order contention resolution via continuous induction: Tightness for bipartite matching under vertex arrivals. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 1629–1640, 2024. doi:10.1145/3618260.3649788.
  • [17] J Michael Steele. An efron-stein inequality for nonsymmetric statistics. The Annals of Statistics, 14(2):753–758, 1986.
  • [18] Yutaro Yamaguchi and Takanori Maehara. Stochastic Packing Integer Programs with Few Queries. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 293–310, 2018. doi:10.1137/1.9781611975031.21.

Appendix A Appendix

A.1 Proof of Lemma 16

We devote this section to prove Lemma 16 due to it being lengthy and technical.

Lemma 16 (restated). For any vertex vV define random variable

Yv=e=(u,v)Nge𝟙uA𝟙e𝒬,

where ge=xePr[e𝒬]Pr[{u,v}A]. The following inequality holds for these random variables.

Pr[|Yv𝔼[Yv]|η]β

for β=ε2100 and η=ε10.

To prove the desired concentration bound on Yv we begin by bounding its variance. This will allow us to apply Chebyshev inequality (Proposition 5) to prove our desired bound. Let us first examine the random variables that affect Yv’s value. One collection is the set of variables for presence of vertices after running Algorithm 3 in set A, i.e., SA={𝟙uA:uV} and the second collection is the edges being present in 𝒬, i.e. SQ={𝟙e𝒬:e=(u,v)N}.

By the law of total variance (Proposition 6) we have:

Var[Yv]=E[Var(YvSA)]+Var[E(YvSA)]

We will later prove that

E[Var[Yv|SA]]60(ε6+ε5) (9)

To bound the term Var[E(YvSA)] let us first examine what E(YvSA) is.

E[Yv|SA] =e=(u,v)NE[ge𝟙uA𝟙eQSA]
=e=(u,v)NE[xePr[e𝒬]Pr[{u,v}A]𝟙uA𝟙eQSA]
=e=(u,v)NE[𝟙eQPr[e𝒬]]E[xePr[{u,v}A]𝟙uASA]
=e=(u,v)NE[xePr[{u,v}A]𝟙uA]

To go from the second to the third line, we are using the fact that A and 𝒬 are independent due to Claim 13. Note that the term in the last line is Zv in Lemma 16. Applying the lemma with the fractional matching x being the edges having xe<τ we get that:

Var[E(YvSA)]10τδ2

Adding this with what we have from Equation 9 and applying law of total variance we get

Var(Yv)60(ε6+ε5)+10τδ2 (10)

Since τ=20pε5δ2 by setting ε to be a small enough constant, we can get the bound Var[Yv]ε4104. This will bound the standard deviation of Yv by ε2100 which is used when applying Chebyshev’s Inequality (See Proposition 5) on the random variable Yv. Now that we have sε2100, by applying Chebyshev’s Inequality, we get

Pr[|Yv𝔼[Yv]|cs]1c2 (11)

Note that we wanted to bound the probability that Yv deviates from its mean by η. Now if we have ηcs, we have

Pr[|Yv𝔼[Yv]|η]Pr[|Yv𝔼[Yv]|cs] (12)

By replacing value of η=ε10 and the fact that sε2100 we can see that it is enough to set c=ε10 to satisfy ηcs. Therefore by combining (11) and (12) we get

Pr[|Yv𝔼[Yv]|η] Pr[|Yv𝔼[Yv]|cs]
1c2
ε2100=β

Now that we proved the statement of the lemma using Equation 9, we prove it which states E[Var[Yv|SA]]60(ε6+ε5). Our first step is to see how random variables in SQ behave. First of all, random variables in SQ are not independent since 𝒬 is a collection of matchings, for two incident edges e1 and e2, when e1 is present in one of the matchings e2 will not be in that matching. This intuition might make us believe that for edges relevant to SQ because they all intersect at the vertex v their presence in 𝒬 is pairwise negatively correlated. This is in fact true and for proving it we prove a stronger fact about the random variables which is negative association which implies negative correlation. (see Definition 7 for definition).

Lemma 25.

Random variables in SQ are negatively associated.

Proof.

See the full version for the proof.

By definition, negative association implies negative correlation. This means Lemma 25 implies that for two edges e1=(u1,v),e2=(u2,v) such that e1,e2N we have:

Cov(𝟙e1𝒬,𝟙e2𝒬)0 (13)

Let us take an arbitrary realization of variables in SA and call it 𝑨. Our plan is, given this fixed 𝑨, first upper-bound Var[Yv|𝑨]. Then, using that, find an upper bound for Var[Yv]. At last, we apply Proposition 5 to prove the statement of the lemma.

Define the random variable

Xu=(g(u,v)𝟙uA𝟙e𝒬|𝑨).

We can see that if 𝟙uA=0, Xu is always equal to zero, and the inequalities discussed further will be trivial for Var[Xu]. In the case that 𝟙uA=1, Xu=(g(u,v)𝟙e𝒬). We can see that (Yv|𝑨)=(u,v)NXu. Now, we are ready to bound the variance of Yv conditioned on 𝑨. The first step is to bound the variance of Xu:

Var[Xu]=Var[ge𝟙uA𝟙e𝒬|𝑨]Var[ge𝟙e𝒬|𝑨] (14)

This is because when we have fixed A, in the case that 𝟙uA=0 then variance of Xu is zero and in the case that 𝟙uA=1 the bound in Equation 14 holds.

Now we know that Var[Xu]=E[Xu2]E[Xu]2E[Xu2] so from Equation 14 we get:

Var[Xu]E[Xu2]E[(ge𝟙u𝒬|𝑨)2]E[(ge𝟙u𝒬)2]Pr[e𝒬]ge2 (15)

Note that we can remove the condition on 𝑨 because variables in SQ and SA are independent. The last step comes from the fact that with probability Pr[e𝒬], (ge𝟙u𝒬)2] equals ge2 and it is zero otherwise. To make further progress, we need a bound on Pr[eQ]. The following lemma addresses this.

Expanding ge in Equation 15, we get:

Var[Xu] Pr[e𝒬](xepePr[eQ]Pr[{u,v}A])2
xe2pePr[eQ](Pr[{u,v}A])2
xe2pePr[eQ]δ2 (16)

To go from the first line to the second, first note the distinction between Q and 𝒬 in the equation above. By definition of e𝒬 being eQe𝒢 we can see that Pr[e𝒬]=pePr[eQ]. This is because e𝒢 is independent of eQ since Q is constructed on hallucinations of 𝒢. To go from the second line to the third line note that in Lemma 21, we showed Pr[{u,v}A]δ.

Moreover, from Claim 9 we know that Pr[eQ]min(1/3,txe/3) so we consider two cases:

Case 1: Pr[eQ]txe𝟑.

Combining this and (A.1) we get:

Var[Xu] xe2pePr[eQ]δ2
3xe2petxeδ2
3xepetδ2
Case 2: Pr[eQ]𝟏𝟑.

Combining this and (A.1) we get:

Var[Xu]xe2pePr[eQ]δ23xe2peδ2

Now that we have a bound on all Var[Xu]’s we are ready to bound Var[Y|𝑨]. The following proposition is what we need.

Proposition 26.

Let X be a random variable written as the sum of random variables X1,,Xn. So we have X=i=nnXi. Then we have:

Var[X]=i=1nVar[Xi]+2i=1nj>inCov(Xi,Xj)

In (13) we argued that all variables in SQ are negatively correlated. Recall the definition of Xu=g(u,v)𝟙uA𝟙e𝒬. Because we have fixed 𝑨 all Xu’s will be equal to zero or g(u,v)𝟙e𝒬. Hence we can argue that Cov(Xu,Xw)0. This is because if at least one of them is equal to zero then Cov(Xu,Xw)=0. Otherwise, since ge’s are constants sign of Cov(Xu,Xw) will be the same as Cov(𝟙(u,v)𝒬,𝟙(w,v)𝒬).

Therefore, by applying Proposition 26 to all Xu’s and the fact that they are pairwise negatively correlated we get:

Var[Yv|𝑨]uVar[Xu]umax(3xepetδ2,3xe2peδ2)u3xepetδ2+u3xe2peδ2 (17)

For brevity, we are writing u instead of (u,v)N for all the equations here. To bound the first sum, note that t=120ε6δ2p and also (u,v)Nxe1 therefore we have:

u3xepetδ2u320ε6δ2pminxepeδ2u60ε6xe60ε6 (18)

To bound the second sum, note that for non-crucial edges, we have xeτ. Since we have τ=20pminε5δ2 we get:

u3xe2peδ2u3τxepeδ2u320ε5δ2pminxepeδ2u60ε5xe60ε5 (19)

Putting things together we get, Var[Yv|𝑨]60(ε6+ε5). Now since we have proved this for any arbitary 𝑨 we can remove the condition on 𝑨 and get:

E[Var[Yv|SA]]60(ε6+ε5) (20)

which is exactly Equation 9 so the proof is complete.