Sink-Free Orientations: A Local Sampler with Applications
Abstract
For sink-free orientations in graphs of minimum degree at least , we show that there is a deterministic approximate counting algorithm that runs in time , a near-linear time sampling algorithm, and a randomised approximate counting algorithm that runs in time , where denotes the number of vertices of the input graph and is the desired accuracy. All three algorithms are based on a local implementation of the sink popping method (Cohn, Pemantle, and Propp, 2002) under the partial rejection sampling framework (Guo, Jerrum, and Liu, 2019).
Keywords and phrases:
Sink-free orientations, local sampling, deterministic countingCategory:
RANDOMCopyright and License:
2012 ACM Subject Classification:
Theory of computation Random walks and Markov chainsAcknowledgements:
We thank Guus Regts for helpful comments on an earlier version of this paper.Funding:
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 947778). Jiaheng Wang also acknowledges support from the ERC (grant agreement No. 101077083, CountHom).Editors:
Alina Ene and Eshan ChattopadhyaySeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
The significance of counting has been recognised in the theory of computing since the pioneering work of Valiant [44, 43]. In the late 80s, a number of landmark approximate counting algorithms [30, 18, 31] were discovered. A common ingredient of these algorithms is the computational equivalence between approximate counting and sampling for self-reducible problems [32]. The reduction from counting to sampling decomposes the task into a sequence of marginal probability estimations, each of which is tractable for sampling techniques such as Markov chains. However, while only the marginal probability of one variable is in question, simulating Markov chains requires keeping track of the whole state of the instance, and thus is obviously wasteful. It is more desirable to draw samples while accessing only some local structure of the target variable. We call such algorithms local samplers.
The first such local sampler was found by Anand and Jerrum [2], who showed how to efficiently generate perfect local samples for spin systems even when the underlying graph is infinite. Using local information is essential here as it is not possible to perfectly simulate the whole state. Subsequently, Feng, Guo, Wang, Wang, and Yin [20] found an alternative local sampler, namely the so-called coupling towards the past method, which yields local implementations of rapid mixing Markov chains. It is also observed that sufficiently efficient local samplers lead to immediate derandomisation via brute-force enumeration. Moreover, local samplers are crucial to obtain sub-quadratic time approximate counting algorithms for spin systems [1]. Thus, local samplers are highly desirable algorithms as they can lead to fast sampling, fast approximate counting, and deterministic approximate counting algorithms.
Guo, Jerrum, and Liu [25] introduced partial rejection sampling (PRS) as yet another efficient sampling technique. This method generalises the so-called cycle-popping algorithm for sampling spanning trees [46] and the sink-popping algorithm for sampling sink-free orientations [16]. It also has close connections with the Lovász local lemma [19]. For extremal instances (in the sense of [33]), PRS is just the celebrated Moser-Tardos algorithm for the constructive local lemma [39]. The most notable application of PRS is the first fully polynomial-time randomised approximation scheme (FPRAS) for all-terminal network reliability [23]. On the other hand, it is still open if all-terminal reliability and counting sink-free orientations admit deterministic fully polynomial-time approximation schemes (FPTASes). Thus, in view of the aforementioned derandomisation technique [20], a local implementation of PRS is a promising way to resolve these open problems.
In this paper, we make some positive progress for sink-free orientations (SFOs). Given an undirected graph , a sink-free orientation of is an orientation of its edges such that each vertex has at least one outgoing edge. SFOs were first studied by Bubley and Dyer [10] as a restricted case of Sat.111We remark that a variant of SFOs is also introduced in the context of distributed computing under the name of sinkless orientations, where they are used to give a lower bound for the distributed Lovász local lemma [8]. The main difference is that for sinkless orientations, usually only vertices of sufficiently high degrees are required to be non-sinks. They showed that exact counting of SFOs is #P-complete, and thus is unlikely to have a polynomial-time algorithm. For approximate counting and sampling, in [11], they showed that a natural Markov chain has an mixing time, where is the number of edges. Later, Cohn, Pemantle, and Propp [16] introduced an exact sampler, namely the aforementioned sink-popping algorithm that runs in time in expectation, where is the number of vertices. Using the partial rejection sampling framework, Guo and He [22] improved the running time of sink-popping to , and constructed instances where this running time is tight. It is open whether a faster sampling algorithm or an FPTAS exists.
Our main result is a local sampler based on partial rejection sampling for SFOs. Using this local sampler, for graphs of minimum degree , we obtain a deterministic approximate counting algorithm that runs in time , a near-linear time sampling algorithm, and a randomised approximate counting algorithm that runs in time , where is the given accuracy. All three algorithms appear difficult to obtain using previous techniques. We will describe the results in more detail in the next section.
1.1 Our contribution and technique overview
Our local sampler works for a slight generalisation of SFOs, which are intermediate problems required by the standard counting to sampling reduction [32]. In these problems, a subset of vertices is specified, which are required to be sink-free, and the task is to estimate the probability of a vertex not in not being a sink. We denote by the uniform distribution of orientations where no vertex in is a sink.
Before describing our technique, let us first review the sink-popping algorithm, (which is a special case of partial rejection sampling and the same as the Moser-Tardos algorithm [39] as the instance is extremal). We orient each edge uniformly at random. As long as there is a sink in , we select one such vertex, arbitrarily, and rerandomise all edges incident to it, until there is no sink belonging to .
Our key observation is that it is unnecessary to simulate all edges to decide if is a sink. In particular, if, at any point of the execution of the algorithm, is a sink, then no adjacent edges will ever be resampled and stays a sink till the algorithm finishes. On the other hand, if at any point , using only the already oriented edges, belongs to a cycle, a path leading to a cycle, or a nonempty path leading to some vertex not in , then the orientations of all edges involved will not be resampled, and stays a non-sink until the algorithm terminates. Thus, this observation gives us an early termination criterion for determining whether is a sink or not. Moreover, since in the sink-popping algorithm, the order of sinks popped can be arbitrary, we can reveal the random orientation of edges strategically, and pop sinks if necessary. To be more precise, we first reveal the edges adjacent to one by one. Once there is an outgoing edge , we then move to and repeat this process. If any sink is revealed, we erase the orientations of all its adjacent edges and backtrack. Eventually, one of the two early termination rules above will kick in, and this gives us a local sample.
Ideally, we want our local sampler to run in time, where is the number of vertices. Unfortunately, the one described above does not necessarily terminate this fast. To see this, consider a sequence of degree vertices, where at each step there is equal probability to move forward or backtrack. Resolving such a path of length would require time. On the other hand, when the minimum degree of the input graph is at least , the length of the path followed by the sampler forms a submartingale. The vertex can be a sink only if this path has length . Thus, once the length of the path is at least for some constant , the probability of being a sink is very small. This allows us to truncate the local sampler with only a small error. Recall that is the uniform distribution of orientations under which no vertex in is a sink. Our main theorem is as follows.
Theorem 1 (local sampler).
There exists an algorithm that, given a graph with minimum degree at least , , and , samples a Boolean random variable in time such that .
Moreover, there exists an algorithm that, in the same setting, given any edge , samples a random orientation of that is -close in total variation distance from the marginal distribution of under .
We note that for the algorithm in Theorem 1, we assume that the input graph can be accessed by an adjacent neighbour oracle. That is, given a vertex and a number , it returns the -th neighbour of . Also, all our algorithms work for not necessarily simple graphs.
The FPTAS for #SFO follows from the derandomisation method of [20] to the truncated local sampler. By -approximation, we mean an estimate such that , where is the target quantity.
Corollary 2 (deterministic approximate counting).
For graphs with minimum degree at least , there exists a deterministic algorithm that, given , outputs an -approximation to the number of sink-free orientations with running time , where is the number of vertices.
Although high, the constant exponent in the running time of Corollary 2 is actually the most interesting feature of our algorithm. This exponent is derived from the hidden constant in Theorem 1. In contrast, the running time of most known FPTASes [45, 4, 6, 28, 5, 40, 37, 20, 12] has an exponent that depends on some parameter (such as the maximum degree) of the input graph. There are exceptions, for example, [34, 27], but the exponents of their running times still depend on the parameters of the problem (not of the instance).
Another corollary of Theorem 1 is the following fast approximate sampler.
Corollary 3 (fast sampling).
For graphs with minimum degree at least , there exists a sampling algorithm that, given , outputs a random orientation such that is -close to a uniform random sink-free orientation in total variation distance, with running time , where is the number of edges.
We note that when the minimum degree is at least , the same running time can be achieved by the Moser-Tardos algorithm [38], as the symmetric local lemma condition holds in that case. However, when the minimum degree is , the symmetric local lemma condition no longer holds and our analysis relies on Shearer’s condition [42]. Under Shearer’s condition, we are not aware of a similar running time bound of the MT algorithm.
Our sampler runs in time222The notation hides logarithmic factors. instead of the time that sink-popping requires, at the cost of generating an approximate sample instead of a perfect sample. This improves over sink-popping when and leads to a faster FPRAS using the counting-to-sampling reduction [32]. In fact, the running time of the FPRAS can be improved further by directly invoking the truncated local sampler in the reduction.
Corollary 4 (fast approximate counting).
For graphs with minimum degree at least , there exists a (randomised) algorithm that, given , outputs a quantity that is an -approximation with probability at least to the number of sink-free orientations. The running time is , where is the number of vertices.
The success probability in Corollary 4 is standard in the definition of FPRAS, and can be easily amplified by taking the median of repeated trials and applying the Chernoff bound.
Note that directly combining Corollary 3 with the counting-to-sampling reduction results in an running time. Corollary 4 is faster when . Previously, the best running time for approximate counting is , via combining the time sink-popping algorithm [22] with simulated annealing (see, for example, [22, Lemma 12]). Corollary 4 improves over this by roughly a factor of . In very dense graphs (when ), Corollary 4 achieves near-linear time, which appears to be rare for approximate counting.
1.2 Related work
Our local sampler in Theorem 1 also falls under the framework of local access of huge random objects, a question first proposed by Goldreich, Goldwasser, and Nussboim [21]. This is a sampling version of local computation algorithms (LCAs) [41]. As such, the LCA lower bound for sinkless orientations by Brandt, Grunau, and Rozhon [9] implies that the running time of Theorem 1 is nearly optimal. Moreover, an early termination implementation of the general Moser-Tardos algorithm was developed by Davies-Peck [17] for LCAs and distributed algorithms. However, the algorithm in [17] requires a “polynomially-weakened” symmetric local lemma condition, whereas in the setting of Theorem 1 only Shearer’s condition for the local lemma holds. In addition, the algorithm in [17] appears to be too slow for our derandomisation purposes.
There are a plethora of fast sampling and deterministic approximate counting techniques by now. However, it appears difficult to achieve our results without the new local sampler. For example, the coupling of Bubley and Dyer [11] does not seem to improve with the minimum degree requirement. On a similar note, the recent deterministic counting technique of [12] requires a distance-decreasing Markov chain coupling, whereas the Bubley-Dyer coupling is distance non-increasing. In any case, even if the technique of [12] applied, it would not imply a running time with a constant exponent. Other fast sampling and FPTAS techniques, such as spectral independence [3, 15, 14], correlation decay [45, 35], and zero-freeness of polynomials [5, 40, 26], all seem difficult to apply. The main obstacle is that these techniques typically make use of properties that hold under arbitrary conditionings. However, for SFO, even if we start with a graph of minimum degree , conditioning the edges can result in a graph that is effectively a cycle, in which case no nice property holds. Our techniques, in contrast, require no hereditary properties and thus can benefit from the minimum degree requirement.
One much less obvious alternative approach to FPTAS is via the connection of the local lemma. In particular, because SFOs form extremal instances, their number can be computed via the independence polynomial evaluated at negative weights on the dependency graph. (We also see this fact in Section 4.1.) Normally this approach would not be efficient, because the dependency graph is usually exponentially large (for example for all-terminal reliability), but in the case of SFOs, the dependency graph is just the input graph itself. There are more than one FPTASes [40, 28] for the independence polynomial at negative weights. However, neither appears able to recover Corollary 2. With the minimum degree assumption, the probability vector for SFOs is within the so-called Shearer’s region, where both algorithms apply.333In [40], only a uniform bound is stated, but one can introduce a scaling variable and make a new polynomial in , so that their algorithm works in the Shearer’s region. The downside is that the running time of both algorithms has the form ,444To be more precise, the hidden constants in the exponents decrease in the multiplicative “slack” of how close the evaluated point is to the boundary of Shearer’s region. For SFOs, when constant degree vertices are present, the slack is a constant, and so are the hidden constants in the exponents. where is the maximum degree of the graph. Thus, in the setting of Corollary 2, these algorithms run in quasi-polynomial time instead. A more detailed discussion is given in Section 4.2.
Very recently, after our paper was posted on arXiv, Bencs and Regts [7] showed that in fact, the connection above can lead to an FPTAS for SFOs via Barvinok’s interpolation method. The main novel idea is that one need to instantiate the independence polynomial in a different way. As in Corollary 2, their algorithm also requires minimum degree , but their running time is faster, being , where is the number of vertices and is the number of edges of the graph.
1.3 Organisation
2 A local sampler for sink-free orientations
Fix as an undirected graph. An orientation of is an assignment of a direction to each edge, turning the initial graph into a directed graph. For any , let be the set of -sink-free orientations of , i.e., the set of orientations such that each vertex is not a sink. Thus, is the set of all (normal) sink-free orientations of . When , we use to denote the uniform distribution over . For two adjacent vertices , we use to denote the undirected edge and to denote the directed edge, from to .
We apply the following standard counting-to-sampling reduction [32]. Let be arbitrarily ordered and, for each , define . Then, can be decomposed into a telescopic product of marginal probabilities:
| (1) |
Thus, our goal becomes to estimate for any and .
We view -sink-free orientations under the variable framework of the Lovász local lemma. Here, each edge corresponds to a variable that indicates its direction, and each vertex in represents a bad event of being a sink. An instance is called extremal if any two bad events are independent (namely, they share no common variable) or disjoint. It is easy to see that all instances to the -sink-free orientation problem are extremal: if a vertex is a sink then none of its neighbors can be a sink. For extremal instances like this, the celebrated Moser-Tardos algorithm [39] is guaranteed to output an assignment avoiding all bad events uniformly at random [25]. This is summarised in Algorithm 1. Note that when , Algorithm 1 is the sink-popping algorithm by Cohn, Pemantle, and Propp [16].
The following lemma is a direct corollary from [25, Theorem 8] and SFOs being extremal.
Lemma 5.
If , Algorithm 1 terminates almost surely and returns an orientation distributed exactly as .
We remark that the only possible case for is when forms a tree and is not connected to any vertex not in .
Algorithm 1 requires one to generate a global sample when estimating for some , which is wasteful. The following observation is crucial to turning it into a local sampler.
Lemma 6 (criteria for early termination).
Suppose . For any , is a sink upon the termination of Algorithm 1 if and only if
- (a)
-
becomes a sink at some iteration.
Conversely, is not a sink upon the termination of Algorithm 1 if and only if one of the following holds:
- (b1)
-
a directed cycle containing , or a directed path containing which ends in a directed cycle is formed in some iteration, or
- (b2)
-
a nonempty directed path from to some is formed in some iteration.
Proof.
First, consider (a). If becomes a sink at any point, then for every which is a neighbour of , the edge is oriented towards . Since , the edge will not be resampled via , and could only be resampled by becoming a sink. Since cannot become a sink without resampling , will remain a sink. The other implication is obvious.
Now for (b1) and (b2), we first consider the forward implications. For a cycle , every vertex has an edge pointing outwards towards some which also has an edge pointing outwards. None of these edges can be resampled without another edge being resampled first, so no vertex in the cycle will ever become a sink again.
Consider a path that ends outside of or in a cycle. Inductively, we see that no edge on this path will be resampled without the edge after it being resampled. The edge connected to the cycle or will not be resampled, since that vertex may never be a sink, so no vertex in can become a sink.
For the reverse implication, suppose is eventually not a sink. In that case, there must be some edge pointing towards a neighbour . If is a sink, then and we are in case (b2). Otherwise, is not a sink, and there is an adjacent edge pointing away from , and we repeat this process. As the set of vertices is finite, if vertices considered in this process are all in , then there must be repeated vertices eventually, in which case we are in (b1).
An illustration of Lemma 6 is given in Figure 1. Based on Lemma 6, we design a local sampling algorithm for determining whether some vertex is a sink or not, given in Algorithm 2. We assume that the undirected graph is stored as an adjacency list where the neighbors of each vertex are arbitrarily ordered. Algorithm 2 takes as input some and , returns an indicator variable such that . We treat the path as a subgraph, and denotes the vertex set of . When we remove the last vertex from , we remove it and the adjacent edge as well. Informally, Algorithm 2 starts from the vertex , and reveals adjacent edges one by one. If there is any edge pointing outward, say , we move to and reveal edges adjacent to . If a sink is formed, then we mark all adjacent edges of as unvisited and backtrack. This induces a directed path starting from . We repeat this process until any of the early termination criteria of Lemma 6 is satisfied, in which case we halt and output accordingly.
3 Analysis of the local sampler
In this section, we analyse the correctness and efficiency of Algorithm 2.
Lemma 7 (correctness of Algorithm 2).
Let be a graph. For any such that and any , terminates almost surely and upon termination, returns an such that
Proof.
We claim that there exists a coupling between the execution of Algorithm 1 (with input and output ), and (with output ), such that
| (2) |
The claim implies the lemma because of Lemma 5.
To prove the claim, we first construct our coupling. We use the resampling table idea of Moser and Tardos [39]. For each edge, we associate it with an infinite sequence of independent random variables, each of which is a uniform orientation. This forms the “resampling table”. Our coupling uses the same resampling table for both Algorithm 1 and Algorithm 2. As showed in [39], the execution of Algorithm 1 can be viewed as first constructing this (imaginary) table, and whenever the orientation of an edge is randomised, we just reveal and use the next random variable in the sequence. For Algorithm 2, we reveal the orientation of an edge when its status changes from unvisited to visited in Line 11. We execute Line 14 if the random orientation from the resampling table is , and otherwise do nothing. Namely, in the latter case, the revealed orientation is , and we just move forward the “frontier” of that edge by one step in the resampling table. We claim that (2) holds with this coupling.
Essentially, the claim holds since for extremal instances, given a resampling table, the order of the bad events resampled in Algorithm 1 does not affect the output. This fact is shown in [24, Section 4]. (See also [16, Lemma 2.2] for the case of .) We can “complete” Algorithm 2 after it finishes. Namely, once Algorithm 2 terminates, we randomise all edges that are not yet oriented, and resample edges adjacent to sinks until there are none, using the same resampling table. Note that at the end of Algorithm 2, some edges may be marked unvisited but are still oriented. Suppose that the output of the completed algorithm is , an orientation of all edges. This completed algorithm is just another implementation of Algorithm 1 with a specific order of resampling bad events. Thus the fact mentioned earlier implies that .
On the other hand, the termination conditions of Algorithm 2 correspond to the cases of Lemma 6. One can show, via a simple induction over the while loop, that at the beginning of the loop, the path is always a directed path starting from , and all other visited edges point towards the path . This implies that Line 13 corresponds to case (b1) in Lemma 6, Line 5 corresponds to case (b2), and exiting the while loop in Line 3 corresponds to case (a). When Algorithm 2 terminates, we have decided whether or not is a sink. By Lemma 6, this decision stays the same under . As , (2) holds.
We then analyse the efficiency of Algorithm 2. The main bottleneck is when there are degree vertices. It would take time to resolve an induced path of length . We then focus on the case where the minimum vertex degree is at least . Note that in this case we have for any . For two distributions and over the state space , their total variation distance is defined by . For two random variables and , we also write to denote .
We will need the notion of martingales in the next lemma, so let us briefly recap the relevant notions. A sequence of random variables is a submartingale if , and . Submartingales enjoy some nice concentration properties. What we will need is the Azuma-Hoeffding inequality, which states that if is a submartingale and for all , then for any and ,
| (3) |
Then we have the following lemma.
Lemma 8 (efficient truncation of Algorithm 2).
Let be a graph with minimum degree at least . Let , and . Let be the output of , and constructed as
-
if terminates within executions of Line 12, let ;
-
otherwise, let .
Then, it holds that
Proof.
We track the length of the path during the execution of Algorithm 2. When an edge is chosen in Line 4 and sampled in Line 12 of Algorithm 2, the following happens:
-
with probability , is appended to and the length of increases by one;
-
with probability , is marked as visited, and the length of decreases by one in the next iteration if and only if was the last unvisited edge incident to .
Let be the random variable denoting the length of after the -th execution of Line 12 in Algorithm 2. Then the observation above implies that forms a submartingale. We construct another sequence of random variables modified from as follows:
-
.
-
At the -th execution of Line 12 in Algorithm 2:
-
–
if is the only unvisited edge incident to , set ,
-
–
otherwise, set .
-
–
It can be verified that the sequence is a martingale.
Claim 9.
For any , .
Proof.
Note that where if is the only unvisited edge incident to at the -th execution of Line 12 in Algorithm 2, and otherwise. Then we can write as
For any such that , let be the last index such that , or if no such exists. Since the minimum degree of is at least , when we append any vertex to , there are at least two unvisited edges incident to . It implies that there must be some such that and . Thus .
Next we show that if Algorithm 2 doesn’t terminate after steps, with high probability the length of the path will not return to . As is a martingale and for all , the Azuma–Hoeffding inequality, namely (3), implies that, for any and ,
| (4) |
Thus,
where the first inequality is by Claim 9, and the last inequality is by plugging into (4). Then, we have
| (5) | ||||
To finish the proof, we couple and by the same execution of Algorithm 2. Thus, if it terminates within executions of Line 12, then with probability . If not, (5) implies that with probability at most . As we always output in this case, with probability at most , which finishes the proof.
Note that Lemma 8 does not require . This is because it is implied by the minimum degree requirement. This implication is an easy consequence of the symmetric Shearer’s bound. It is also directly implied by Lemma 10 which we show next.
Lemma 8 implies the first part of Theorem 1. For the second part, we will need a modified version of Algorithm 2 to sample from the marginal distributions of the orientation of edges. This is given in Algorithm 3. It takes as input a subset of vertices and an edge , then outputs a random orientation following the marginal distribution induced from on . The differences between Algorithm 2 and Algorithm 3 are:
-
In Algorithm 2, the number of vertices in is initialised as , while in Algorithm 3, it is initialised as .
-
Algorithm 2 can terminate when while Algorithm 3 can not terminate due to (a) of Lemma 6 unless or are not in .
The correctness of Algorithm 3 is due to a coupling argument similar to Lemma 7. We couple Algorithm 1 and Algorithm 3 by using the same resampling table. By the same argument as in Lemma 7, given the same resampling table, the orientation of is the same in the outputs of both Algorithm 1 and Algorithm 3. Thus, follows the desired marginal distribution by Lemma 5. As for efficiency, we notice that the same martingale argument as in Lemma 8 applies to the length of as well. Early truncation of the edge sampler only incurs a small error. This finishes the proof of Theorem 1.
4 Applications of the local sampler
We derive the main applications of our local sampler in this section. Lemma 8 implies an additive error on the truncated estimator. As we are after relative errors in approximate counting, we need a lower bound of the marginal ratio.
Lemma 10.
Let be a graph with a minimum degree at least . For any and , it holds that and
The proof of Lemma 10 can be viewed as an application of the symmetric Shearer’s Lemma [42] on SFO, and is deferred to Section 4.1. Note that the minimum degree requirement is essential for such a marginal lower bound to hold, as the marginal ratio in Lemma 10 can be of order when is a cycle and .
We then show the two approximate counting algorithms first, namely Corollary 2 and Corollary 4.
Proof of Corollary 2.
By (1), we just need to -approximate for each to -approximate , the number of sink-free orientations to . The only random choice Algorithm 2 makes is Line 12. In view of Lemma 8, we enumerate the first random choices of Algorithm 2, and just output if the algorithm does not terminate by then. Let the estimator be the average of all enumeration. Note that Lemma 10 implies that for any . Then, Lemmas 7 and 8 imply that the estimator is an additive approximation. By Lemma 10, it is also an relative approximation, which is what we need.
For the running time, there are marginals, it takes enumerations for each marginal probability, and each enumeration takes at most time. Therefore, the overall running time is bounded by .
Proof of Corollary 4.
We use (1) again. Denote and . Let be the average of independent samples from Algorithm 2 truncated after executions of Line 12. Let be an estimator for .
For any and , let be the expectation of (note that it does not depend on ). By Lemmas 7 and 8, . By Lemma 10, . Let so that . Then, as ,
| (6) |
We bound and next. First,
as each is an indicator variable. Then,
| (by Lemma 10) |
To further reduce the variance, let be the average of independent samples of , where . Then, . By Chebyshev’s bound, we have
Thus with probability at least , we have that . By (6), when this holds, . It is then easy to have an -approximation of .
For the running time, each sample takes time. We draw samples for each of the vertices, and we repeat this process times. Thus, the total running time is bounded by .
For Corollary 3, we need the second part of Theorem 1 and Algorithm 3. In addition, some extra care for the self-reduction in the overall sampling algorithm is required.
Proof of Corollary 3.
We sequentially sample the orientation of edges in (approximately) from its conditional marginal distribution. Suppose we choose an edge , and the sampled orientation is . Then, we can remove from the graph, and let . The conditional distribution is effectively the same as in the remaining graph.
One subtlety here is that doing so may create vertices of degree . To cope with this, we keep sampling edges adjacent to one vertex in as much as possible before moving on to the next. Suppose the current focus is on . We use SampleEdge to sample the orientation of edges adjacent to one at a time until either is removed from or the degree of becomes . In the latter case, the leftover edge must be oriented away from , which also results in removing from . Note that, when either condition holds, the last edge sampled is oriented as for some neighbour of . We then move our focus to if , and move to an arbitrary vertex in otherwise. The key property of choosing edges this way is that, whenever is invoked, there can only be at most one vertex of degree in , and if it exists, it must be an endpoint of . If all vertices are removed from , we finish by simply outputing a uniformly at random orientation of the remaining edges.
To maintain efficiency, we truncate in each step of the sampling process. More specifically, for some constant , we output the first edge of once the number of executions of Line 15 in Algorithm 3 exceeds . We claim that there is a constant such that the truncation only incurs an error in total variation distance between the output and the marginal distribution. This is because the same martingale argument as in Lemma 8 still applies. Note that if visits any vertex not in , the algorithm immediately terminates. Thus degrees of vertices not in do not affect the argument. Moreover, the only degree vertex in , say , is adjacent to the first edge of . If is initialised as , then when returns to , the algorithm immediately terminates. Otherwise is initialised as , in which case there is no drift in the first step of . Thus, as long as we adjust the constant to compensate the potential lack of drift in the first step, the martingale argument in Lemma 8 still works and the claim holds. As the truncation error is , we may couple the untruncated algorithm with the truncated version, and a union bound implies that the overall error is at most .
As we process each edge in at most time, the overall running time is then . This finishes the proof of the fast sampling algorithm.
4.1 Proof of the marginal lower bound
Now we prove the lower bound of marginal ratios for SFOs, namely, Lemma 10. Let us first recall the variable framework of the local lemma. Consider the probability space of a uniformly random orientation of (namely orienting each edge independently and uniformly at random), and each corresponding to a bad event of being a sink. We then have
where denotes the degree of . We also need some definitions, essentially from [29] and small variations from those in [42].
Definition 11.
We define the following notations.
-
Let denote all independent sets of , i.e.,
-
For , let
(7) and
which is the probability under that all vertices in are sink-free.
We then proceed to the proof.
Proof of Lemma 10.
For , let denote the set of neighbours of in . We claim that for any and :
-
1.
;
-
2.
Lemma 10 immediately follows from the claim as and
where the last equality is by Item 1 and the inequality is by Item 2.
We then prove the claim by induction on the size of . The base case is when , and all items directly hold as . For the induction step, we first prove Item 1. For and , denote We have
where in the second equality we used the fact that -SFO instances are extremal. Thus,
| (8) |
Also, by separating independent sets according to whether they contain or not, we have
| (9) |
Combining (8), (9), and the induction hypothesis, we have that . For the positivity, let be listed as for some . For , let (with ). Then we have
where the first inequality is by Item 2 of the induction hypothesis. Thus, by (9) and Item 1 holds.
It remains to show Item 2. Recall that is listed as . For any , because and . Then by the induction hypothesis on the second case of Item 2,
| (10) |
By (9), we have
| (11) |
If , and as ,
If , and, again, as ,
Together with (11), this finishes the proof of Item 2 and the lemma.
In the proof above, Item 1 holds mainly because the instance is extremal. For general non-extremal cases, we would have for and instead.
4.2 Independence polynomial at negative weights
An interesting consequence of Item 1 in the proof of Lemma 10 is that the number of SFOs can be computed using the independence polynomial evaluated at negative activities. More specifically, similar to (7), let
where is a vector of weights for each vertex. Then, where is defined in (7), and thus , where is the vector of failure probabilities at the vertices. Namely where is the degree of .
There are more than one FPTASes [40, 28] that can efficiently approximate the independence polynomial at negative weights. These algorithms work in the so-called Shearer’s region [42]. To explain Shearer’s region, let us abuse the notation slightly and extend the definition in (7) to a function to take an input weight vector . Then, a vector is in Shearer’s region if and only if for all . Lemma 10 implies that the probability vector for SFOs is in Shearer’s region. Moreover, we say a vector has slack if is in Shearer’s region. For a vector with slack , the algorithm by Patel and Regts [40] -approximates in time , and the algorithm by Harvey, Srivastava, and Vondrák [28] runs in time , where is the maximum degree of the graph. They do not recover Corollary 2 as the slack is a constant when constant degree vertices exist. If, in the meantime, some other vertices have unbounded degrees, these algorithms run in quasi-polynomial time instead.
To see the last point, we construct a graph that contains vertices of unbounded degrees but with constant slack for SFOs. Consider the wheel graph , which consists of a cycle of length , and a central vertex that connects to all vertices of . Thus, and for any in . For the cycle, as there are two SFOs, we see that (where we use Item 1 in the proof of Lemma 10). Thus, by (9),
Therefore, the slack here is at most , despite the existence of a high degree vertex.
In summary, the existing FPTASes on the independence polynomial with negative weights do not handle the mixture of high and low degree vertices well enough for the case of SFOs. However, it might provide an alternative approach to derive FPTASes to count solutions to extremal instances of the local lemma, which is worthy of further study.
5 Concluding remarks
Originally, Bubley and Dyer [10] introduced sink-free orientations as a special case of read-twice Sat. Here, “read-twice” means that each variable in a CNF formula appears exactly twice, and it corresponds to an edge of the graph. Vertices of the graph correspond to clauses of the formula. The assignment of the edge is an orientation. This represents that the variable appears with opposite signs in the formula. In fact, Bubley and Dyer showed an FPRAS for all read-twice #Sat. It is natural to ask if they admit FPTAS as well. This question was first raised by Lin, Liu, and Lu [35], who also gave an FPTAS for monotone read-twice #Sat (which is equivalent to counting edge-covers in graphs). The monotone requirement means that the two appearances of any variable have the same sign. From this perspective, our FPTAS is complementary to that of [35]. However, as our techniques are drastically different from [35], to give an FPTAS for all read-twice #Sat, one may need to find a way to combine these two techniques to handle mixed appearances of variables.
Another immediate question is to generalise our local sampler under the partial rejection sampling framework. The first step would be to be able to handle degree 2 vertices for SFOs, which breaks our current submartingale argument. To go a bit further, a local sampler for all extremal instances would yield an FPTAS for all-terminal reliability, whose existence is a major open problem. Also, for all-terminal reliability, one may also attempt to localise the near-linear time sampler in [13].
Lastly, in addition to the discussion of Section 4.2, let us discuss another polynomial associated with SFOs and its zero-freeness. Fix a SFO . Let , where is the number of edges, and indicates how many SFOs exist that agree with in exactly edges. It is easy to evaluate , and is the total number of SFOs. However, for a cycle, this polynomial becomes , which can have a zero arbitrarily close to . This zero defeats, at least, the standard application of Barvinok’s method [5, 40]. Although one could exclude cycles by requiring the minimum degree to be at least (like we did in this paper), current techniques of proving zero-freeness seem to hinge on handling all subgraphs. For example, to use Ruelle’s contraction like in [26], one has to start from small fragments of the graph and gradually rebuild it. The obstacle then is to avoid starting from or encountering cycles in the rebuilding process. Other methods, such as the recursion-based one in [36], require hereditary properties (similar to the so-called strong spatial mixing) that break in cycles as well. It would be interesting to see if any of our arguments can help in proving zero-freeness of the polynomial above.
References
- [1] Konrad Anand, Weiming Feng, Graham Freifeld, Heng Guo, and Jiaheng Wang. Approximate counting for spin systems in sub-quadratic time. TheoretiCS, 4:3:1–3:27, 2025. doi:10.46298/theoretics.25.3.
- [2] Konrad Anand and Mark Jerrum. Perfect sampling in infinite spin systems via strong spatial mixing. SIAM J. Comput., 51(4):1280–1295, 2022. doi:10.1137/21M1437433.
- [3] Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. Spectral independence in high-dimensional expanders and applications to the hardcore model. In FOCS, pages 1319–1330. IEEE, 2020. doi:10.1109/FOCS46700.2020.00125.
- [4] Antar Bandyopadhyay and David Gamarnik. Counting without sampling: new algorithms for enumeration problems using statistical physics. In SODA, pages 890–899. ACM Press, 2006. URL: http://dl.acm.org/citation.cfm?id=1109557.1109655.
- [5] Alexander I. Barvinok. Combinatorics and Complexity of Partition Functions, volume 30 of Algorithms and combinatorics. Springer, 2016. doi:10.1007/978-3-319-51829-9.
- [6] Mohsen Bayati, David Gamarnik, Dimitriy A. Katz, Chandra Nair, and Prasad Tetali. Simple deterministic approximation algorithms for counting matchings. In STOC, pages 122–127. ACM, 2007. doi:10.1145/1250790.1250809.
- [7] Ferenc Bencs and Guus Regts. Barvinok’s interpolation method meets Weitz’s correlation decay approach. arXiv, abs/2507.03135, 2025. arXiv:2507.03135.
- [8] Sebastian Brandt, Orr Fischer, Juho Hirvonen, Barbara Keller, Tuomo Lempiäinen, Joel Rybicki, Jukka Suomela, and Jara Uitto. A lower bound for the distributed Lovász local lemma. In STOC, pages 479–488. ACM, 2016. doi:10.1145/2897518.2897570.
- [9] Sebastian Brandt, Christoph Grunau, and Václav Rozhon. The randomized local computation complexity of the Lovász local lemma. In PODC, pages 307–317. ACM, 2021. doi:10.1145/3465084.3467931.
- [10] Russ Bubley and Martin E. Dyer. Graph orientations with no sink and an approximation for a hard case of #SAT. In SODA, pages 248–257. ACM/SIAM, 1997. URL: http://dl.acm.org/citation.cfm?id=314161.314263.
- [11] Russ Bubley and Martin E. Dyer. Path coupling: A technique for proving rapid mixing in markov chains. In FOCS, pages 223–231. IEEE Computer Society, 1997. doi:10.1109/SFCS.1997.646111.
- [12] Xiaoyu Chen, Weiming Feng, Heng Guo, Xinyuan Zhang, and Zongrui Zou. Deterministic counting from coupling independence. arXiv, abs/2410.23225, 2024. doi:10.48550/arXiv.2410.23225.
- [13] Xiaoyu Chen, Heng Guo, Xinyuan Zhang, and Zongrui Zou. Near-linear time samplers for matroid independent sets with applications. In APPROX/RANDOM, volume 317 of LIPIcs, pages 32:1–32:12. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.APPROX/RANDOM.2024.32.
- [14] Zongchen Chen and Yuzhou Gu. Fast sampling of b-matchings and b-edge covers. In SODA, pages 4972–4987. SIAM, 2024. doi:10.1137/1.9781611977912.178.
- [15] Zongchen Chen, Kuikui Liu, and Eric Vigoda. Optimal mixing of glauber dynamics: Entropy factorization via high-dimensional expansion. In STOC, pages 1537–1550. ACM, 2021. doi:10.1145/3406325.3451035.
- [16] Henry Cohn, Robin Pemantle, and James Gary Propp. Generating a random sink-free orientation in quadratic time. Electron. J. Comb., 9(1), 2002. doi:10.37236/1627.
- [17] Peter Davies-Peck. On the locality of the Lovász local lemma. In STOC. ACM, 2025. to appear. doi:10.1145/3717823.3718103.
- [18] Martin E. Dyer, Alan M. Frieze, and Ravi Kannan. A random polynomial time algorithm for approximating the volume of convex bodies. J. ACM, 38(1):1–17, 1991. doi:10.1145/102782.102783.
- [19] Paul Erdős and László Lovász. Problems and results on -chromatic hypergraphs and some related questions. In Infinite and finite sets (Colloq., Keszthely, 1973; dedicated to P. Erdős on his 60th birthday), Vols. I, II, III, volume Vol. 10 of Colloq. Math. Soc. János Bolyai, pages 609–627. North-Holland, Amsterdam-London, 1975.
- [20] Weiming Feng, Heng Guo, Chunyang Wang, Jiaheng Wang, and Yitong Yin. Towards derandomising Markov chain Monte Carlo. In FOCS, pages 1963–1990. IEEE, 2023. doi:10.1109/FOCS57990.2023.00120.
- [21] Oded Goldreich, Shafi Goldwasser, and Asaf Nussboim. On the implementation of huge random objects. SIAM J. Comput., 39(7):2761–2822, 2010. doi:10.1137/080722771.
- [22] Heng Guo and Kun He. Tight bounds for popping algorithms. Random Struct. Algorithms, 57(2):371–392, 2020. doi:10.1002/RSA.20928.
- [23] Heng Guo and Mark Jerrum. A polynomial-time approximation algorithm for all-terminal network reliability. SIAM J. Comput., 48(3):964–978, 2019. doi:10.1137/18M1201846.
- [24] Heng Guo and Mark Jerrum. Approximately counting bases of bicircular matroids. Comb. Probab. Comput., 30(1):124–135, 2021. doi:10.1017/S0963548320000292.
- [25] Heng Guo, Mark Jerrum, and Jingcheng Liu. Uniform sampling through the Lovász local lemma. J. ACM, 66(3):Art. 18, 31, 2019. doi:10.1145/3310131.
- [26] Heng Guo, Chao Liao, Pinyan Lu, and Chihao Zhang. Zeros of holant problems: Locations and algorithms. ACM Trans. Algorithms, 17(1):4:1–4:25, 2021. doi:10.1145/3418056.
- [27] Heng Guo and Pinyan Lu. Uniqueness, spatial mixing, and approximation for ferromagnetic -spin systems. ACM Trans. Comput. Theory, 10(4):17:1–17:25, 2018. doi:10.1145/3265025.
- [28] Nicholas J. A. Harvey, Piyush Srivastava, and Jan Vondrák. Computing the independence polynomial: from the tree threshold down to the roots. In SODA, pages 1557–1576. SIAM, 2018. doi:10.1137/1.9781611975031.102.
- [29] Nicholas J. A. Harvey and Jan Vondrák. Short proofs for generalizations of the Lovász local lemma: Shearer’s condition and cluster expansion. arXiv, abs/1711.06797, 2017. arXiv:1711.06797.
- [30] Mark Jerrum and Alistair Sinclair. Approximating the permanent. SIAM J. Comput., 18(6):1149–1178, 1989. doi:10.1137/0218077.
- [31] Mark Jerrum and Alistair Sinclair. Polynomial-time approximation algorithms for the Ising model. SIAM J. Comput., 22(5):1087–1116, 1993. doi:10.1137/0222066.
- [32] Mark R. Jerrum, Leslie G. Valiant, and Vijay V. Vazirani. Random generation of combinatorial structures from a uniform distribution. Theoret. Comput. Sci., 43(2-3):169–188, 1986. doi:10.1016/0304-3975(86)90174-X.
- [33] Kashyap Babu Rao Kolipaka and Mario Szegedy. Moser and Tardos meet Lovász. In STOC, pages 235–244. ACM, 2011. doi:10.1145/1993636.1993669.
- [34] Liang Li, Pinyan Lu, and Yitong Yin. Correlation decay up to uniqueness in spin systems. In SODA, pages 67–84. SIAM, 2013. doi:10.1137/1.9781611973105.5.
- [35] Chengyu Lin, Jingcheng Liu, and Pinyan Lu. A simple FPTAS for counting edge covers. In SODA, pages 341–348. SIAM, 2014. doi:10.1137/1.9781611973402.25.
- [36] Jingcheng Liu, Alistair Sinclair, and Piyush Srivastava. A deterministic algorithm for counting colorings with 2-Delta colors. In FOCS, pages 1380–1404. IEEE Computer Society, 2019. doi:10.1109/FOCS.2019.00085.
- [37] Ankur Moitra. Approximate counting, the Lovász local lemma, and inference in graphical models. J. ACM, 66(2):10:1–10:25, 2019. doi:10.1145/3268930.
- [38] Robin A. Moser. A constructive proof of the Lovász local lemma. In STOC, pages 343–350. ACM, 2009. doi:10.1145/1536414.1536462.
- [39] Robin A. Moser and Gábor Tardos. A constructive proof of the general Lovász local lemma. J. ACM, 57(2):11, 2010. doi:10.1145/1667053.1667060.
- [40] Viresh Patel and Guus Regts. Deterministic polynomial-time approximation algorithms for partition functions and graph polynomials. SIAM J. Comput., 46(6):1893–1919, 2017. doi:10.1137/16M1101003.
- [41] Ronitt Rubinfeld, Gil Tamir, Shai Vardi, and Ning Xie. Fast local computation algorithms. In ICS, pages 223–238. Tsinghua University Press, 2011. URL: http://conference.iiis.tsinghua.edu.cn/ICS2011/content/papers/36.html.
- [42] James B. Shearer. On a problem of Spencer. Combinatorica, 5(3):241–245, 1985. doi:10.1007/BF02579368.
- [43] Leslie G. Valiant. The complexity of computing the permanent. Theor. Comput. Sci., 8:189–201, 1979. doi:10.1016/0304-3975(79)90044-6.
- [44] Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410–421, 1979. doi:10.1137/0208032.
- [45] Dror Weitz. Counting independent sets up to the tree threshold. In STOC, pages 140–149. ACM, 2006. doi:10.1145/1132516.1132538.
- [46] David Bruce Wilson. Generating random spanning trees more quickly than the cover time. In STOC, pages 296–303. ACM, 1996. doi:10.1145/237814.237880.
