Perfect simulation of the Hard Disks Model by Partial Rejection Sampling

We present a perfect simulation of the hard disks model via the partial rejection sampling method. Provided the density of disks is not too high, the method produces exact samples in $O(\log n)$ rounds, and total time $O(n)$, where $n$ is the expected number of disks. The method extends easily to the hard spheres model in $d>2$ dimensions. In order to apply the partial rejection method to this continuous setting, we provide an alternative perspective of its correctness and run-time analysis that is valid for general state spaces.


Introduction
The hard disks model is one of the simplest gas models in statistical physics. Its configurations are non-overlapping disks of uniform radius r in a bounded region of R 2 . For convenience, in this paper, we take this region to be the unit square [0, 1] 2 . This model was precisely the one studied by Metropolis et al. [14], in their pioneering work on the Markov chain Monte Carlo (MCMC) method. They used Los Alamos' MANIAC computer to simulate a system with 224 disks.
There are two variants of this model. To obtain the canonical ensemble, we fix the number (or equivalently, density) of disks and decree that all configurations are "equally likely", subject only to the disks not overlapping. In the grand canonical ensemble, we fix the "average" number of disks. To be more specific, centers of the disks are distributed according to a Poisson point process of intensity λ r = λ/(πr 2 ), conditioned on the disks being non-overlapping. The hard disks model, and its higher dimensional generalization (called the hard spheres model ) are also related to the optimal sphere packing density [6,20,2]. See [8,1] and references therein for more details. See also [13] for the physics perspective.
Our main aim in this work is to describe and analyse a very simple algorithm for exactly sampling from the grand canonical ensemble, based on the partial rejection sampling paradigm introduced by Guo, Jerrum and Liu [5].
More precisely, the challenge is the following: produce a realisation P ⊂ [0, 1] 2 of a Poisson point process of intensity λ r in the unit square, conditioned on the event that no pair of points in P are closer than 2r in Euclidean distance. We refer to this target measure as the hard disks distribution. It describes an arrangement of open disks of radius r with centres in [0, 1] 2 that are not allowed to overlap, but which otherwise do not interact. It is a special case of the Strauss process [19]. Note that, although the disks do not overlap each other, they may extend beyond the boundary of the unit square. Also, the intensity of the underlying Poisson process is normalised so that the expected number of points of P lying in a disk of radius r is λ. This normalisation gives us sensible asymptotics as the radius of the disks tends to zero (equivalently, the number of disks tends to infinity).
Classical rejection sampling applied to this problem yields the following algorithm: repeatedly sample a realisation P of the Poisson process of intensity λ in the unit square until P satisfies the condition that no two points are closer than 2r, and return P . Unfortunately, for every λ > 0, however small, the expected number of unsuccessful trials using this approach increases exponentially in r −1 , as r → 0. Partial rejection sampling [5] requires only a subset of P to be resampled at each iteration. Algorithm 1 below arises from a routine application of the paradigm to the problem at hand.
The original partial rejection method [5] and its analysis are tailored for the discrete case. In this paper we provide an alternative view on the correctness of the method, which is also valid in the continuous setting. In other words, as with classical rejection sampling, Algorithm 1 terminates with probability 1, producing a realisation of the exact hard disks distribution. Theorem 1. Algorithm 1 is correct: conditional on halting, Algorithm 1 produces a sample from the hard disks distribution with intensity λ r = λ/(πr 2 ).
The proof of this result forms the content of Section 3. In contrast to classical rejection sampling, the expected number of iterations (resampling steps) is now asymptotically O(log(r −1 )) as r → 0, provided λ is not too large. Furthermore, with a suitable implementation, the total runtime is O(r −2 ), i.e., linear in the number of disks. We prove that rapid termination occurs when λ < 0.21027. This analysis is not tight, and experiments suggest that the actual threshold for rapid termination is around λ ≈ 0.5. Figure 1 is a realisation of λ = 0.5 with r = 1 200 . The resulting density is α = 0.189+. Theorem 2. Fix λ < 0.21027. Then the expected number of iterations of the while-loop in Algorithm 1 is O(log r −1 ). Moreover, with a suitable implementation, the overall runtime of the algorithm is O(r −2 ).
The proof of this result forms the content of Section 4. The method extends naturally to the hard spheres model in d > 2 dimensions. Here, the desired distribution is a Poisson point process in [0, 1] d conditioned on no pair of points being closer than 2r. The natural normalisation for the intensity of the Poisson process in d dimensions is where v d is the volume of a ball of radius 1 in R d . With this convention, we prove that rapid termination occurs in d dimensions provided λ < 2 −(d+ 1 2 ) . The expected packing density α(λ) or simply α for this model is the expected total volume of spheres. (Note that, neglecting boundary effects, α is the proportion of the unit cube occupied by spheres.) The quantity α(λ) grows monotonically with λ, but intuitively we expect its rate of growth to slow down dramatically as the spheres pack more tightly. The connection between expected packing density α and intensity λ has recently been thoroughly explored by Jenssen, Joos and Perkins [8]. Using their results, we show that partial rejection sampling can achieve expected packing density Ω(2 −d ) while retaining rapid termination in O(log(r −1 )) iterations.
Although sphere packings of density Ω(d2 −d ) have been proved to exist, there is no polynomialtime sampler that provably achieves packing density beyond O(2 −d ), as far as we are aware.
Other approaches to exact sampling include Coupling From The Past (CFTP), which was adapted to point processes by Kendall [10] and Kendall and Møller [11]. Recently, Moka, Juneja and Mandjes [15] proposed an algorithm based on rejection and importance sampling. Although their algorithm, like ours, is based on rejection sampling, it does not share our asymptotic performance guarantee. Indeed, its runtime appears to grow exponentially as the number of disks goes to infinity, with the density of disks held constant.
The most widely used approach to sampling configurations from the hard disks model is Markov chain Monte Carlo (MCMC). Here the desired distribution is approached in the limit as the runtime t of the sampler tends to infinity. In this sense, MCMC produces an approximate sampler, though the error (in total variation distance) decays exponentially with t. The problem lies in deciding how large t should be in order to ensure that the samples obtained are close enough to the desired distribution. There are two possibilities. The runtime t may be chosen heuristically, in which case the quality of the output from the sampler is not guaranteed. Otherwise, an analytical upper bound on mixing time may be used to determine a suitable t, but then the tendency is for this bound to be very conservative. The experimental advantage of partial rejection sampling is its simple termination rule, combined with the property of generating perfect samples from the desired distribution.
Approximate sampling via Markov chain simulation has been studied by Kannan, Mahoney and Montenegro [9] and Hayes and Moore [7] in the context of the canonical ensemble, where the number of spheres in a configuration is fixed. (So in the MCMC approach, it is the density α that is chosen in advance, while in the approach taken here we choose λ in advance and then α follows.) Kannan et al. [9] show that rapid mixing (convergence to near-stationarity in time polynomial in the number of disks) occurs for densities α ≤ 2 −(d+1) , in dimension d. The best rigorously derived density bound guaranteeing rapid mixing in two dimensions is given by Hayes and Moore [7] and is α ≈ 0.154. (Note that this improves on the α = 1 8 obtained by Kannan et al.) It is should be noted that these results are not directly comparable with our λ < 0.21027 owing to the difference in models. To obtain canonical ensembles, we could use Algorithm 1 and further condition on the number of desired disks. However, the only rigorous guarantee for this approach, via [8], is α(0.21027) > 0.0887.
It is believed that the hard disks model in two dimensions undergoes a phase transition at a certain critical density α c : at lower densities configurations are disordered, while at higher densities, long range correlations can be observed. Unfortunately, α c is well beyond the densities that can be achieved by samplers (either based on MCMC or rejection sampling) with rigorous performance guarantees. However, heuristic approaches using sophisticated MCMC samplers [3] suggest that the critical density is around α c ≈ 0.7.
Finally, determining the maximum achievable density α is the classical sphere packing problem. Despite extensive study, the exact solution is known only for dimensions d = 1, 2, 3, 8 and 24. See [8,16] for rigorous bounds on packing densities in general dimension d.
A preliminary version of this paper was presented at ICALP 2018 [4]. Subsequently, the constant in Theorem 2 has been improved by Jake Wellens [21] to 0.2344+.

The sampling algorithm
The following notation will be used throughout. If P is a finite subset of [0, 1] 2 then where · denotes Euclidean norm, and The open disk of radius r with centre x ∈ [0, 1] 2 is denoted by D r (x). The finite set Π ⊂ [0, 1] 2 always denotes a realisation of the Poisson point process of intensity λ r on [0, 1] 2 . For a random variable X and event E we use D(X) to denote the distribution (law) of X, and D(X | E) the distribution of X conditioned on E occurring. Thus, D(Π | BadPoints(Π) = ∅) is the hard disks distribution that we are interested in. Our goal is to analyse the correctness and runtime of a sampling algorithm for the hard disks model (see Algorithm 1 below), specifically to determine the largest value of λ for which it terminates quickly, i.e., in O(log r −1 ) iterations. The algorithm is an example application of "Partial Rejection Sampling" [5], adapted to the continuous state space setting.
Algorithm 1 Partial Rejection Sampling for the hard disks model PRS for Hard Disks(λ, r) // r is the disk radius and λ r = λ/(πr 2 ) the intensity Let P be a sample from the Poisson point process of intensity λ r on the unit square while B ← BadPoints(P ) = ∅ do S ← B + D 2r (0) // Resampling set is the Minkowski sum of B with a disk of radius 2r Let P S be a sample from the Poisson point process of intensity λ r on S P ← (P \ B) ∪ P S end while return Let T (a random variable) be the number of iterations of the while-loop. On each iteration, the while loop terminates with probability bounded away from 0; thus T is finite with probability 1. (Indeed, T has finite expectation.) Let P t , for 1 ≤ t ≤ T , be the point set P after t iterations of the loop, and P 0 be the initial value of P (which is just a realisation of the Poisson point process on [0, 1] 2 ). We say that B 0 , B 1 , . . . , B t ⊂ [0, 1] 2 is a feasible sequence of (finite) point sets if there exists a run of Algorithm 1 with BadPoints(P 0 ) = B 0 , . . . , BadPoints(P t ) = B t . Theorem 1 will follow easily from the following lemma.
Proof. We prove the result by induction on t. The base case, t = 0, holds by construction: P 0 is just a realisation of the Poisson point process on [0, 1] 2 . Our induction hypothesis is for every feasible sequence B 0 , . . . , B t . Extend the feasible sequence to B t+1 . For the inductive step, we assume (1) and aim to derive The resampling set on iteration t + 1 is S = B t + D 2r (0). As a first step we argue below that where S (2) denotes the set of unordered pairs of elements from S. We have noted that (1) implies that, outside of the resampling set S, the point set P t is a realisation of the hard disks distribution. Also, the algorithm does not resample points outside of S. Thus P t+1 ∩ S = P t ∩ S is Poisson distributed, conditioned on there being no bad pairs. Inside S, resampling has left behind a fresh Poisson point process P t+1 ∩ S. These considerations give (3). Next, we condition on B t+1 : Since B t+1 contains no bad pairs with both endpoints in S, the event BadPoints(Π) = B t+1 entails the event BadPairs(Π) ∩ S (2) = ∅. Thus, we have The right hand side of this equation does not involve B 0 , . . . , B t , and so This completes the induction step (2) and the proof.
We can now complete the proof of Theorem 1. As we observed earlier, T , the number of iterations of the while-loop, is finite with probability 1. By Lemma 3, noting that B T = ∅, D(P T ) = D(Π | BadPoints(Π) = ∅).
In other words, at termination, Algorithm 1 produces a realisation of the hard disks distribution on [0, 1] 2 .

Run-time analysis (Proof of Theorem 2)
We consider how the number of "bad events", i.e., the cardinality of the set BadPairs(P t ), evolves with time. As usual Π denotes a realisation of the Poisson point process of intensity λ r . Also denote by ∆ a realisation of the hard disks process of the same intensity. We need the following stochastic domination result.
Lemma 4. The hard disks distribution is stochastically dominated by the Poisson point process with the same intensity. That is, we can construct a joint sample space for Π and ∆ such that ∆ ⊆ Π.
Holley's criterion is a useful test for stochastic domination, but it is not of direct use to us in the proof of Lemma 4, because it applies only to finite state spaces. Fortunately, Preston [17,Theorem 9.1], has derived a version of Holley's criterion that fits our situation. We will mostly follow Preston's notation, except that, to save confusion, we will use P and Q, rather than x and y, to denote finite sets of points. In order to state his result, we need some notation. In our application, ω n is the distribution on ([0, 1] 2 ) (n) obtained by sampling n points independently and uniformly at random from [0, 1] 2 , and regarding the points as indistinguishable; furthermore, ω = ∞ n=0 ω n /n!. Lemma 5 (Theorem 9.1 of [17]). Let f 1 , f 2 ∈ F and suppose that , for all Q ⊆ P ∈ Ω and ξ ∈ [0, 1] 2 \ P (where, by convention, 0/0 = 0). Then, for all bounded, measurable, non-decreasing functions g : Ω → R, i.e., if µ i is the probability measure having density f i with respect to ω, then µ 1 stochastically dominates µ 2 .
Proof of Lemma 4. We set otherwise.
The normalising constants C 1 and C 2 are chosen so that both f = f 1 and f = f 2 satisfy (4). (There is an explicit expression for C 1 , namely C 1 = exp(−λ r ), but not for C 2 .) Notice that both f 1 and f 2 also satisfy (5). Notice also that the probability measures µ 1 and µ 2 of the Poisson point process and the hard disks process have densities f 1 and f 2 with respect to ω. The premise (6) of Lemma 5 holds, since the left-hand side is always λ r and the right-hand side is either λ r or 0. The conclusion is that µ 1 dominates µ 2 . Strassen's Theorem [12,18], allows us to conclude the existence of a coupling of Π and ∆ as advertised in the statement of the lemma (except, possibly, on a set of measure zero).

Lemma 6.
There is a boundλ > 0 such that the expected number of iterations of the while-loop in Algorithm 1 is O(log r −1 ) when λ <λ.
Proof. First observe that BadPairs(P ) determines BadPoints(P ) and vice versa. So conditioning on the set BadPairs(P ) is equivalent to conditioning on BadPoints(P ). Introduce random variables Z t = |BadPairs(P t )|, for 0 ≤ t ≤ T . Our strategy is to show that Here, we have used the fact that |P 0 | is a Poisson random variable with expectation λ r , and , we obtain E Z t < 1/ε, and hence Pr(T > t) ≤ ε. It follows that the expected number of iterations of the while-loop of Algorithm 1 is O(log r −1 ). Note that the probability of non-termination decreases exponentially with t, so the probability of large deviations above the expected value of T is low.
Crude estimates giveλ = 1/(4 √ 2). The calculation goes as follows. Suppose, in (7), we condition on the random variables BadPoints(P 0 ), . . . , BadPoints(P t ), rather than simply on Z 0 , . . . , Z t . This is more stringent conditioning, since the former random variables determine the latter. It is enough to establish (7) under the more stringent conditioning. So fix BadPoints(P 0 ) = B 0 , . . . , BadPoints(P t ) = B t , and note that this choice also fixes the resampling sets S 0 , . . . , S t . Suppose Z t = |BadPairs(P t )| = k. Inside the resampling set S t we have a Poisson point process P t+1 ∩ S t of intensity λ r . Outside, by Lemma 3, there is a realisation of the hard disks process. Since we are seeking an upper bound on Z t+1 we may, by Lemma 4, replace P t+1 ∩ S t by a Poisson point process of intensity λ r .
Let k = E(Z t+1 | Z 0 , . . . , Z t ). From the above considerations we have This is an overestimate, as we are double-counting overlapping disks whose centres both lie within S t . Now, S t is a union of at most 2k disks of radius 2r. Thus There are further sources of slack here: there may be fewer than 2k disks, the disks comprising S t certainly overlap, and, for points x near the boundary, some of disks D 2r (x) will lie partly outside the unit square. (The last of these presumably has no effect asymptotically, as r → 0.) Settingλ = 1/(4 √ 2) = 0.17677+, we see that α = k/k > 1 for any λ <λ, and (Z t ) ∞ t=0 is a supermartingale, as required.
The constantλ may seem quite small. Note, however, that classical rejection sampling cannot achieve anyλ > 0. The argument goes as follows. Divide [0, 1] 2 into r × r squares. If there are two points in the same square then they will certainly be less than distance 2r apart. The number of points in each square is Poisson distributed with mean λ/π. Thus for any λ > 0 the probability that a particular square has at least two points is bounded away from zero. The number of points in each square is independent of all the others. It follows that the runtime of classical rejection sampling is exponential in r −2 .
The above derivation forλ is quite crude and can be improved.
Lemma 7. The constantλ in Lemma 6 can be taken to be 0.21027.
Proof. For each of the 2k disks, the right-hand side of inequality (9) counts pairs of points (x, y) with x in the disk, and y anywhere within distance 2r of x. Since a bad event is determined by an unordered pair of points, this gives rise to significant double counting. In particular, pairs (x, y) with x and y lying in the same ball are double counted. We can subtract off these pairs to obtain a better estimate. For a single ball, the correction is (The initial factor of one half arises because we want to count unordered pairs.) With the change of variables x = 2rx and y = 2ry this expression simplifies to where L( x ) is the area of the "lens" D 1 (0) ∩ D 1 (x ). Letting denote the offset of the centres of the two disks, the area of the lens is given by (This is by elementary geometry: the lens is the intersection of two sectors, one from each of the disks, and its area can be computed by inclusion-exclusion.) An illustration (before shifting y to 0) is given in Figure 2. The shaded area is L.
x y L 2r Figure 2. An illustration of double counting.
Translating to polar coordinates ( , θ), (The integral was evaluated using the Maple computer algebra system.) Our revised upper bound on k is thus yields the improved bound ofλ = 0.21027+.
There are other factors that could in principle be used to increaseλ further -each disk necessarily overlaps with at least one other disk, some bad events are triple or quadruple counted -but the computational difficulties rapidly increase when attempting to account for these.
We have just seen that the number of iterations of the while-loop in Algorithm 1 is small when λ is below some threshold. It just remains to check that the body of the loop can be implemented efficiently.

Lemma 8.
There is an implementation of Algorithm 1 such that, for any fixed λ <λ, the expected runtime of the algorithm is O(r −2 ).
Proof. No sophisticated data structures are required, but the runtime analysis requires some work.
Divide the unit square into a grid of r×r squares. Index the grid squares by I = [0, n] 2 , where n ≈ r −1 . Let the grid squares be {Γ i : i ∈ I}. Note that if two points of a point set P lie in the same grid square then they will necessarily form a bad pair and lie in BadPairs(P ). Moreover, if (x, y) ∈ BadPairs(P ) and x ∈ Γ i and y ∈ Γ j then necessarily i−j ∞ ≤ 2. So it seems, intuitively, that we should be able to implement the body of loop to run in time O(n 2 ) = O(r −2 ), since we only have to examine O(n 2 ) pairs of grid squares in order to complete the computation. Given Lemma 6, this would give an upper bound on total runtime of O(r −2 log r −1 ).
However, we can do better than this by noting that the amount of work to be done in each iteration decays exponentially. Recall the notation employed in Algorithm 1. Assume that, in addition to P and B = BadPoints(P ), we maintain a list L of grid squares containing points in B. In order to perform the computations within the while-loop, it is only necessary to consider grid squares that are within constant distance of a grid square in L. As the length of the list L decays exponentially, according to Lemma 6, we expect the overall runtime to be O(r −2 ) rather than just O(r −2 log r −1 ). This is indeed the case. However, there is a technical issue; the time taken for the tth iteration of the loop is a random variable, say T t . Recall the notation of Lemma 6, in particular that Z t = |BadPairs(P t )| denotes the number of bad pairs of points (ones within distance 2r of each other) after t iterations of the loop. As may be expected, we can bound the expectation of T t by a linear function of Z t−1 . Unfortunately, we can't just sum these estimates over t ∈ {1, T } to get an upper bound on expected total runtime, as the random variable T t is presumably correlated with Z t . Instead, we shall "charge" the operations within the tth iteration of the loop either to Z t−1 or Z t . and hence bound the expected runtime by O( Consider the work done during the tth iteration of the while-loop. Assume that the point sets P = P t−1 and B = B t−1 are available either from the initialisation phase or from the previous iteration of the loop. Also assume that we have the list L containing grid squares containing points in B. Note that |L| ≤ 2Z t−1 , since each bad pair contributes at most two bad points. To compute P S , we could sample a realisation of the Poisson point process in the unit square and reject all those points that are not within distance 2r of some point in B. However, this would be inefficient. Instead, we just produce realisations of a Poisson point process within grid squares that are within distance two of a grid square in L. Here, we take the distance between grid squares Γ i and Γ j to be j − i ∞ . We then reject each new point unless it is within distance 2r of a point in B; the result is the required point set P S . Let Thus, restricted to grid square Γ i , we have that b i is the number of bad points carried forward from the previous iteration, n i is the number of fresh points added during the current iteration, and p i is the total number of points at the end of the current iteration. Also, let n i be the number of points generated during the current iteration that did not survive because they were not within distance 2r of some bad point. Let P = {(i, j) : b i > 0 and j−i ∞ ≤ 2}. Note that, in order to compute the points that need to be added to P during the current iteration, only pairs of grid squares {(Γ i , Γ j ) : (i, j) ∈ P} need to be examined. (Each new point must be within distance 2r of an existing bad point.) The work done during iteration t in computing P t is then proportional to Here, A represents the fixed cost of cycling through all the relevant pairs of grid squares, and A + B is the additional time required to examine each fresh point and see whether it needs to be retained (i.e., added to P S ). Now |P| ≤ 25|L| ≤ 50Z t−1 , and so A = O(Z t−1 ). Also, n j is stochastically dominated by a Poisson random variable with mean r 2 λ r = λ/π = O(1); thus Note that the n j new points in grid square Γ j are freshly generated during the current iteration, and are discarded before the end of the iteration, and so have no effect on the subsequent evolution of the algorithm; the sequence Z 0 , αZ 1 , α 2 Z 2 , . . . remains a supermartingale, and the analysis in Lemma 6 is unaffected. (All other estimates will be deterministic, so this is the only point where we need to consider the potential for conditioning.) The remaining term B may be bounded as follows: Here we use the fact that any pair of distinct points in a single grid square is certainly a bad pair. The final task is to compute the new set of bad points. Since each of the new bad pairs must involve at least one point in P S , the time to complete this task is proportional to Now any grid square Γ i with n i > 0 must be within distance 2 of a grid square in the list L. Therefore, by similar reasoning to that used earlier, So again, the time for this phase of the loop is bounded by It remains to analyse the initialisation phase. Generating the realisation of a Poisson point process of intensity λ r in the unit square clearly takes time O(r −2 ). To identify the bad pairs, we cycle through pairs of grid squares (Γ i .Γ j ) with j − i ∞ ≤ 2; there are O(r −2 ) of these. An argument identical to those above shows that the time taken to identify the bad pairs is O(Z 0 ). We are almost done, but we do need a better estimate for Z 0 that the one used in Lemma 6, which was O(r −4 ). (This crude estimate was adequate at the time, as we were only interested in the logarithm of this quantity.) But we can now see that bad pairs can only come from pairs of grid squares separated by distance at most two. There are O(r −2 ) of these, and each of them generates O(1) bad pairs in expectation, so that Z 0 = O(r −2 ). We saw in Lemma 6 that Z 0 , αZ 1 , α 2 Z 2 , is a supermartingale with α > 1 (with the convention that Z t = 0 for t > T ). Thus the overall runtime of Algorithm 1 is in expectation, assuming λ <λ.

Three or more dimensions
In higher dimensions, the hard disk model is known as the hard spheres model. Everything in Sections 3 and 4 carries across to d > 2 dimensions with little change. For general d, the appropriate scaling for the intensity is λ r,d = λ/(v d r d ), where v d is the volume of a ball of unit radius in d dimensions. Note that in a realisation of a Poisson point process with intensity λ r,d , the expected number of points in a ball of radius r is λ.
By a result of Jenssen, Joos and Perkins [8], we lose just a constant factor when translating from intensity λ to packing density α. (It is partly to connect with their work, we measure intensity in terms of the expected number of points in a ball of radius r.) In the proof of [8,Thm 2], the following inequality is derived: Assuming λ ≤λ, which holds in the range of validity of our algorithm, we have √ 2λ ≤ 2 −d and hence Note that (c d ) is monotonically increasing, with c 2 = 0.42220+, and lim d→∞ c d = 0.63724+. It follows that we can reach expected packing density Ω(2 −d ) with O(log r −1 ) expected iterations. This is currently the best that can be achieved by any provably correct sampling algorithm with polynomial (in 1/r) runtime [9]. The asymptotically best packing density currently rigorously known is d2 −d , but achieving this would require λ to grow exponentially fast in d. This is clearly beyond the capability of partial rejection sampling, but also beyond the capability of any known efficient sampling algorithm.