Abstract 1 Introduction 2 Problem Formulation and Results 3 Lower Bound 4 Algorithm 5 Extensions and Conclusion References Appendix A Appendix

Evolving Distributions Under Local Motion

Aditya Acharya ORCID Department of Computer Science, University of Maryland, College Park, MD, USA David M. Mount ORCID Department of Computer Science and Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA
Abstract

Geometric data sets that arise in modern applications are often very large and change dynamically over time. A popular framework for dealing with such data sets is the evolving data framework, where a discrete structure continuously varies over time due to the unseen actions of an evolver, which makes small changes to the data. An algorithm probes the current state through an oracle, and the objective is to maintain a hypothesis of the data set’s current state that is close to its actual state at all times. In this paper, we apply this framework to maintaining a set of n point objects in motion in d-dimensional Euclidean space. To model the uncertainty in the object locations, both the ground truth and hypothesis are based on spatial probability distributions, and the distance between them is measured by the Kullback-Leibler divergence (relative entropy). We introduce a simple and intuitive motion model in which, with each time step, the distance that any object can move is a fraction of the distance to its nearest neighbor. We present an algorithm that, in steady state, guarantees a distance of O(n) between the true and hypothesized placements. We also show that for any algorithm in this model, there is an evolver that can generate a distance of Ω(n), implying that our algorithm is asymptotically optimal.

Keywords and phrases:
Evolving data, tracking, imprecise points, local-motion model, online algorithms
Copyright and License:
[Uncaptioned image] © Aditya Acharya and David M. Mount; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Computational geometry
Related Version:
Full Version: https://arxiv.org/pdf/2409.11779
Acknowledgements:
The authors would like to thank Michael Goodrich for suggesting the question of how to model multidimensional evolving point sets.
Editors:
Pat Morin and Eunjin Oh

1 Introduction

Many modern computational applications are characterized by two qualities: data sets are massive and vary over time. A fundamental question is how to maintain some combinatorial structure that is a function of such a data set. The combination of size and dynamics makes maintaining such a structure challenging. Given the large data sizes, single-shot algorithms may be too slow, and common dynamic algorithms, which support explicit requests for insertions and deletions, may not be applicable because structural changes are unseen by the algorithm.

Anagnostopoulos, Kumar, Mahdian, Upfal, and Vandin introduced a model for handling such data sets, called the evolving data framework [2]. In this framework, the structure varies repeatedly over time through the unseen actions of an evolver, which makes small, stochastic changes to the data set. The algorithm can probe the current state locally using an oracle. With the aid of this oracle, the algorithm maintains a hypothesis of the current state of the structure that is as close as possible to its actual state at all times. The similarity between the hypothesis and the current state is measured through some distance function. The algorithm’s objective is to achieve, in the steady state, a small distance between the hypothesis and the actual state. This framework has been applied to a variety of problems [1, 6, 21, 17, 36].

Consider, for example, sorting. The data consists of a set of objects over some total order. The evolver repeatedly selects a random pair of adjacent objects and swaps them. The oracle is given two objects and returns their relative order. The objective of the algorithm is to maintain an order that is as close to the current state as possible, where the distance is measured in terms of the Kendall tau distance, that is, the number of pairwise order inversions [22]. It has been shown that a Kendall tau distance of O(n) is achievable, and this is optimal [6, 5]. It is noteworthy that the optimal algorithm in the evolving framework is based on a simple quadratic-time sequential algorithm, and not O(nlogn)-time algorithms, as one might expect. The sorting problem has been generalized to tracking labels on a tree in [1], laying the foundations for a geometric framework for evolving data.

This paper focuses on the question of how to maintain a set of points whose positions evolve continuously in real d-dimensional space, d. In motion tracking applications, object movement is recorded through various technologies, including GPS-enabled mobile devices [31], RFID tags [28], and camera-based sensing [35]. Examples include the movement and migration of animals on land and in oceans, traffic and transport, defense and surveillance, and analysis of human behavior (see, e.g., [13, 24]).

Imprecision and uncertainty are unavoidable problems that arise in evolving systems. Due to sensor latency, the exact location of any object cannot be known with certainty. In the best case, any algorithm can maintain only an approximation to the current state. In order to bound the degree of uncertainty, it is necessary to impose restrictions on object motion. In traditional applications of the evolving data framework, the evolver acts randomly. This is not a reasonable assumption in practice, where moving objects are subject to physical laws or may have a sense of agency [15, 32, 34]. Much work has focused on realistic models of motion, but these can be difficult to analyze theoretically. A widely used model assumes that objects have a maximum speed limit, and as time passes, an object can be inferred to lie within a ball whose radius grows linearly over time based on this speed limit [9, 12, 20]. However, in practice, the speed that any object can move depends on the congestion within its local environment.

In multidimensional space, there is no intrinsic total order, and it is less clear what it means to accurately track imprecise moving objects. Given the inherent uncertainty, we model the location of each object in terms of a spatial probability distribution. The distance between the actual state of the system and our hypothesis is naturally defined as the relative entropy (Kullback-Leibler divergence) between these two distributions. The Kullback-Leibler (KL) divergence is a fundamental measure in statistics that represents the distance between two probability distributions [23]. It has numerous applications in statistical inference, coding theory, machine learning, among others [10]. In our case, it serves two intuitive purposes. First, it measures how different the truth is from our hypothesis. Second, it can be used to quantify the additional information required to encode the actual location of each object, based on our hypothesis [10].

For the evolution of our data, we adopt a locality-sensitive stochastic motion model. We assume that with each time step, the motion of an object is constrained by its immediate environment, which we call the local-motion model. In this model, the distance that any object can move in a single time step is a fixed fraction of the distance to its nearest neighbor. The support of this object’s probability distribution is a region of size proportional to the nearest-neighbor distance. This model has the advantage that it does not impose arbitrary speed limits on the movement of objects, it is invariant under transformations that preserve relative distances (e.g., translation, rotation, and uniform scaling), and it satisfies the observed phenomenon that objects in dense environments have less personal space [19, 16], and exhibit slower movement than those in sparse environments [29, 4].

To control the communication complexity in determining object locations, we do not require that the oracle returns the exact object positions. Instead, we assume access to an oracle that is given a Euclidean ball and the index i of an object. It returns a pair of bits indicating whether the i-th object lies within this ball and (if so) whether there is any other object of the set within this ball. Since every object’s location is subject to uncertainty, the same query on the oracle might result in different outcomes at different times, and our algorithm is robust to such variations.

In our framework, the evolver and the algorithm operate asynchronously and in parallel. With each time step, the evolver selects an arbitrary object of the set and moves it in accordance with the local-motion model. (This need not be random and may even be adversarial.) This information is hidden from our algorithm. The algorithm selects an object and invokes the oracle on this object. Based on the oracle’s response, the algorithm updates the current hypothesized distribution for this point. Thus, the evolver and algorithm are involved in a pursuit game, with the evolver incrementally changing object distributions (possibly in an adversarial manner) and the algorithm updating its hypothesized distributions. The algorithm’s goal is to minimize the relative entropy between these distributions at all times. Our computational model and a formal statement of our results are presented in Section 2.

2 Problem Formulation and Results

Objects

The objects that populate our system can be thought of as imprecise points [8, 14, 25] that are drawn independently from probability distributions that depend on the proximity to other objects. More formally, at any fixed time step, let Q={𝐪1,,𝐪n}d denote a point set of size n in d-dimensional Euclidean space, and let [n] denote the index set {1,,n}. For each 𝐪iQ, let Ni denote the distance to its nearest neighbor in Q{𝐪i}. Given 𝐱d and nonnegative real r, let (𝐱,r) denote the closed Euclidean ball of radius r centered at 𝐱, and let U((𝐱,r)) denote the uniform probability distribution over this ball. Given a real β, where 0<β<1, define the β-local feature region of 𝐪i to be (𝐪i,βNi). Let Pi=Pi(β) denote the uniform111Our choice of using a uniform probability distribution is not critical to our approach. In the full version of the paper we show that our results apply to a much broader class of distributions. probability distribution, U((𝐪i,βNi)), and let 𝒫={P1,,Pn} denote this set of distributions. We refer to 𝒫 as the truth or ground truth, as it represents the true state of the system, subject to the given imprecision. Together, Q and β define a set n independent random variables {X1,,Xn}, with Xi distributed as Pi (see Figure 1(a)).

(a) Illustration of the objects, and the notations used. The evolver’s action ε is shown in red.
(b) Objects that were affected by ε are shown in pink. Tiled regions show the previous state.
Figure 1: The model and an action by the evolver. Shaded regions represent objects.

The Local-Motion Model

As mentioned in the introduction, there are many ways to model the realistic motion of a collection of objects in an environment. A natural requirement is that each object’s motion is affected by the presence of nearby objects. Although there are many ways to incorporate this information (see, e.g., [29]), we have chosen a very simple model, where velocities are influenced by the distance to the nearest neighbor.

We think of the objects of our system as moving continuously over time, and our algorithm queries their state at regular discrete time steps. From the algorithm’s perspective, objects move, or “evolve,” over time in the following manner. Given a parameter α, where 0<α<1, at each time step, an entity called evolver selects an object i[n] and moves 𝐪i by a distance of at most αNi in any direction of its choosing (see Figure 1). While the value of α is known, the action of the evolver, including the choice of the object and the movement, is hidden from the algorithm. Throughout, we assume that the evolver is a strong adversary, which means that it has access to our algorithm and the input set [7].

For a nonnegative integer time step t, let Q(t) and 𝒫(t) denote the underlying centers and distributions, respectively, at this time. To simplify notation, we omit the superscript when talking about the current time. We assume that there exists a bounding region in the form of a Euclidean ball 0 centered at the origin. At all times, the points Q(t) are restricted to lie within 0. The algorithm has knowledge of this ball. Define the system’s initial aspect ratio to be Λ0=(radius(0))/(mini[n]Ni(0)). Given any positive constant c, let c0 denote a factor-c expansion of this ball about the origin.

Oracle

Consider the state of the system at any fixed time t. Knowledge about the current state of the system is provided by an entity 𝒪, called the oracle. It is given a Euclidean ball (𝐱,r) and an object index i[n]. Recall that for each i[n], Xi denotes a random variable distributed as Pi. The oracle is a function 𝒪:[n]×d×+{Y,N}×{+,}, where 𝒪(i,𝐱,r) returns:

  • Y” or “N” depending on whether Xi(𝐱,r) and

  • +” or “” depending on whether there exists ji, such that Xj(𝐱,r)

The first element of the pair is used to estimate the location of the object i, and the second is used to estimate the distance to its nearest neighbor. Because Xi and Xj are random variables, so is 𝒪(i,𝐱,r). Consistent with previous applications of the evolving data framework (see, e.g., [3]), we intentionally made the oracle as weak as possible, implying that our algorithm can be used with stronger oracles.

Hypotheses and Distance

The algorithm maintains a hypothesis of the current object locations, which is defined to be a set ={H1,,Hn} of spatial probability distributions in d. The distance between the truth 𝒫 and the current hypothesis , denoted 𝒟(𝒫,), is defined as the sum of n Kullback-Leibler divergences from the hypothesized distributions to the true ones. For i[n], let

Di=DKL(PiHi)=𝐱iPi(𝐱)logPi(𝐱)Hi(𝐱)μ(d𝐱),

where μ() denotes the measure over i=(𝐪i,βNi) and define

𝒟(𝒫,)=i=1nDi=i=1nDKL(PiHi). (1)

As a baseline for comparisons, we introduce a naïve hypothesis, denoted , which assumes no information about the locations of the objects, other than the fact that they lie within the bounding region. Recall that 0 denotes this bounding region and 30 denotes a factor-3 expansion about its center. Since all the points of Q lie within 0 and 0<β<1, 30 is guaranteed to contain all the β-local feature regions. For all i[n], let Hi be the uniform distribution over 30, and let ={H1,,Hn}. Regardless of the initial truth, the initial distance of the ith object satisfies the following bound.

Di=𝐱iPi(𝐱)logPi(𝐱)Hi(𝐱)μ(d𝐱) =𝐱iPi(𝐱)log1/(βNi)d1/(3radius(0))dμ(d𝐱)
=log(3radius(0)βNi)d𝐱iPi(𝐱)μ(d𝐱)
d(logΛ0+log3β).

Let 𝒟=i[n]Di. Under our assumption that d and β are constants, we have 𝒟Θ(nlogΛ0). The initial aspect ratio, Λ0, depends on the initial configuration of the points of Q, and can be arbitrarily high. In the best case, when the points are uniformly distributed in 0, we have Λ0=Ω(n1/d) and hence 𝒟Ω(nlogn). The objective of our algorithm is to maintain a significantly smaller bound on this distance.

The combination of evolver, oracle, and distance function constitute a model of evolving motion, which we henceforth call the (α,β)-local-motion model.

Class of Algorithms

We assume that the algorithm that maintains the hypothesis runs in discrete steps over time. With each step, it may query the oracle a constant number of times, perform a constant amount of work, and then update the current set of hypotheses. The number of oracle queries is independent of n but can depend on dimension d, local feature scale factor β, and motion factor α. In our case, this work takes the form of updating the hypothesis for the object that was queried. In the purest form of the evolving data framework, the evolver and algorithm alternate [2]. Instead, similar to the generalized framework proposed in [1], we assume that there is a fixed speed-up factor, denoted σ. When amortized over the entire run, the ratio of the number of steps taken by the algorithm and the evolver does not exceed σ.

Objective and Results

The objective of the algorithm is as follows. Given any starting ground-truth configuration, after an initial “burn-in” period, the algorithm guarantees that the hypothesis is within a bounded distance of the truth subject to model assumptions and the given speed-up factor. Our main result is that there exists an algorithm with constant speed-up factor σ that maintains a distance of O(n) in steady state.

Theorem 1.

Consider a set of n evolving objects in d under the (α,β)-local-motion model, for constants α and β, where 0<α,β<13. There exists an algorithm of constant speed-up σ, and burn-in time t0O(nlogΛ0loglogΛ0) such that for all tt0, this algorithm maintains a distance of O(n).

The algorithm and its analysis will be proved in Section 4. Given that no algorithm can guarantee an exact match between hypothesis and truth, it is natural to wonder how close this is to optimal. In Section 3, we will show that it is asymptotically optimal by showing that for any algorithm and any constant speed-up factor, there exists an evolver that can force a distance of Ω(n) in steady state.

Why the KL Divergence?

The principal challenge in generalizing the evolving framework from simple 1-dimensional applications like sorting to a multidimensional setting is the lack of an obvious distance measure that captures how close the hypothesis is to the current state. Our approach is motivated by an information-theoretic perspective. The KL divergence Di=DKL(PiHi) serves as a measure of how different the actual object distribution Pi is from our hypothesis Hi. (Note that the KL divergence is asymmetric, which is to be expected, given the asymmetric roles of the truth and our hypothesis.)

As an example of this information-theoretic approach, consider the following application in coding theory and space quantization [10]. The objective is to communicate the location of an imprecise point Xi with any arbitrary resolution δ. From Shannon’s source coding theorem [26], the theoretical lower bound for the expected number of bits required for communication is given by the Shannon entropy [30]

bPi=CδBiPi(Cδ)log1Pi(Cδ),

where Cδ is a cell (a d-dimensional box) of size δ in d, and P(C), by a slight abuse of notation, is the probability associated with a cell C using the underlying probability distribution function P. This is roughly achieved by a Huffman coding [18] of the space around qi, which intuitively assigns a number of bits proportional to the log of the inverse of the probability measure associated with an event, which in our case is Xi lying within a cell Cδ of size δ.

Since the Pi’s are not available to us, we use a similar strategy of encoding the space using Hi. Therefore, in expectation, we use roughly

bHi=CδBiPi(Cδ)log1Hi(Cδ)

bits. When δ is arbitrarily small, the difference (bHibPi)Di. Therefore, 𝒟(𝒫,)=i[n]Di roughly represents the expected number of additional bits needed to represent the location of Xi’s individually using just Hi’s, compared to the absolute theoretical limit.

Paper Organization

The remainder of the paper is organized as follows. In the next section, we present Theorem 2, which provides a lower bound on the distance achievable by any algorithm in our model. In Section 4, we present the algorithm and analyze its steady-state performance. In Section 5, we discuss possible relaxations to the assumptions made in our model.

3 Lower Bound

In this section, we show that maintaining 𝒟(𝒫,)O(n) is the best we can hope to achieve. This is established in the following theorem. Due to space limitations, the proof has been deferred to Section A.1.

Theorem 2.

For any algorithm 𝒜, there exists a starting configuration Q(0) and an evolver (with knowledge of 𝒜) in the local-motion model such that, for any positive integer t0, there exists tt0, such that 𝒟(t)=𝒟(𝒫(t),(t))Ω(n).

The proof follows a similar structure to other lower-bound proofs in the evolving data framework [3, 6, 1]. Intuitively, over a period of time of length cn, for c<1, the algorithm can inspect the locations of only a constant fraction of points. During this time, the evolver can move a sufficient number of uninspected points so that the overall distance increases by Ω(n).

4 Algorithm

In this section, we present our algorithm and analyze its performance. We begin by defining our hypothesis. For every index i, we define two parameters: hi, and 𝐤i=(ki,1,ki,d)d. For 𝐱=(x1,xd)d we set the hypothesis probability density for the ith point to be the d-fold product of independent Cauchy distributions, one per coordinate, with center at 𝐤i and scaling parameter hi [33]. That is,

Hi(𝐱)=fhi,𝐤i(𝐱)=1πdhidj=1d(1+(xjki,j)2hi2)1 (2: Hypothesis Def.)

(Note that this differs from the standard multivariate Cauchy [33].) If we let f1,𝟎(𝐱) denote the probability density function for the standard d-fold product of Cauchy distributions (centered at the origin unit scale), we can express this equivalently as

fhi,𝐤i(𝐱)=1hidf1,𝟎(𝐱𝐤ihi),wheref1,𝟎(𝐱)=1πdj=1d(1+xj2)1.

It will be convenient to define the hypothesis ball iH=(𝐤i,hi), which we use to illustrate Hi. Note that, unlike our truth probabilities Pi, our hypothesis distributions have unbounded support. Our algorithm modifies the parameters 𝐤i and hi in response to information received from the oracle.

4.1 Potential Function

Due to the subtleties of tracking the Kullback-Leibler divergence, we introduce a potential function Φ to aid in the analysis. For each object index i[n], we define an individual potential function Φi, which bounds the distance Di to within a constant. The total potential Φ will be the sum of these functions.

Let si=𝐪i𝐤i denote the Euclidean distance between the center of the actual distribution and the center of the hypothesis ball. Let li=βNi denote the radius of the local feature of 𝐪i. Recall that hi is the radius of the hypothesis ball (see Figure 2). We define the individual and system potentials to be

Φi=Φ(Pi,Hi)=log(max(si,li,hi)lihi)andΦ=Φ(𝒫,)=i[n]Φi (3)

The following lemma shows that, up to additive and scalar constants that depend on the dimension, distances can be bounded above by this potential.

Figure 2: The definition of the individual potential Φi.
Lemma 3.

There exists a constant cd (depending on dimension) such that for any i[n], Dicd(1+Φi), and hence 𝒟cd(n+Φ).

Proof.

Recall that Pi(𝐱) is the probability density function of a uniform distribution over a ball of radius li. Thus, for 𝐱i, Pi(𝐱)=ωd/lid, where ωd denotes the volume of the unit Euclidean ball in d, and i. Therefore,

Di =iPi(𝐱)logPi(𝐱)Hi(𝐱)μ(d𝐱)
=iPi(𝐱)logωd/lid(πdhid)1j=1d(1+(xjki,j)2hi2)1μ(d𝐱)
=iPi(𝐱)log(ωdπd)μ(d𝐱)+iPi(𝐱)loghidj=1d(1+(xjki,j)2hi2)lidμ(d𝐱). (4)

For 𝐱i and j[d], by the triangle inequality, we have

|xjki,j|𝐱𝐤i𝐱𝐪i+𝐪i𝐤ili+si.

Therefore,

Di log(ωdπd)+𝐱iPi(𝐱)log[hidlidj=1d(1+(si+li)2hi2)]μ(d𝐱)
log(ωdπd)+log[hidlid(1+(si+li)2hi2)d]𝐱iPi(𝐱)μ(d𝐱)
=log(ωdπd)+dlog(hi2+(si+li)2lihi)cd(1+logmax(si,li,hi)lihi),

for a suitably chosen cd that depends only on the dimension.

 Remark (Potential function and approximating pairwise distances).

The potential function Φi we define here is symmetric. It is remarkable that a symmetric function bounds an asymmetric function Di under the choice of Cauchy distributions as hypotheses.

To demonstrate the value of our potential function, suppose that for i,Φi<logc, for some constant c. This implies that hi<clihi, which further implies hi<cli for some constant c. Similarly, si<clihi<c′′li for some constant c′′. By the triangle inequality, the distance between the centers of two hypothesis balls satisfies

𝐤i𝐤jsi+sj+𝐪i𝐪jc′′li+c′′lj+𝐪i𝐪j.

Since li=Ni/β𝐪i𝐪j/β, we have 𝐤i𝐤jc𝐪i𝐪j, for some constant c.

Since 𝐤i𝐤j is a constant approximation for 𝐪i𝐪j, it immediately follows that we can maintain a constant-weight approximation of structures like the Euclidean minimum spanning tree and the Euclidean traveling salesman tour on Q by constructing the same on 𝐤i’s, the centers of the hypotheses Hi’s.  

An important feature of our choice of a potential function is that the evolver cannot change its value by more than a constant with each step. Recall that the evolver selects an object index i and moves qi by a distance of most αNi in any direction. This results in at most an α fraction change in the values of li and si, and hence the resulting change in Φi is bounded by a constant. However, note that the movement of qi could potentially affect the local feature sizes of a number of other objects in the system. We can show by a packing argument that there is only a constant number of indices j, whose Φj value is affected. Furthermore, these values are only changed by a constant. This is encapsulated in the following lemma. Its proof has been deferred to Section A.1.

Lemma 4.

Each step of the evolver increases the potential Φ by at most a constant.

4.2 The Algorithm

Next, we present our algorithm. The algorithm runs in parallel to the evolver, running faster by a speed-up factor of σ (whose value will be derived in our analysis). Each step of the algorithm involves a probe of an object by the oracle, and following that, a possible modification of a hypothesis, Hi. The total running time of the algorithm is proportional to the number of steps.

Let us begin with a high-level explanation. The details are provided in Algorithm˜TrackByZoom. Recall that we are given a set of n objects, each represented by Xi, a uniform probability distribution over a ball i=(𝐪i,βNi), where Ni is the distance to its nearest neighbor and β is the local feature scale. The points 𝐪i are restricted to lie within a bounding ball 0 centered at the origin. The algorithm maintains a hypothesis in the form of n Cauchy distributions, each represented by a hypothesis ball iH=(𝐤i,hi), where 𝐤i is the center of the distribution and hi is the distribution’s scaling parameter.

The algorithm begins with an initial hypothesis, where each hypothesis ball is just 0. This reflects the fact that we effectively assume nothing about the location of 𝐪i, except that it lies within the bounding ball. The algorithm then proceeds in a series of iterations, where each iteration handles all the n indices in order. The handling of each index involves two processes, called zoom-out and zoom-in.

In the zoom-out process, we query the oracle on index i to check whether (1) the sampled point Xi lies within its hypothesis ball and (2) at least one other Xj lies within a concentric ball whose radius is larger by a factor of 1β. If so, we expand the hypothesis ball by a factor of 3/(12β). (We show in Lemma 5 that this guarantees that the hypothesis ball iH now contains the local feature ball i.) We then proceed to the zoom-in process. If not, we double the hypothesis ball and repeat (see Figure 3).

Algorithm TrackByZoom Tracking Evolving Distributions.
(a) Multiple stages of the zoom-out process starting at t=0. Stage t=2 is the first time iH,(t) contains Xi, and its 1/β-expansion contains Xj.
(b) iH,(3) is the hypothesis ball at the end of the zoom-out process. Note that iiH,(3).
Figure 3: The zoom-out process of TrackByZoom (Line 6).

In the zoom-in process, we first check whether Xi lies within the hypothesis ball. (The evolver could have moved qi.) If not, we return to the zoom-out process. If so, we next check the 1β-expansion of this ball. If there is no other Xj in this expanded ball, we accept the current hypothesis for i, and go on to the next point in the set. (We show in Lemma 10 that Φi at this point in time is bounded by a constant). If there is such an Xj however, we may need to shrink the hypothesis ball for the current index. We cover the hypothesis ball with a collection of O(1) balls whose radii are smaller by a constant factor, and we test whether Xi lies within any of them (see Figure 4). As soon as we find one, we shrink the hypothesis ball about this ball. We repeat this process until one of the earlier conditions applies (causing us to return to the zoom-out process or move on to the next point).

(a) Set of nested balls covering iH,(0), which we call Ψ. radius(Ψi)=radius(i)/(2312β), and |Ψ|O(1).
(b) Ψk is the first nested ball to contain Xi. iH,(k) contains Xi, and its 1/β-expansion does not contain Xj. TrackByZoom moves on to index i+1.
Figure 4: Illustration of the zoom-in process in TrackByZoom (Line 14).

4.3 Analysis

In this section, we analyze the distance that the algorithm maintains with the ground truth. Recall that the initial hypotheses are based on the bounding ball 0, which is centered at the origin. Letting R0=radius(0), we have Hi(0)(𝐱)=f𝟎,R0(𝐱), as defined in Eq. (2: Hypothesis Def.). Using Eq. (3), the initial potential function satisfies:

Φi(0)=logR0/li(0),andΦ(0)O(nlogΛ0)(Λ0:Aspect Ratio) (5)

Suppose the condition in line 8, and 23 are satisfied. We prove the following lemma to justify the expansion factor 312β. The proof is rather technical and has been deferred to the full version of the paper.

Lemma 5 (Expansion Factor).

If 𝒪(i,𝐤i,hi)=(Y,) and 𝒪(i,𝐤i,hi/β)=(Y,+), then i(𝐤i,312βhi).

Let us consider how Algorithm˜TrackByZoom affects the potential. For a particular index i, during the zoom-out process, we double the radius hi of the hypothesis ball. If the Euclidean distance between the centers of the hypothesis, and the local-feature ball is much larger than hi, then Φ=log(max(si,li,hi)/lihi)=log(si/lihi) decreases by an additive constant. However, once hi is greater than si, the algorithm risks increasing Φ. We show that the number of steps where Φ increases is a constant for every index i. However, during the zoom-in process, we maintain the invariant that the hypothesis ball contains the local-feature ball (Lemma 5), which implies that max(si,li,hi)=hi. Therefore, Φ decreases by a constant amount whenever we shrink the hypothesis ball by a constant factor. However, if the evolver moves qi while index i is being processed, our algorithm may transition to the zoom-out process again. Hence, if the evolver executes some εz steps in the zth iteration, our algorithm may increase Φ by only O(n+εz). We obtain the following lemma. The proof is deferred to Section A.1

Lemma 6 (Algorithm’s contribution towards Φ).

For any iteration z, let εz denote the number of steps executed by the evolver. Then TrackByZoom increases Φ in O(n+εz) steps, each time by O(1). In each of the remaining steps of the zth iteration, it decreases Φ by Θ(1) in an amortized sense.

Let Φz be the potential function at the start of the zth iteration. And let Δz be the time taken by the zth iteration of the algorithm. At the conclusion of the zoom-in process for a particular index i, the algorithm fixes a hypothesis Hi which is at a reasonable distance from the truth Pi. In fact, we show in Lemma 10 that at this point in time, Φi is ϕ0, a constant. Therefore, by Lemma 6, TrackByZoom running at a constant speed-up σ roughly spends Φiz time (within constant factors), in reducing Φiz to ϕ0. Summing over all indices, we see that the time taken by zth iteration, Δz, is Φz. Intuitively, for a sufficiently large speed-up factor, the actions of the evolver can only affect Δz by a small fraction of this amount. We summarize these observations in the following lemma. The proof is deferred to Section A.1

Lemma 7 (Iteration time proportional to Potential).

There exists a constant speed-up factor σ for TrackByZoom such that ΔzΘσ(n+Φz).

We are now ready to derive how Φz changes over the course of multiple iterations. We show that for a sufficiently large (constant) speed-up factor σ, after loglogΛ0 iterations Φ converges to O(n). (Recall that Λ0 is the initial aspect ratio.)

Lemma 8.

There exists a constant speed-up factor σ for Algorithm˜TrackByZoom and z0, such that for all z>z0, ΦzO(n)

Proof.

By Lemma 4, in the zth iteration the evolver takes at most εz=O(Δz) steps and increases Φ by at most O(Δz). Therefore, by Lemma 6, TrackByZoom increases Φ by O(n+Δz). Since the algorithm runs at a speed-up factor σ it takes σΔz steps, and by Lemma 6, it decreases Φ by at least λ(σΔzO(n+Δz)), for some λO(1). Accounting for all the increases and decreases we have:

Φz+1 Φz+O(Δz)+O(n+Δz)λ(σΔzO(n+Δz))
Φz+O(Δz)σλΔz+O(n).

There exists a sufficiently large σO(1) such that σλΔzO(Δz)aΔz, for 0<aO(1). So,

Φz+1ΦzaΔz+O(n).

By Lemma 7, for a sufficiently large (constant) σ, there exists cσ>0 such that Δzcσ(n+Φz). Therefore,

Φz+1 (1acσ)Φz+(1acσ)O(n)(1acσ)z+1Φ0+(y=1z+1(1acσ)y)O(n).

By Eq. (5), Φ0=Φ(0)O(nlogΛ0), and since there exists b>1 such that (1acσ)<1/b<1, we have

Φz+1 O(nlogΛ0)bz+1+O(n). (6)

Observe that this is O(n) whenever z>loglogΛ0.

If ΦzO(n), then by Lemma 7, ΔzΘ(n). Thus, Φ increases by at most O(n) throughout iteration z. Therefore, for any time t during iteration z, Φ(t)O(n) as well.

Using Eq. (6) we observe that for all z>1, ΦzO(nlogΛ0). By Lemma 7, this implies that ΔzΘ(nlogΛ0). Hence, for some t0O(nlogΛ0loglogΛ0), we have Φ(t)O(n), for all t>t0.

By combining Lemmas 8 and 3, we complete the proof of Theorem 1. It is noteworthy that there is a trade-off between the speed-up factor σ and the motion parameter α. Recall that with each step, the evolver can move qi by a distance of up to αNi. If we reduce this parameter to α=(1+α)1c1, then it takes the evolver at least c steps to effect the same change as before. Thus, by reducing the local-motion factor, it is possible to achieve a speed-up factor of σ=1. Formally we have,

Corollary 9.

There exists a motion parameter α<1 such that for a set of n evolving objects in d under the (α,β)-local-motion model, where β<13, Algorithm˜TrackByZoom (with speed-up factor 1) maintains a distance of O(n) for all tt0, t0O(nlogΛ0loglogΛ0)

5 Extensions and Conclusion

The local-motion model introduced in Section 2 assumes a uniform probability distribution for each object. However, the algorithm TrackByZoom with an appropriate speed-up factor can be applied to a wider class of probability distributions. A distribution Pi in this class has i, the volume of the ith object, as its support, and is scaled according to li, the local feature size. Many natural truncated distributions [11], such as the uniform distribution, the truncated normal distribution [27], the truncated Cauchy distribution belong to this family. Details are presented in the full version of the paper.

An additional refinement to the motion model we suggested might involve redefining the local feature region and the speed of a point based on the distance to its k-nearest neighbors for a fixed k, instead of just using the nearest neighbor distance. We conjecture that our principal results extend to this generalization as well, given a suitable range counting oracle. Yet another variation of the model could possibly involve letting the uncertainty associated with a point have unbounded support (for e.g. a normal rather than a truncated normal distribution). This is a significantly harder problem, as a point may lie outside its local feature with a constant probability, requiring a maintaining algorithm to have a super-constant speed-up. We aim to study these in the future.

References

  • [1] A. Acharya and D. M. Mount. Optimally tracking labels on an evolving tree. In Proc. 34th Canad. Conf. Comput. Geom., pages 1–8, 2022.
  • [2] A. Anagnostopoulos, R. Kumar, M. Mahdian, and E. Upfal. Sorting and selection on dynamic data. Theoretical Computer Science, 412(24):2564–2576, 2011. doi:10.1016/j.tcs.2010.10.003.
  • [3] A. Anagnostopoulos, R. Kumar, M. Mahdian, E. Upfal, and F. Vandin. Algorithms on evolving graphs. In Proc. 3rd Innov. Theor. Comput. Sci. Conf., pages 149–160, 2012. doi:10.1145/2090236.2090249.
  • [4] M. Beermann and A. Sieben. The connection between stress, density, and speed in crowds. Scientific Reports, 13:13626, 2023. doi:10.1038/s41598-023-39006-8.
  • [5] J. J. Besa, W. E. Devanny, D. Eppstein, M. T. Goodrich, and T. Johnson. Optimally sorting evolving data. In 45th Internat. Colloq. Autom., Lang., and Prog., pages 81:1–81:13, 2018. doi:10.4230/LIPIcs.ICALP.2018.81.
  • [6] J. J. Besa, W. E. Devanny, D. Eppstein, M. T. Goodrich, and T. Johnson. Quadratic time algorithms appear to be optimal for sorting evolving data. In Proc. 20th Workshop Algorithm Eng. and Exp., pages 87–96, 2018.
  • [7] A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, 2005.
  • [8] K. Buchin, M. Löffler, P. Morin, and W. Mulzer. Delaunay triangulation of imprecise points simplified and extended. Algorithmica, 61:674–693, 2011. doi:10.1007/s00453-010-9430-0.
  • [9] D. Busto, W. Evans, and D. Kirkpatrick. Minimizing interference potential among moving entities. In Proc. 30th Annu. ACM-SIAM Sympos. Discrete Algorithms, pages 2400–2418, 2019. doi:10.5555/3310435.3310582.
  • [10] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, 2012. doi:10.1002/047174882X.
  • [11] Y. Dodge. The Oxford Dictionary of Statistical Terms. Oxford Univ. Press, 2003. doi:10.1002/sim.1812.
  • [12] W. Evans and D. Kirkpatrick. Minimizing query frequency to bound congestion potential for moving entities at a fixed target time. Algorithms, 17:246, 2024. doi:10.3390/a17060246.
  • [13] J. Gudmundsson, P. Laube, and T. Wolle. Computational movement analysis. In W. Kresse and D. Danko, editors, Handbook of Geogr. Inf., pages 725–741. Springer, 2012. doi:10.1007/978-3-540-72680-7_22.
  • [14] L. J. Guibas, D. Salesin, and J. Stolfi. Epsilon geometry: Building robust algorithms from imprecise computations. In Proc. Fifth Annu. Sympos. Comput. Geom., pages 208–217, 1989. doi:10.1145/73833.73857.
  • [15] D. Helbing, L. Buzna, A. Johansson, and T. Werner. Self-organized pedestrian crowd dynamics: Experiments, simulations, and design solutions. Transportation Science, 39:1–24, 2005. doi:10.1287/trsc.1040.0108.
  • [16] O. Hesham and G. Wainer. Context-sensitive personal space for dense crowd simulation. In Proc. Symp. Simulation for Architecture and Urban Design, pages 1–8, 2017. URL: https://dl.acm.org/doi/10.5555/3289787.3289806.
  • [17] Q. Huang, X. Liu, X. Sun, and J. Zhang. Partial sorting problem on evolving data. Algorithmica, 79:960–983, 2017. doi:10.1007/s00453-017-0295-3.
  • [18] D. A. Huffman. A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40:1098–1101, 1952. doi:10.1109/JRPROC.1952.273898.
  • [19] U. Jain. Effects of population density and resources on the feeling of crowding and personal space. J. Social Psych., 127:331–338, 1987. doi:10.1080/00224545.1987.9713699.
  • [20] S. Kahan. A model for data in motion. In Proc. 23rd Annu. ACM Sympos. Theory Comput., pages 265–277, 1991. doi:10.1145/103418.103449.
  • [21] V. Kanade, N. Leonardos, and F. Magniez. Stable matching with evolving preferences. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 2016.
  • [22] M. G. Kendall. A new measure of rank correlation. Biometrika, 30:81–93, 1938. doi:10.2307/2332226.
  • [23] S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22:79–86, 1951. doi:10.1214/aoms/1177729694.
  • [24] P. Laube. Computational Movement Analysis. Springer Cham, 2014. doi:10.1007/978-3-319-10268-9.
  • [25] M. Löffler and M. van Kreveld. Largest and smallest convex hulls for imprecise points. Algorithmica, 56:235–269, 2010. doi:10.1007/s00453-008-9174-2.
  • [26] D. J. C. MacKay. Information Theory, Inference and Learning Algorithms. Cambridge University Press, 2003. URL: https://books.google.com/books?id=AKuMj4PN_EMC.
  • [27] F. Nielsen. Statistical divergences between densities of truncated exponential families with nested supports: Duo Bregman and duo Jensen divergences. Entropy, 24:421, 2022. doi:10.3390/e24030421.
  • [28] R. Pan, Z. Han, T. Liu, H. Wang, J. Huang, and W. Wang. An RFID tag movement trajectory tracking method based on multiple RF characteristics for electronic vehicle identification ITS applications. Sensors, 23:7001, 2023. doi:10.3390/s23157001.
  • [29] A. Seyfried, B. Steffen, W. Klingsch, and M. Boltes. The fundamental diagram of pedestrian movement revisited. J. Stat. Mech. Theory Exp., 2005:P10002, 2005. doi:10.1088/1742-5468/2005/10/P10002.
  • [30] C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:623–656, 1948. doi:10.1002/j.1538-7305.1948.tb00917.x.
  • [31] S. M. Tomkiewicz, M. R. Fuller, J. G. Kie, and K. K. Bates. Global positioning system and associated technologies in animal behaviour and ecological research. Phil. Trans. R. Soc. B, 365:2163–2176, 2010. doi:10.1098/rstb.2010.0090.
  • [32] M. van Kreveld, M. Löffler, F. Staals, and L. Wiratma. A refined definition for groups of moving entities and its computation. Internat. J. Comput. Geom. Appl., 28(2):181–196, 2018. doi:10.4230/LIPIcs.ISAAC.2016.130.
  • [33] S. Verdú. The Cauchy distribution in information theory. Entropy, 25:346, 2023. doi:10.3390/e25020346.
  • [34] L. Wiratma, M. Löffler, and F. Staals. An experimental comparison of two definitions for groups of moving entities. In Proc. 10th Internat. Conf. Geogr. Inf. Sci., pages 64:1–64:6, 2018. doi:10.4230/LIPIcs.GIScience.2018.64.
  • [35] S. Yi, H. Li, and X. Wang. Understanding pedestrian behaviors from stationary crowd groups. In Proc. IEEE Conf. Comput. Vis. Pattern Recogn., pages 3488–3496, 2015. doi:10.1109/CVPR.2015.7298971.
  • [36] Y. Zou, G. Zeng, Y. Wang, X. Liu, X. Sun, J. Zhang, and Q. Li. Shortest paths on evolving graphs. In H. T. Nguyen and V. Snasel, editors, Computational Social Networks, pages 1–13. Springer, 2016. doi:10.1007/978-3-319-42345-6_1.

Appendix A Appendix

A.1 Deferred Proofs

Note that some of the restatements of the lemmas given here provide additional details, which were not present in the original statements.

The next lemma applies whenever Algorithm˜TrackByZoom has completed its processing of an object on Line 20. It shows that the potential value for this object does not exceed a constant.

Lemma 10 (Done tracking Xi).

There exists a constant ϕ0, such that whenever Algorithm˜TrackByZoom completes its processing of any object i, Φiϕ0O(1).

Proof.

The algorithm moves on to the next point when the condition in line 19 is satisfied (𝒪(i,𝐤i,hi)=(Y,) and 𝒪(i,𝐤i,hi/β)=(Y,)). Since 𝒪(i,𝐤i,hi)=(Y,), iH=(𝐤i,hi) intersect. Moreover, if 𝐪j is the current nearest neighbor of 𝐪i, 𝒪(i,𝐤i,hi/β)=(Y,) implies Xj lies outside the 1/β expansion of iH (see Figure 5).

Figure 5: Proof of Lemma 10.

Now because 𝐪j is the nearest neighbor of 𝐪i, NjNi, hence ljli. By triangle inequality, the following series of inequality holds:

hiβXj𝐤iXj𝐪i+𝐪i𝐤i(liβ+lj)+(hi+li)(1β+2)li+hi,

implying that 1β1+2βhili.

Now consider the hypothesis ball, iH, set by the algorithm in the previous step (via Line 26). From Lemma 5, we have iiH. Let hi=radius(iH). Since the algorithm shrinks the hypothesis ball by at least a factor of 2 during the zoom-in process, hi is at least 2hi. And therefore 2hili. We also have sihi+li (see Figure 5). Therefore,

Φi=log(max(si,li,hi)lihi) log(li+hilihi)
log(2hi+hi1β1+2βhihi)ϕ0,for ϕ0=log(31+2β1β).

Theorem 2 (Lower Bound on the Distance).

For any algorithm 𝒜, there exists a starting configuration Q(0) and an evolver (with knowledge of 𝒜) in the local-motion model such that, for any positive integer t0, there exists tt0, such that 𝒟(t)=𝒟(𝒫(t),(t))Ω(n).

Proof.

We construct a point set on the real line (), and we will mention at the end how to adapt the proof to any dimension d. Given the small constant β, the local feature constant, we define Q(0) to be the following set of n=2m points in .

Q(0)=i[m]{ai(0)=100i}i[m]{bi(0)=100i+1}.

Let 𝒟a,i=𝒟2(k1)+1,k[m], the distance corresponding to the first point in the tuple. We similarly define Pa,i and Ha,i. Let Ni denote the current distance to ai’s nearest neighbor. The local feature interval for ai, denoted Ii, is 100i+βNi[1,1]. Recall that Pa,i is the uniform probability over Ii. Therefore, Pa,i(𝐱)=1/|i|, where |i|=1/βNi is the diameter of i. Now

𝒟a,i(t0) =𝐱iPa,i(t0)(𝐱)logPa,i(t0)(𝐱)Ha,i(t0)(𝐱)d𝐱
=𝐱iPa,i(t0)(𝐱)logPa,i(t0)(𝐱)𝑑𝐱𝐱iPa,i(t0)(𝐱)logHa,i(t0)(𝐱)𝑑𝐱.

By the concavity of the log function and Jensen’s inequality, we have

𝒟a,i(t0) log|i|𝐱iPa,i(t0)(𝐱)𝑑𝐱log𝐱iHa,i(t0)(𝐱)Pa,i(t0)(𝐱)𝑑𝐱
=log|i|log𝐱iHa,i(t0)(𝐱)𝑑𝐱|i|=log𝐱iHa,i(t0)(𝐱)𝑑𝐱. (7)

We may assume that 𝒟(t0)o(n) (for otherwise we can set t=t0 and are done). This implies there are only o(n) many ai’s such that 𝒟a,i(t0)Ω(1). Call the remaining n/2o(n), ai’s the proximal set, denoted Ta.

We are now ready to describe the evolver’s actions. It chooses to stay dormant until time t0. For a large enough constant M, consider the time interval [t0,t0+n/M]. The algorithm 𝒜, running at a speed-up factor of σ, can modify at most σn/M hypotheses in the time interval. The evolver chooses not to move any of those points. Therefore 𝒜 can only reduce 𝒟(t0) by o(n) over this time interval. Now there are at least n/2σn/Mo(n) members in the proximal set Ta, whose hypotheses were not altered by 𝒜. Call this set the stable set, denoted Sa.

For κ=log(2/(1+α)), the evolver selects some n/(κM) members from Sa. Call that set Sa. To be specific, for all aiSa, the evolver chooses to move ai away from bi some κ times, by a distance of exactly αNa,i, where Na,i is the distance of ai from its current nearest neighbor. (Note that the nearest neighbor will remain bi throughout these operations.) Given this value of κ, after the conclusion of these operations, the distance between ai and bi is at least 2. For β<1/3, the local feature of ai changes to an interval i, where ii=. Now, for t=t0+n/M, using a similar analysis as Eq. (7), we have

𝒟a,i(t)log𝐱iHa,i(t)(𝐱)𝑑𝐱. (8)

Thus, for all aiSa, we have Ha,i(t)=Ha,i(t0) and Da,i(t0)=o(1). Therefore,

𝐱iHa,i(t0)(𝐱)𝑑𝐱eo(1).

If 𝒟a,i(t)o(1) as well, then similarly from Eq. (8), we have

𝐱iHa,i(t)(𝐱)𝑑𝐱eo(1).

But, this yields a contradiction since ii= and Ha,i(t)=Ha,i(t0) implies that

𝐱iiHa,i(t)(𝐱)𝑑𝐱 2eo(1)> 1.

Therefore, for aiSa, 𝒟a,i(t)Ω(1), and since |Sa|=Ω(n), we have 𝒟(t)Ω(n).

For a general dimension d, we define Q, in the same way except all the points lie on a single axis. The evolver also moves these points along that particular axis. The rest of the analysis involves integration over a region of space rather an interval, but is straightforward and carries over to d.

Lemma 11 (Evolver’s contribution to Φ).

Every step of the evolver increases the potential function Φ by at most a constant.

Proof.

Let’s say the evolver chooses to move 𝐪i at time 0. Now li(0)/β is the nearest neighbor distance of 𝐪i at time 0. Therefore 𝐪i moves by at most (α/β)li(0) distance, implying si(1)si(0)+(α/β)li(0)(1+α/β)max(si(0),li(0),hi(0)). Similarly (1α)li(0)li(1)(1+α)li(0). Therefore,

Φi(1)Φi(0)logmax(si(1),li(1),hi(1))li(1)hi(1)logmax(si(0),li(0),hi(0))li(0)hi(0)logmax(si(0)+(α/β)li(0),(1+α)li(0),hi(0))(1α)li(0)hi(0)logmax(si(0),li(0),hi(0))li(0)hi(0)log(1+α/β1αmax(si(0),li(0),hi(0))li(0)hi(0))logmax(si(0),li(0),hi(0))li(0)hi(0)=log1+α/β1αO(1) (9)

Let 𝒩i be the set of indices of points whose nearest neighbor at time 0 was 𝐪i(0), or whose nearest neighbor at time 1 was 𝐪i(1). Only points with indices in this set have their potential function changed. If j𝒩i, sj(0), and hj(0) do not change. Since 𝐪i moves by at most (α/β)li(0) distance, the nearest neighbor distance Nj=lj/β changes by at most (α/β)li(0) as well. Therefore,

(α/β)li(0)Nj(1)Nj(0)(α/β)li(0)αli(0)lj(1)lj(0)αli(0) (10)
Since (1α)li(0)li(1), we also have
α1αli(1)lj(1)lj(0)α1αli(1) (11)

If 𝐪i(0) was the nearest neighbor of 𝐪j(0), then Nj(0)Ni(0), and hence lj(0)li(0). Using the last fact, and dividing Eq. (10) by lj(0), we have (1α)lj(1)/lj(0)(1+α). Similarly, If 𝐪i(1) was the nearest neighbor of 𝐪j(1), then using Eq. (11) we have (1α/(1α))lj(1)/lj(0)(1+α/(1α)). Therefore lj(1)/lj(0)Θ(1) for j𝒩i. And hence,

Φj(1)Φj(0)=logmax(si(1),li(1),hi(1))li(1)hi(1)logmax(si(0),li(0),hi(0))li(0)hi(0)=logmax(si(0),li(1),hi(0))max(si(0),li(0),hi(0))logli(1)hi(0)li(0)hi(0)|logli(1)li(0)|+|logli(0)li(1)|O(1),j𝒩i (12)

Finally, we show that |𝒩i|O(1). To that effect we solve a general problem: What is the maximum number of points in Q, whose nearest neighbor is 𝐪1? Without loss of generality, let 𝐪2 be the nearest neighbor of 𝐪1, such that 𝐪2𝐪1=1. Let 𝐪3’s nearest neighbor be 𝐪1. Consider the line segment 𝐪1𝐪3¯, and its intersection with (𝐪1,1) the ball centered at 𝐪1, and passing through 𝐪2. Call the intersection point 𝐪3. For any general point 𝐪xQ, we similarly define 𝐪x. Note 𝐪2=𝐪2. By triangle inequality, we have 𝐪3𝐪2𝐪3𝐪2𝐪3𝐪3=𝐪3𝐪2𝐪3𝐪1+𝐪1𝐪3. Since 𝐪1 is the nearest neighbor of 𝐪3, we have 𝐪3𝐪2𝐪3𝐪1. Therefore 𝐪3𝐪2𝐪1𝐪3=1. Therefore, for any x,y[n],xy, such that 𝐪1 is the nearest neighbor of 𝐪x and 𝐪y, we have 𝐪x𝐪y1. Let 1={𝐪x𝐪1is the nearest neighbor of 𝐪x}. Draw a ball of radius 1/2 around 𝐪1, and every member of 1. Each of these balls are non-overlapping. However all of them are contained within a ball around 𝐪1 of radius 3/2. Therefore |1|<(3/2)d/(1/2)d=3dO(1). Observing that 𝒩i21, and using Eq. (9), and 12 we have the result.

Lemma 12 (Algorithm’s contribution towards Φ).

For any iteration z, let εz denote the number of steps executed by the evolver. Then TrackByZoom increases Φ in O(n+εz) steps, each time by O(1). In each of the remaining steps of the zth iteration, it decreases Φ by Θ(1) in an amortized sense.

Proof.

For a particular index i, we first concentrate on the zoom-out routine of our Algorithm˜TrackByZoom (see Line 6). Assume that the evolver leaves qi untouched for the time being. We show later how to handle the case where the evolver moves qi, while the algorithm is processing the index i. The algorithm increases hi by a constant factor in Line 12. Therefore, Φ=log(max(si,li,hi)/lihi) decreases by a constant amount whenever max(si,li,hi)hi. Now suppose max(si,li,hi)=hi at time t, for the first time during the zoom-out routine. That implies a 2-expansion of iH in line 12 contains the entire i. Therefore, a further 1/β-expansion definitely contains another Xj, ji, satisfying the condition in line 8. We conclude that once max(si,li,hi)=hi, the algorithm only increases Φ by a constant amount before moving on to the zoom-in routine of the algorithm (line 14).

Now, we look at the zoom-in routine of the algorithm (line 14). The algorithm only zooms in as long as the condition in line 23 is satisfied. Since we expand iH with an expansion factor 312β in line 26, using Lemma 5 we conclude that iiH in every step during the zoom-in routine. That implies max(si,li,hi)=hi throughout the zoom-in process, further implying Φi=loghi/li.

In line 24 we subdivide the hypothesis ball into |Ψ|=(2312βd)dO(1) balls and make constant number of queries to the oracle. Therefore, hi reduces by a factor of 312β/(2312β)2. And hence, Φ reduces by at least a constant amount. We amortize the reduction over the |Ψ| calls to the Oracle, to denote a constant reduction in Φ in every zoom-in step.

If the evolver moved qi while the algorithm was processing the index i, there is a possibility that Xi might move outside of iH during the zoom-in routine. We take care of this condition in line 16 to initiate the zoom-out process. Since the algorithm possibly increases Φ by a constant amount when switching from zoom-out to zoom-in, we conclude: For every evolver’s step, our algorithm might increase Φ by at most a constant amount as well.

Lemma 13 (Iteration time proportional to Potential).

There exists a constant speed-up factor σ for TrackByZoom such that ΔzΘσ(n+Φz).

Proof.

By Lemma 4, the evolver can only increase the potential by Φεz=O(Δz) during the iteration. By Lemma 6 we observe that the algorithm increases Φ by some amount that is O(n+Δz). Since by Lemma 10, the algorithm reduces each Φi to at most ϕ0, it reduces the overall potential by at most Φznϕ0+Φεz+O(n+Δz) =O(n+Φz+Δz). For a speed-up factor σ, the algorithm TrackByZoom spends O(n+Φz+Δz)/σ time to do so, implying ΔzO(n+Φz+Δz)/σ, further implying ΔzOσ(n+Φz), for a sufficiently large constant σ.

To see why ΔzΩ(n+Φz), we follow a similar logic, except to observe that the evolver can decrease the potential by at most Φεz=O(Δz) as well. By Lemma 10, the algorithm reduces the overall potential Φ by at least Φznϕ0Φεz=Ω(n+ΦzΔz), which takes it at least Ω(n+ΦzΔz)/σ time, implying that ΔzΩ(n+ΦzΔz)/σ, resulting in ΔzΩσ(n+Φz).