Testing C_k-Freeness in Bounded-Arboricity Graphs

Authors: Talya Eden, Reut Levi, and Dana Ron

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

We study the problem of testing C_k-freeness (k-cycle-freeness) for fixed constant k > 3 in graphs with bounded arboricity (but unbounded degrees). In particular, we are interested in one-sided error algorithms, so that they must detect a copy of C_k with high constant probability when the graph is ε-far from C_k-free. We next state our results for constant arboricity and constant ε with a focus on the dependence on the number of graph vertices, n. The query complexity of all our algorithms grows polynomially with 1/ε. 1) As opposed to the case of k = 3, where the complexity of testing C₃-freeness grows with the arboricity of the graph but not with the size of the graph (Levi, ICALP 2021) this is no longer the case already for k = 4. We show that Ω(n^{1/4}) queries are necessary for testing C₄-freeness, and that Õ(n^{1/4}) are sufficient. The same bounds hold for C₅. 2) For every fixed k ≥ 6, any one-sided error algorithm for testing C_k-freeness must perform Ω(n^{1/3}) queries. 3) For k = 6 we give a testing algorithm whose query complexity is Õ(n^{1/2}). 4) For any fixed k, the query complexity of testing C_k-freeness is upper bounded by {O}(n^{1-1/⌊k/2⌋}). The last upper bound builds on another result in which we show that for any fixed subgraph F, the query complexity of testing F-freeness is upper bounded by O(n^{1-1/𝓁(F)}), where 𝓁(F) is a parameter of F that is always upper bounded by the number of vertices in F (and in particular is k/2 in C_k for even k). We extend some of our results to bounded (non-constant) arboricity, where in particular, we obtain sublinear upper bounds for all k. Our Ω(n^{1/4}) lower bound for testing C₄-freeness in constant arboricity graphs provides a negative answer to an open problem posed by (Goldreich, 2021).

Talya Eden, Reut Levi, and Dana Ron. Testing C_k-Freeness in Bounded-Arboricity Graphs. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 60:1-60:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Sample-Based Distance-Approximation for Subsequence-Freeness

Authors: Omer Cohen Sidon and Dana Ron

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

In this work, we study the problem of approximating the distance to subsequence-freeness in the sample-based distribution-free model. For a given subsequence (word) w = w_1 … w_k, a sequence (text) T = t_1 … t_n is said to contain w if there exist indices 1 ≤ i_1 < … < i_k ≤ n such that t_{i_{j}} = w_j for every 1 ≤ j ≤ k. Otherwise, T is w-free. Ron and Rosin (ACM TOCT 2022) showed that the number of samples both necessary and sufficient for one-sided error testing of subsequence-freeness in the sample-based distribution-free model is Θ(k/ε). Denoting by Δ(T,w,p) the distance of T to w-freeness under a distribution p:[n] → [0,1], we are interested in obtaining an estimate Δ̂, such that |Δ̂ - Δ(T,w,p)| ≤ δ with probability at least 2/3, for a given distance parameter δ. Our main result is an algorithm whose sample complexity is Õ(k²/δ²). We first present an algorithm that works when the underlying distribution p is uniform, and then show how it can be modified to work for any (unknown) distribution p. We also show that a quadratic dependence on 1/δ is necessary.

Omer Cohen Sidon and Dana Ron. Sample-Based Distance-Approximation for Subsequence-Freeness. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 44:1-44:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs

Authors: Talya Eden, Dana Ron, and Will Rosenbaum

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Counting and sampling small subgraphs are fundamental algorithmic tasks. Motivated by the need to handle massive datasets efficiently, recent theoretical work has examined the problems in the sublinear time regime. In this work, we consider the problem of sampling a k-clique in a graph from an almost uniform distribution. Specifically the algorithm should output each k-clique with probability (1±ε)/n_k, where n_k denotes the number of k-cliques in the graph and ε is a given approximation parameter. To this end, the algorithm may perform degree, neighbor, and pair queries. We focus on the class of graphs with arboricity at most α, and prove that the query complexity of the problem is Θ^*(min{nα , max {(((nα)^(k/2))/n_k)^{1/(k-1)}, (nα^(k-1))/n_k}}), where n is the number of vertices in the graph, and Θ^*(⋅) suppresses dependencies on (log n/ε)^O(k). Our upper bound is based on defining a special auxiliary graph H_k, such that sampling edges almost uniformly in H_k translates to sampling k-cliques almost uniformly in the original graph G. We then build on a known edge-sampling algorithm (Eden, Ron and Rosenbaum, ICALP19) to sample edges in H_k. The challenge is simulating queries to H_k while being given query access only to G. Our lower bound follows from a construction of a family of graphs with arboricity α such that in each graph there are n_k k-cliques, where one of these cliques is "hidden" and hence hard to sample.

Talya Eden, Dana Ron, and Will Rosenbaum. Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 56:1-56:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Testing Distributions of Huge Objects

Authors: Oded Goldreich and Dana Ron

Published in: LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)

We initiate a study of a new model of property testing that is a hybrid of testing properties of distributions and testing properties of strings. Specifically, the new model refers to testing properties of distributions, but these are distributions over huge objects (i.e., very long strings). Accordingly, the model accounts for the total number of local probes into these objects (resp., queries to the strings) as well as for the distance between objects (resp., strings). Specifically, the distance between distributions is defined as the earth mover’s distance with respect to the relative Hamming distance between strings. We study the query complexity of testing in this new model, focusing on three directions. First, we try to relate the query complexity of testing properties in the new model to the sample complexity of testing these properties in the standard distribution testing model. Second, we consider the complexity of testing properties that arise naturally in the new model (e.g., distributions that capture random variations of fixed strings). Third, we consider the complexity of testing properties that were extensively studied in the standard distribution testing model: Two such cases are uniform distributions and pairs of identical distributions, where we obtain the following results. - Testing whether a distribution over n-bit long strings is uniform on some set of size m can be done with query complexity Õ(m/ε³), where ε > (log₂m)/n is the proximity parameter. - Testing whether two distribution over n-bit long strings that have support size at most m are identical can be done with query complexity Õ(m^{2/3}/ε³). Both upper bounds are quite tight; that is, for ε = Ω(1), the first task requires Ω(m^c) queries for any c < 1 and n = ω(log m), whereas the second task requires Ω(m^{2/3}) queries. Note that the query complexity of the first task is higher than the sample complexity of the corresponding task in the standard distribution testing model, whereas in the case of the second task the bounds almost match.

Oded Goldreich and Dana Ron. Testing Distributions of Huge Objects. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 78:1-78:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Testing Dynamic Environments: Back to Basics

Authors: Yonatan Nakar and Dana Ron

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

We continue the line of work initiated by Goldreich and Ron (Journal of the ACM, 2017) on testing dynamic environments and propose to pursue a systematic study of the complexity of testing basic dynamic environments and local rules. As a first step, in this work we focus on dynamic environments that correspond to elementary cellular automata that evolve according to threshold rules. Our main result is the identification of a set of conditions on local rules, and a meta-algorithm that tests evolution according to local rules that satisfy the conditions. The meta-algorithm has query complexity poly(1/ε), is non-adaptive and has one-sided error. We show that all the threshold rules satisfy the set of conditions, and therefore are poly(1/ε)-testable. We believe that this is a rich area of research and suggest a variety of open problems and natural research directions that may extend and expand our results.

Yonatan Nakar and Dana Ron. Testing Dynamic Environments: Back to Basics. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 98:1-98:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Almost Optimal Distribution-Free Sample-Based Testing of k-Modality

Authors: Dana Ron and Asaf Rosin

Published in: LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

For an integer k ≥ 0, a sequence σ = σ₁,… ,σ_n over a fully ordered set is k-modal, if there exist indices 1 = a₀ < a₁ < … < a_{k+1} = n such that for each i, the subsequence σ_{a_i},… ,σ_{a_{i+1}} is either monotonically non-decreasing or monotonically non-increasing. The property of k-modality is a natural extension of monotonicity, which has been studied extensively in the area of property testing. We study one-sided error property testing of k-modality in the distribution-free sample-based model. We prove an upper bound of O({√{kn}log k}/ε) on the sample complexity, and an almost matching lower bound of Ω(√{kn}/ε). When the underlying distribution is uniform, we obtain a completely tight bound of Θ(√{kn/ε}), which generalizes what is known for sample-based testing of monotonicity under the uniform distribution.

Dana Ron and Asaf Rosin. Almost Optimal Distribution-Free Sample-Based Testing of k-Modality. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 27:1-27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

The Arboricity Captures the Complexity of Sampling Edges

Authors: Talya Eden, Dana Ron, and Will Rosenbaum

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

In this paper, we revisit the problem of sampling edges in an unknown graph G = (V, E) from a distribution that is (pointwise) almost uniform over E. We consider the case where there is some a priori upper bound on the arboriciy of G. Given query access to a graph G over n vertices and of average degree {d} and arboricity at most alpha, we design an algorithm that performs O(alpha/d * {log^3 n}/epsilon) queries in expectation and returns an edge in the graph such that every edge e in E is sampled with probability (1 +/- epsilon)/m. The algorithm performs two types of queries: degree queries and neighbor queries. We show that the upper bound is tight (up to poly-logarithmic factors and the dependence in epsilon), as Omega(alpha/d) queries are necessary for the easier task of sampling edges from any distribution over E that is close to uniform in total variational distance. We also prove that even if G is a tree (i.e., alpha = 1 so that alpha/d = Theta(1)), Omega({log n}/{loglog n}) queries are necessary to sample an edge from any distribution that is pointwise close to uniform, thus establishing that a poly(log n) factor is necessary for constant alpha. Finally we show how our algorithm can be applied to obtain a new result on approximately counting subgraphs, based on the recent work of Assadi, Kapralov, and Khanna (ITCS, 2019).

Talya Eden, Dana Ron, and Will Rosenbaum. The Arboricity Captures the Complexity of Sampling Edges. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 52:1-52:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

The Subgraph Testing Model

Authors: Oded Goldreich and Dana Ron

Published in: LIPIcs, Volume 124, 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)

We initiate a study of testing properties of graphs that are presented as subgraphs of a fixed (or an explicitly given) graph. The tester is given free access to a base graph G=([n],E), and oracle access to a function f:E -> {0,1} that represents a subgraph of G. The tester is required to distinguish between subgraphs that posses a predetermined property and subgraphs that are far from possessing this property. We focus on bounded-degree base graphs and on the relation between testing graph properties in the subgraph model and testing the same properties in the bounded-degree graph model. We identify cases in which testing is significantly easier in one model than in the other as well as cases in which testing has approximately the same complexity in both models. Our proofs are based on the design and analysis of efficient testers and on the establishment of query-complexity lower bounds.

Oded Goldreich and Dana Ron. The Subgraph Testing Model. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 124, pp. 37:1-37:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

On the Testability of Graph Partition Properties

Authors: Yonatan Nakar and Dana Ron

Published in: LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

In this work we study the testability of a family of graph partition properties that generalizes a family previously studied by Goldreich, Goldwasser, and Ron (Journal of the ACM, 1998 ). While the family studied by Goldreich, Goldwasser, and Ron includes a variety of natural properties, such as k-colorability and containing a large cut, it does not include other properties of interest, such as split graphs, and more generally (p,q)-colorable graphs. The generalization we consider allows us to impose constraints on the edge-densities within and between parts (relative to the sizes of the parts). We denote the family studied in this work by GPP. We first show that all properties in GPP have a testing algorithm whose query complexity is polynomial in 1/epsilon, where epsilon is the given proximity parameter (and there is no dependence on the size of the graph). As the testing algorithm has two-sided error, we next address the question of which properties in GPP can be tested with one-sided error and query complexity polynomial in 1/epsilon. We answer this question by establishing a characterization result. Namely, we define a subfamily GPP_{0,1} of GPP and show that a property P in GPP is testable by a one-sided error algorithm that has query complexity poly(1/epsilon) if and only if P in GPP_{0,1}.

Yonatan Nakar and Dana Ron. On the Testability of Graph Partition Properties. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 53:1-53:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection

Authors: Talya Eden, Dana Ron, and C. Seshadhri

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)

We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph G=(V,E) with n (non-isolated) vertices, and define (for s > 0) mu_s = 1\n * sum_{v in V} d^s_v. Our aim is to estimate mu_s within a multiplicative error of (1+epsilon) (for a given approximation parameter epsilon>0) in sublinear time. We consider the sparse graph model that allows access to: uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of s=1 (the average degree), \widetilde{O}(\sqrt{n}) queries suffice for any constant epsilon (Feige, SICOMP 06 and Goldreich-Ron, RSA 08). Gonen-Ron-Shavitt (SIDMA 11) extended this result to all integral s > 0, by designing an algorithms that performs \widetilde{O}(n^{1-1/(s+1)}) queries. (Strictly speaking, their algorithm approximates the number of star-subgraphs of a given size, but a slight modification gives an algorithm for moments.) We design a new, significantly simpler algorithm for this problem. In the worst-case, it exactly matches the bounds of Gonen-Ron-Shavitt, and has a much simpler proof. More importantly, the running time of this algorithm is connected to the degeneracy of G. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with degeneracy at most alpha, it has a query complexity of widetilde{O}\left(\frac{n^{1-1/s}}{\mu^{1/s}_s} \Big(\alpha^{1/s} + \min\{\alpha,\mu^{1/s}_s\}\Big)\right) = \widetilde{O}(n^{1-1/s}\alpha/\mu^{1/s}_s). Thus, for the class of bounded degeneracy graphs (which includes all minor closed families and preferential attachment graphs), we can estimate the average degree in \widetilde{O}(1) queries, and can estimate the variance of the degree distribution in \widetilde{O}(\sqrt{n}) queries. This is a major improvement over the previous worst-case bounds. Our key insight is in designing an estimator for mu_s that has low variance when G does not have large dense subgraphs.

Talya Eden, Dana Ron, and C. Seshadhri. Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 7:1-7:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

A Local Algorithm for Constructing Spanners in Minor-Free Graphs

Authors: Reut Levi, Dana Ron, and Ronitt Rubinfeld

Published in: LIPIcs, Volume 60, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)

Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider this problem in the setting of local algorithms: one wants to quickly determine whether a given edge e is in a specific spanning tree, without computing the whole spanning tree, but rather by inspecting the local neighborhood of e. The challenge is to maintain consistency. That is, to answer queries about different edges according to the same spanning tree. Since it is known that this problem cannot be solved without essentially viewing all the graph, we consider the relaxed version of finding a spanning subgraph with (1+c)n edges instead of n-1 edges (where n is the number of vertices and c is a given approximation/sparsity parameter). It is known that this relaxed problem requires inspecting order of n^{1/2} edges in general graphs (for any constant c), which motivates the study of natural restricted families of graphs. One such family is the family of graphs with an excluded minor (which in particular includes planar graphs). For this family there is an algorithm that achieves constant success probability, and inspects (d/c)^{poly(h)log(1/c)} edges (for each edge it is queried on), where d is the maximum degree in the graph and h is the size of the excluded minor. The distances between pairs of vertices in the spanning subgraph G' are at most a factor of poly(d, 1/c, h) larger than in G. In this work, we show that for an input graph that is H-minor free for any H of size h, this task can be performed by inspecting only poly(d, 1/c, h) edges in G. The distances between pairs of vertices in the spanning subgraph G' are at most a factor of h log(d)/c (up to poly-logarithmic factors) larger than in G. Furthermore, the error probability of the new algorithm is significantly improved to order of 1/n. This algorithm can also be easily adapted to yield an efficient algorithm for the distributed (message passing) setting.

Reut Levi, Dana Ron, and Ronitt Rubinfeld. A Local Algorithm for Constructing Spanners in Minor-Free Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 38:1-38:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Local Algorithms for Sparse Spanning Graphs

Authors: Reut Levi, Dana Ron, and Ronitt Rubinfeld

Published in: LIPIcs, Volume 28, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)

We initiate the study of the problem of designing sublinear-time (local) algorithms that, given an edge (u,v) in a connected graph G=(V,E), decide whether (u,v) belongs to a sparse spanning graph G' = (V,E') of G. Namely, G' should be connected and |E'| should be upper bounded by (1+epsilon)|V| for a given parameter epsilon > 0. To this end the algorithms may query the incidence relation of the graph G, and we seek algorithms whose query complexity and running time (per given edge (u,v)) is as small as possible. Such an algorithm may be randomized but (for a fixed choice of its random coins) its decision on different edges in the graph should be consistent with the same spanning graph G' and independent of the order of queries. We first show that for general (bounded-degree) graphs, the query complexity of any such algorithm must be Omega(sqrt{|V|}). This lower bound holds for graphs that have high expansion. We then turn to design and analyze algorithms both for graphs with high expansion (obtaining a result that roughly matches the lower bound) and for graphs that are (strongly) non-expanding (obtaining results in which the complexity does not depend on |V|). The complexity of the problem for graphs that do not fall into these two categories is left as an open question.

Reut Levi, Dana Ron, and Ronitt Rubinfeld. Local Algorithms for Sparse Spanning Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 28, pp. 826-842, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Approximating Average Parameters of Graphs

Authors: Oded Goldreich and Dana Ron

Published in: Dagstuhl Seminar Proceedings, Volume 5291, Sublinear Algorithms (2006)

Inspired by Feige (36th STOC, 2004), we initiate a study of sublinear randomized algorithms for approximating average parameters of a graph. Specifically, we consider the average degree of a graph and the average distance between pairs of vertices in a graph. Since our focus is on sublinear algorithms, these algorithms access the input graph via queries to an adequate oracle. We consider two types of queries. The first type is standard neighborhood queries (i.e., what is the i'th neighbor of vertex v?), whereas the second type are queries regarding the quantities that we need to find the average of (i.e., what is the degree of vertex v? and what is the distance between u and v, respectively). Loosely speaking, our results indicate a difference between the two problems: For approximating the average degree, the standard neighbor queries suffice and in fact are preferable to degree queries. In contrast, for approximating average distances, the standard neighbor queries are of little help whereas distance queries are crucial.

Oded Goldreich and Dana Ron. Approximating Average Parameters of Graphs. In Sublinear Algorithms. Dagstuhl Seminar Proceedings, Volume 5291, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

