##### Is the Algorithmic Kadison-Singer Problem Hard?

Authors: Ben Jourdan, Peter Macgregor, and He Sun

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)

##### Abstract
We study the following KS₂(c) problem: let c ∈ ℝ^+ be some constant, and v₁,…, v_m ∈ ℝ^d be vectors such that ‖v_i‖² ≤ α for any i ∈ [m] and ∑_{i=1}^m ⟨v_i, x⟩² = 1 for any x ∈ ℝ^d with ‖x‖ = 1. The KS₂(c) problem asks to find some S ⊂ [m], such that it holds for all x ∈ ℝ^d with ‖x‖ = 1 that |∑_{i∈S} ⟨v_i, x⟩² - 1/2| ≤ c⋅√α, or report no if such S doesn't exist. Based on the work of Marcus et al. [Adam Marcus et al., 2013] and Weaver [Nicholas Weaver, 2004], the KS₂(c) problem can be seen as the algorithmic Kadison-Singer problem with parameter c ∈ ℝ^+. Our first result is a randomised algorithm with one-sided error for the KS₂(c) problem such that (1) our algorithm finds a valid set S ⊂ [m] with probability at least 1-2/d, if such S exists, or (2) reports no with probability 1, if no valid sets exist. The algorithm has running time O(binom(m,n)⋅poly(m, d)) for n = O(d/ε² log(d) log(1/(c√α))), where ε is a parameter which controls the error of the algorithm. This presents the first algorithm for the Kadison-Singer problem whose running time is quasi-polynomial in m in a certain regime, although having exponential dependency on d. Moreover, it shows that the algorithmic Kadison-Singer problem is easier to solve in low dimensions. Our second result is on the computational complexity of the KS₂(c) problem. We show that the KS₂(1/(4√2)) problem is FNP-hard for general values of d, and solving the KS₂(1/(4√2)) problem is as hard as solving the NAE-3SAT problem.

Ben Jourdan, Peter Macgregor, and He Sun. Is the Algorithmic Kadison-Singer Problem Hard?. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 43:1-43:18, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

##### Subset Wavelet Trees

Authors: Jarno N. Alanko, Elena Biagi, Simon J. Puglisi, and Jaakko Vuohtoniemi

Published in: LIPIcs, Volume 265, 21st International Symposium on Experimental Algorithms (SEA 2023)

##### Abstract
Given an alphabet Σ of σ = |Σ| symbols, a degenerate (or indeterminate) string X is a sequence X = X[0],X[1]…, X[n-1] of n subsets of Σ. Since their introduction in the mid 70s, degenerate strings have been widely studied, with applications driven by their being a natural model for sequences in which there is a degree of uncertainty about the precise symbol at a given position, such as those arising in genomics and proteomics. In this paper we introduce a new data structural tool for degenerate strings, called the subset wavelet tree (SubsetWT). A SubsetWT supports two basic operations on degenerate strings: subset-rank(i,c), which returns the number of subsets up to the i-th subset in the degenerate string that contain the symbol c; and subset-select(i,c), which returns the index in the degenerate string of the i-th subset that contains symbol c. These queries are analogs of rank and select queries that have been widely studied for ordinary strings. Via experiments in a real genomics application in which degenerate strings are fundamental, we show that subset wavelet trees are practical data structures, and in particular offer an attractive space-time tradeoff. Along the way we investigate data structures for supporting (normal) rank queries on base-4 and base-3 sequences, which may be of independent interest. Our C++ implementations of the data structures are available at https://github.com/jnalanko/SubsetWT.

Jarno N. Alanko, Elena Biagi, Simon J. Puglisi, and Jaakko Vuohtoniemi. Subset Wavelet Trees. In 21st International Symposium on Experimental Algorithms (SEA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 265, pp. 4:1-4:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

##### The Support of Open Versus Closed Random Walks

Authors: Thomas Sauerwald, He Sun, and Danny Vagnozzi

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

##### Abstract
A closed random walk of length 𝓁 on an undirected and connected graph G = (V,E) is a random walk that returns to the start vertex at step 𝓁, and its properties have been recently related to problems in different mathematical fields, e.g., geometry and combinatorics (Jiang et al., Annals of Mathematics '21) and spectral graph theory (McKenzie et al., STOC '21). For instance, in the context of analyzing the eigenvalue multiplicity of graph matrices, McKenzie et al. show that, with high probability, the support of a closed random walk of length 𝓁 ⩾ 1 is Ω(𝓁^{1/5}) on any bounded-degree graph, and leaves as an open problem whether a stronger bound of Ω(𝓁^{1/2}) holds for any regular graph. First, we show that the support of a closed random walk of length 𝓁 is at least Ω(𝓁^{1/2} / √{log n}) for any regular or bounded-degree graph on n vertices. Secondly, we prove for every 𝓁 ⩾ 1 the existence of a family of bounded-degree graphs, together with a start vertex such that the support is bounded by O(𝓁^{1/2}/√{log n}). Besides addressing the open problem of McKenzie et al., these two results also establish a subtle separation between closed random walks and open random walks, for which the support on any regular (or bounded-degree) graph is well-known to be Ω(𝓁^{1/2}) for all 𝓁 ⩾ 1. For irregular graphs, we prove that even if the start vertex is chosen uniformly, the support of a closed random walk may still be O(log 𝓁). This rules out a general polynomial lower bound in 𝓁 for all graphs. Finally, we apply our results on random walks to obtain new bounds on the multiplicity of the second largest eigenvalue of the adjacency matrices of graphs.

Thomas Sauerwald, He Sun, and Danny Vagnozzi. The Support of Open Versus Closed Random Walks. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 103:1-103:21, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

##### Susceptibility to Image Resolution in Face Recognition and Training Strategies to Enhance Robustness

Authors: Martin Knoche, Stefan Hörmann, and Gerhard Rigoll

Published in: LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1

##### Abstract
Many face recognition approaches expect the input images to have similar image resolution. However, in real-world applications, the image resolution varies due to different image capture mechanisms or sources, affecting the performance of face recognition systems. This work first analyzes the image resolution susceptibility of modern face recognition. Face verification on the very popular LFW dataset drops from 99.23% accuracy to almost 55% when image dimensions of both images are reduced to arguable very poor resolution. With cross-resolution image pairs (one HR and one LR image), face verification accuracy is even worse. This characteristic is investigated more in-depth by analyzing the feature distances utilized for face verification. To increase the robustness, we propose two training strategies applied to a state-of-the-art face recognition model: 1) Training with 50% low resolution images within each batch and 2) using the cosine distance loss between high and low resolution features in a siamese network structure. Both methods significantly boost face verification accuracy for matching training and testing image resolutions. Training a network with different resolutions simultaneously instead of adding only one specific low resolution showed improvements across all resolutions and made a single model applicable to unknown resolutions. However, models trained for one particular low resolution perform better when using the exact resolution for testing. We improve the face verification accuracy from 96.86% to 97.72% on the popular LFW database with uniformly distributed image dimensions between 112 × 112 px and 5 × 5 px. Our approaches improve face verification accuracy even more from 77.56% to 87.17% for distributions focusing on lower images resolutions. Lastly, we propose specific image dimension sets focusing on high, mid, and low resolution for five well-known datasets to benchmark face verification accuracy in cross-resolution scenarios.

Martin Knoche, Stefan Hörmann, and Gerhard Rigoll. Susceptibility to Image Resolution in Face Recognition and Training Strategies to Enhance Robustness. In LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1, pp. 01:1-01:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

##### Micro- and Macroscopic Road Traffic Analysis using Drone Image Data

Authors: Friedrich Kruber, Eduardo Sánchez Morales, Robin Egolf, Jonas Wurst, Samarjit Chakraborty, and Michael Botsch

Published in: LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1

##### Abstract
The current development in the drone technology, alongside with machine learning based image processing, open new possibilities for various applications. Thus, the market volume is expected to grow rapidly over the next years. The goal of this paper is to demonstrate the capabilities and limitations of drone based image data processing for the purpose of road traffic analysis. In the first part a method for generating microscopic traffic data is proposed. More precisely, the state of vehicles and the resulting trajectories are estimated. The method is validated by conducting experiments with reference sensors and proofs to achieve precise vehicle state estimation results. It is also shown, how the computational effort can be reduced by incorporating the tracking information into a neural network. A discussion on current limitations supplements the findings. By collecting a large number of vehicle trajectories, macroscopic statistics, such as traffic flow and density can be obtained from the data. In the second part, a publicly available drone based data set is analyzed to evaluate the suitability for macroscopic traffic modeling. The results show that the method is well suited for gaining detailed information about macroscopic statistics, such as traffic flow dependent time headway or lane change occurrences. In conclusion, this paper presents methods to exploit the remarkable opportunities of drone based image processing for joint macro- and microscopic traffic analysis.

Friedrich Kruber, Eduardo Sánchez Morales, Robin Egolf, Jonas Wurst, Samarjit Chakraborty, and Michael Botsch. Micro- and Macroscopic Road Traffic Analysis using Drone Image Data. In LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1, pp. 02:1-02:27, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

##### HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology

Authors: Manoj-Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, Emanuele Valpreda, Manfredi Camalleri, Qi Zhao, Christian Unger, Naveen-Shankar Nagaraja, Maurizio Martina, and Walter Stechele

Published in: LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1

##### Abstract
Convolutional neural networks (CNNs) have produced unprecedented accuracy for many computer vision problems in the recent past. In power and compute-constrained embedded platforms, deploying modern CNNs can present many challenges. Most CNN architectures do not run in real-time due to the high number of computational operations involved during the inference phase. This emphasizes the role of CNN optimization techniques in early design space exploration. To estimate their efficacy in satisfying the target constraints, existing techniques are either hardware (HW) agnostic, pseudo-HW-aware by considering parameter and operation counts, or HW-aware through inflexible hardware-in-the-loop (HIL) setups. In this work, we introduce HW-Flow, a framework for optimizing and exploring CNN models based on three levels of hardware abstraction: Coarse, Mid and Fine. Through these levels, CNN design and optimization can be iteratively refined towards efficient execution on the target hardware platform. We present HW-Flow in the context of CNN pruning by augmenting a reinforcement learning agent with key metrics to understand the influence of its pruning actions on the inference hardware. With 2× reduction in energy and latency, we prune ResNet56, ResNet50, and DeepLabv3 with minimal accuracy degradation on the CIFAR-10, ImageNet, and CityScapes datasets, respectively.

Manoj-Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, Emanuele Valpreda, Manfredi Camalleri, Qi Zhao, Christian Unger, Naveen-Shankar Nagaraja, Maurizio Martina, and Walter Stechele. HW-Flow: A Multi-Abstraction Level HW-CNN Codesign Pruning Methodology. In LITES, Volume 8, Issue 1 (2022): Special Issue on Embedded Systems for Computer Vision. Leibniz Transactions on Embedded Systems, Volume 8, Issue 1, pp. 03:1-03:30, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

##### Improved Bounds for Randomly Colouring Simple Hypergraphs

Authors: Weiming Feng, Heng Guo, and Jiaheng Wang

Published in: LIPIcs, Volume 245, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)

##### Abstract
We study the problem of sampling almost uniform proper q-colourings in k-uniform simple hypergraphs with maximum degree Δ. For any δ > 0, if k ≥ 20(1+δ)/δ and q ≥ 100Δ^({2+δ}/{k-4/δ-4}), the running time of our algorithm is Õ(poly(Δ k)⋅ n^1.01), where n is the number of vertices. Our result requires fewer colours than previous results for general hypergraphs (Jain, Pham, and Vuong, 2021; He, Sun, and Wu, 2021), and does not require Ω(log n) colours unlike the work of Frieze and Anastos (2017).

Weiming Feng, Heng Guo, and Jiaheng Wang. Improved Bounds for Randomly Colouring Simple Hypergraphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 25:1-25:17, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

Authors: Aaron Bernstein, Jan van den Brand, Maximilian Probst Gutenberg, Danupon Nanongkai, Thatchaphol Saranurak, Aaron Sidford, and He Sun

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Aaron Bernstein, Jan van den Brand, Maximilian Probst Gutenberg, Danupon Nanongkai, Thatchaphol Saranurak, Aaron Sidford, and He Sun. Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 20:1-20:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

##### Dynamic Inference in Probabilistic Graphical Models

Authors: Weiming Feng, Kun He, Xiaoming Sun, and Yitong Yin

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

##### Abstract
Probabilistic graphical models, such as Markov random fields (MRFs), are useful for describing high-dimensional distributions in terms of local dependence structures. The {probabilistic inference} is a fundamental problem related to graphical models, and sampling is a main approach for the problem. In this paper, we study probabilistic inference problems when the graphical model itself is changing dynamically with time. Such dynamic inference problems arise naturally in today’s application, e.g. multivariate time-series data analysis and practical learning procedures. We give a dynamic algorithm for sampling-based probabilistic inferences in MRFs, where each dynamic update can change the underlying graph and all parameters of the MRF simultaneously, as long as the total amount of changes is bounded. More precisely, suppose that the MRF has n variables and polylogarithmic-bounded maximum degree, and N(n) independent samples are sufficient for the inference for a polynomial function N(⋅). Our algorithm dynamically maintains an answer to the inference problem using Õ(n N(n)) space cost, and Õ(N(n) + n) incremental time cost upon each update to the MRF, as long as the Dobrushin-Shlosman condition is satisfied by the MRFs. This well-known condition has long been used for guaranteeing the efficiency of Markov chain Monte Carlo (MCMC) sampling in the traditional static setting. Compared to the static case, which requires Ω(n N(n)) time cost for redrawing all N(n) samples whenever the MRF changes, our dynamic algorithm gives a 𝛺^~(min{n, N(n)})-factor speedup. Our approach relies on a novel dynamic sampling technique, which transforms local Markov chains (a.k.a. single-site dynamics) to dynamic sampling algorithms, and an "algorithmic Lipschitz" condition that we establish for sampling from graphical models, namely, when the MRF changes by a small difference, samples can be modified to reflect the new distribution, with cost proportional to the difference on MRF.

Weiming Feng, Kun He, Xiaoming Sun, and Yitong Yin. Dynamic Inference in Probabilistic Graphical Models. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 25:1-25:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)

##### Augmenting the Algebraic Connectivity of Graphs

Authors: Bogdan-Adrian Manghiuc, Pan Peng, and He Sun

Published in: LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)

##### Abstract
For any undirected graph G = (V,E) and a set E_W of candidate edges with E ∩ E_W = ∅, the (k,γ)-spectral augmentability problem is to find a set F of k edges from E_W with appropriate weighting, such that the algebraic connectivity of the resulting graph H = (V, E ∪ F) is least γ. Because of a tight connection between the algebraic connectivity and many other graph parameters, including the graph’s conductance and the mixing time of random walks in a graph, maximising the resulting graph’s algebraic connectivity by adding a small number of edges has been studied over the past 15 years, and has many practical applications in network optimisation. In this work we present an approximate and efficient algorithm for the (k,γ)-spectral augmentability problem, and our algorithm runs in almost-linear time under a wide regime of parameters. Our main algorithm is based on the following two novel techniques developed in the paper, which might have applications beyond the (k,γ)-spectral augmentability problem: - We present a fast algorithm for solving a feasibility version of an SDP for the algebraic connectivity maximisation problem from [Ghosh and Boyd, 2006]. Our algorithm is based on the classic primal-dual framework for solving SDP, which in turn uses the multiplicative weight update algorithm. We present a novel approach of unifying SDP constraints of different matrix and vector variables and give a good separation oracle accordingly. - We present an efficient algorithm for the subgraph sparsification problem, and for a wide range of parameters our algorithm runs in almost-linear time, in contrast to the previously best known algorithm running in at least Ω(n²mk) time [Kolla et al., 2010]. Our analysis shows how the randomised BSS framework can be generalised in the setting of subgraph sparsification, and how the potential functions can be applied to approximately keep track of different subspaces.

Bogdan-Adrian Manghiuc, Pan Peng, and He Sun. Augmenting the Algebraic Connectivity of Graphs. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 70:1-70:22, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)

##### Contraction: A Unified Perspective of Correlation Decay and Zero-Freeness of 2-Spin Systems

Authors: Shuai Shao and Yuxin Sun

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

##### Abstract
We study complex zeros of the partition function of 2-spin systems, viewed as a multivariate polynomial in terms of the edge interaction parameters and the uniform external field. We obtain new zero-free regions in which all these parameters are complex-valued. Crucially based on the zero-freeness, we are able to extend the existence of correlation decay to these complex regions from real parameters. As a consequence, we obtain an FPTAS for computing the partition function of 2-spin systems on graphs of bounded degree for these parameter settings. We introduce the contraction property as a unified sufficient condition to devise FPTAS via either Weitz’s algorithm or Barvinok’s algorithm. Our main technical contribution is a very simple but general approach to extend any real parameter of which the 2-spin system exhibits correlation decay to its complex neighborhood where the partition function is zero-free and correlation decay still exists. This result formally establishes the inherent connection between two distinct notions of phase transition for 2-spin systems: the existence of correlation decay and the zero-freeness of the partition function via a unified perspective, contraction.

Shuai Shao and Yuxin Sun. Contraction: A Unified Perspective of Correlation Decay and Zero-Freeness of 2-Spin Systems. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 96:1-96:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)

##### Hermitian Laplacians and a Cheeger Inequality for the Max-2-Lin Problem

Authors: Huan Li, He Sun, and Luca Zanetti

Published in: LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

##### Abstract
We study spectral approaches for the MAX-2-LIN(k) problem, in which we are given a system of m linear equations of the form x_i - x_j is equivalent to c_{ij} mod k, and required to find an assignment to the n variables {x_i} that maximises the total number of satisfied equations. We consider Hermitian Laplacians related to this problem, and prove a Cheeger inequality that relates the smallest eigenvalue of a Hermitian Laplacian to the maximum number of satisfied equations of a MAX-2-LIN(k) instance I. We develop an O~(kn^2) time algorithm that, for any (1-epsilon)-satisfiable instance, produces an assignment satisfying a (1 - O(k)sqrt{epsilon})-fraction of equations. We also present a subquadratic-time algorithm that, when the graph associated with I is an expander, produces an assignment satisfying a (1- O(k^2)epsilon)-fraction of the equations. Our Cheeger inequality and first algorithm can be seen as generalisations of the Cheeger inequality and algorithm for MAX-CUT developed by Trevisan.

Huan Li, He Sun, and Luca Zanetti. Hermitian Laplacians and a Cheeger Inequality for the Max-2-Lin Problem. In 27th Annual European Symposium on Algorithms (ESA 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 144, pp. 71:1-71:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)

##### Querying a Matrix Through Matrix-Vector Products

Authors: Xiaoming Sun, David P. Woodruff, Guang Yang, and Jialin Zhang

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

##### Abstract
We consider algorithms with access to an unknown matrix M in F^{n x d} via matrix-vector products, namely, the algorithm chooses vectors v^1, ..., v^q, and observes Mv^1, ..., Mv^q. Here the v^i can be randomized as well as chosen adaptively as a function of Mv^1, ..., Mv^{i-1}. Motivated by applications of sketching in distributed computation, linear algebra, and streaming models, as well as connections to areas such as communication complexity and property testing, we initiate the study of the number q of queries needed to solve various fundamental problems. We study problems in three broad categories, including linear algebra, statistics problems, and graph problems. For example, we consider the number of queries required to approximate the rank, trace, maximum eigenvalue, and norms of a matrix M; to compute the AND/OR/Parity of each column or row of M, to decide whether there are identical columns or rows in M or whether M is symmetric, diagonal, or unitary; or to compute whether a graph defined by M is connected or triangle-free. We also show separations for algorithms that are allowed to obtain matrix-vector products only by querying vectors on the right, versus algorithms that can query vectors on both the left and the right. We also show separations depending on the underlying field the matrix-vector product occurs in. For graph problems, we show separations depending on the form of the matrix (bipartite adjacency versus signed edge-vertex incidence matrix) to represent the graph. Surprisingly, this fundamental model does not appear to have been studied on its own, and we believe a thorough investigation of problems in this model would be beneficial to a number of different application areas.

Xiaoming Sun, David P. Woodruff, Guang Yang, and Jialin Zhang. Querying a Matrix Through Matrix-Vector Products. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 94:1-94:16, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)

##### On the Decision Tree Complexity of String Matching

Authors: Xiaoyu He, Neng Huang, and Xiaoming Sun

Published in: LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)

##### Abstract
String matching is one of the most fundamental problems in computer science. A natural problem is to determine the number of characters that need to be queried (i.e. the decision tree complexity) in a string in order to decide whether this string contains a certain pattern. Rivest showed that for every pattern p, in the worst case any deterministic algorithm needs to query at least n-|p|+1 characters, where n is the length of the string and |p| is the length of the pattern. He further conjectured that this bound is tight. By using the adversary method, Tuza disproved this conjecture and showed that more than one half of binary patterns are evasive, i.e. any algorithm needs to query all the characters (see Section 1.1 for more details). In this paper, we give a query algorithm which settles the decision tree complexity of string matching except for a negligible fraction of patterns. Our algorithm shows that Tuza's criteria of evasive patterns are almost complete. Using the algebraic approach of Rivest and Vuillemin, we also give a new sufficient condition for the evasiveness of patterns, which is beyond Tuza's criteria. In addition, our result reveals an interesting connection to Skolem's Problem in mathematics.

Xiaoyu He, Neng Huang, and Xiaoming Sun. On the Decision Tree Complexity of String Matching. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 45:1-45:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

##### Balls into bins via local search: cover time and maximum load

Authors: Karl Bringmann, Thomas Sauerwald, Alexandre Stauffer, and He Sun

Published in: LIPIcs, Volume 25, 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)

##### Abstract
We study a natural process for allocating m balls into n bins that are organized as the vertices of an undirected graph G. Balls arrive one at a time. When a ball arrives, it first chooses a vertex u in G uniformly at random. Then the ball performs a local search in G starting from u until it reaches a vertex with local minimum load, where the ball is finally placed on. Then the next ball arrives and this procedure is repeated. For the case m=n, we give an upper bound for the maximum load on graphs with bounded degrees. We also propose the study of the cover time of this process, which is defined as the smallest m so that every bin has at least one ball allocated to it. We establish an upper bound for the cover time on graphs with bounded degrees. Our bounds for the maximum load and the cover time are tight when the graph is vertex transitive or sufficiently homogeneous. We also give upper bounds for the maximum load when m>=n.

Karl Bringmann, Thomas Sauerwald, Alexandre Stauffer, and He Sun. Balls into bins via local search: cover time and maximum load. In 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 25, pp. 187-198, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2014)

