Document

**Published in:** LIPIcs, Volume 284, 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023)

The fundamental task of rank aggregation is to combine multiple rankings on a group of candidates into a single ranking to mitigate biases inherent in individual input rankings. This task has a myriad of applications, such as in social choice theory, collaborative filtering, web search, statistics, databases, sports, and admission systems. One popular version of this task, maximum rank aggregation (or the center ranking problem), aims to find a ranking (not necessarily from the input set) that minimizes the maximum distance to the input rankings. However, even for four input rankings, this problem is NP-hard (Dwork et al., WWW'01, and Biedl et al., Discrete Math.'09), and only a (folklore) polynomial-time 2-approximation algorithm is known for finding an optimal aggregate ranking under the commonly used Kendall-tau distance metric. Achieving a better approximation factor in polynomial time, ideally, a polynomial time approximation scheme (PTAS), is one of the major challenges.
This paper presents significant progress in solving this problem by considering the Mallows model, a classical probabilistic model. Our proposed algorithm outputs an (1+ε)-approximate aggregate ranking for any ε > 0, with high probability, as long as the input rankings come from a Mallows model, even in a streaming fashion. Furthermore, the same approximation guarantee is achieved even in the presence of outliers, presumably a more challenging task.

Yan Hong Yao Alvin and Diptarka Chakraborty. Approximate Maximum Rank Aggregation: Beyond the Worst-Case. In 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 284, pp. 12:1-12:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{alvin_et_al:LIPIcs.FSTTCS.2023.12, author = {Alvin, Yan Hong Yao and Chakraborty, Diptarka}, title = {{Approximate Maximum Rank Aggregation: Beyond the Worst-Case}}, booktitle = {43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023)}, pages = {12:1--12:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-304-1}, ISSN = {1868-8969}, year = {2023}, volume = {284}, editor = {Bouyer, Patricia and Srinivasan, Srikanth}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2023.12}, URN = {urn:nbn:de:0030-drops-193857}, doi = {10.4230/LIPIcs.FSTTCS.2023.12}, annote = {Keywords: Rank Aggregation, Center Problem, Mallows Model, Approximation Algorithms, Clustering with Outliers} }

Document

**Published in:** LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)

In this paper, we focus on the matrix completion problem and aim to minimize the diameter over an arbitrary alphabet. Given a matrix M with missing entries, our objective is to complete the matrix by filling in the missing entries in a way that minimizes the maximum (Hamming) distance between any pair of rows in the completed matrix (also known as the diameter of the matrix). It is worth noting that this problem is already known to be NP-hard. Currently, the best-known upper bound is a 4-approximation algorithm derived by applying the triangle inequality together with a well-known 2-approximation algorithm for the radius minimization variant.
In this work, we make the following contributions:
- We present a novel 3-approximation algorithm for the diameter minimization variant of the matrix completion problem. To the best of our knowledge, this is the first approximation result that breaks below the straightforward 4-factor bound.
- Furthermore, we establish that the diameter minimization variant of the matrix completion problem is (2-ε)-inapproximable, for any ε > 0, even when considering a binary alphabet, under the assumption that 𝖯 ≠ NP. This is the first result that demonstrates a hardness of approximation for this problem.

Diptarka Chakraborty and Sanjana Dey. Matrix Completion: Approximating the Minimum Diameter. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 17:1-17:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ISAAC.2023.17, author = {Chakraborty, Diptarka and Dey, Sanjana}, title = {{Matrix Completion: Approximating the Minimum Diameter}}, booktitle = {34th International Symposium on Algorithms and Computation (ISAAC 2023)}, pages = {17:1--17:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-289-1}, ISSN = {1868-8969}, year = {2023}, volume = {283}, editor = {Iwata, Satoru and Kakimura, Naonori}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.17}, URN = {urn:nbn:de:0030-drops-193197}, doi = {10.4230/LIPIcs.ISAAC.2023.17}, annote = {Keywords: Incomplete Data, Matrix Completion, Hamming Distance, Diameter Minimization, Approximation Algorithms, Hardness of Approximation} }

Document

**Published in:** LIPIcs, Volume 272, 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)

We consider the problem of estimating the support size of a distribution D. Our investigations are pursued through the lens of distribution testing and seek to understand the power of conditional sampling (denoted as COND), wherein one is allowed to query the given distribution conditioned on an arbitrary subset S. The primary contribution of this work is to introduce a new approach to lower bounds for the COND model that relies on using powerful tools from information theory and communication complexity.
Our approach allows us to obtain surprisingly strong lower bounds for the COND model and its extensions.
- We bridge the longstanding gap between the upper bound O(log log n + 1/ε²) and the lower bound Ω(√{log log n}) for the COND model by providing a nearly matching lower bound. Surprisingly, we show that even if we get to know the actual probabilities along with COND samples, still Ω(log log n + 1/{ε² log (1/ε)}) queries are necessary.
- We obtain the first non-trivial lower bound for the COND equipped with an additional oracle that reveals the actual as well as the conditional probabilities of the samples (to the best of our knowledge, this subsumes all of the models previously studied): in particular, we demonstrate that Ω(log log log n + 1/{ε² log (1/ε)}) queries are necessary.

Diptarka Chakraborty, Gunjan Kumar, and Kuldeep S. Meel. Support Size Estimation: The Power of Conditioning. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 33:1-33:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.MFCS.2023.33, author = {Chakraborty, Diptarka and Kumar, Gunjan and Meel, Kuldeep S.}, title = {{Support Size Estimation: The Power of Conditioning}}, booktitle = {48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)}, pages = {33:1--33:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-292-1}, ISSN = {1868-8969}, year = {2023}, volume = {272}, editor = {Leroux, J\'{e}r\^{o}me and Lombardy, Sylvain and Peleg, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2023.33}, URN = {urn:nbn:de:0030-drops-185675}, doi = {10.4230/LIPIcs.MFCS.2023.33}, annote = {Keywords: Support-size estimation, Distribution testing, Conditional sampling, Lower bound} }

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

**Published in:** LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

Given a Boolean formula ϕ over n variables, the problem of model counting is to compute the number of solutions of ϕ. Model counting is a fundamental problem in computer science with wide-ranging applications in domains such as quantified information leakage, probabilistic reasoning, network reliability, neural network verification, and more. Owing to the #P-hardness of the problems, Stockmeyer initiated the study of the complexity of approximate counting. Stockmeyer showed that log n calls to an NP oracle are necessary and sufficient to achieve (ε,δ) guarantees. The hashing-based framework proposed by Stockmeyer has been very influential in designing practical counters over the past decade, wherein the SAT solver substitutes the NP oracle calls in practice. It is well known that an NP oracle does not fully capture the behavior of SAT solvers, as SAT solvers are also designed to provide satisfying assignments when a formula is satisfiable, without additional overhead. Accordingly, the notion of SAT oracle has been proposed to capture the behavior of SAT solver wherein given a Boolean formula, an SAT oracle returns a satisfying assignment if the formula is satisfiable or returns unsatisfiable otherwise. Since the practical state-of-the-art approximate counting techniques use SAT solvers, a natural question is whether an SAT oracle is more powerful than an NP oracle in the context of approximate model counting.
The primary contribution of this work is to study the relative power of the NP oracle and SAT oracle in the context of approximate model counting. The previous techniques proposed in the context of an NP oracle are weak to provide strong bounds in the context of SAT oracle since, in contrast to an NP oracle that provides only one bit of information, a SAT oracle can provide n bits of information. We therefore develop a new methodology to achieve the main result: a SAT oracle is no more powerful than an NP oracle in the context of approximate model counting.

Diptarka Chakraborty, Sourav Chakraborty, Gunjan Kumar, and Kuldeep S. Meel. Approximate Model Counting: Is SAT Oracle More Powerful Than NP Oracle?. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 123:1-123:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ICALP.2023.123, author = {Chakraborty, Diptarka and Chakraborty, Sourav and Kumar, Gunjan and Meel, Kuldeep S.}, title = {{Approximate Model Counting: Is SAT Oracle More Powerful Than NP Oracle?}}, booktitle = {50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)}, pages = {123:1--123:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-278-5}, ISSN = {1868-8969}, year = {2023}, volume = {261}, editor = {Etessami, Kousha and Feige, Uriel and Puppis, Gabriele}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.123}, URN = {urn:nbn:de:0030-drops-181750}, doi = {10.4230/LIPIcs.ICALP.2023.123}, annote = {Keywords: Model counting, Approximation, Satisfiability, NP oracle, SAT oracle} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

We study the classical metric k-median clustering problem over a set of input rankings (i.e., permutations), which has myriad applications, from social-choice theory to web search and databases. A folklore algorithm provides a 2-approximate solution in polynomial time for all k = O(1), and works irrespective of the underlying distance measure, so long it is a metric; however, going below the 2-factor is a notorious challenge. We consider the Ulam distance, a variant of the well-known edit-distance metric, where strings are restricted to be permutations. For this metric, Chakraborty, Das, and Krauthgamer [SODA, 2021] provided a (2-δ)-approximation algorithm for k = 1, where δ≈ 2^{-40}.
Our primary contribution is a new algorithmic framework for clustering a set of permutations. Our first result is a 1.999-approximation algorithm for the metric k-median problem under the Ulam metric, that runs in time (k log (nd))^{O(k)} nd³ for an input consisting of n permutations over [d]. In fact, our framework is powerful enough to extend this result to the streaming model (where the n input permutations arrive one by one) using only polylogarithmic (in n) space. Additionally, we show that similar results can be obtained even in the presence of outliers, which is presumably a more difficult problem.

Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Clustering Permutations: New Techniques with Streaming Applications. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 31:1-31:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ITCS.2023.31, author = {Chakraborty, Diptarka and Das, Debarati and Krauthgamer, Robert}, title = {{Clustering Permutations: New Techniques with Streaming Applications}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {31:1--31:24}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.31}, URN = {urn:nbn:de:0030-drops-175340}, doi = {10.4230/LIPIcs.ITCS.2023.31}, annote = {Keywords: Clustering, Approximation Algorithms, Ulam Distance, Rank Aggregation, Streaming} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

In this paper, we consider reachability oracles and reachability preservers for directed graphs/networks prone to edge/node failures. Let G = (V, E) be a directed graph on n-nodes, and P ⊆ V× V be a set of vertex pairs in G. We present the first non-trivial constructions of single and dual fault-tolerant pairwise reachability oracle with constant query time. Furthermore, we provide extremal bounds for sparse fault-tolerant reachability preservers, resilient to two or more failures. Prior to this work, such oracles and reachability preservers were widely studied for the special scenario of single-source and all-pairs settings. However, for the scenario of arbitrary pairs, no prior (non-trivial) results were known for dual (or more) failures, except those implied from the single-source setting. One of the main questions is whether it is possible to beat the O(n |P|) size bound (derived from the single-source setting) for reachability oracle and preserver for dual failures (or O(2^k n|P|) bound for k failures). We answer this question affirmatively. Below we summarize our contributions.
- For an n-vertex directed graph G = (V, E) and P ⊆ V× V, we present a construction of O(n √{|P|}) sized dual fault-tolerant pairwise reachability oracle with constant query time. We further provide a matching (up to the word size) lower bound of Ω(n √{|P|}) on the size (in bits) of the oracle for the dual fault setting, thereby proving that our oracle is (near-)optimal.
- Next, we provide a construction of O(n + min{|P|√ n,~n√{|P|}}) sized oracle with O(1) query time, resilient to single node/edge failure. In particular, for |P| bounded by O(√n) this yields an oracle of just O(n) size. We complement the upper bound with a lower bound of Ω(n^{2/3}|P|^{1/2}) (in bits), refuting the possibility of a linear-sized oracle for P of size ω(n^{2/3}).
- We also present a construction of O(n^{4/3} |P|^{1/3}) sized pairwise reachability preservers resilient to dual edge/vertex failures. Previously, such preservers were known to exist only under single failure and had O(n+min{|P|√n,~n√ {|P|}}) size [Chakraborty and Choudhary, ICALP'20]. We also show a lower bound of Ω(n √{|P|}) edges on the size of dual fault-tolerant reachability preservers, thereby providing a sharp gap between single and dual fault-tolerant reachability preservers for |P| = o(n).
- Finally, we provide a generic pairwise reachability preserver construction that provides a o(2^k n |P|) sized subgraph resilient to k failures, for any k ≥ 1. Before this work, we only knew of an O(2^k n |P|) bound implied from the single-source setting [Baswana, Choudhary, and Roditty, STOC'16].

Diptarka Chakraborty, Kushagra Chatterjee, and Keerti Choudhary. Pairwise Reachability Oracles and Preservers Under Failures. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ICALP.2022.35, author = {Chakraborty, Diptarka and Chatterjee, Kushagra and Choudhary, Keerti}, title = {{Pairwise Reachability Oracles and Preservers Under Failures}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {35:1--35:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.35}, URN = {urn:nbn:de:0030-drops-163768}, doi = {10.4230/LIPIcs.ICALP.2022.35}, annote = {Keywords: Fault-tolerant, Reachability Oracle, Reachability Preservers, Graph sparsification, Lower bounds} }

Document

**Published in:** LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)

We consider an approximate version of the trace reconstruction problem, where the goal is to recover an unknown string s ∈ {0,1}ⁿ from m traces (each trace is generated independently by passing s through a probabilistic insertion-deletion channel with rate p). We present a deterministic near-linear time algorithm for the average-case model, where s is random, that uses only three traces. It runs in near-linear time Õ(n) and with high probability reports a string within edit distance Õ(p² n) from s, which significantly improves over the straightforward bound of O(pn).
Technically, our algorithm computes a (1+ε)-approximate median of the three input traces. To prove its correctness, our probabilistic analysis shows that an approximate median is indeed close to the unknown s. To achieve a near-linear time bound, we have to bypass the well-known dynamic programming algorithm that computes an optimal median in time O(n³).

Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Approximate Trace Reconstruction via Median String (In Average-Case). In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 11:1-11:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.FSTTCS.2021.11, author = {Chakraborty, Diptarka and Das, Debarati and Krauthgamer, Robert}, title = {{Approximate Trace Reconstruction via Median String (In Average-Case)}}, booktitle = {41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)}, pages = {11:1--11:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-215-0}, ISSN = {1868-8969}, year = {2021}, volume = {213}, editor = {Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.11}, URN = {urn:nbn:de:0030-drops-155228}, doi = {10.4230/LIPIcs.FSTTCS.2021.11}, annote = {Keywords: Trace Reconstruction, Approximation Algorithms, Edit Distance, String Median} }

Document

**Published in:** LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)

We study the problem of approximating a center under the Ulam metric. The Ulam metric, defined over a set of permutations over [n], is the minimum number of move operations (deletion plus insertion) to transform one permutation into another. The Ulam metric is a simpler variant of the general edit distance metric. It provides a measure of dissimilarity over a set of rankings/permutations. In the center problem, given a set of permutations, we are asked to find a permutation (not necessarily from the input set) that minimizes the maximum distance to the input permutations. This problem is also referred to as maximum rank aggregation under Ulam. So far, we only know of a folklore 2-approximation algorithm for this NP-hard problem. Even for constantly many permutations, we do not know anything better than an exhaustive search over all n! permutations.
In this paper, we achieve a (3/2 - 1/(3m))-approximation of the Ulam center in time n^O(m² ln m), for m input permutations over [n]. We therefore get a polynomial time bound while achieving better than a 3/2-approximation for constantly many permutations. This problem is of special interest even for constantly many permutations because under certain dissimilarity measures over rankings, even for four permutations, the problem is NP-hard.
In proving our result, we establish a surprising connection between the approximate Ulam center problem and the closest string with wildcards problem (the center problem over the Hamming metric, allowing wildcards). We further study the closest string with wildcards problem and show that there cannot exist any (2-ε)-approximation algorithm (for any ε > 0) for it unless 𝖯 = NP. This inapproximability result is in sharp contrast with the same problem without wildcards, where we know of a PTAS.

Diptarka Chakraborty, Kshitij Gajjar, and Agastya Vibhuti Jha. Approximating the Center Ranking Under Ulam. In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 12:1-12:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.FSTTCS.2021.12, author = {Chakraborty, Diptarka and Gajjar, Kshitij and Jha, Agastya Vibhuti}, title = {{Approximating the Center Ranking Under Ulam}}, booktitle = {41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)}, pages = {12:1--12:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-215-0}, ISSN = {1868-8969}, year = {2021}, volume = {213}, editor = {Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.12}, URN = {urn:nbn:de:0030-drops-155230}, doi = {10.4230/LIPIcs.FSTTCS.2021.12}, annote = {Keywords: Center Problem, Ulam Metric, Edit Distance, Closest String, Approximation Algorithms} }

Document

APPROX

**Published in:** LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

The problem of finding longest common subsequence (LCS) is one of the fundamental problems in computer science, which finds application in fields such as computational biology, text processing, information retrieval, data compression etc. It is well known that (decision version of) the problem of finding the length of a LCS of an arbitrary number of input sequences (which we refer to as Multi-LCS problem) is NP-complete. Jiang and Li [SICOMP'95] showed that if Max-Clique is hard to approximate within a factor of s then Multi-LCS is also hard to approximate within a factor of Θ(s). By the NP-hardness of the problem of approximating Max-Clique by Zuckerman [ToC'07], for any constant δ > 0, the length of a LCS of arbitrary number of input sequences of length n each, cannot be approximated within an n^{1-δ}-factor in polynomial time unless {P}={NP}. However, the reduction of Jiang and Li assumes the alphabet size to be Ω(n). So far no hardness result is known for the problem of approximating Multi-LCS over sub-linear sized alphabet. On the other hand, it is easy to get 1/|Σ|-factor approximation for strings of alphabet Σ.
In this paper, we make a significant progress towards proving hardness of approximation over small alphabet by showing a polynomial-time reduction from the well-studied densest k-subgraph problem with perfect completeness to approximating Multi-LCS over alphabet of size poly(n/k). As a consequence, from the known hardness result of densest k-subgraph problem (e.g. [Manurangsi, STOC'17]) we get that no polynomial-time algorithm can give an n^{-o(1)}-factor approximation of Multi-LCS over an alphabet of size n^{o(1)}, unless the Exponential Time Hypothesis is false.

Amey Bhangale, Diptarka Chakraborty, and Rajendra Kumar. Hardness of Approximation of (Multi-)LCS over Small Alphabet. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 38:1-38:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{bhangale_et_al:LIPIcs.APPROX/RANDOM.2020.38, author = {Bhangale, Amey and Chakraborty, Diptarka and Kumar, Rajendra}, title = {{Hardness of Approximation of (Multi-)LCS over Small Alphabet}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)}, pages = {38:1--38:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-164-1}, ISSN = {1868-8969}, year = {2020}, volume = {176}, editor = {Byrka, Jaros{\l}aw and Meka, Raghu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.38}, URN = {urn:nbn:de:0030-drops-126418}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2020.38}, annote = {Keywords: Longest common subsequence, Hardness of approximation, ETH-hardness, Densest k-subgraph problem} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

In this paper, we consider the question of computing sparse subgraphs for any input directed graph G = (V,E) on n vertices and m edges, that preserves reachability and/or strong connectivity structures.
- We show O(n+min{|P|√n, n√|P|}) bound on a subgraph that is an 1-fault-tolerant reachability preserver for a given vertex-pair set P ⊆ V× V, i.e., it preserves reachability between any pair of vertices in P under single edge (or vertex) failure. Our result is a significant improvement over the previous best O(n |P|) bound obtained as a corollary of single-source reachability preserver construction. We prove our upper bound by exploiting the special structure of single fault-tolerant reachability preserver for any pair, and then considering the interaction among such structures for different pairs.
- In the lower bound side, we show that a 2-fault-tolerant reachability preserver for a vertex-pair set P ⊆ V×V of size Ω(n^ε), for even any arbitrarily small ε, requires at least Ω(n^(1+ε/8)) edges. This refutes the existence of linear-sized dual fault-tolerant preservers for reachability for any polynomial sized vertex-pair set.
- We also present the first sub-quadratic bound of at most Õ(k 2^k n^(2-1/k)) size, for strong-connectivity preservers of directed graphs under k failures. To the best of our knowledge no non-trivial bound for this problem was known before, for a general k. We get our result by adopting the color-coding technique of Alon, Yuster, and Zwick [JACM'95].

Diptarka Chakraborty and Keerti Choudhary. New Extremal Bounds for Reachability and Strong-Connectivity Preservers Under Failures. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 25:1-25:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ICALP.2020.25, author = {Chakraborty, Diptarka and Choudhary, Keerti}, title = {{New Extremal Bounds for Reachability and Strong-Connectivity Preservers Under Failures}}, booktitle = {47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)}, pages = {25:1--25:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-138-2}, ISSN = {1868-8969}, year = {2020}, volume = {168}, editor = {Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.25}, URN = {urn:nbn:de:0030-drops-124327}, doi = {10.4230/LIPIcs.ICALP.2020.25}, annote = {Keywords: Preservers, Strong-connectivity, Reachability, Fault-tolerant, Graph sparsification} }

Document

**Published in:** LIPIcs, Volume 150, 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)

We consider the approximate pattern matching problem under edit distance. In this problem we are given a pattern P of length m and a text T of length n over some alphabet Sigma, and a positive integer k. The goal is to find all the positions j in T such that there is a substring of T ending at j which has edit distance at most k from the pattern P. Recall, the edit distance between two strings is the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. For a position t in {1,...,n}, let k_t be the smallest edit distance between P and any substring of T ending at t. In this paper we give a constant factor approximation to the sequence k_1,k_2,...,k_n. We consider both offline and online settings.
In the offline setting, where both P and T are available, we present an algorithm that for all t in {1,...,n}, computes the value of k_t approximately within a constant factor. The worst case running time of our algorithm is O~(n m^(3/4)).
In the online setting, we are given P and then T arrives one symbol at a time. We design an algorithm that upon arrival of the t-th symbol of T computes k_t approximately within O(1)-multiplicative factor and m^(8/9)-additive error. Our algorithm takes O~(m^(1-(7/54))) amortized time per symbol arrival and takes O~(m^(1-(1/54))) additional space apart from storing the pattern P. Both of our algorithms are randomized and produce correct answer with high probability. To the best of our knowledge this is the first algorithm that takes worst-case sublinear (in the length of the pattern) time and sublinear extra space for the online approximate pattern matching problem. To get our result we build on the technique of Chakraborty, Das, Goldenberg, Koucký and Saks [FOCS'18] for computing a constant factor approximation of edit distance in sub-quadratic time.

Diptarka Chakraborty, Debarati Das, and Michal Koucký. Approximate Online Pattern Matching in Sublinear Time. In 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 150, pp. 10:1-10:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.FSTTCS.2019.10, author = {Chakraborty, Diptarka and Das, Debarati and Kouck\'{y}, Michal}, title = {{Approximate Online Pattern Matching in Sublinear Time}}, booktitle = {39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)}, pages = {10:1--10:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-131-3}, ISSN = {1868-8969}, year = {2019}, volume = {150}, editor = {Chattopadhyay, Arkadev and Gastin, Paul}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2019.10}, URN = {urn:nbn:de:0030-drops-115726}, doi = {10.4230/LIPIcs.FSTTCS.2019.10}, annote = {Keywords: Approximate Pattern Matching, Online Pattern Matching, Edit Distance, Sublinear Algorithm, Streaming Algorithm} }

Document

**Published in:** LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)

A quasi-Gray code of dimension n and length l over an alphabet Sigma is a sequence of distinct words w_1,w_2,...,w_l from Sigma^n such that any two consecutive words differ in at most c coordinates, for some fixed constant c>0. In this paper we are interested in the read and write complexity of quasi-Gray codes in the bit-probe model, where we measure the number of symbols read and written in order to transform any word w_i into its successor w_{i+1}.
We present construction of quasi-Gray codes of dimension n and length 3^n over the ternary alphabet {0,1,2} with worst-case read complexity O(log n) and write complexity 2. This generalizes to arbitrary odd-size alphabets. For the binary alphabet, we present quasi-Gray codes of dimension n and length at least 2^n - 20n with worst-case read complexity 6+log n and write complexity 2. This complements a recent result by Raskin [Raskin '17] who shows that any quasi-Gray code over binary alphabet of length 2^n has read complexity Omega(n).
Our results significantly improve on previously known constructions and for the odd-size alphabets we break the Omega(n) worst-case barrier for space-optimal (non-redundant) quasi-Gray codes with constant number of writes. We obtain our results via a novel application of algebraic tools together with the principles of catalytic computation [Buhrman et al. '14, Ben-Or and Cleve '92, Barrington '89, Coppersmith and Grossman '75].

Diptarka Chakraborty, Debarati Das, Michal Koucký, and Nitin Saurabh. Space-Optimal Quasi-Gray Codes with Logarithmic Read Complexity. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 12:1-12:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ESA.2018.12, author = {Chakraborty, Diptarka and Das, Debarati and Kouck\'{y}, Michal and Saurabh, Nitin}, title = {{Space-Optimal Quasi-Gray Codes with Logarithmic Read Complexity}}, booktitle = {26th Annual European Symposium on Algorithms (ESA 2018)}, pages = {12:1--12:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-081-1}, ISSN = {1868-8969}, year = {2018}, volume = {112}, editor = {Azar, Yossi and Bast, Hannah and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.12}, URN = {urn:nbn:de:0030-drops-94750}, doi = {10.4230/LIPIcs.ESA.2018.12}, annote = {Keywords: Gray code, Space-optimal counter, Decision assignment tree, Cell probe model} }

Document

**Published in:** LIPIcs, Volume 101, 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)

In this paper we address the problem of computing a sparse subgraph of any weighted directed graph such that the exact distances from a designated source vertex to all other vertices are preserved under bounded weight increment. Finding a small sized subgraph that preserves distances between any pair of vertices is a well studied problem. Since in the real world any network is prone to failures, it is natural to study the fault tolerant version of the above problem. Unfortunately, it turns out that there may not always exist such a sparse subgraph even under single edge failure [Demetrescu et al. '08]. However in real applications it is not always the case that a link (edge) in a network becomes completely faulty. Instead, it can happen that some links become more congested which can be captured by increasing weight on the corresponding edges. Thus it makes sense to try to construct a sparse distance preserving subgraph under the above weight increment model where total increase in weight in the whole network (graph) is bounded by some parameter k. To the best of our knowledge this problem has not been studied so far.
In this paper we show that given any weighted directed graph with n vertices and a source vertex, one can construct a subgraph of size at most e * (k-1)!2^kn such that it preserves distances between the source and all other vertices as long as the total weight increment is bounded by k and we are allowed to only have integer valued (can be negative) weight on edges and also weight of an edge can only be increased by some positive integer. Next we show a lower bound of c * 2^kn, for some constant c >= 5/4, on the size of such a subgraph. We further argue that the restrictions of integral weight and integral weight increment are actually essential by showing that if we remove any one of these two we may need to store Omega(n^2) edges to preserve the distances.

Diptarka Chakraborty and Debarati Das. Sparse Weight Tolerant Subgraph for Single Source Shortest Path. In 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 101, pp. 15:1-15:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.SWAT.2018.15, author = {Chakraborty, Diptarka and Das, Debarati}, title = {{Sparse Weight Tolerant Subgraph for Single Source Shortest Path}}, booktitle = {16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)}, pages = {15:1--15:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-068-2}, ISSN = {1868-8969}, year = {2018}, volume = {101}, editor = {Eppstein, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2018.15}, URN = {urn:nbn:de:0030-drops-88413}, doi = {10.4230/LIPIcs.SWAT.2018.15}, annote = {Keywords: Shortest path, fault tolerant, distance preserver, graph algorithm} }

Document

**Published in:** LIPIcs, Volume 45, 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)

In this paper we propose a quantification of distributions on a set of strings, in terms of how close to pseudorandom a distribution is. The quantification is an adaptation of the theory of dimension of sets of infinite sequences introduced by Lutz. Adapting Hitchcock's work, we also show that the logarithmic loss incurred by a predictor on a distribution is quantitatively equivalent to the notion of dimension we define. Roughly, this captures the equivalence between pseudorandomness defined via indistinguishability and via unpredictability. Later we show some natural properties of our notion of dimension. We also do a comparative study among our proposed notion of dimension and two well known notions of computational analogue of entropy, namely HILL-type pseudo min-entropy and next-bit pseudo Shannon entropy.
Further, we apply our quantification to the following problem. If we know that the dimension of a distribution on the set of n-length strings is s in (0,1], can we extract out O(sn) pseudorandom bits out of the distribution? We show that to construct such extractor, one need at least Omega(log n) bits of pure randomness. However, it is still open to do the same using O(log n) random bits. We show that deterministic extraction is possible in a special case - analogous to the bit-fixing sources introduced by Chor et al., which we term nonpseudorandom bit-fixing source. We adapt the techniques of Gabizon, Raz and Shaltiel to construct a deterministic pseudorandom extractor for this source.
By the end, we make a little progress towards P vs. BPP problem by showing that existence of optimal stretching function that stretches O(log n) input bits to produce n output bits such that output distribution has dimension s in (0,1], implies P=BPP.

Manindra Agrawal, Diptarka Chakraborty, Debarati Das, and Satyadev Nandakumar. Dimension, Pseudorandomness and Extraction of Pseudorandomness. In 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 45, pp. 221-235, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{agrawal_et_al:LIPIcs.FSTTCS.2015.221, author = {Agrawal, Manindra and Chakraborty, Diptarka and Das, Debarati and Nandakumar, Satyadev}, title = {{Dimension, Pseudorandomness and Extraction of Pseudorandomness}}, booktitle = {35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)}, pages = {221--235}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-97-2}, ISSN = {1868-8969}, year = {2015}, volume = {45}, editor = {Harsha, Prahladh and Ramalingam, G.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2015.221}, URN = {urn:nbn:de:0030-drops-56184}, doi = {10.4230/LIPIcs.FSTTCS.2015.221}, annote = {Keywords: Pseudorandomness, Dimension, Martingale, Unpredictability, Pseudoentropy, Pseudorandom Extractor, Hard Function} }

Document

**Published in:** LIPIcs, Volume 29, 34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014)

We obtain the following new simultaneous time-space upper bounds for the directed reachability problem. (1) A polynomial-time, O(n^{2/3} * g^{1/3})-space algorithm for directed graphs embedded on orientable surfaces of genus g. (2) A polynomial-time, O(n^{2/3})-space algorithm for all H-minor-free graphs given the tree decomposition, and (3) for K_{3,3}-free and K_5-free graphs, a polynomial-time, O(n^{1/2 + epsilon})-space algorithm, for every epsilon > 0.
For the general directed reachability problem, the best known simultaneous time-space upper bound is the BBRS bound, due to Barnes, Buss, Ruzzo, and Schieber, which achieves a space bound of O(n/2^{k * sqrt(log(n))}) with polynomial running time, for any constant k. It is a significant open question to improve this bound for reachability over general directed graphs. Our algorithms beat the BBRS bound for graphs embedded on surfaces of genus n/2^{omega(sqrt(log(n))}, and for all H-minor-free graphs. This significantly broadens the class of directed graphs for which the BBRS bound can be improved.

Diptarka Chakraborty, A. Pavan, Raghunath Tewari, N. V. Vinodchandran, and Lin Forrest Yang. New Time-Space Upperbounds for Directed Reachability in High-genus and H-minor-free Graphs. In 34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 29, pp. 585-595, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.FSTTCS.2014.585, author = {Chakraborty, Diptarka and Pavan, A. and Tewari, Raghunath and Vinodchandran, N. V. and Yang, Lin Forrest}, title = {{New Time-Space Upperbounds for Directed Reachability in High-genus and H-minor-free Graphs}}, booktitle = {34th International Conference on Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014)}, pages = {585--595}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-77-4}, ISSN = {1868-8969}, year = {2014}, volume = {29}, editor = {Raman, Venkatesh and Suresh, S. P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2014.585}, URN = {urn:nbn:de:0030-drops-48730}, doi = {10.4230/LIPIcs.FSTTCS.2014.585}, annote = {Keywords: Reachability, Space complexity, Time-Space Efficient Algorithms, Graphs on Surfaces, Minor Free Graphs, Savitch's Algorithm, BBRS Bound} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail