DROPS

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2022.65

Fully Functional Parameterized Suffix Trees in Compact Space

Authors: Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

Two equal length strings are a parameterized match (p-match) iff there exists a one-to-one function that renames the symbols in one string to those in the other. The Parameterized Suffix Tree (PST) [Baker, STOC' 93] is a fundamental data structure that handles various string matching problems under this setting. The PST of a text T[1,n] over an alphabet Σ of size σ takes O(nlog n) bits of space. It can report any entry in (parameterized) (i) suffix array, (ii) inverse suffix array, and (iii) longest common prefix (LCP) array in O(1) time. Given any pattern P as a query, a position i in T is an occurrence iff T[i,i+|P|-1] and P are a p-match. The PST can count the number of occurrences of P in T in time O(|P|log σ) and then report each occurrence in time proportional to that of accessing a suffix array entry. An important question is, can we obtain a compressed version of PST that takes space close to the text’s size of nlogσ bits and still support all three functionalities mentioned earlier? In SODA' 17, Ganguly et al. answered this question partially by presenting an O(nlogσ) bit index that can support (parameterized) suffix array and inverse suffix array operations in O(log n) time. However, the compression of the (parameterized) LCP array and the possibility of faster suffix array and inverse suffix array queries in compact space were left open. In this work, we obtain a compact representation of the (parameterized) LCP array. With this result, in conjunction with three new (parameterized) suffix array representations, we obtain the first set of PST representations in o(nlog n) bits (when logσ = o(log n)) as follows. Here ε > 0 is an arbitrarily small constant. - Space O(n logσ) bits and query time O(log_σ^ε n); - Space O(n logσlog log_σ n) bits and query time O(log log_σ n); and - Space O(n logσ log^ε_σ n) bits and query time O(1). The first trade-off is an improvement over Ganguly et al.’s result, whereas our third trade-off matches the optimal time performance of Baker’s PST while squeezing the space by a factor roughly log_σ n. We highlight that our trade-offs match the space-and-time bounds of the best-known compressed text indexes for exact pattern matching and further improvement is highly unlikely.

Cite as

Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Fully Functional Parameterized Suffix Trees in Compact Space. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 65:1-65:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.ICALP.2022.65,
  author =	{Ganguly, Arnab and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Fully Functional Parameterized Suffix Trees in Compact Space}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{65:1--65:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.65},
  URN =		{urn:nbn:de:0030-drops-164061},
  doi =		{10.4230/LIPIcs.ICALP.2022.65},
  annote =	{Keywords: Data Structures, Suffix Trees, String Algorithms, Compression}
}

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2022.118

Dynamic Meta-Theorems for Distance and Matching

Authors: Samir Datta, Chetan Gupta, Rahul Jain, Anish Mukherjee, Vimal Raj Sharma, and Raghunath Tewari

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

Reachability, distance, and matching are some of the most fundamental graph problems that have been of particular interest in dynamic complexity theory in recent years [Samir Datta et al., 2018; Samir Datta et al., 2018; Samir Datta et al., 2020]. Reachability can be maintained with first-order update formulas, or equivalently in DynFO in general graphs with n nodes [Samir Datta et al., 2018], even under O(log(n)/log log(n)) changes per step [Samir Datta et al., 2018]. In the context of how large the number of changes can be handled, it has recently been shown [Samir Datta et al., 2020] that under a polylogarithmic number of changes, reachability is in DynFOpar in planar, bounded treewidth, and related graph classes - in fact in any graph where small non-zero circulation weights can be computed in NC. We continue this line of investigation and extend the meta-theorem for reachability to distance and bipartite maximum matching with the same bounds. These are amongst the most general classes of graphs known where we can maintain these problems deterministically without using a majority quantifier and even maintain witnesses. For the bipartite matching result, modifying the approach from [Stephen A. Fenner et al., 2016], we convert the static non-zero circulation weights to dynamic matching-isolating weights. While reachability is in DynFOar under O(log(n)/log log(n)) changes, no such bound is known for either distance or matching in any non-trivial class of graphs under non-constant changes. We show that, in the same classes of graphs as before, bipartite maximum matching is in DynFOar under O(log(n)/log log(n)) changes per step. En route to showing this we prove that the rank of a matrix can be maintained in DynFOar, also under O(log(n)/log log(n)) entry changes, improving upon the previous O(1) bound [Samir Datta et al., 2018]. This implies a similar extension for the non-uniform DynFO bound for maximum matching in general graphs and an alternate algorithm for maintaining reachability under O(log(n)/log log(n)) changes [Samir Datta et al., 2018].

Cite as

Samir Datta, Chetan Gupta, Rahul Jain, Anish Mukherjee, Vimal Raj Sharma, and Raghunath Tewari. Dynamic Meta-Theorems for Distance and Matching. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 118:1-118:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{datta_et_al:LIPIcs.ICALP.2022.118,
  author =	{Datta, Samir and Gupta, Chetan and Jain, Rahul and Mukherjee, Anish and Sharma, Vimal Raj and Tewari, Raghunath},
  title =	{{Dynamic Meta-Theorems for Distance and Matching}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{118:1--118:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.118},
  URN =		{urn:nbn:de:0030-drops-164598},
  doi =		{10.4230/LIPIcs.ICALP.2022.118},
  annote =	{Keywords: Dynamic Complexity, Distance, Matching, Derandomization, Isolation, Matrix Rank}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.CPM.2022.3

Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk)

Authors: Sharma V. Thankachan

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)

Abstract

In the past two decades, we have witnessed the design of various compact data structures for pattern matching over an indexed text [Navarro, 2016]. Popular indexes like the FM-index [Paolo Ferragina and Giovanni Manzini, 2005], compressed suffix arrays/trees [Roberto Grossi and Jeffrey Scott Vitter, 2005; Kunihiko Sadakane, 2007], the recent r-index [Travis Gagie et al., 2020; Takaaki Nishimoto and Yasuo Tabei, 2021], etc., capture the key functionalities of classic suffix arrays/trees [Udi Manber and Eugene W. Myers, 1993; Peter Weiner, 1973] in compact space. Mostly, they rely on the Burrows-Wheeler Transform (BWT) and its associated operations [Burrows and Wheeler, 1994]. However, compactly encoding some advanced suffix tree (ST) variants, like parameterized ST [Brenda S. Baker, 1993; S. Rao Kosaraju, 1995; Juan Mendivelso et al., 2020], order-isomorphic/preserving ST [Maxime Crochemore et al., 2016], two-dimensional ST [Raffaele Giancarlo, 1995; Dong Kyue Kim et al., 1998], etc. [Sung Gwan Park et al., 2019; Tetsuo Shibuya, 2000]- collectively known as suffix trees with missing suffix links [Richard Cole and Ramesh Hariharan, 2003], has been challenging. The previous techniques are not easily extendable because these variants do not hold some structural properties of the standard ST that enable compression. However, some limited progress has been made in these directions recently [Arnab Ganguly et al., 2017; Travis Gagie et al., 2017; Gianni Decaroli et al., 2017; Dhrumil Patel and Rahul Shah, 2021; Arnab Ganguly et al., 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Sung{-}Hwan Kim and Hwan{-}Gue Cho, 2021; Arnab Ganguly et al., 2017; Arnab Ganguly et al., 2022; Arnab Ganguly et al., 2021]. This talk will briefly survey them and highlight some interesting open problems.

Cite as

Sharma V. Thankachan. Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc. (Invited Talk). In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 3:1-3:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{thankachan:LIPIcs.CPM.2022.3,
  author =	{Thankachan, Sharma V.},
  title =	{{Compact Text Indexing for Advanced Pattern Matching Problems: Parameterized, Order-Isomorphic, 2D, etc.}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{3:1--3:3},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.3},
  URN =		{urn:nbn:de:0030-drops-161300},
  doi =		{10.4230/LIPIcs.CPM.2022.3},
  annote =	{Keywords: Text Indexing, Suffix Trees, String Matching}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2021.16

Reachability and Matching in Single Crossing Minor Free Graphs

Authors: Samir Datta, Chetan Gupta, Rahul Jain, Anish Mukherjee, Vimal Raj Sharma, and Raghunath Tewari

Published in: LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)

Abstract

We show that for each single crossing graph H, a polynomially bounded weight function for all H-minor free graphs G can be constructed in logspace such that it gives nonzero weights to all the cycles in G. This class of graphs subsumes almost all classes of graphs for which such a weight function is known to be constructed in logspace. As a consequence, we obtain that for the class of H-minor free graphs where H is a single crossing graph, reachability can be solved in UL, and bipartite maximum matching can be solved in SPL, which are small subclasses of the parallel complexity class NC. In the restrictive case of bipartite graphs, our maximum matching result improves upon the recent result of Eppstein and Vazirani [David Eppstein and Vijay V. Vazirani, 2021], where they show an NC bound for constructing perfect matching in general single crossing minor free graphs.

Cite as

Samir Datta, Chetan Gupta, Rahul Jain, Anish Mukherjee, Vimal Raj Sharma, and Raghunath Tewari. Reachability and Matching in Single Crossing Minor Free Graphs. In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 16:1-16:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{datta_et_al:LIPIcs.FSTTCS.2021.16,
  author =	{Datta, Samir and Gupta, Chetan and Jain, Rahul and Mukherjee, Anish and Sharma, Vimal Raj and Tewari, Raghunath},
  title =	{{Reachability and Matching in Single Crossing Minor Free Graphs}},
  booktitle =	{41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)},
  pages =	{16:1--16:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-215-0},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{213},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.16},
  URN =		{urn:nbn:de:0030-drops-155277},
  doi =		{10.4230/LIPIcs.FSTTCS.2021.16},
  annote =	{Keywords: Reachability, Matching, Logspace, Single-crossing minor free graphs}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2021.71

LF Successor: Compact Space Indexing for Order-Isomorphic Pattern Matching

Authors: Arnab Ganguly, Dhrumil Patel, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

Abstract

Two strings are order isomorphic iff the relative ordering of their characters is the same at all positions. For a given text T[1,n] over an ordered alphabet of size σ, we can maintain an order-isomorphic suffix tree/array in O(nlog n) bits and support (order-isomorphic) pattern/substring matching queries efficiently. It is interesting to know if we can encode these structures in space close to the text’s size of nlogσ bits. We answer this question positively by presenting an O(nlog σ)-bit index that allows access to any entry in order-isomorphic suffix array (and its inverse array) in t_{SA} = {O}(log²n/logσ) time. For any pattern P given as a query, this index can count the number of substrings of T that are order-isomorphic to P (denoted by occ) in {O}((|P|logσ+t_{SA})log n) time using standard techniques. Also, it can report the locations of those substrings in additional O(occ ⋅ t_{SA}) time.

Cite as

Arnab Ganguly, Dhrumil Patel, Rahul Shah, and Sharma V. Thankachan. LF Successor: Compact Space Indexing for Order-Isomorphic Pattern Matching. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 71:1-71:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.ICALP.2021.71,
  author =	{Ganguly, Arnab and Patel, Dhrumil and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{LF Successor: Compact Space Indexing for Order-Isomorphic Pattern Matching}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{71:1--71:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.71},
  URN =		{urn:nbn:de:0030-drops-141400},
  doi =		{10.4230/LIPIcs.ICALP.2021.71},
  annote =	{Keywords: Succinct data structures, Pattern Matching}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2019.16

Unambiguous Catalytic Computation

Authors: Chetan Gupta, Rahul Jain, Vimal Raj Sharma, and Raghunath Tewari

Published in: LIPIcs, Volume 150, 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)

Abstract

The catalytic Turing machine is a model of computation defined by Buhrman, Cleve, Koucký, Loff, and Speelman (STOC 2014). Compared to the classical space-bounded Turing machine, this model has an extra space which is filled with arbitrary content in addition to the clean space. In such a model we study if this additional filled space can be used to increase the power of computation or not, with the condition that the initial content of this extra filled space must be restored at the end of the computation. In this paper, we define the notion of unambiguous catalytic Turing machine and prove that under a standard derandomization assumption, the class of problems solved by an unambiguous catalytic Turing machine is same as the class of problems solved by a general nondeterministic catalytic Turing machine in the logspace setting.

Cite as

Chetan Gupta, Rahul Jain, Vimal Raj Sharma, and Raghunath Tewari. Unambiguous Catalytic Computation. In 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 150, pp. 16:1-16:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{gupta_et_al:LIPIcs.FSTTCS.2019.16,
  author =	{Gupta, Chetan and Jain, Rahul and Sharma, Vimal Raj and Tewari, Raghunath},
  title =	{{Unambiguous Catalytic Computation}},
  booktitle =	{39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)},
  pages =	{16:1--16:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-131-3},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{150},
  editor =	{Chattopadhyay, Arkadev and Gastin, Paul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2019.16},
  URN =		{urn:nbn:de:0030-drops-115782},
  doi =		{10.4230/LIPIcs.FSTTCS.2019.16},
  annote =	{Keywords: Catalytic computation, Logspace, Reinhardt-Allender}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2019.11

Eventually Sound Points-To Analysis with Specifications

Authors: Osbert Bastani, Rahul Sharma, Lazaro Clapp, Saswat Anand, and Alex Aiken

Published in: LIPIcs, Volume 134, 33rd European Conference on Object-Oriented Programming (ECOOP 2019)

Abstract

Static analyses make the increasingly tenuous assumption that all source code is available for analysis; for example, large libraries often call into native code that cannot be analyzed. We propose a points-to analysis that initially makes optimistic assumptions about missing code, and then inserts runtime checks that report counterexamples to these assumptions that occur during execution. Our approach guarantees eventual soundness, which combines two guarantees: (i) the runtime checks are guaranteed to catch the first counterexample that occurs during any execution, in which case execution can be terminated to prevent harm, and (ii) only finitely many counterexamples ever occur, implying that the static analysis eventually becomes statically sound with respect to all remaining executions. We implement Optix, an eventually sound points-to analysis for Android apps, where the Android framework is missing. We show that the runtime checks added by Optix incur low overhead on real programs, and demonstrate how Optix improves a client information flow analysis for detecting Android malware.

Cite as

Osbert Bastani, Rahul Sharma, Lazaro Clapp, Saswat Anand, and Alex Aiken. Eventually Sound Points-To Analysis with Specifications. In 33rd European Conference on Object-Oriented Programming (ECOOP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 134, pp. 11:1-11:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{bastani_et_al:LIPIcs.ECOOP.2019.11,
  author =	{Bastani, Osbert and Sharma, Rahul and Clapp, Lazaro and Anand, Saswat and Aiken, Alex},
  title =	{{Eventually Sound Points-To Analysis with Specifications}},
  booktitle =	{33rd European Conference on Object-Oriented Programming (ECOOP 2019)},
  pages =	{11:1--11:28},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-111-5},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{134},
  editor =	{Donaldson, Alastair F.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2019.11},
  URN =		{urn:nbn:de:0030-drops-108038},
  doi =		{10.4230/LIPIcs.ECOOP.2019.11},
  annote =	{Keywords: specification inference, static points-to analysis, runtime monitoring}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2019.9

Categorical Range Reporting with Frequencies

Authors: Arnab Ganguly, J. Ian Munro, Yakov Nekrich, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)

Abstract

In this paper, we consider a variant of the color range reporting problem called color reporting with frequencies. Our goal is to pre-process a set of colored points into a data structure, so that given a query range Q, we can report all colors that appear in Q, along with their respective frequencies. In other words, for each reported color, we also output the number of times it occurs in Q. We describe an external-memory data structure that uses O(N(1+log^2D/log N)) words and answers one-dimensional queries in O(1 +K/B) I/Os, where N is the total number of points in the data structure, D is the total number of colors in the data structure, K is the number of reported colors, and B is the block size. Next we turn to an approximate version of this problem: report all colors sigma that appear in the query range; for every reported color, we provide a constant-factor approximation on its frequency. We consider color reporting with approximate frequencies in two dimensions. Our data structure uses O(N) space and answers two-dimensional queries in O(log_B N +log^*B + K/B) I/Os in the special case when the query range is bounded on two sides. As a corollary, we can also answer one-dimensional approximate queries within the same time and space bounds.

Cite as

Arnab Ganguly, J. Ian Munro, Yakov Nekrich, Rahul Shah, and Sharma V. Thankachan. Categorical Range Reporting with Frequencies. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 9:1-9:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.ICDT.2019.9,
  author =	{Ganguly, Arnab and Munro, J. Ian and Nekrich, Yakov and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Categorical Range Reporting with Frequencies}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{9:1--9:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.9},
  URN =		{urn:nbn:de:0030-drops-103115},
  doi =		{10.4230/LIPIcs.ICDT.2019.9},
  annote =	{Keywords: Data Structures, Range Reporting, Range Counting, Categorical Range Reporting, Orthogonal Range Query}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2017.35

Structural Pattern Matching - Succinctly

Authors: Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 92, 28th International Symposium on Algorithms and Computation (ISAAC 2017)

Abstract

Let T be a text of length n containing characters from an alphabet \Sigma, which is the union of two disjoint sets: \Sigma_s containing static characters (s-characters) and \Sigma_p containing parameterized characters (p-characters). Each character in \Sigma_p has an associated complementary character from \Sigma_p. A pattern P (also over \Sigma) matches an equal-length substring $S$ of T iff the s-characters match exactly, there exists a one-to-one function that renames the p-characters in S to the p-characters in P, and if a p-character x is renamed to another p-character y then the complement of x is renamed to the complement of y. The task is to find the starting positions (occurrences) of all such substrings S. Previous indexing solution [Shibuya, SWAT 2000], known as Structural Suffix Tree, requires \Theta(n\log n) bits of space, and can find all occ occurrences in time O(|P|\log \sigma+ occ), where \sigma = |\Sigma|. In this paper, we present the first succinct index for this problem, which occupies n \log \sigma + O(n) bits and offers O(|P|\log\sigma+ occ\cdot \log n \log\sigma) query time.

Cite as

Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Structural Pattern Matching - Succinctly. In 28th International Symposium on Algorithms and Computation (ISAAC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 92, pp. 35:1-35:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.ISAAC.2017.35,
  author =	{Ganguly, Arnab and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Structural Pattern Matching - Succinctly}},
  booktitle =	{28th International Symposium on Algorithms and Computation (ISAAC 2017)},
  pages =	{35:1--35:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-054-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{92},
  editor =	{Okamoto, Yoshio and Tokuyama, Takeshi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2017.35},
  URN =		{urn:nbn:de:0030-drops-82566},
  doi =		{10.4230/LIPIcs.ISAAC.2017.35},
  annote =	{Keywords: Parameterized Pattern Matching, Suffix tree, Burrows-Wheeler Transform, Wavelet Tree, Fully-functional succinct tree}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2016.34

Space-Time Trade-Offs for the Shortest Unique Substring Problem

Authors: Arnab Ganguly, Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 64, 27th International Symposium on Algorithms and Computation (ISAAC 2016)

Abstract

Given a string X[1, n] and a position k in [1, n], the Shortest Unique Substring of X covering k, denoted by S_k, is a substring X[i, j] of X which satisfies the following conditions: (i) i leq k leq j, (ii) i is the only position where there is an occurrence of X[i, j], and (iii) j - i is minimized. The best-known algorithm [Hon et al., ISAAC 2015] can find S k for all k in [1, n] in time O(n) using the string X and additional 2n words of working space. Let tau be a given parameter. We present the following new results. For any given k in [1, n], we can compute S_k via a deterministic algorithm in O(n tau^2 log n tau) time using X and additional O(n/tau) words of working space. For every k in [1, n], we can compute S_k via a deterministic algorithm in O(n tau^2 log n/tau) time using X and additional O(n/tau) words and 4n + o(n) bits of working space. For both problems above, we present an O(n tau log^{c+1} n)-time randomized algorithm that uses n/ log c n words in addition to that mentioned above, where c geq 0 is an arbitrary constant. In this case, the reported string is unique and covers k, but with probability at most n^{-O(1)} , may not be the shortest. As a consequence of our techniques, we also obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings [Delcher et al., Nucleic Acids Res. 1999].

Cite as

Arnab Ganguly, Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan. Space-Time Trade-Offs for the Shortest Unique Substring Problem. In 27th International Symposium on Algorithms and Computation (ISAAC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 64, pp. 34:1-34:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.ISAAC.2016.34,
  author =	{Ganguly, Arnab and Hon, Wing-Kai and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Space-Time Trade-Offs for the Shortest Unique Substring Problem}},
  booktitle =	{27th International Symposium on Algorithms and Computation (ISAAC 2016)},
  pages =	{34:1--34:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-026-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{64},
  editor =	{Hong, Seok-Hee},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2016.34},
  URN =		{urn:nbn:de:0030-drops-68041},
  doi =		{10.4230/LIPIcs.ISAAC.2016.34},
  annote =	{Keywords: Suffix Tree, Sparsification, Rabin-Karp Fingerprint, Probabilistic z-Fast Trie, Succinct Data-Structures}
}

Document

DOI: 10.4230/LIPIcs.CPM.2016.2

Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching

Authors: Arnab Ganguly, Wing-Kai Hon, Kunihiko Sadakane, Rahul Shah, Sharma V. Thankachan, and Yilin Yang

Published in: LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

Abstract

Let S and S' be two strings of the same length.We consider the following two variants of string matching. * Parameterized Matching: The characters of S and S' are partitioned into static characters and parameterized characters. The strings are parameterized match iff the static characters match exactly and there exists a one-to-one function which renames the parameterized characters in S to those in S'. * Order-Preserving Matching: The strings are order-preserving match iff for any two integers i,j in [1,|S|], S[i] <= S[j] iff S'[i] <= S'[j]. Let P be a collection of d patterns {P_1, P_2, ..., P_d} of total length n characters, which are chosen from an alphabet Sigma. Given a text T, also over Sigma, we consider the dictionary indexing problem under the above definitions of string matching. Specifically, the task is to index P, such that we can report all positions j where at least one of the patterns P_i in P is a parameterized-match (resp. order-preserving match) with the same-length substring of $T$ starting at j. Previous best-known indexes occupy O(n * log(n)) bits and can report all occ positions in O(|T| * log(|Sigma|) + occ) time. We present space-efficient indexes that occupy O(n * log(|Sigma|+d) * log(n)) bits and reports all occ positions in O(|T| * (log(|Sigma|) + log_{|Sigma|}(n)) + occ) time for parameterized matching and in O(|T| * log(n) + occ) time for order-preserving matching.

Cite as

Arnab Ganguly, Wing-Kai Hon, Kunihiko Sadakane, Rahul Shah, Sharma V. Thankachan, and Yilin Yang. Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 2:1-2:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{ganguly_et_al:LIPIcs.CPM.2016.2,
  author =	{Ganguly, Arnab and Hon, Wing-Kai and Sadakane, Kunihiko and Shah, Rahul and Thankachan, Sharma V. and Yang, Yilin},
  title =	{{Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching}},
  booktitle =	{27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)},
  pages =	{2:1--2:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-012-5},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{54},
  editor =	{Grossi, Roberto and Lewenstein, Moshe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.2},
  URN =		{urn:nbn:de:0030-drops-60736},
  doi =		{10.4230/LIPIcs.CPM.2016.2},
  annote =	{Keywords: Parameterized Matching, Order-preserving Matching, Dictionary Indexing, Aho-Corasick Automaton, Sparsification}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2015.320

Forbidden Extension Queries

Authors: Sudip Biswas, Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 45, 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)

Abstract

Document retrieval is one of the most fundamental problem in information retrieval. The objective is to retrieve all documents from a document collection that are relevant to an input pattern. Several variations of this problem such as ranked document retrieval, document listing with two patterns and forbidden patterns have been studied. We introduce the problem of document retrieval with forbidden extensions. Let D={T_1,T_2,...,T_D} be a collection of D string documents of n characters in total, and P^+ and P^- be two query patterns, where P^+ is a proper prefix of P^-. We call P^- as the forbidden extension of the included pattern P^+. A forbidden extension query < P^+,P^- > asks to report all occ documents in D that contains P^+ as a substring, but does not contain P^- as one. A top-k forbidden extension query < P^+,P^-,k > asks to report those k documents among the occ documents that are most relevant to P^+. We present a linear index (in words) with an O(|P^-| + occ) query time for the document listing problem. For the top-k version of the problem, we achieve the following results, when the relevance of a document is based on PageRank: - an O(n) space (in words) index with O(|P^-|log sigma+ k) query time, where sigma is the size of the alphabet from which characters in D are chosen. For constant alphabets, this yields an optimal query time of O(|P^-|+ k). - for any constant epsilon > 0, a |CSA| + |CSA^*| + Dlog frac{n}{D} + O(n) bits index with O(search(P)+ k cdot tsa cdot log ^{2+epsilon} n) query time, where search(P) is the time to find the suffix range of a pattern P, tsa is the time to find suffix (or inverse suffix) array value, and |CSA^*| denotes the maximum of the space needed to store the compressed suffix array CSA of the concatenated text of all documents, or the total space needed to store the individual CSA of each document.

Cite as

Sudip Biswas, Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Forbidden Extension Queries. In 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 45, pp. 320-335, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.FSTTCS.2015.320,
  author =	{Biswas, Sudip and Ganguly, Arnab and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Forbidden Extension Queries}},
  booktitle =	{35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2015)},
  pages =	{320--335},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-97-2},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{45},
  editor =	{Harsha, Prahladh and Ramalingam, G.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2015.320},
  URN =		{urn:nbn:de:0030-drops-56522},
  doi =		{10.4230/LIPIcs.FSTTCS.2015.320},
  annote =	{Keywords: document retrieval, suffix trees, range queries, succinct data structure}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2015.277

Shared-Constraint Range Reporting

Authors: Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)

Abstract

Orthogonal range reporting is one of the classic and most fundamental data structure problems. (2,1,1) query is a 3 dimensional query with two-sided constraint on the first dimension and one sided constraint on each of the 2nd and 3rd dimension. Given a set of N points in three dimension, a particular formulation of such a (2,1,1) query (known as four-sided range reporting in three-dimension) asks to report all those K points within a query region [a, b]X(-infinity, c]X[d, infinity). These queries have overall 4 constraints. In Word-RAM model, the best known structure capable of answering such queries with optimal query time takes O(N log^{epsilon} N) space, where epsilon>0 is any positive constant. It has been shown that any external memory structure in optimal I/Os must use Omega(N log N/ log log_B N) space (in words), where B is the block size [Arge et al., PODS 1999]. In this paper, we study a special type of (2,1,1) queries, where the query parameters a and c are the same i.e., a=c. Even though the query is still four-sided, the number of independent constraints is only three. In other words, one constraint is shared. We call this as a Shared-Constraint Range Reporting (SCRR) problem. We study this problem in both internal as well as external memory models. In RAM model where coordinates can only be compared, we achieve linear-space and O(log N+K) query time solution, matching the best-known three dimensional dominance query bound. Whereas in external memory, we present a linear space structure with O(log_B N + log log N + K/B) query I/Os. We also present an I/O-optimal (i.e., O(log_B N+K/B) I/Os) data structure which occupies O(N log log N)-word space. We achieve these results by employing a novel divide and conquer approach. SCRR finds application in database queries containing sharing among the constraints. We also show that SCRR queries naturally arise in many well known problems such as top-k color reporting, range skyline reporting and ranked document retrieval.

Cite as

Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan. Shared-Constraint Range Reporting. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 277-290, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.ICDT.2015.277,
  author =	{Biswas, Sudip and Patil, Manish and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Shared-Constraint Range Reporting}},
  booktitle =	{18th International Conference on Database Theory (ICDT 2015)},
  pages =	{277--290},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-79-8},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{31},
  editor =	{Arenas, Marcelo and Ugarte, Mart{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2015.277},
  URN =		{urn:nbn:de:0030-drops-49900},
  doi =		{10.4230/LIPIcs.ICDT.2015.277},
  annote =	{Keywords: data structure, shared constraint, multi-slab, point partitioning}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.FSTTCS.2011.6

A Domain-Specific Language for Computing on Encrypted Data (Invited Talk)

Authors: Alex Bain, John Mitchell, Rahul Sharma, Deian Stefan, and Joe Zimmerman

Published in: LIPIcs, Volume 13, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)

Abstract

In cloud computing, a client may request computation on confidential data that is sent to untrusted servers. While homomorphic encryption and secure multiparty computation provide building blocks for secure computation, software must be properly structured to preserve confidentiality. Using a general definition of secure execution platform, we propose a single Haskell-based domain-specific language for cryptographic cloud computing and prove correctness and confidentiality for two representative and distinctly different implementations of the same programming language. The secret sharing execution platform provides information-theoretic security against colluding servers. The homomorphic encryption execution platform requires only one server, but has limited efficiency, and provides secrecy against a computationally-bounded adversary. Experiments with our implementation suggest promising computational feasibility, as cryptography improves, and show how code can be developed uniformly for a variety of secure cloud platforms, without explicitly programming separate clients and servers.

Cite as

Alex Bain, John Mitchell, Rahul Sharma, Deian Stefan, and Joe Zimmerman. A Domain-Specific Language for Computing on Encrypted Data (Invited Talk). In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011). Leibniz International Proceedings in Informatics (LIPIcs), Volume 13, pp. 6-24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)

Copy BibTex To Clipboard

@InProceedings{bain_et_al:LIPIcs.FSTTCS.2011.6,
  author =	{Bain, Alex and Mitchell, John and Sharma, Rahul and Stefan, Deian and Zimmerman, Joe},
  title =	{{A Domain-Specific Language for Computing on Encrypted Data}},
  booktitle =	{IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)},
  pages =	{6--24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-34-7},
  ISSN =	{1868-8969},
  year =	{2011},
  volume =	{13},
  editor =	{Chakraborty, Supratik and Kumar, Amit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2011.6},
  URN =		{urn:nbn:de:0030-drops-33604},
  doi =		{10.4230/LIPIcs.FSTTCS.2011.6},
  annote =	{Keywords: Domain-Specific Language, Secret Sharing, Homomorphic Encryption}
}

14 Search Results for "Sharma, Rahul"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message