1 Search Results for "Bertram, Noah"

AWLCO: All-Window Length Co-Occurrence

Authors: Joshua Sobel, Noah Bertram, Chen Ding, Fatemeh Nargesian, and Daniel Gildea

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)

Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity O(n) for each window length and thus a total complexity of O(n²) and the space complexity O(|I|) for a sequence of size n and an itemset of size |I|. We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the time complexity of O(n) and space complexity of O(√{n|I|}), assuming perfect hashing. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with time complexity O(n|I|), assuming perfect hashing, with an additional pre-processing step and space complexity O(√{n|I|}+|I|), plus the overhead of the Aho-Corasick algorithm [Aho and Corasick, 1975].

Cite as

Joshua Sobel, Noah Bertram, Chen Ding, Fatemeh Nargesian, and Daniel Gildea. AWLCO: All-Window Length Co-Occurrence. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 24:1-24:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

  author =	{Sobel, Joshua and Bertram, Noah and Ding, Chen and Nargesian, Fatemeh and Gildea, Daniel},
  title =	{{AWLCO: All-Window Length Co-Occurrence}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{24:1--24:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.24},
  URN =		{urn:nbn:de:0030-drops-139759},
  doi =		{10.4230/LIPIcs.CPM.2021.24},
  annote =	{Keywords: Itemsets, Data Sequences, Co-occurrence}
  • Refine by Author
  • 1 Bertram, Noah
  • 1 Ding, Chen
  • 1 Gildea, Daniel
  • 1 Nargesian, Fatemeh
  • 1 Sobel, Joshua

  • Refine by Classification
  • 1 Theory of computation → Streaming, sublinear and near linear time algorithms

  • Refine by Keyword
  • 1 Co-occurrence
  • 1 Data Sequences
  • 1 Itemsets

  • Refine by Type
  • 1 document

  • Refine by Publication Year
  • 1 2021

Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail