19 Search Results for "Li, Yi"


Document
Position
Grounding Stream Reasoning Research

Authors: Pieter Bonte, Jean-Paul Calbimonte, Daniel de Leng, Daniele Dell'Aglio, Emanuele Della Valle, Thomas Eiter, Federico Giannini, Fredrik Heintz, Konstantin Schekotihin, Danh Le-Phuoc, Alessandra Mileo, Patrik Schneider, Riccardo Tommasini, Jacopo Urbani, and Giacomo Ziffer

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
In the last decade, there has been a growing interest in applying AI technologies to implement complex data analytics over data streams. To this end, researchers in various fields have been organising a yearly event called the "Stream Reasoning Workshop" to share perspectives, challenges, and experiences around this topic. In this paper, the previous organisers of the workshops and other community members provide a summary of the main research results that have been discussed during the first six editions of the event. These results can be categorised into four main research areas: The first is concerned with the technological challenges related to handling large data streams. The second area aims at adapting and extending existing semantic technologies to data streams. The third and fourth areas focus on how to implement reasoning techniques, either considering deductive or inductive techniques, to extract new and valuable knowledge from the data in the stream. This summary is written not only to provide a crystallisation of the field, but also to point out distinctive traits of the stream reasoning community. Moreover, it also provides a foundation for future research by enumerating a list of use cases and open challenges, to stimulate others to join this exciting research area.

Cite as

Pieter Bonte, Jean-Paul Calbimonte, Daniel de Leng, Daniele Dell'Aglio, Emanuele Della Valle, Thomas Eiter, Federico Giannini, Fredrik Heintz, Konstantin Schekotihin, Danh Le-Phuoc, Alessandra Mileo, Patrik Schneider, Riccardo Tommasini, Jacopo Urbani, and Giacomo Ziffer. Grounding Stream Reasoning Research. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 2:1-2:47, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{bonte_et_al:TGDK.2.1.2,
  author =	{Bonte, Pieter and Calbimonte, Jean-Paul and de Leng, Daniel and Dell'Aglio, Daniele and Della Valle, Emanuele and Eiter, Thomas and Giannini, Federico and Heintz, Fredrik and Schekotihin, Konstantin and Le-Phuoc, Danh and Mileo, Alessandra and Schneider, Patrik and Tommasini, Riccardo and Urbani, Jacopo and Ziffer, Giacomo},
  title =	{{Grounding Stream Reasoning Research}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:47},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.2},
  URN =		{urn:nbn:de:0030-drops-198597},
  doi =		{10.4230/TGDK.2.1.2},
  annote =	{Keywords: Stream Reasoning, Stream Processing, RDF streams, Streaming Linked Data, Continuous query processing, Temporal Logics, High-performance computing, Databases}
}
Document
Position
Standardizing Knowledge Engineering Practices with a Reference Architecture

Authors: Bradley P. Allen and Filip Ilievski

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used given the importance of high-quality knowledge for reliable intelligent agents. Meanwhile, the scope of knowledge engineering, as apparent from its target tasks and use cases, has been shifting, together with its paradigms such as expert systems, semantic web, and language modeling. The intended use cases and supported user requirements between these paradigms have not been analyzed globally, as new paradigms often satisfy prior pain points while possibly introducing new ones. The recent abstraction of systemic patterns into a boxology provides an opening for aligning the requirements and use cases of knowledge engineering with the systems, components, and software that can satisfy them best, however, this direction has not been explored to date. This paper proposes a vision of harmonizing the best practices in the field of knowledge engineering by leveraging the software engineering methodology of creating reference architectures. We describe how a reference architecture can be iteratively designed and implemented to associate user needs with recurring systemic patterns, building on top of existing knowledge engineering workflows and boxologies. We provide a six-step roadmap that can enable the development of such an architecture, consisting of scope definition, selection of information sources, architectural analysis, synthesis of an architecture based on the information source analysis, evaluation through instantiation, and, ultimately, instantiation into a concrete software architecture. We provide an initial design and outcome of the definition of architectural scope, selection of information sources, and analysis. As the remaining steps of design, evaluation, and instantiation of the architecture are largely use-case specific, we provide a detailed description of their procedures and point to relevant examples. We expect that following through on this vision will lead to well-grounded reference architectures for knowledge engineering, will advance the ongoing initiatives of organizing the neurosymbolic knowledge engineering space, and will build new links to the software architectures and data science communities.

Cite as

Bradley P. Allen and Filip Ilievski. Standardizing Knowledge Engineering Practices with a Reference Architecture. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 5:1-5:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{allen_et_al:TGDK.2.1.5,
  author =	{Allen, Bradley P. and Ilievski, Filip},
  title =	{{Standardizing Knowledge Engineering Practices with a Reference Architecture}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:23},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.5},
  URN =		{urn:nbn:de:0030-drops-198623},
  doi =		{10.4230/TGDK.2.1.5},
  annote =	{Keywords: knowledge engineering, knowledge graphs, quality attributes, software architectures, sociotechnical systems}
}
Document
RANDOM
Streaming Algorithms with Large Approximation Factors

Authors: Yi Li, Honghao Lin, David P. Woodruff, and Yuheng Zhang

Published in: LIPIcs, Volume 245, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)


Abstract
We initiate a broad study of classical problems in the streaming model with insertions and deletions in the setting where we allow the approximation factor α to be much larger than 1. Such algorithms can use significantly less memory than the usual setting for which α = 1+ε for an ε ∈ (0,1). We study large approximations for a number of problems in sketching and streaming, assuming that the underlying n-dimensional vector has all coordinates bounded by M throughout the data stream: 1) For the 𝓁_p norm/quasi-norm, 0 < p ≤ 2, we show that obtaining a poly(n)-approximation requires the same amount of memory as obtaining an O(1)-approximation for any M = n^Θ(1), which holds even for randomly ordered streams or for streams in the bounded deletion model. 2) For estimating the 𝓁_p norm, p > 2, we show an upper bound of O(n^{1-2/p} (log n log M)/α²) bits for an α-approximation, and give a matching lower bound for linear sketches. 3) For the 𝓁₂-heavy hitters problem, we show that the known lower bound of Ω(k log nlog M) bits for identifying (1/k)-heavy hitters holds even if we are allowed to output items that are 1/(α k)-heavy, provided the algorithm succeeds with probability 1-O(1/n). We also obtain a lower bound for linear sketches that is tight even for constant failure probability algorithms. 4) For estimating the number 𝓁₀ of distinct elements, we give an n^{1/t}-approximation algorithm using O(tlog log M) bits of space, as well as a lower bound of Ω(t) bits, both excluding the storage of random bits, where n is the dimension of the underlying frequency vector and M is an upper bound on the magnitude of its coordinates. 5) For α-approximation to the Schatten-p norm, we give near-optimal Õ(n^{2-4/p}/α⁴) sketching dimension for every even integer p and every α ≥ 1, while for p not an even integer we obtain near-optimal sketching dimension once α = Ω(n^{1/q-1/p}), where q is the largest even integer less than p. The latter is surprising as it is unknown what the complexity of Schatten-p norm estimation is for constant approximation; we show once the approximation factor is at least n^{1/q-1/p}, we can obtain near-optimal sketching bounds.

Cite as

Yi Li, Honghao Lin, David P. Woodruff, and Yuheng Zhang. Streaming Algorithms with Large Approximation Factors. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 13:1-13:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.APPROX/RANDOM.2022.13,
  author =	{Li, Yi and Lin, Honghao and Woodruff, David P. and Zhang, Yuheng},
  title =	{{Streaming Algorithms with Large Approximation Factors}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)},
  pages =	{13:1--13:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-249-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{245},
  editor =	{Chakrabarti, Amit and Swamy, Chaitanya},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2022.13},
  URN =		{urn:nbn:de:0030-drops-171354},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2022.13},
  annote =	{Keywords: streaming algorithms, 𝓁\underlinep norm, heavy hitters, distinct elements}
}
Document
RANDOM
The Product of Gaussian Matrices Is Close to Gaussian

Authors: Yi Li and David P. Woodruff

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
We study the distribution of the matrix product G₁ G₂ ⋯ G_r of r independent Gaussian matrices of various sizes, where G_i is d_{i-1} × d_i, and we denote p = d₀, q = d_r, and require d₁ = d_{r-1}. Here the entries in each G_i are standard normal random variables with mean 0 and variance 1. Such products arise in the study of wireless communication, dynamical systems, and quantum transport, among other places. We show that, provided each d_i, i = 1, …, r, satisfies d_i ≥ C p ⋅ q, where C ≥ C₀ for a constant C₀ > 0 depending on r, then the matrix product G₁ G₂ ⋯ G_r has variation distance at most δ to a p × q matrix G of i.i.d. standard normal random variables with mean 0 and variance ∏_{i = 1}^{r-1} d_i. Here δ → 0 as C → ∞. Moreover, we show a converse for constant r that if d_i < C' max{p,q}^{1/2}min{p,q}^{3/2} for some i, then this total variation distance is at least δ', for an absolute constant δ' > 0 depending on C' and r. This converse is best possible when p = Θ(q).

Cite as

Yi Li and David P. Woodruff. The Product of Gaussian Matrices Is Close to Gaussian. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 35:1-35:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.APPROX/RANDOM.2021.35,
  author =	{Li, Yi and Woodruff, David P.},
  title =	{{The Product of Gaussian Matrices Is Close to Gaussian}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{35:1--35:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.35},
  URN =		{urn:nbn:de:0030-drops-147281},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.35},
  annote =	{Keywords: random matrix theory, total variation distance, matrix product}
}
Document
Development and Validation of Energy Simulation for Additive Manufacturing

Authors: Li Yi and Jan C. Aurich

Published in: OASIcs, Volume 89, 2nd International Conference of the DFG International Research Training Group 2057 – Physical Modeling for Virtual Manufacturing (iPMVM 2020)


Abstract
Additive manufacturing (AM) is a promising manufacturing technology towards cleaner production systems. Nevertheless, recent studies state that environmental benefits of AM are case-specific and need to be evaluated and confirmed in the design phase. To enable the energy performance evaluation in the design phase, developing convenient tools for energy prediction of AM has been an important research task. Aiming at this problem, this paper presents the research for energy modeling, simulation implementation, and experimental validation of an energy simulation tool of two AM processes: Selective laser melting (SLM) and Fused deposition modeling (FDM). The developed simulation tool can be conveniently used for energy consumption quantification and evaluation during the product and process design for AM.

Cite as

Li Yi and Jan C. Aurich. Development and Validation of Energy Simulation for Additive Manufacturing. In 2nd International Conference of the DFG International Research Training Group 2057 – Physical Modeling for Virtual Manufacturing (iPMVM 2020). Open Access Series in Informatics (OASIcs), Volume 89, pp. 1:1-1:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{yi_et_al:OASIcs.iPMVM.2020.1,
  author =	{Yi, Li and Aurich, Jan C.},
  title =	{{Development and Validation of Energy Simulation for Additive Manufacturing}},
  booktitle =	{2nd International Conference of the DFG International Research Training Group 2057 – Physical Modeling for Virtual Manufacturing (iPMVM 2020)},
  pages =	{1:1--1:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-183-2},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{89},
  editor =	{Garth, Christoph and Aurich, Jan C. and Linke, Barbara and M\"{u}ller, Ralf and Ravani, Bahram and Weber, Gunther H. and Kirsch, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.iPMVM.2020.1},
  URN =		{urn:nbn:de:0030-drops-137500},
  doi =		{10.4230/OASIcs.iPMVM.2020.1},
  annote =	{Keywords: Additive manufacturing, fused deposition modeling, selective laser melting, energy simulation, eco-design for AM}
}
Document
Geometric Cover with Outliers Removal

Authors: Zhengyang Guo and Yi Li

Published in: LIPIcs, Volume 187, 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)


Abstract
We study the problem of partial geometric cover, which asks to find the minimum number of geometric objects (unit squares and unit disks in this work) that cover at least (n-t) of n given planar points, where 0 ≤ t ≤ n/2. When t = 0, the problem is the classical geometric cover problem, for which many existing works adopt a general framework called the shifting strategy. The shifting strategy is a divide and conquer paradigm which partitions the plane into equal-width strips, applies a local algorithm on each strip and then merges the local solutions with only a small loss on the overall approximation ratio. A challenge to extend the shifting strategy to the case of outliers is to determine the number of outliers in each strip. We develop a shifting strategy incorporating the outlier distribution, which runs in O(tn log n) time. We also develop local algorithms on strips for the outliers case, improving the running time over previous algorithms, and consequently obtain approximation algorithms to the partial geometric cover.

Cite as

Zhengyang Guo and Yi Li. Geometric Cover with Outliers Removal. In 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 187, pp. 39:1-39:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{guo_et_al:LIPIcs.STACS.2021.39,
  author =	{Guo, Zhengyang and Li, Yi},
  title =	{{Geometric Cover with Outliers Removal}},
  booktitle =	{38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021)},
  pages =	{39:1--39:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-180-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{187},
  editor =	{Bl\"{a}ser, Markus and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2021.39},
  URN =		{urn:nbn:de:0030-drops-136849},
  doi =		{10.4230/LIPIcs.STACS.2021.39},
  annote =	{Keywords: Geometric Cover, Unit Square Cover, Unit Disk Cover, Shifting Strategy, Outliers Detection, Computational Geometry}
}
Document
Invited Talk
Worst-Case Optimal Join Algorithms (Invited Talk)

Authors: Ke Yi

Published in: LIPIcs, Volume 181, 31st International Symposium on Algorithms and Computation (ISAAC 2020)


Abstract
Join is the most important operator in relational databases, and remains the most expensive one despite years of research and engineering efforts. Following the ground-breaking work of Atserias, Grohe, and Marx in 2008, worst-case optimal join algorithms have been discovered, which has led to not only a series of beautiful theoretical results, but also new database systems based on drastically different query evaluation techniques. In this talk, I will present an overview of this topic, including algorithms in various computation models (sequential and parallel), variants of the problem (full, Boolean, and counting), and approximate solutions.

Cite as

Ke Yi. Worst-Case Optimal Join Algorithms (Invited Talk). In 31st International Symposium on Algorithms and Computation (ISAAC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 181, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{yi:LIPIcs.ISAAC.2020.2,
  author =	{Yi, Ke},
  title =	{{Worst-Case Optimal Join Algorithms}},
  booktitle =	{31st International Symposium on Algorithms and Computation (ISAAC 2020)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-173-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{181},
  editor =	{Cao, Yixin and Cheng, Siu-Wing and Li, Minming},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2020.2},
  URN =		{urn:nbn:de:0030-drops-133462},
  doi =		{10.4230/LIPIcs.ISAAC.2020.2},
  annote =	{Keywords: query evaluation}
}
Document
APPROX
Streaming Complexity of SVMs

Authors: Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, and David P. Woodruff

Published in: LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)


Abstract
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. In particular, given a data set (x_i,y_i) ∈ ℝ^d× {-1,+1}, the objective function is F_λ(θ,b) = λ/2‖(θ,b)‖₂² + 1/n∑_{i=1}ⁿ max{0,1-y_i(θ^Tx_i+b)} and the goal is to find the parameters that (approximately) minimize this objective. This is a classic supervised learning problem that has drawn lots of attention, including for developing fast algorithms for solving the problem approximately: i.e., for finding (θ,b) such that F_λ(θ,b) ≤ min_{(θ',b')} F_λ(θ',b')+ε. One of the most widely used algorithms for approximately optimizing the SVM objective is Stochastic Gradient Descent (SGD), which requires only O(1/λε) random samples, and which immediately yields a streaming algorithm that uses O(d/λε) space. For related problems, better streaming algorithms are only known for smooth functions, unlike the SVM objective that we focus on in this work. We initiate an investigation of the space complexity for both finding an approximate optimum of this objective, and for the related "point estimation" problem of sketching the data set to evaluate the function value F_λ on any query (θ, b). We show that, for both problems, for dimensions d = 1,2, one can obtain streaming algorithms with space polynomially smaller than 1/λε, which is the complexity of SGD for strongly convex functions like the bias-regularized SVM [Shalev-Shwartz et al., 2007], and which is known to be tight in general, even for d = 1 [Agarwal et al., 2009]. We also prove polynomial lower bounds for both point estimation and optimization. In particular, for point estimation we obtain a tight bound of Θ(1/√{ε}) for d = 1 and a nearly tight lower bound of Ω̃(d/{ε}²) for d = Ω(log(1/ε)). Finally, for optimization, we prove a Ω(1/√{ε}) lower bound for d = Ω(log(1/ε)), and show similar bounds when d is constant.

Cite as

Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, and David P. Woodruff. Streaming Complexity of SVMs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 50:1-50:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.APPROX/RANDOM.2020.50,
  author =	{Andoni, Alexandr and Burns, Collin and Li, Yi and Mahabadi, Sepideh and Woodruff, David P.},
  title =	{{Streaming Complexity of SVMs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)},
  pages =	{50:1--50:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-164-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{176},
  editor =	{Byrka, Jaros{\l}aw and Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.50},
  URN =		{urn:nbn:de:0030-drops-126532},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2020.50},
  annote =	{Keywords: support vector machine, streaming algorithm, space lower bound, sketching algorithm, point estimation}
}
Document
Track A: Algorithms, Complexity and Games
Deterministic Sparse Fourier Transform with an 𝓁_{∞} Guarantee

Authors: Yi Li and Vasileios Nakos

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)


Abstract
In this paper we revisit the deterministic version of the Sparse Fourier Transform problem, which asks to read only a few entries of x ∈ ℂⁿ and design a recovery algorithm such that the output of the algorithm approximates x̂, the Discrete Fourier Transform (DFT) of x. The randomized case has been well-understood, while the main work in the deterministic case is that of Merhi et al. (J Fourier Anal Appl 2018), which obtains O(k² log^(-1) k ⋅ log^5.5 n) samples and a similar runtime with the 𝓁₂/𝓁₁ guarantee. We focus on the stronger 𝓁_∞/𝓁₁ guarantee and the closely related problem of incoherent matrices. We list our contributions as follows. 1) We find a deterministic collection of O(k² log n) samples for the 𝓁_∞/𝓁₁ recovery in time O(nk log² n), and a deterministic collection of O(k² log² n) samples for the 𝓁_∞/𝓁₁ sparse recovery in time O(k² log³n). 2) We give new deterministic constructions of incoherent matrices that are row-sampled submatrices of the DFT matrix, via a derandomization of Bernstein’s inequality and bounds on exponential sums considered in analytic number theory. Our first construction matches a previous randomized construction of Nelson, Nguyen and Woodruff (RANDOM'12), where there was no constraint on the form of the incoherent matrix. Our algorithms are nearly sample-optimal, since a lower bound of Ω(k² + k log n) is known, even for the case where the sensing matrix can be arbitrarily designed. A similar lower bound of Ω(k² log n/ log k) is known for incoherent matrices.

Cite as

Yi Li and Vasileios Nakos. Deterministic Sparse Fourier Transform with an 𝓁_{∞} Guarantee. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 77:1-77:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.ICALP.2020.77,
  author =	{Li, Yi and Nakos, Vasileios},
  title =	{{Deterministic Sparse Fourier Transform with an 𝓁\underline\{∞\} Guarantee}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{77:1--77:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.77},
  URN =		{urn:nbn:de:0030-drops-124844},
  doi =		{10.4230/LIPIcs.ICALP.2020.77},
  annote =	{Keywords: Fourier sparse recovery, derandomization, incoherent matrices}
}
Document
Streaming Complexity of Spanning Tree Computation

Authors: Yi-Jun Chang, Martín Farach-Colton, Tsan-Sheng Hsu, and Meng-Tsung Tsai

Published in: LIPIcs, Volume 154, 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)


Abstract
The semi-streaming model is a variant of the streaming model frequently used for the computation of graph problems. It allows the edges of an n-node input graph to be read sequentially in p passes using Õ(n) space. If the list of edges includes deletions, then the model is called the turnstile model; otherwise it is called the insertion-only model. In both models, some graph problems, such as spanning trees, k-connectivity, densest subgraph, degeneracy, cut-sparsifier, and (Δ+1)-coloring, can be exactly solved or (1+ε)-approximated in a single pass; while other graph problems, such as triangle detection and unweighted all-pairs shortest paths, are known to require Ω̃(n) passes to compute. For many fundamental graph problems, the tractability in these models is open. In this paper, we study the tractability of computing some standard spanning trees, including BFS, DFS, and maximum-leaf spanning trees. Our results, in both the insertion-only and the turnstile models, are as follows. - Maximum-Leaf Spanning Trees: This problem is known to be APX-complete with inapproximability constant ρ ∈ [245/244, 2). By constructing an ε-MLST sparsifier, we show that for every constant ε > 0, MLST can be approximated in a single pass to within a factor of 1+ε w.h.p. (albeit in super-polynomial time for ε ≤ ρ-1 assuming P ≠ NP) and can be approximated in polynomial time in a single pass to within a factor of ρ_n+ε w.h.p., where ρ_n is the supremum constant that MLST cannot be approximated to within using polynomial time and Õ(n) space. In the insertion-only model, these algorithms can be deterministic. - BFS Trees: It is known that BFS trees require ω(1) passes to compute, but the naïve approach needs O(n) passes. We devise a new randomized algorithm that reduces the pass complexity to O(√n), and it offers a smooth tradeoff between pass complexity and space usage. This gives a polynomial separation between single-source and all-pairs shortest paths for unweighted graphs. - DFS Trees: It is unknown whether DFS trees require more than one pass. The current best algorithm by Khan and Mehta [STACS 2019] takes Õ(h) passes, where h is the height of computed DFS trees. Note that h can be as large as Ω(m/n) for n-node m-edge graphs. Our contribution is twofold. First, we provide a simple alternative proof of this result, via a new connection to sparse certificates for k-node-connectivity. Second, we present a randomized algorithm that reduces the pass complexity to O(√n), and it also offers a smooth tradeoff between pass complexity and space usage.

Cite as

Yi-Jun Chang, Martín Farach-Colton, Tsan-Sheng Hsu, and Meng-Tsung Tsai. Streaming Complexity of Spanning Tree Computation. In 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 154, pp. 34:1-34:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{chang_et_al:LIPIcs.STACS.2020.34,
  author =	{Chang, Yi-Jun and Farach-Colton, Mart{\'\i}n and Hsu, Tsan-Sheng and Tsai, Meng-Tsung},
  title =	{{Streaming Complexity of Spanning Tree Computation}},
  booktitle =	{37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)},
  pages =	{34:1--34:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-140-5},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{154},
  editor =	{Paul, Christophe and Bl\"{a}ser, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2020.34},
  URN =		{urn:nbn:de:0030-drops-118951},
  doi =		{10.4230/LIPIcs.STACS.2020.34},
  annote =	{Keywords: Max-Leaf Spanning Trees, BFS Trees, DFS Trees}
}
Document
Emerging Hardware Techniques and EDA Methodologies for Neuromorphic Computing (Dagstuhl Seminar 19152)

Authors: Krishnendu Chakrabarty, Tsung-Yi Ho, Hai Li, and Ulf Schlichtmann

Published in: Dagstuhl Reports, Volume 9, Issue 4 (2019)


Abstract
This report documents the program and the outcomes of Dagstuhl Seminar 19152 "Emerging Hardware Techniques and EDA Methodologies for Neuromorphic Computing," which was held during April 7–10, 2019 in Schloss Dagstuhl - Leibniz Center for Informatics. Though interdisciplinary considerations of issues from computer science in the domain of machine learning and large scale computing have already successfully been covered in a series of Dagstuhl seminars, this was the first time that Neuromorphic Computing was brought out as the focus. During the seminar, many of the participants presented their current research on the traditional and emerging hardware techniques, design methodologies, electronic design automation techniques, and application of neuromorphic computing, including ongoing work and open problems. This report documents the abstracts or extended abstracts of the talks presented during the seminar, as well as summaries of the discussion sessions.

Cite as

Krishnendu Chakrabarty, Tsung-Yi Ho, Hai Li, and Ulf Schlichtmann. Emerging Hardware Techniques and EDA Methodologies for Neuromorphic Computing (Dagstuhl Seminar 19152). In Dagstuhl Reports, Volume 9, Issue 4, pp. 43-58, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Article{chakrabarty_et_al:DagRep.9.4.43,
  author =	{Chakrabarty, Krishnendu and Ho, Tsung-Yi and Li, Hai and Schlichtmann, Ulf},
  title =	{{Emerging Hardware Techniques and EDA Methodologies for Neuromorphic Computing (Dagstuhl Seminar 19152)}},
  pages =	{43--58},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2019},
  volume =	{9},
  number =	{4},
  editor =	{Chakrabarty, Krishnendu and Ho, Tsung-Yi and Li, Hai and Schlichtmann, Ulf},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.9.4.43},
  URN =		{urn:nbn:de:0030-drops-113034},
  doi =		{10.4230/DagRep.9.4.43},
  annote =	{Keywords: Neuromorphic computing; nanotechnology; hardware design; electronic design automation; reliability and robustness}
}
Document
Optimality of Linear Sketching Under Modular Updates

Authors: Kaave Hosseini, Shachar Lovett, and Grigory Yaroslavtsev

Published in: LIPIcs, Volume 137, 34th Computational Complexity Conference (CCC 2019)


Abstract
We study the relation between streaming algorithms and linear sketching algorithms, in the context of binary updates. We show that for inputs in n dimensions, the existence of efficient streaming algorithms which can process Omega(n^2) updates implies efficient linear sketching algorithms with comparable cost. This improves upon the previous work of Li, Nguyen and Woodruff [Yi Li et al., 2014] and Ai, Hu, Li and Woodruff [Yuqing Ai et al., 2016] which required a triple-exponential number of updates to achieve a similar result for updates over integers. We extend our results to updates modulo p for integers p >= 2, and to approximation instead of exact computation.

Cite as

Kaave Hosseini, Shachar Lovett, and Grigory Yaroslavtsev. Optimality of Linear Sketching Under Modular Updates. In 34th Computational Complexity Conference (CCC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 137, pp. 13:1-13:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{hosseini_et_al:LIPIcs.CCC.2019.13,
  author =	{Hosseini, Kaave and Lovett, Shachar and Yaroslavtsev, Grigory},
  title =	{{Optimality of Linear Sketching Under Modular Updates}},
  booktitle =	{34th Computational Complexity Conference (CCC 2019)},
  pages =	{13:1--13:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-116-0},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{137},
  editor =	{Shpilka, Amir},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2019.13},
  URN =		{urn:nbn:de:0030-drops-108355},
  doi =		{10.4230/LIPIcs.CCC.2019.13},
  annote =	{Keywords: communication complexity, linear sketching, streaming algorithm}
}
Document
Randomized (Delta+1)-Coloring in O(log* Delta) Congested Clique Rounds

Authors: Merav Parter and Hsin-Hao Su

Published in: LIPIcs, Volume 121, 32nd International Symposium on Distributed Computing (DISC 2018)


Abstract
(Delta+1)-vertex coloring is one of the most fundamental symmetry breaking graph problems, receiving tremendous amount of attention over the last decades. We consider the congested clique model where in each round, every pair of vertices can exchange O(log n) bits of information. In a recent breakthrough, Yi-Jun Chang, Wenzheng Li, and Seth Pettie [CLP-STOC'18] presented a randomized (Delta+1)-list coloring algorithm in the LOCAL model that works in O(log^*n+Det_{deg}(log log n)) rounds, where Det_{deg}(n') is the deterministic LOCAL complexity of (deg+1)-list coloring algorithm on n'-vertex graphs. Unfortunately, the CLP algorithm uses large messages and hence cannot be efficiently implemented in the congested clique model when the maximum degree Delta is large (in particular, when Delta=omega(sqrt{n})). Merav Parter [P-ICALP'18] recently provided a randomized (Delta+1)-coloring algorithm in O(log log Delta * log^* Delta) congested clique rounds based on a careful partitioning of the input graph into almost-independent subgraphs with maximum degree sqrt{n}. In this work, we significantly improve upon this result and present a randomized (Delta+1)-coloring algorithm with O(log^* Delta) rounds, with high probability. At the heart of our algorithm is an adaptation of the CLP algorithm for coloring a subgraph with o(n) vertices and maximum degree Omega(n^{5/8}) in O(log^* Delta) rounds. The approach is built upon a combination of techniques, this includes: the graph sparsification of [Parter-ICALP'18], and a palette sampling technique adopted to the CLP framework.

Cite as

Merav Parter and Hsin-Hao Su. Randomized (Delta+1)-Coloring in O(log* Delta) Congested Clique Rounds. In 32nd International Symposium on Distributed Computing (DISC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 121, pp. 39:1-39:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{parter_et_al:LIPIcs.DISC.2018.39,
  author =	{Parter, Merav and Su, Hsin-Hao},
  title =	{{Randomized (Delta+1)-Coloring in O(log* Delta) Congested Clique Rounds}},
  booktitle =	{32nd International Symposium on Distributed Computing (DISC 2018)},
  pages =	{39:1--39:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-092-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{121},
  editor =	{Schmid, Ulrich and Widder, Josef},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2018.39},
  URN =		{urn:nbn:de:0030-drops-98286},
  doi =		{10.4230/LIPIcs.DISC.2018.39},
  annote =	{Keywords: Distributed Graph Algorithms, Coloring, congested clique}
}
Document
Deterministic Heavy Hitters with Sublinear Query Time

Authors: Yi Li and Vasileios Nakos

Published in: LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)


Abstract
We study the classic problem of finding l_1 heavy hitters in the streaming model. In the general turnstile model, we give the first deterministic sublinear-time sketching algorithm which takes a linear sketch of length O(epsilon^{-2} log n * log^*(epsilon^{-1})), which is only a factor of log^*(epsilon^{-1}) more than the best existing polynomial-time sketching algorithm (Nelson et al., RANDOM '12). Our approach is based on an iterative procedure, where most unrecovered heavy hitters are identified in each iteration. Although this technique has been extensively employed in the related problem of sparse recovery, this is the first time, to the best of our knowledge, that it has been used in the context of heavy hitters. Along the way we also obtain a sublinear time algorithm for the closely related problem of the l_1/l_1 compressed sensing, matching the space usage of previous (super-)linear time algorithms. In the strict turnstile model, we show that the runtime can be improved and the sketching matrix can be made strongly explicit with O(epsilon^{-2}log^3 n/log^3(1/epsilon)) rows.

Cite as

Yi Li and Vasileios Nakos. Deterministic Heavy Hitters with Sublinear Query Time. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.APPROX-RANDOM.2018.18,
  author =	{Li, Yi and Nakos, Vasileios},
  title =	{{Deterministic Heavy Hitters with Sublinear Query Time}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-085-9},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{116},
  editor =	{Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.18},
  URN =		{urn:nbn:de:0030-drops-94221},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2018.18},
  annote =	{Keywords: heavy hitters, turnstile model, sketching algorithm, strongly explicit}
}
Document
On Low-Risk Heavy Hitters and Sparse Recovery Schemes

Authors: Yi Li, Vasileios Nakos, and David P. Woodruff

Published in: LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)


Abstract
We study the heavy hitters and related sparse recovery problems in the low failure probability regime. This regime is not well-understood, and the main previous work on this is by Gilbert et al. (ICALP'13). We recognize an error in their analysis, improve their results, and contribute new sparse recovery algorithms, as well as provide upper and lower bounds for the heavy hitters problem with low failure probability. Our results are summarized as follows: 1) (Heavy Hitters) We study three natural variants for finding heavy hitters in the strict turnstile model, where the variant depends on the quality of the desired output. For the weakest variant, we give a randomized algorithm improving the failure probability analysis of the ubiquitous Count-Min data structure. We also give a new lower bound for deterministic schemes, resolving a question about this variant posed in Question 4 in the IITK Workshop on Algorithms for Data Streams (2006). Under the strongest and well-studied l_{infty}/ l_2 variant, we show that the classical Count-Sketch data structure is optimal for very low failure probabilities, which was previously unknown. 2) (Sparse Recovery Algorithms) For non-adaptive sparse-recovery, we give sublinear-time algorithms with low-failure probability, which improve upon Gilbert et al. (ICALP'13). In the adaptive case, we improve the failure probability from a constant by Indyk et al. (FOCS '11) to e^{-k^{0.99}}, where k is the sparsity parameter. 3) (Optimal Average-Case Sparse Recovery Bounds) We give matching upper and lower bounds in all parameters, including the failure probability, for the measurement complexity of the l_2/l_2 sparse recovery problem in the spiked-covariance model, completely settling its complexity in this model.

Cite as

Yi Li, Vasileios Nakos, and David P. Woodruff. On Low-Risk Heavy Hitters and Sparse Recovery Schemes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 19:1-19:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.APPROX-RANDOM.2018.19,
  author =	{Li, Yi and Nakos, Vasileios and Woodruff, David P.},
  title =	{{On Low-Risk Heavy Hitters and Sparse Recovery Schemes}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)},
  pages =	{19:1--19:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-085-9},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{116},
  editor =	{Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.19},
  URN =		{urn:nbn:de:0030-drops-94237},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2018.19},
  annote =	{Keywords: heavy hitters, sparse recovery, turnstile model, spike covariance model, lower bounds}
}
  • Refine by Author
  • 10 Li, Yi
  • 7 Woodruff, David P.
  • 3 Nakos, Vasileios
  • 1 Ai, Yuqing
  • 1 Allen, Bradley P.
  • Show More...

  • Refine by Classification
  • 6 Theory of computation → Streaming, sublinear and near linear time algorithms
  • 1 Applied computing → Computer-aided design
  • 1 Computing methodologies → Description logics
  • 1 Computing methodologies → Knowledge representation and reasoning
  • 1 Computing methodologies → Temporal reasoning
  • Show More...

  • Refine by Keyword
  • 3 heavy hitters
  • 2 communication complexity
  • 2 data streams
  • 2 matrix norms
  • 2 sketching
  • Show More...

  • Refine by Type
  • 19 document

  • Refine by Publication Year
  • 4 2020
  • 3 2018
  • 3 2021
  • 2 2016
  • 2 2019
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail