11 Search Results for "Roth, Tal"


Document
Testable Algorithms for Approximately Counting Edges and Triangles in Sublinear Time and Space

Authors: Talya Eden, Ronitt Rubinfeld, and Arsen Vasilyan

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
We consider the fundamental problems of approximately counting the numbers of edges and triangles in a graph in sublinear time. Previous algorithms for these tasks are significantly more efficient under a promise that the arboricity of the graph is bounded by some parameter ̅α. However, when this promise is violated, the estimates given by these algorithms are no longer guaranteed to be correct. For the triangle counting task, we give an algorithm that requires no promise on the input graph G, and computes a (1±ε)-approximation for the number of triangles t in G in time O^*((m⋅ α(G))/t + m/(t^{2/3)}), where α(G) is the arboricity of the graph. The algorithm can be used on any graph G (no prior knowledge of the arboricity α(G) is required), and the algorithm adapts its run-time on the fly based on the graph G. We accomplish this by trying a sequence of candidate values α̃ for α(G) and using a novel algorithm in the framework of testable algorithms. This ensures that wrong candidates α̃ cannot lead to wrong estimates: if the advice is incorrect, the algorithm either succeeds despite this or detects this and continues with a new candidate. Once the algorithm accepts the candidate, its output is guaranteed to be correct with high probability. We prove that this approach preserves - up to an additive overhead - the dramatic efficiency gains obtainable when good arboricity bounds are known in advance, while ensuring robustness against misleading advice. We further complement this result with a lower bound, showing that such an overhead is unavoidable whenever the advice may be faulty. We further demonstrate implications of our results for triangle counting in the streaming model.

Cite as

Talya Eden, Ronitt Rubinfeld, and Arsen Vasilyan. Testable Algorithms for Approximately Counting Edges and Triangles in Sublinear Time and Space. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 54:1-54:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ITCS.2026.54,
  author =	{Eden, Talya and Rubinfeld, Ronitt and Vasilyan, Arsen},
  title =	{{Testable Algorithms for Approximately Counting Edges and Triangles in Sublinear Time and Space}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{54:1--54:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.54},
  URN =		{urn:nbn:de:0030-drops-253417},
  doi =		{10.4230/LIPIcs.ITCS.2026.54},
  annote =	{Keywords: Sublinear Algorithms, Triangle Counting, Edge Counting, Arboricity}
}
Document
Communication Complexity of Equality and Error-Correcting Codes

Authors: Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
We study the public-coin randomized communication complexity of the equality function. The communication complexity of this function is known to be low when the error probability is constant and the players have access to many random bits. The complexity grows, however, if the allowed error probability and the amount of randomness are restricted. We show that public-coin randomized protocols for equality and error-correcting codes are essentially the same object. That is, given a protocol for equality, we can construct a code, and vice versa. We substantially extend the protocol-implies-code direction: any protocol computing a function with a large fooling set can be converted into an error-correcting code. As a corollary, we show that among functions with a fooling set of size s, equality on log s bits has the least randomized communication complexity, regardless of the restrictions on the error probability and the amount of randomness. Finally, we use the connection to error-correcting codes to analyze the randomized communication complexity of equality for varying restrictions on the error probability and the amount of randomness. In most cases, we provide tight bounds. We pinpoint the setting in which tight bounds are still unknown.

Cite as

Dale Jacobs, John Jeang, Vladimir Podolskii, Morgan Prior, and Ilya Volkovich. Communication Complexity of Equality and Error-Correcting Codes. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 37:1-37:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{jacobs_et_al:LIPIcs.FSTTCS.2025.37,
  author =	{Jacobs, Dale and Jeang, John and Podolskii, Vladimir and Prior, Morgan and Volkovich, Ilya},
  title =	{{Communication Complexity of Equality and Error-Correcting Codes}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{37:1--37:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.37},
  URN =		{urn:nbn:de:0030-drops-251175},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.37},
  annote =	{Keywords: communication complexity, randomized communication complexity, error-correcting codes}
}
Document
RANDOM
Sublinear Space Graph Algorithms in the Continual Release Model

Authors: Alessandro Epasto, Quanquan C. Liu, Tamalika Mukherjee, and Felix Zhou

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)


Abstract
The graph continual release model of differential privacy seeks to produce differentially private solutions to graph problems under a stream of edge updates where new private solutions are released after each update. Thus far, previously known edge-differentially private algorithms for most graph problems including densest subgraph and matchings in the continual release setting only output real-value estimates (not vertex subset solutions) and do not use sublinear space. Instead, they rely on computing exact graph statistics on the input [Hendrik Fichtenberger et al., 2021; Shuang Song et al., 2018]. In this paper, we leverage sparsification to address the above shortcomings for edge-insertion streams. Our edge-differentially private algorithms use sublinear space with respect to the number of edges in the graph while some also achieve sublinear space in the number of vertices in the graph. In addition, for the densest subgraph problem, we also output edge-differentially private vertex subset solutions; no previous graph algorithms in the continual release model output such subsets. We make novel use of assorted sparsification techniques from the non-private streaming and static graph algorithms literature to achieve new results in the sublinear space, continual release setting. This includes algorithms for densest subgraph, maximum matching, as well as the first continual release k-core decomposition algorithm. We also develop a novel sparse level data structure for k-core decomposition that may be of independent interest. To complement our insertion-only algorithms, we conclude with polynomial additive error lower bounds for edge-privacy in the fully dynamic setting, where only logarithmic lower bounds were previously known.

Cite as

Alessandro Epasto, Quanquan C. Liu, Tamalika Mukherjee, and Felix Zhou. Sublinear Space Graph Algorithms in the Continual Release Model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 40:1-40:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{epasto_et_al:LIPIcs.APPROX/RANDOM.2025.40,
  author =	{Epasto, Alessandro and Liu, Quanquan C. and Mukherjee, Tamalika and Zhou, Felix},
  title =	{{Sublinear Space Graph Algorithms in the Continual Release Model}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{40:1--40:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  URN =		{urn:nbn:de:0030-drops-244064},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  annote =	{Keywords: Differential Privacy, Continual Release, Densest Subgraph, k-Core Decomposition, Maximum Matching}
}
Document
RANDOM
Consumable Data via Quantum Communication

Authors: Dar Gilboa, Siddhartha Jain, and Jarrod R. McClean

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)


Abstract
Classical data can be copied and re-used for computation, with adverse consequences economically and in terms of data privacy. Motivated by this, we formulate problems in one-way communication complexity where Alice holds some data x and Bob holds m inputs y_1, …, y_m. They want to compute m instances of a bipartite relation R(⋅,⋅) on every pair (x, y_1), …, (x, y_m). We call this the asymmetric direct sum question for one-way communication. We give examples where the quantum communication complexity of such problems scales polynomially with m, while the classical communication complexity depends at most logarithmically on m. Thus, for such problems, data behaves like a consumable resource that is effectively destroyed upon use when the owner stores and transmits it as quantum states, but not when transmitted classically. We show an application to a strategic data-selling game, and discuss other potential economic implications.

Cite as

Dar Gilboa, Siddhartha Jain, and Jarrod R. McClean. Consumable Data via Quantum Communication. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 39:1-39:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gilboa_et_al:LIPIcs.APPROX/RANDOM.2025.39,
  author =	{Gilboa, Dar and Jain, Siddhartha and McClean, Jarrod R.},
  title =	{{Consumable Data via Quantum Communication}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{39:1--39:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.39},
  URN =		{urn:nbn:de:0030-drops-244059},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.39},
  annote =	{Keywords: quantum communication, one-time programs, data markets}
}
Document
RANDOM
Fooling Near-Maximal Decision Trees

Authors: William M. Hoza and Zelin Lv

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)


Abstract
For any constant α > 0, we construct an explicit pseudorandom generator (PRG) that fools n-variate decision trees of size m with error ε and seed length (1 + α) ⋅ log₂ m + O(log(1/ε) + log log n). For context, one can achieve seed length (2 + o(1)) ⋅ log₂ m + O(log(1/ε) + log log n) using well-known constructions and analyses of small-bias distributions, but such a seed length is trivial when m ≥ 2^{n/2}. Our approach is to develop a new variant of the classic concept of almost k-wise independence, which might be of independent interest. We say that a distribution X over {0, 1}ⁿ is k-wise ε-probably uniform if every Boolean function f that depends on only k variables satisfies 𝔼[f(X)] ≥ (1 - ε) ⋅ 𝔼[f]. We show how to sample a k-wise ε-probably uniform distribution using a seed of length (1 + α) ⋅ k + O(log(1/ε) + log log n). Meanwhile, we also show how to construct a set H ⊆ 𝔽₂ⁿ such that every feasible system of k linear equations in n variables over 𝔽₂ has a solution in H. The cardinality of H and the time complexity of enumerating H are at most 2^{k + o(k) + polylog n}, whereas small-bias distributions would give a bound of 2^{2k + O(log(n/k))}. By combining our new constructions with work by Chen and Kabanets (TCS 2016), we obtain nontrivial PRGs and hitting sets for linear-size Boolean circuits. Specifically, we get an explicit PRG with seed length (1 - Ω(1)) ⋅ n that fools circuits of size 2.99 ⋅ n over the U₂ basis, and we get a hitting set with time complexity 2^{(1 - Ω(1)) ⋅ n} for circuits of size 2.49 ⋅ n over the B₂ basis.

Cite as

William M. Hoza and Zelin Lv. Fooling Near-Maximal Decision Trees. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 35:1-35:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{hoza_et_al:LIPIcs.APPROX/RANDOM.2025.35,
  author =	{Hoza, William M. and Lv, Zelin},
  title =	{{Fooling Near-Maximal Decision Trees}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{35:1--35:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.35},
  URN =		{urn:nbn:de:0030-drops-244019},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.35},
  annote =	{Keywords: almost k-wise independence, decision trees, pseudorandom generators}
}
Document
A Min-Entropy Approach to Multi-Party Communication Lower Bounds

Authors: Mi-Ying (Miryam) Huang, Xinyu Mao, Shuo Wang, Guangxu Yang, and Jiapeng Zhang

Published in: LIPIcs, Volume 339, 40th Computational Complexity Conference (CCC 2025)


Abstract
Information complexity is one of the most powerful techniques to prove information-theoretical lower bounds, in which Shannon entropy plays a central role. Though Shannon entropy has some convenient properties, such as the chain rule, it still has inherent limitations. One of the most notable barriers is the square-root loss, which appears in the square-root gap between entropy gaps and statistical distances, e.g., Pinsker’s inequality. To bypass this barrier, we introduce a new method based on min-entropy analysis. Building on this new method, we prove the following results. - An Ω(N^{∑_i α_i - max_i {α_i}}/k) randomized communication lower bound of the k-party set-intersection problem where the i-th party holds a random set of size ≈ N^{1-α_i}. - A tight Ω(n/k) randomized lower bound of the k-party Tree Pointer Jumping problems, improving an Ω(n/k²) lower bound by Chakrabarti, Cormode, and McGregor (STOC 08). - An Ω(n/k+√n) lower bound of the Chained Index problem, improving an Ω(n/k²) lower bound by Cormode, Dark, and Konrad (ICALP 19). Since these problems served as hard problems for numerous applications in streaming lower bounds and cryptography, our new lower bounds directly improve these streaming lower bounds and cryptography lower bounds. On the technical side, min-entropy does not have nice properties such as the chain rule. To address this issue, we enhance the structure-vs-pseudorandomness decomposition used by Göös, Pitassi, and Watson (FOCS 17) and Yang and Zhang (STOC 24); both papers used this decomposition to prove communication lower bounds. In this paper, we give a new breath to this method in the multi-party setting, presenting a new toolkit for proving multi-party communication lower bounds.

Cite as

Mi-Ying (Miryam) Huang, Xinyu Mao, Shuo Wang, Guangxu Yang, and Jiapeng Zhang. A Min-Entropy Approach to Multi-Party Communication Lower Bounds. In 40th Computational Complexity Conference (CCC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 339, pp. 33:1-33:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.CCC.2025.33,
  author =	{Huang, Mi-Ying (Miryam) and Mao, Xinyu and Wang, Shuo and Yang, Guangxu and Zhang, Jiapeng},
  title =	{{A Min-Entropy Approach to Multi-Party Communication Lower Bounds}},
  booktitle =	{40th Computational Complexity Conference (CCC 2025)},
  pages =	{33:1--33:29},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-379-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{339},
  editor =	{Srinivasan, Srikanth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2025.33},
  URN =		{urn:nbn:de:0030-drops-237273},
  doi =		{10.4230/LIPIcs.CCC.2025.33},
  annote =	{Keywords: communication complexity, lifting theorems, set intersection, chained index}
}
Document
Survey
Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}
Document
Private Estimation When Data and Privacy Demands Are Correlated

Authors: Syomantak Chaudhuri and Thomas A. Courtade

Published in: LIPIcs, Volume 329, 6th Symposium on Foundations of Responsible Computing (FORC 2025)


Abstract
Differential Privacy (DP) is the current gold-standard for ensuring privacy for statistical queries. Estimation problems under DP constraints appearing in the literature have largely focused on providing equal privacy to all users. We consider the problems of empirical mean estimation for univariate data and frequency estimation for categorical data, both subject to heterogeneous privacy constraints. Each user, contributing a sample to the dataset, is allowed to have a different privacy demand. The dataset itself is assumed to be worst-case and we study both problems under two different formulations - first, where privacy demands and data may be correlated, and second, where correlations are weakened by random permutation of the dataset. We establish theoretical performance guarantees for our proposed algorithms, under both PAC error and mean-squared error. These performance guarantees translate to minimax optimality in several instances, and experiments confirm superior performance of our algorithms over other baseline techniques.

Cite as

Syomantak Chaudhuri and Thomas A. Courtade. Private Estimation When Data and Privacy Demands Are Correlated. In 6th Symposium on Foundations of Responsible Computing (FORC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 329, pp. 3:1-3:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chaudhuri_et_al:LIPIcs.FORC.2025.3,
  author =	{Chaudhuri, Syomantak and Courtade, Thomas A.},
  title =	{{Private Estimation When Data and Privacy Demands Are Correlated}},
  booktitle =	{6th Symposium on Foundations of Responsible Computing (FORC 2025)},
  pages =	{3:1--3:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-367-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{329},
  editor =	{Bun, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2025.3},
  URN =		{urn:nbn:de:0030-drops-231305},
  doi =		{10.4230/LIPIcs.FORC.2025.3},
  annote =	{Keywords: Differential Privacy, Personalized Privacy, Heterogeneous Privacy, Correlations in Privacy}
}
Document
Vision
Towards Ordinal Data Science

Authors: Gerd Stumme, Dominik Dürrschnabel, and Tom Hanika

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Order is one of the main instruments to measure the relationship between objects in (empirical) data. However, compared to methods that use numerical properties of objects, the amount of ordinal methods developed is rather small. One reason for this is the limited availability of computational resources in the last century that would have been required for ordinal computations. Another reason - particularly important for this line of research - is that order-based methods are often seen as too mathematically rigorous for applying them to real-world data. In this paper, we will therefore discuss different means for measuring and ‘calculating’ with ordinal structures - a specific class of directed graphs - and show how to infer knowledge from them. Our aim is to establish Ordinal Data Science as a fundamentally new research agenda. Besides cross-fertilization with other cornerstone machine learning and knowledge representation methods, a broad range of disciplines will benefit from this endeavor, including, psychology, sociology, economics, web science, knowledge engineering, scientometrics.

Cite as

Gerd Stumme, Dominik Dürrschnabel, and Tom Hanika. Towards Ordinal Data Science. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 6:1-6:39, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{stumme_et_al:TGDK.1.1.6,
  author =	{Stumme, Gerd and D\"{u}rrschnabel, Dominik and Hanika, Tom},
  title =	{{Towards Ordinal Data Science}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{6:1--6:39},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.6},
  URN =		{urn:nbn:de:0030-drops-194801},
  doi =		{10.4230/TGDK.1.1.6},
  annote =	{Keywords: Order relation, data science, relational theory of measurement, metric learning, general algebra, lattices, factorization, approximations and heuristics, factor analysis, visualization, browsing, explainability}
}
Document
Track A: Algorithms, Complexity and Games
The Communication Complexity of Set Intersection Under Product Distributions

Authors: Rotem Oshman and Tal Roth

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)


Abstract
We consider a multiparty setting where k parties have private inputs X_1,…,X_k ⊆ [n] and wish to compute the intersection ⋂_{𝓁 =1}^k X_𝓁 of their sets, using as little communication as possible. This task generalizes the well-known problem of set disjointness, where the parties are required only to determine whether the intersection is empty or not. In the worst-case, it is known that the communication complexity of finding the intersection is the same as that of solving set disjointness, regardless of the size of the intersection: the cost of both problems is Ω(n log k + k) bits in the shared blackboard model, and Ω (nk) bits in the coordinator model. In this work we consider a realistic setting where the parties' inputs are independent of one another, that is, the input is drawn from a product distribution. We show that this makes finding the intersection significantly easier than in the worst-case: only Θ̃((n^{1-1/k} (H(S) + 1)^{1/k}) + k) bits of communication are required, where {H}(S) is the Shannon entropy of the intersection S. We also show that the parties do not need to know the exact underlying input distribution; if we are given in advance O(n^{1/k}) samples from the underlying distribution μ, we can learn enough about μ to allow us to compute the intersection of an input drawn from μ using expected communication Θ̃((n^{1-1/k}𝔼[|S|]^{1/k}) + k), where |S| is the size of the intersection.

Cite as

Rotem Oshman and Tal Roth. The Communication Complexity of Set Intersection Under Product Distributions. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 95:1-95:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{oshman_et_al:LIPIcs.ICALP.2023.95,
  author =	{Oshman, Rotem and Roth, Tal},
  title =	{{The Communication Complexity of Set Intersection Under Product Distributions}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{95:1--95:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.95},
  URN =		{urn:nbn:de:0030-drops-181472},
  doi =		{10.4230/LIPIcs.ICALP.2023.95},
  annote =	{Keywords: Communication complexity, intersection, set disjointness}
}
Document
RANDOM
Candidate Tree Codes via Pascal Determinant Cubes

Authors: Inbar Ben Yaacov, Gil Cohen, and Anand Kumar Narayanan

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)


Abstract
Tree codes are combinatorial structures introduced by Schulman [Schulman, 1993] as key ingredients in interactive coding schemes. Asymptotically-good tree codes are long known to exist, yet their explicit construction remains a notoriously hard open problem. Even proposing a plausible construction, without the burden of proof, is difficult and the defining tree code property requires structure that remains elusive. To the best of our knowledge, only one candidate appears in the literature, due to Moore and Schulman [Moore and Schulman, 2014]. We put forth a new candidate for an explicit asymptotically-good tree code. Our construction is an extension of the vanishing rate tree code by Cohen-Haeupler-Schulman [Cohen et al., 2018], and its correctness relies on a conjecture that we introduce on certain Pascal determinants indexed by the points of the Boolean hypercube. Furthermore, using the vanishing distance tree code by Gelles et al. [Gelles et al., 2016] enables us to present a construction that relies on an even weaker assumption. We furnish evidence supporting our conjecture through numerical computation, combinatorial arguments from planar path graphs and based on well-studied heuristics from arithmetic geometry.

Cite as

Inbar Ben Yaacov, Gil Cohen, and Anand Kumar Narayanan. Candidate Tree Codes via Pascal Determinant Cubes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 54:1-54:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{benyaacov_et_al:LIPIcs.APPROX/RANDOM.2021.54,
  author =	{Ben Yaacov, Inbar and Cohen, Gil and Narayanan, Anand Kumar},
  title =	{{Candidate Tree Codes via Pascal Determinant Cubes}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{54:1--54:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.54},
  URN =		{urn:nbn:de:0030-drops-147474},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.54},
  annote =	{Keywords: Tree codes, Sparse polynomials, Explicit constructions}
}
  • Refine by Type
  • 11 Document/PDF
  • 8 Document/HTML

  • Refine by Publication Year
  • 1 2026
  • 7 2025
  • 2 2023
  • 1 2021

  • Refine by Author
  • 1 Ben Yaacov, Inbar
  • 1 Chabot, Yoan
  • 1 Chaudhuri, Syomantak
  • 1 Cohen, Gil
  • 1 Couceiro, Miguel
  • Show More...

  • Refine by Series/Journal
  • 9 LIPIcs
  • 2 TGDK

  • Refine by Classification
  • 4 Theory of computation → Communication complexity
  • 2 Theory of computation → Error-correcting codes
  • 1 Computing methodologies → Algebraic algorithms
  • 1 Computing methodologies → Boolean algebra algorithms
  • 1 Computing methodologies → Inductive logic learning
  • Show More...

  • Refine by Keyword
  • 2 Differential Privacy
  • 2 communication complexity
  • 1 Arboricity
  • 1 Communication complexity
  • 1 Continual Release
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail