**Published in:** LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

The number of quantifiers needed to express first-order (FO) properties is captured by two-player combinatorial games called multi-structural games. We analyze these games on binary strings with an ordering relation, using a technique we call parallel play, which significantly reduces the number of quantifiers needed in many cases. Ordered structures such as strings have historically been notoriously difficult to analyze in the context of these and similar games. Nevertheless, in this paper, we provide essentially tight bounds on the number of quantifiers needed to characterize different-sized subsets of strings. The results immediately give bounds on the number of quantifiers necessary to define several different classes of Boolean functions. One of our results is analogous to Lupanov’s upper bounds on circuit size and formula size in propositional logic: we show that every Boolean function on n-bit inputs can be defined by a FO sentence having (1+ε)n/log(n) + O(1) quantifiers, and that this is essentially tight. We reduce this number to (1 + ε)log(n) + O(1) when the Boolean function in question is sparse.

Marco Carmosino, Ronald Fagin, Neil Immerman, Phokion G. Kolaitis, Jonathan Lenchner, and Rik Sengupta. On the Number of Quantifiers Needed to Define Boolean Functions. In 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 306, pp. 34:1-34:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

**Published in:** LIPIcs, Volume 290, 27th International Conference on Database Theory (ICDT 2024)

A query algorithm based on homomorphism counts is a procedure for determining whether a given instance satisfies a property by counting homomorphisms between the given instance and finitely many predetermined instances. In a left query algorithm, we count homomorphisms from the predetermined instances to the given instance, while in a right query algorithm we count homomorphisms from the given instance to the predetermined instances. Homomorphisms are usually counted over the semiring ℕ of non-negative integers; it is also meaningful, however, to count homomorphisms over the Boolean semiring 𝔹, in which case the homomorphism count indicates whether or not a homomorphism exists. We first characterize the properties that admit a left query algorithm over 𝔹 by showing that these are precisely the properties that are both first-order definable and closed under homomorphic equivalence. After this, we turn attention to a comparison between left query algorithms over 𝔹 and left query algorithms over ℕ. In general, there are properties that admit a left query algorithm over ℕ but not over 𝔹. The main result of this paper asserts that if a property is closed under homomorphic equivalence, then that property admits a left query algorithm over 𝔹 if and only if it admits a left query algorithm over ℕ. In other words and rather surprisingly, homomorphism counts over ℕ do not help as regards properties that are closed under homomorphic equivalence. Finally, we characterize the properties that admit both a left query algorithm over 𝔹 and a right query algorithm over 𝔹.

Balder ten Cate, Victor Dalmau, Phokion G. Kolaitis, and Wei-Lin Wu. When Do Homomorphism Counts Help in Query Algorithms?. In 27th International Conference on Database Theory (ICDT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 290, pp. 8:1-8:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

**Published in:** Dagstuhl Reports, Volume 12, Issue 7 (2023)

This report documents the program and the outcomes of Dagstuhl Seminar 22301 "Algorithmic Aspects of Information Theory".
Constraints on entropies constitute the "laws of information theory". These constraints go well beyond Shannon’s basic information inequalities, as they include not only information inequalities that cannot be derived from Shannon’s basic inequalities, but also conditional inequalities and disjunctive inequalities that are valid for all entropic functions. There is an extensive body of research on constraints on entropies and their applications to different areas of mathematics and computer science. So far, however, little progress has been made on the algorithmic aspects of information theory. In fact, even fundamental questions about the decidability of information inequalities and their variants have remained open to date.
Recently, research in different applications has demonstrated a clear need for algorithmic solutions to questions in information theory. These applications include: finding tight upper bounds on the answer to a query on a relational database, the homomorphism domination problem and its uses in query optimization, the conditional independence implication problem, soft constraints in databases, group-theoretic inequalities, and lower bounds on the information ratio in secret sharing. Thus far, the information-theory community has had little interaction with the communities where these applications have been studied or with the computational complexity community. The main goal of this Dagstuhl Seminar was to bring together researchers from the aforementioned communities and to develop an agenda for studying algorithmic aspects of information theory, motivated from a rich set of diverse applications. By using the algorithmic lens to examine the common problems and by transferring techniques from one community to the other, we expected that bridges would be created and some tangible progress on open questions could be made.

Phokion G. Kolaitis, Andrej E. Romashchenko, Milan Studený, Dan Suciu, and Tobias A. Boege. Algorithmic Aspects of Information Theory (Dagstuhl Seminar 22301). In Dagstuhl Reports, Volume 12, Issue 7, pp. 180-204, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

**Published in:** Dagstuhl Reports, Volume 12, Issue 2 (2022)

This report documents the program and the outcomes of Dagstuhl Seminar 22061 "Logic and Random Discrete Structures". The main topic of this seminar has been the analysis of large random discrete structures, such as trees, graphs, or permutations, from the perspective of mathematical logic. It has brought together both experts and junior researchers from a number of different areas where logic and random structures play a role, with the goal to establish new connections between such areas and to encourage interactions between foundational research and different application areas, including probabilistic databases.

Erich Grädel, Phokion G. Kolaitis, Marc Noy, and Matthias Naaf. Logic and Random Discrete Structures (Dagstuhl Seminar 22061). In Dagstuhl Reports, Volume 12, Issue 2, pp. 1-16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

**Published in:** LIPIcs, Volume 178, 27th International Symposium on Temporal Representation and Reasoning (TIME 2020)

During the past fifteen years, data exchange has been explored in depth and in a variety of different settings. Even though temporal databases constitute a mature area of research studied over several decades, the investigation of temporal data exchange was initiated only very recently. We analyze the properties of universal solutions in temporal data exchange with emphasis on the relationship between universal solutions in the context of concrete time and universal solutions in the context of abstract time. We show that challenges arise even in the setting in which the data exchange specifications involve a single temporal variable. After this, we identify settings, including data exchange settings that involve multiple temporal variables, in which these challenges can be overcome.

Zehui Cheng and Phokion G. Kolaitis. Universal Solutions in Temporal Data Exchange. In 27th International Symposium on Temporal Representation and Reasoning (TIME 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 178, pp. 8:1-8:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Track B: Automata, Logic, Semantics, and Theory of Programming

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Constraints on entropies are considered to be the laws of information theory. Even though the pursuit of their discovery has been a central theme of research in information theory, the algorithmic aspects of constraints on entropies remain largely unexplored. Here, we initiate an investigation of decision problems about constraints on entropies by placing several different such problems into levels of the arithmetical hierarchy. We establish the following results on checking the validity over all almost-entropic functions: first, validity of a Boolean information constraint arising from a monotone Boolean formula is co-recursively enumerable; second, validity of "tight" conditional information constraints is in Π⁰₃. Furthermore, under some restrictions, validity of conditional information constraints "with slack" is in Σ⁰₂, and validity of information inequality constraints involving max is Turing equivalent to validity of information inequality constraints (with no max involved). We also prove that the classical implication problem for conditional independence statements is co-recursively enumerable.

Mahmoud Abo Khamis, Phokion G. Kolaitis, Hung Q. Ngo, and Dan Suciu. Decision Problems in Information Theory. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 106:1-106:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

**Published in:** Dagstuhl Reports, Volume 9, Issue 9 (2020)

The goal of building truly intelligent systems has forever been a central problem in computer science. While logic-based approaches of yore have had their successes and failures, the era of machine learning, specifically deep learning is also coming upon significant challenges. There is a growing consensus that the inductive reasoning and complex, high-dimensional pattern recognition capabilities of deep learning models need to be combined with symbolic (even programmatic), deductive capabilities traditionally developed in the logic and automated reasoning communities in order to achieve the next step towards building intelligent systems, including making progress at the frontier of hard problems such as explainable AI. However, these communities tend to be quite separate and interact only minimally, often at odds with each other upon the subject of the ``correct approach'' to AI. This report documents the efforts of Dagstuhl Seminar 19361 on ``Logic and Learning'' to bring these communities together in order to: (i) bridge the research efforts between them and foster an exchange of ideas in order to create unified formalisms and approaches that bear the advantages of both research methodologies; (ii) review and analyse the progress made across both communities; (iii) understand the subtleties and difficulties involved in solving hard problems using both perspectives; (iv) make attempts towards a consensus on what the hard problems are and what the elements of good solutions to these problems would be.
The three focal points of the seminar were the strands of ``Logic for Machine Learning'', ``Machine Learning for Logic'', and ``Logic vs. Machine Learning''. The seminar format consisted of long and short talks, as well as breakout sessions. We summarise the motivations and proceedings of the seminar, and report on the abstracts of the talks and the results of the breakout sessions.

Michael Benedikt, Kristian Kersting, Phokion G. Kolaitis, and Daniel Neider. Logic and Learning (Dagstuhl Seminar 19361). In Dagstuhl Reports, Volume 9, Issue 9, pp. 1-22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

**Published in:** Dagstuhl Reports, Volume 9, Issue 1 (2019)

This report documents the programme and outcomes of Dagstuhl Seminar 19031 "Logics for Dependence and Independence". This seminar served as a follow-up seminar to the highly successful seminars "Dependence Logic: Theory and Applications" (13071) and "Logics for Dependence and Independence" (15261).
A key objective of the seminar was to bring together researchers working in dependence logic and in the application areas so that they can communicate state-of-the-art advances and embark on a systematic interaction. The goal was especially to reach those researchers who have recently started working in this thriving area as well as researchers working on several aspects of database theory, separation logic, and logics of uncertainy.

Erich Grädel, Phokion G. Kolaitis, Juha Kontinen, and Heribert Vollmer. Logics for Dependence and Independence (Dagstuhl Seminar 19031). In Dagstuhl Reports, Volume 9, Issue 1, pp. 28-46, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

**Published in:** Dagstuhl Reports, Volume 7, Issue 9 (2018)

This report documents the program and the outcomes of Dagstuhl Seminar 17361 "Finite and Algorithmic Model Theory".

Anuj Dawar, Erich Grädel, Phokion G. Kolaitis, and Thomas Schwentick. Finite and Algorithmic Model Theory (Dagstuhl Seminar 17361). In Dagstuhl Reports, Volume 7, Issue 9, pp. 1-25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Invited Talk

**Published in:** LIPIcs, Volume 82, 26th EACSL Annual Conference on Computer Science Logic (CSL 2017)

A schema mapping is a high-level specification of the relationship between two database schemas. For the past fifteen years, schema mappings have played an essential role in the modeling and analysis of important data inter-operability tasks, such as data exchange and data integration. Syntactically, schema mappings are expressed in some schema-mapping language, which, typically, is a fragment of first-order logic or second-order logic. In the first part of the talk, we will introduce the main schema-mapping languages, will discuss the fundamental structural properties of these languages, and will then use these structural properties to obtain characterizations of various schema-mapping languages in the spirit of abstract model theory. In the second part of the talk, we will examine schema mappings from a dynamic viewpoint by considering sequences of schema mappings and studying the convergence properties of such sequences. To this effect, we will introduce a metric space that is based on a natural notion of distance between sets of database instances and will investigate pointwise limits and uniform limits of sequences of schema mappings. Among other findings, it will turn out that the completion of this metric space can be described in terms of graph limits arising from converging sequences of homomorphism densities.

Phokion G. Kolaitis. Schema Mappings: Structural Properties and Limits (Invited Talk). In 26th EACSL Annual Conference on Computer Science Logic (CSL 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 82, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

**Published in:** LIPIcs, Volume 68, 20th International Conference on Database Theory (ICDT 2017)

We develop a unifying approach to declarative entity linking by introducing the notion of an entity linking framework and an accompanying notion of the certain links in such a framework. In an entity linking framework, logic-based constraints are used to express properties of the desired link relations in terms of source relations and, possibly, in terms of other link relations. The definition of the certain links in such a framework makes use of weighted repairs and consistent answers in inconsistent databases. We demonstrate the modeling capabilities of this approach by showing that numerous concrete entity linking scenarios can be cast as such entity linking frameworks for suitable choices of constraints and weights. By using the certain links as a measure of expressive power, we investigate the relative expressive power of several entity linking frameworks and obtain sharp comparisons. In particular, we show that we gain expressive power if we allow constraints that capture non-recursive collective entity resolution, where link relations may depend on other link relations (and not just on source relations). Moreover, we show that an increase in expressive power also takes place when we allow constraints that incorporate preferences as an additional mechanism for expressing "goodness" of links.

Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan. Expressive Power of Entity-Linking Frameworks. In 20th International Conference on Database Theory (ICDT 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 68, pp. 10:1-10:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

**Published in:** LIPIcs, Volume 62, 25th EACSL Annual Conference on Computer Science Logic (CSL 2016)

During the past decade, dependence logic has emerged as a formalism suitable for expressing and analyzing notions of dependence and independence that arise in different scientific areas. The sentences of dependence logic have the same expressive power as those of existential second-order logic, hence dependence logic captures NP on the class of all finite structures. In this paper, we identify a natural fragment of universal dependence logic and show that, in a precise sense, it captures constraint satisfaction. This tight connection between dependence logic and constraint satisfaction contributes to the descriptive complexity of constraint satisfaction and elucidates the expressive power of universal dependence logic.

Lauri Hella and Phokion G. Kolaitis. Dependence Logic vs. Constraint Satisfaction. In 25th EACSL Annual Conference on Computer Science Logic (CSL 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 62, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

**Published in:** LIPIcs, Volume 48, 19th International Conference on Database Theory (ICDT 2016)

Schema mappings have been extensively studied in the context of data exchange and data integration, where they have turned out to be the right level of abstraction for formalizing data inter-operability tasks. Up to now and for the most part, schema mappings have been studied as static objects, in the sense that each time the focus has been on a single schema mapping of interest or, in the case of composition, on a pair of schema mappings of interest.
In this paper, we adopt a dynamic viewpoint and embark on a study of sequences of schema mappings and of the limiting behavior of such sequences. To this effect, we first introduce a natural notion of distance on sets of finite target instances that expresses how "close" two sets of target instances are as regards the certain answers of conjunctive queries on these sets. Using this notion of distance, we investigate pointwise limits and uniform limits of sequences of schema mappings, as well as the companion notions of pointwise Cauchy and uniformly Cauchy sequences of schema mappings. We obtain a number of results about the limits of sequences of GAV schema mappings and the limits of sequences of LAV schema mappings that reveal striking differences between these two classes of schema mappings. We also consider the completion of the metric space of sets of target instances and obtain concrete representations of limits of sequences of schema mappings in terms of generalized schema mappings, i.e., schema mappings with infinite target instances as solutions to (finite) source instances.

Phokion G. Kolaitis, Reinhard Pichler, Emanuel Sallinger, and Vadim Savenkov. Limits of Schema Mappings. In 19th International Conference on Database Theory (ICDT 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 48, pp. 19:1-19:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

**Published in:** LIPIcs, Volume 31, 18th International Conference on Database Theory (ICDT 2015)

The aim of this paper is to introduce and develop a truly declarative framework for entity linking and, in particular, for entity resolution. As in some earlier approaches, our framework is based on the systematic use of constraints. However, the constraints we adopt are link-to-source constraints, unlike in earlier approaches where source-to-link constraints were used to dictate how to generate links. Our approach makes it possible to focus entirely on the intended properties of the outcome of entity linking, thus separating the constraints from any procedure of how to achieve that outcome. The core language consists of link-to-source constraints that specify the desired properties of a link relation in terms of source relations and built-in predicates such as similarity measures. A key feature of the link-to-source constraints is that they employ disjunction, which enables the declarative listing of all the reasons as to why two entities should be linked. We also consider extensions of the core language that capture collective entity resolution, by allowing inter-dependence between links.
We identify a class of "good" solutions for entity linking specifications, which we call maximum-value solutions and which capture the strength of a link by counting the reasons that justify it. We study natural algorithmic problems associated with these solutions, including the problem of enumerating the "good" solutions, and the problem of finding the certain links, which are the links that appear in every "good" solution. We show that these problems are tractable for the core language, but may become intractable once we allow inter-dependence between link relations. We also make some surprising connections between our declarative framework, which is deterministic, and probabilistic approaches such as ones based on Markov Logic Networks.

Douglas Burdick, Ronald Fagin, Phokion G. Kolaitis, Lucian Popa, and Wang-Chiew Tan. A Declarative Framework for Linking Entities. In 18th International Conference on Database Theory (ICDT 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 31, pp. 25-43, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Complete Volume

**Published in:** Dagstuhl Follow-Ups, Volume 5, Data Exchange, Integration, and Streams (2013)

DFU, Volume 5, Data Exchange, Information, and Streams, Complete Volume

Data Exchange, Integration, and Streams. Dagstuhl Follow-Ups, Volume 5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)

**Published in:** Dagstuhl Follow-Ups, Volume 5, Data Exchange, Integration, and Streams (2013)

Frontmatter, Table of Contents, Preface

Data Exchange, Integration, and Streams. Dagstuhl Follow-Ups, Volume 5, pp. 0:i-0:x, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)

Invited Talk

**Published in:** LIPIcs, Volume 13, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011)

Schema mappings are high-level specifications that describe the relationship between two database schemas. Schema mappings are considered to be the essential building blocks in such critical data interoperability tasks as data exchange and data integration. For this reason, they have been the focus of extensive research investigations over the past several years. Since in real-life applications schema mappings can be quite complex, it is important to develop methods and tools for illustrating, explaining, and deriving schema mappings. A promising approach to this effect is to use “good” data examples that illustrate the schema mapping at hand.
In this talk, we present an overview of recent work on characterizing and deriving schema mappings via a finite set of data examples. We show that every LAV schema mapping (i.e., a schema mapping specified by a finite set of local-as-view tuple-generating dependencies) is uniquely characterized by a finite set of universal data examples with respect to the class of all LAV schema mappings. We also show that this type of result does not hold for arbitrary GAV schema mappings (i.e., schema mappings specified by a finite set of global-as-view tuple- generating dependencies). After this, we give a necessary and sufficient algorithmic condition for a GAV schema mapping to be uniquely characterizable by a finite set of universal examples with respect to the class of all GAV schema mappings. Along the way, we establish tight connections between unique characterizability of schema mappings and homomorphism dualities.
This is joint work with Bogdan Alexe (IBM Research - Almaden), Balder ten Cate (UC Santa Cruz), and Wang-Chiew Tan (UC Santa Cruz and IBM Research - Almaden) based on [1, 2, 3].

Phokion G. Kolaitis. Schema Mappings and Data Examples: Deriving Syntax from Semantics (Invited Talk). In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011). Leibniz International Proceedings in Informatics (LIPIcs), Volume 13, p. 25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)

**Published in:** Dagstuhl Seminar Proceedings, Volume 9441, The Constraint Satisfaction Problem: Complexity and Approximability (2010)

From 25th to 30th October 2009, the Dagstuhl Seminar 09441 ``The Constraint Satisfaction Problem: Complexity and Approximability'' was held
in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available.

Andrei A. Bulatov, Martin Grohe, Phokion G. Kolaitis, and Andrei Krokhin. 09441 Abstracts Collection – The Constraint Satisfaction Problem: Complexity and Approximability. In The Constraint Satisfaction Problem: Complexity and Approximability. Dagstuhl Seminar Proceedings, Volume 9441, pp. 1-14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2010)

**Published in:** Dagstuhl Seminar Proceedings, Volume 9441, The Constraint Satisfaction Problem: Complexity and Approximability (2010)

The seminar brought together forty researchers from di®erent highly
advanced areas of constraint satisfaction and with complementary ex-
pertise (logical, algebraic, combinatorial, probabilistic aspects). The list
of participants contained both senior and junior researchers and a small
number of advanced graduate students.

Andrei A. Bulatov, Martin Grohe, Phokion G. Kolaitis, and Andrei Krokhin. 09441 Executive Summary – The Constraint Satisfaction Problem: Complexity and Approximability. In The Constraint Satisfaction Problem: Complexity and Approximability. Dagstuhl Seminar Proceedings, Volume 9441, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2010)

**Published in:** Dagstuhl Seminar Proceedings, Volume 6401, Complexity of Constraints (2006)

From 01.10.06 to 06.10.06, the Dagstuhl Seminar 06401 ``Complexity of Constraints'' was held in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available.

Nadia Creignou, Phokion Kolaitis, and Heribert Vollmer. 06401 Abstracts Collection – Complexity of Constraints. In Complexity of Constraints. Dagstuhl Seminar Proceedings, Volume 6401, pp. 1-14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

**Published in:** Dagstuhl Seminar Proceedings, Volume 6401, Complexity of Constraints (2006)

In this document we describe the original motivation and goals of the seminar as well as the sequence of talks given during the seminar.

Nadia Creignou, Phokion Kolaitis, and Heribert Vollmer. 06401 Executive Summary – Complexity of Constraints. In Complexity of Constraints. Dagstuhl Seminar Proceedings, Volume 6401, pp. 1-6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

