Dagstuhl Seminar Proceedings, Volume 7161
Dagstuhl Seminar Proceedings
DagSemProc
https://www.dagstuhl.de/dagpub/1862-4405
https://dblp.org/db/series/dagstuhl
1862-4405
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
7161
2008
https://drops.dagstuhl.de/entities/volume/DagSemProc-volume-7161
07161 Abstracts Collection – Probabilistic, Logical and Relational Learning - A Further Synthesis
From April 14 – 20, 2007, the Dagstuhl Seminar 07161 ``Probabilistic, Logical and Relational Learning - A Further Synthesis'' was held
in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available.
Artificial Intelligence
Uncertainty in AI
Probabilistic Reasoning
Knowledge Representation
Logic Programming
Relational Learning
Inductive Logic Programming
Graphical Models
Statistical Relational Learning
First-Order Logical and Relational Probabilistic Languages
1-21
Regular Paper
Luc
De Raedt
Luc De Raedt
Thomas
Dietterich
Thomas Dietterich
Lise
Getoor
Lise Getoor
Kristian
Kersting
Kristian Kersting
Stephen H.
Muggleton
Stephen H. Muggleton
10.4230/DagSemProc.07161.1
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
A general framework for unsupervised preocessing of structured data
We propose a general framework for unsupervised recurrent and recursive networks. This proposal covers various popular approaches like standard self organizing maps (SOM), temporal Kohonen maps, resursive SOM, and SOM for structured data. We define Hebbian learning within this general framework. We show how approaches based on an energy function, like neural gas, can be transferred to this abstract framework so that proposals for new learning algorithms emerge.
Relational clustering
median clustering
recursive SOM models
kernel SOM
1-6
Regular Paper
Barbara
Hammer
Barbara Hammer
Alessio
Micheli
Alessio Micheli
Alessandro
Sperduti
Alessandro Sperduti
10.4230/DagSemProc.07161.2
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Exploiting prior knowledge in Intelligent Assistants - Combining relational models with hierarchies
Statitsical relational models have been successfully used to model
static probabilistic relationships between the entities of the domain.
In this talk, we illustrate their use in a dynamic decison-theoretic
setting where the task is to assist a user by inferring his intentional
structure and taking appropriate assistive actions. We show that the
statistical relational models can be used to succintly express the
system's prior knowledge about the user's goal-subgoal structure and
tune it with experience. As the system is better able to predict the
user's goals, it improves the effectiveness of its assistance. We show
through experiments that both the hierarchical structure of the goals
and the parameter sharing facilitated by relational models significantly
improve the learning speed.
Statistical Relational Learning
Intelligent Assistants
1-2
Regular Paper
Sriraam
Natarajan
Sriraam Natarajan
Prasad
Tadepalli
Prasad Tadepalli
Alan
Fern
Alan Fern
10.4230/DagSemProc.07161.3
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Learning Probabilistic Relational Dynamics for Multiple Tasks
The ways in which an agent's actions affect the world can often be modeled compactly using a set of relational probabilistic planning rules. This extended abstract addresses the problem of learning such rule sets for multiple related tasks. We take a hierarchical Bayesian approach, in which the system learns a prior distribution over rule sets. We present a class of prior distributions parameterized by a rule set prototype that is stochastically modified to produce a task-specific rule set. We also describe a coordinate ascent algorithm that iteratively optimizes the task-specific rule sets and the prior distribution. Experiments using this algorithm show that transferring information from related tasks significantly reduces the amount of training data required to predict action effects in blocks-world domains.
Hierarchical Bayesian models
transfer learning
multi-task learning
probabilistic planning rules
1-10
Regular Paper
Ashwin
Deshpande
Ashwin Deshpande
Brian
Milch
Brian Milch
Luke S.
Zettlemoyer
Luke S. Zettlemoyer
Leslie Pack
Kaelbling
Leslie Pack Kaelbling
10.4230/DagSemProc.07161.4
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Logical Particle Filtering
In this paper, we consider the problem of
filtering in relational hidden Markov models.
We present a compact representation for such models
and an associated logical particle filtering
algorithm. Each particle contains a logical
formula that describes a set of states.
The algorithm updates the formulae as new
observations are received.
Since a single particle tracks many states, this filter
can be more accurate than a traditional particle filter
in high dimensional state spaces, as we demonstrate
in experiments.
Particle filter
logical hidden Markov model
1-14
Regular Paper
Luke S.
Zettlemoyer
Luke S. Zettlemoyer
Hanna M.
Pasula
Hanna M. Pasula
Leslie
Pack Kaelbling
Leslie Pack Kaelbling
10.4230/DagSemProc.07161.5
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Markov Logic in Infinite Domains
Markov logic combines logic and probability by attaching weights to
first-order formulas, and viewing them as templates for features of Markov
networks. Unfortunately, in its original formulation it does not have the
full power of first-order logic, because it applies only to finite domains.
Recently, we have extended Markov logic to infinite domains, by casting it
in the framework of Gibbs measures. In this talk I will summarize our main
results to date, including sufficient conditions for the existence and
uniqueness of a Gibbs measure consistent with an infinite MLN, and
properties of the set of consistent measures in the non-unique case.
(Many important phenomena, like phase transitions, are modeled by
non-unique MLNs.) Under the conditions for existence, we have extended
to infinite domains the result in Richardson and Domingos (2006) that
first-order logic is the limiting case of Markov logic when all weights
tend to infinity. I will also discuss some fundamental limitations of
Herbrand interpretations (and representations based on them) for
probabilistic modeling of infinite domains, and how to get around them.
Finally, I will discuss some of the surprising insights for learning
and inference in large finite domains that result from considering the
infinite limit.
Markov logic networks
Gibbs measures
first-order logic
infinite probabilistic models
Markov networks
1-16
Regular Paper
Pedro
Domingos
Pedro Domingos
Parag
Singla
Parag Singla
10.4230/DagSemProc.07161.6
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Model equivalence of PRISM programs
The problem of deciding the probability model equivalence of two
PRISM programs is addressed. In the finite case this problem can be
solved (albeit slowly) using techniques from emph{algebraic
statistics}, specifically the computation of elimination ideals
and Gr"{o}bner bases. A very brief introduction to algebraic
statistics is given. Consideration is given to cases where shortcuts
to proving/disproving model equivalence are available.
PRISM programs
model equivalence
model inclusion
algebraic statistics
algebraic geometry
ideals
varieties
Gr"{o}bner bases
polynomials
1-21
Regular Paper
James
Cussens
James Cussens
10.4230/DagSemProc.07161.7
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
On classification, ranking, and probability estimation
Given a binary classification task, a ranker is an algorithm that can sort a set of instances from highest to lowest expectation that the instance is positive. In contrast to a classifier, a ranker does not output class predictions – although it can be turned into a classifier with help of an additional procedure to split the ranked list into two. A straightforward way to compute rankings is to train a scoring classifier to assign numerical scores to instances, for example the predicted odds that an instance is positive. However, rankings can be computed without scores, as we demonstrate in this paper. We propose a lexicographic ranker, LexRank , whose rankings are derived not from scores, but from a simple ranking of attribute values obtained from the training data. Although various metrics can be used, we show that by using the odds ratio to rank the attribute values we obtain a ranker that is conceptually close to the naive Bayes classifier, in the sense that for every instance of LexRank there exists an instance of naive Bayes
that achieves the same ranking. However, the reverse is not true, which means that LexRank is more biased than naive Bayes. We systematically develop the relationships and differences between classification, ranking, and probability estimation, which leads to a novel connection between the Brier score and ROC curves. Combining LexRank with isotonic regression, which derives probability estimates from the ROC convex hull, results in the lexicographic probability estimator LexProb.
Ranking
probability estimation
ROC analysis
calibration
1-10
Regular Paper
Peter
Flach
Peter Flach
Edson
Matsubara
Edson Matsubara
10.4230/DagSemProc.07161.8
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Structural Sampling for Statistical Software Testing
Structural Statistical Software Testing exploits the control flow
graph of the program being tested to construct test cases.
While test cases can easily be extracted from {em feasible paths} in the control
flow graph, that is, paths which are actually exerted for some
values of the program input, the feasible path region is a tiny fraction
of the graph paths (less than $10^{-5}]$ for medium size programs).
The S4T algorithm presented in this paper aims to address this limitation;
as an Active Relational Learning Algorithm, it uses the few feasible paths
initially available to sample new feasible paths. The difficulty comes
from the non-Markovian nature of the feasible path concept, due to the
long-range dependencies between the nodes in the control flow graph.
Experimental validation on real-world and artificial problems
demonstrates significant improvements compared to the state of the art.
Active Relational Learning
Software Testing
Autonomic Computing
Parikh Maps
1-13
Regular Paper
Nicolas
Baskiotis
Nicolas Baskiotis
Michele
Sebag
Michele Sebag
10.4230/DagSemProc.07161.9
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode
Variational Bayes via Propositionalization
We propose a unified approach to VB (variational Bayes) in
symbolic-statistical modeling via propositionalization.
By propositionalization we mean, broadly, expressing and
computing probabilistic models such as BNs (Bayesian
networks) and PCFGs (probabilistic context free grammars)
in terms of propositional logic that considers
propositional variables as binary random variables.
Our proposal is motivated by three observations. The
first one is that PPC (propostionalized probability
computation), i.e. probability computation formalized in
a propositional setting, has turned out to be general and
efficient when variable values are sparsely
interdependent. Examples include (discrete) BNs, PCFGs
and more generally PRISM which is a Turing complete logic
programming language with EM learning ability we have been
developing, and computes probabilities using graphically
represented AND/OR boolean formulas. Efficiency of PPC is
classically testified by the Inside-Outside algorithm in
the case of PCFGs and by recent PPC approaches in the case
of BNs such as the one by Darwiche et al. that exploits
$0$ probability and CSI (context specific independence).
Dechter et al. also revealed that PPC is a general
computation scheme for BNs by their formulation of AND/OR
search spaces.
Second of all, while VB has been around for sometime as a
practically effective approach to Bayesian modeling, it's
use is still somewhat restricted to simple models such as
BNs and HMMs (hidden Markov models) though its usefulness
is established through a variety of applications from
model selection to prediction. On the other hand it is
already proved that VB can be extended to PCFGs and is
efficiently implementable using dynamic programming. Note
that PCFGs are just one class of PPC and much more general
PPC is realized by PRISM. Accordingly if VB is extened to
PRISM's PPC, we will obtain VB for general probabilistic
models, far wider than BNs and PCFGs.
The last observation is that once VB becomes available in
PRISM, it saves us a lot of time and energy. First we do
not have to derive a new VB algorithm from scratch for
each model and implement it. All we have to do is just to
write a probabilistic model at predicate level. The rest
of work will be carried out automatically in a unified
manner by the PRISM system as it happens in the case of EM
learning. Deriving and implementing a VB algorithm is a
tedious error-prone process, and ensuring its correctness
would be difficult beyond PCFGs without formal semantics.
PRISM augmented with VB will completely eliminate such
needs and make it easy to explore and test new Bayesian
models by helping the user cope with data sparseness and
avoid over-fitting.
Variational Bayes
propositionalized probability computation
PRISM
1-8
Regular Paper
Taisuke
Sato
Taisuke Sato
Yoshitaka
Kameya
Yoshitaka Kameya
Kenichi
Kurihara
Kenichi Kurihara
10.4230/DagSemProc.07161.10
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode