Structured Set Variable Domains in Bayesian Network Structure Learning

Trösser, Fulya; de Givry, Simon; Katsirelos, George

doi:10.4230/LIPIcs.CP.2022.37

Abstract

Constraint programming is a state of the art technique for learning the structure of Bayesian Networks from data (Bayesian Network Structure Learning - BNSL). However, scalability both for CP and other combinatorial optimization techniques for this problem is limited by the fact that the basic decision variables are set variables with domain sizes that may grow super polynomially with the number of random variables. Usual techniques for handling set variables in CP are not useful, as they lead to poor bounds. In this paper, we propose using decision trees as a data structure for storing sets of sets to represent set variable domains. We show that relatively simple operations are sufficient to implement all propagation and bounding algorithms, and that the use of these data structures improves scalability of a state of the art CP-based solver for BNSL.

Cite As Get BibTex

Fulya Trösser, Simon de Givry, and George Katsirelos. Structured Set Variable Domains in Bayesian Network Structure Learning. In 28th International Conference on Principles and Practice of Constraint Programming (CP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 235, pp. 37:1-37:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.CP.2022.37

Author Details

Fulya Trösser

Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France

Simon de Givry

Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France

George Katsirelos

Université Fédérale de Toulouse, ANITI, INRAE, MIA Paris, AgroParisTech, 75231 Paris, France

Funding

This work has been partly funded by the "Agence nationale de la Recherche" (ANR-16-CE40-0028 Demograph project and ANR-19-PIA3-0004 ANITI-DIL chair of Thomas Schiex).

Acknowledgements

Thanks to the GenoToul (Toulouse, France) Bioinformatics platform for computational support.

Supplementary Materials

Software (Source Code) https://gkatsi.github.io/elsa-cp22.tar.gz

References

Mark Bartlett and James Cussens. Integer linear programming for the bayesian network structure learning problem. Artificial Intelligence, pages 258-271, 2017.
Jeremias Berg, Matti Järvisalo, and Brandon Malone. Learning optimal bounded treewidth bayesian networks via maximum satisfiability. In Artificial Intelligence and Statistics, pages 86-95. PMLR, 2014.
Wray Buntine. Theory refinement on bayesian networks. In Proc. of UAI, pages 52-60. Elsevier, 1991.
David Maxwell Chickering. Learning bayesian networks is NP-Complete. In Proc. of Fifth Int. Workshop on Artificial Intelligence and Statistics (AISTATS), pages 121-130, Key West, Florida, USA, 1995. URL: https://doi.org/10.1007/978-1-4612-2404-4_12.
James Cussens, Matti Järvisalo, Janne H Korhonen, and Mark Bartlett. Bayesian network structure learning with integer programming: Polytopes, facets and complexity. Journal of Artificial Intelligence Research, 58:185-229, 2017.
Cassio P de Campos, Mauro Scanagatta, Giorgio Corani, and Marco Zaffalon. Entropy-based pruning for learning bayesian networks using BIC. Artificial Intelligence, 260:42-50, 2018.
Cassio Polpo de Campos and Qiang Ji. Properties of bayesian dirichlet scores to learn bayesian network structures. In Proc. of AAAI-10, Atlanta, Georgia, USA, 2010.
Xiannian Fan and Changhe Yuan. An improved lower bound for bayesian network structure learning. In Proc. of AAAI-15, Austin, Texas, 2015.
Carmen Gervet. Conjunto: Constraint logic programming with finite set domains. In Maurice Bruynooghe, editor, Logic Programming, Proceedings of the 1994 International Symposium, Ithaca, New York, USA, November 13-17, 1994, pages 339-358. MIT Press, 1994.
Carmen Gervet and Pascal Van Hentenryck. Length-lex ordering for set CSPs. In Proceedings of AAAI, 2006.
Peter Hawkins, Vitaly L. Lagoon, and Peter J. Stuckey. Solving set constraint satisfaction problems using ROBDDs. Journal of Artificial Intelligence Research, 24:109-156, 2005.
David Heckerman, Dan Geiger, and David M Chickering. Learning bayesian networks: The combination of knowledge and statistical data. Machine learning, 20(3):197-243, 1995.
Y. Lai, D. Liu, and S. Wang. Reduced ordered binary decision diagram with implied literals: A new knowledge compilation approach. Knowledge and Information Systems, 35(3):665-712, 2013.
Wai Lam and Fahiem Bacchus. Using new data to refine a bayesian network. In Proc. of UAI, pages 383-390, 1994.
Hyafil Laurent and Ronald L Rivest. Constructing optimal binary decision trees is NP-complete. Information processing letters, 5(1):15-17, 1976.
Colin Lee and Peter van Beek. An experimental analysis of anytime algorithms for bayesian network structure learning. In Advanced Methodologies for Bayesian Networks, pages 69-80, 2017.
J.R. Quinlan. Induction of decision trees. Machine Learning, 1:81-107, 1986.
Mauro Scanagatta, Cassio P de Campos, Giorgio Corani, and Marco Zaffalon. Learning bayesian networks with thousands of variables. Proc. of NeurIPS, 28:1864-1872, 2015.
Gideon Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):461-464, 1978.
Tomi Silander and Petri Myllymäki. A simple approach for finding the globally optimal bayesian network structure. In Proc. of UAI'06, Cambridge, MA, USA, 2006.
Fulya Trösser, Simon de Givry, and George Katsirelos. Improved acyclicity reasoning for bayesian network structure learning with constraint programming. In Proceedings of IJCAI, pages 4250-4257, 2021.
Peter van Beek and Hella-Franziska Hoffmann. Machine learning of bayesian networks using constraint programming. In Proc. of International Conference on Principles and Practice of Constraint Programming, pages 429-445, Cork, Ireland, 2015.
Peter van Emde Boas. Preserving order in a forest in less than logarithmic time. In Proceedings of the 16th Annual Symposium on Foundations of Computer Science, pages 75-84, 1975.
Changhe Yuan and Brandon Malone. Learning optimal bayesian networks: A shortest path perspective. J. of Artificial Intelligence Research, 48:23-65, 2013.

Structured Set Variable Domains in Bayesian Network Structure Learning

Authors Fulya Trösser, Simon de Givry , George Katsirelos

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message