Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling (Dagstuhl Seminar 22382)

Authors Philipp Berens, Kyle Cranmer, Neil D. Lawrence, Ulrike von Luxburg, Jessica Montgomery and all authors of the abstracts in this report

Thumbnail PDF


  • Filesize: 2.25 MB
  • 50 pages

Document Identifiers

Author Details

Philipp Berens
  • Universität Tübingen, DE
Kyle Cranmer
  • University of Wisconsin - Madison, US
Neil D. Lawrence
  • University of Cambridge, GB
Ulrike von Luxburg
  • Universität Tübingen, DE
Jessica Montgomery
  • University of Cambridge, GB
and all authors of the abstracts in this report

Cite AsGet BibTex

Philipp Berens, Kyle Cranmer, Neil D. Lawrence, Ulrike von Luxburg, and Jessica Montgomery. Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling (Dagstuhl Seminar 22382). In Dagstuhl Reports, Volume 12, Issue 9, pp. 150-199, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


This report documents the programme and the outcomes of Dagstuhl Seminar 22382 "Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling". Today’s scientific challenges are characterised by complexity. Interconnected natural, technological, and human systems are influenced by forces acting across time- and spatial-scales, resulting in complex interactions and emergent behaviours. Understanding these phenomena - and leveraging scientific advances to deliver innovative solutions to improve society’s health, wealth, and well-being - requires new ways of analysing complex systems. The transformative potential of AI stems from its widespread applicability across disciplines, and will only be achieved through integration across research domains. AI for science is a rendezvous point. It brings together expertise from AI and application domains; combines modelling knowledge with engineering know-how; and relies on collaboration across disciplines and between humans and machines. Alongside technical advances, the next wave of progress in the field will come from building a community of machine learning researchers, domain experts, citizen scientists, and engineers working together to design and deploy effective AI tools. This report summarises the discussions from the seminar and provides a roadmap to suggest how different communities can collaborate to deliver a new wave of progress in AI and its application for scientific discovery.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Machine learning
  • Computing methodologies → Artificial intelligence
  • machine learning
  • artificial intelligence
  • life sciences
  • physical sciences
  • environmental sciences
  • simulation
  • causality
  • modelling


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Hananeh Aliee, Fabian J. Theis, and Niki Kilbertus. Beyond predictions in neural odes: Identification and interventions. Technical report, 2021. URL:
  2. Justin Alsing, Tom Charnock, Stephen Feeney, and Benjamin Wandelt. Fast likelihood-free cosmology with neural density estimators and active learning. Monthly Notices of the Royal Astronomical Society, 488(3):4440-4458, 07 2019., URL:
  3. Mauricio A. Álvarez, David Luengo, and Neil D. Lawrence. Linear latent force models using Gaussian processes. TPAMI, 35(11):2693-2705, 5 2013., URL:
  4. Christian Beer, Markus Reichstein, Enrico Tomelleri, Philippe Ciais, Martin Jung, Nuno Carvalhais, Christian Rödenbeck, M. Altaf Arain, Dennis Baldocchi, Gordon B. Bonan, A. Bondeau, A. Cescatti, G. Lasslop, A. Lindroth, M. Lomas, S. Luyssaert, H. Margolis, K. W. Oleson, O. Roupsard, E. Veenendaal, N. Viovy, C. Williams, F. I. Woodward, and D. Papale. Terrestrial gross carbon dioxide uptake: global distribution and covariation with climate. Science, 329(5993):834-838, 2010. URL:
  5. David M. Blei. Build, compute, critique, repeat: Data analysis with latent variable models. Annual Review of Statistics and Its Application, 1:203-232, 2014. URL:
  6. Erik Bodin, Zhenwen Dai, Neill Campbell, and Carl Henrik Ek. Black-box density function estimation using recursive partitioning. In International Conference on Machine Learning, volume 139, pages 1015-1025. PMLR, 2021. URL:
  7. Jan Boelts, Jan-Matthis Lueckmann, Richard Gao, and Jakob H. Macke. Flexible and efficient simulation-based inference for models of decision-making. eLife, 11:e77220, 2022. URL:, URL:
  8. Anika Böttcher, Maren Büttner, Sophie Tritschler, Michael Sterr, Alexandra Aliluev, Lena Oppenländer, Ingo Burtscher, Steffen Sass, Martin Irmler, Johannes Beckers, Christoph Ziegenhain, Wolfgang Enard, Andrea C. Schamberger, Fien M. Verhamme, Oliver Eickelberg, Fabian J. Theis, and Heiko Lickert. Non-canonical Wnt/PCP signalling regulates intestinal stem cell lineage priming towards enteroendocrine and paneth cell fates. Nature cell biology, 23(1):23-31, 2021. URL:, URL:
  9. Martin Brandt, Compton J. Tucker, Ankit Kariryaa, Kjeld Rasmussen, Christin Abel, Jennifer Small, Jerome Chave, Laura Vang Rasmussen, Pierre Hiernaux, Abdoul Aziz Diouf, Laurent Kergoat, Ole Mertz, Christian Igel, Fabian Gieseke, Johannes Schöning, Sizhuo Li, Katherine Melocik, Jesse Meyer, Scott Sinno, Eric Romero, Erin Glennie, Amandine Montagu, Morgane Dendoncker, and Rasmus Fensholt. An unexpectedly large count of trees in the west african sahara and sahel. Nature, 587(7832):78-82, 2020. URL:, URL:
  10. Sarah Bridle, Sreekumar T Balan, Matthias Bethge, Marc Gentile, Stefan Harmeling, Catherine Heymans, Michael Hirsch, Reshad Hosseini, Mike Jarvis, Donnacha Kirk, et al. Results of the great08 challenge: an image analysis competition for cosmological lensing. Monthly Notices of the Royal Astronomical Society, 405(3):2044-2061, 2010. URL:, URL:
  11. Patrick Cannon, Daniel Ward, and Sebastian M. Schmon. Investigating the impact of model misspecification in neural simulation-based inference. Technical report, 2022. URL:
  12. Mustafa Mert Çelikok, Frans A. Oliehoek, and Samuel Kaski. Best-response bayesian reinforcement learning with bayes-adaptive pomdps for centaurs. In International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2022, 2022. URL:
  13. Kyle Cranmer, Johann Brehmer, and Gilles Louppe. The frontier of simulation-based inference. Proceedings of the National Academy of Sciences, 117(48):30055-30062, 2020. URL:
  14. Kyle Cranmer, Lukas Heinrich, Tim head, and Gilles Louppe. Active sciencing. Available from, 2017.
  15. Maximilian Dax, Stephen R. Green, Jonathan Gair, Jakob H. Macke, Alessandra Buonanno, and Bernhard Schölkopf. Real-time gravitational wave science with neural posterior estimation. Phys. Rev. Lett., 127:241103, Dec 2021. URL:, URL:
  16. Sebastiaan De Peuter, Antti Oulasvirta, and Samuel Kaski. Toward ai assistants that let designers design. Technical report, 2021. URL:
  17. Arnaud Delaunoy, Joeri Hermans, François Rozet, Antoine Wehenkel, and Gilles Louppe. Towards reliable simulation-based inference with balanced neural ratio estimation. Technical report, 2022. URL:
  18. Cora Dvorkin, Siddharth Mishra-Sharma, Brian Nord, V. Ashley Villar, Camille Avestruz, Keith Bechtol, Aleksandra Ćiprijanović, Andrew J. Connolly, Lehman H. Garrison, Gautham Narayan, and Francisco Villaescusa-Navarro. Machine learning and cosmology. Technical report, 2022. URL:, URL:
  19. Nadav Dym and Haggai Maron. On the universality of rotation equivariant point cloud networks. Technical report, 2020. URL:
  20. Jesse Emspak. How a machine learns prejudice. Scientific American Blogs, available at:, 2016.
  21. The ATLAS Collaboration et al. The ATLAS experiment at the CERN large hadron collider. Jinst, 3:S08003, 2008. URL:, URL:
  22. European Commission. Destination Earth – new digital twin of the Earth will help tackle climate change and protect nature. Available at, 2022.
  23. Pedro J Gonçalves, Jan-Matthis Lueckmann, Michael Deistler, Marcel Nonnenmacher, Kaan Öcal, Giacomo Bassetto, Chaitanya Chintaluri, William F Podlaski, Sara A. Haddad, Tim P. Vogels, David S. Greenberg, and Jakob H. Macke. Training deep neural density estimators to identify mechanistic models of neural dynamics. eLife, 9:e56261, 2020. URL:, URL:
  24. Siyuan Guo, Viktor Tóth, Bernhard Schölkopf, and Ferenc Huszár. Causal de Finetti: On the identification of invariant causal structure in exchangeable data. Technical report, 2022. URL:
  25. Laleh Haghverdi, Maren Büttner, F. Alexander Wolf, Florian Buettner, and Fabian J. Theis. Diffusion pseudotime robustly reconstructs lineage branching. Nature methods, 13(10):845-848, 2016. URL:, URL:
  26. Philipp Hennig, Michael A. Osborne, and Hans P. Kersting. Probabilistic Numerics: Computation as Machine Learning. Cambridge University Press, 2022. Google Scholar
  27. Joeri Hermans, Arnaud Delaunoy, François Rozet, Antoine Wehenkel, and Gilles Louppe. A trust crisis in simulation-based inference? your posterior approximations can be unfaithful. Technical report, 2021. URL:
  28. Luis Hernandez-Triana and Ssuzanna Bell. Taking the sting out of vector borne diseases. APHA Science Blog, available at:, 2022.
  29. Pierre Hiernaux, Hassane Bil-Assanou Issoufou, Christian Igel, Ankit Kariryaa, Moussa Kourouma, Jérôme Chave, Eric Mougin, and Patrice Savadogo. Allometric equations to estimate the dry mass of sahel woody plants mapped with very-high resolution satellite imagery. Forest Ecology and Management, 529:120653, 2023. URL:, URL:
  30. IPCC. IPCC special report on the ocean and cryosphere in a changing climate. IPCC Intergovernmental Panel on Climate Change: Geneva, Switzerland, 2019. Google Scholar
  31. Martin Emil Jakobsen and Jonas Peters. Distributional robustness of k-class estimators and the pulse. The Econometrics Journal, 25(2):404-432, 2022. URL:, URL:
  32. Antti Kangasrääsiö, Jussi PP Jokinen, Antti Oulasvirta, Andrew Howes, and Samuel Kaski. Parameter inference for computational cognitive models with approximate bayesian computation. Cognitive science, 43(6):e12738, 2019. URL:, URL:
  33. Hans Kersting. Uncertainty-Aware Numerical Solutions of ODEs by Bayesian Filtering. PhD thesis, Eberhard Karls Universität Tübingen, 2021. URL:
  34. Arto Klami, Theodoros Damoulas, Ola Engkvist, Patrick Rinke, and Samuel Kaski. Virtual laboratories: Transforming research with ai. Technical report, 2022. URL:
  35. Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In International Conference on Machine Learning, volume 80, pages 2747-2755. PMLR, 2018. URL:
  36. K. V. Krishnamurthy, Bir Bahadur, S. John Adams, and Padma Venkatasubramanian. Development and organization of cell types and tissues. Plant Biology and Biotechnology: Volume I: Plant Diversity, Organization, Function and Improvement, pages 73-111, 2015. URL:
  37. Alexander Lavin, Hector Zenil, Brooks Paige, David Krakauer, Justin Gottschlich, Tim Mattson, Anima Anandkumar, Sanjay Choudry, Kamil Rocki, Atılım Güneş Baydin, et al. Simulation intelligence: Towards a new generation of scientific methods. Technical report, 2021. URL:
  38. Neil D. Lawrence. Introduction to learning and inference in computational systems biology. In Neil D. Lawrence, Mark Girolami, Magnus Rattray, and Guido Sanguinetti, editors, Learning and Inference in Computational Systems Biology, chapter 1. MIT Press, Cambridge, MA, 2010. URL:
  39. Steve W Lindsay, Musa Jawara, Julia Mwesigwa, Jane Achan, Nabie Bayoh, John Bradley, Balla Kandeh, Matthew J. Kirby, Jakob Knudsen, Mike Macdonald, et al. Reduced mosquito survival in metal-roof houses may contribute to a decline in malaria transmission in sub-saharan africa. Scientific reports, 9(1):7770, 2019. URL:, URL:
  40. Julia Ling, Reese Jones, and Jeremy Templeton. Machine learning strategies for systems with invariance properties. Journal of Computational Physics, 318:22-35, 2016. URL:, URL:
  41. Malte D. Luecken, Maren Büttner, Kridsadakorn Chaichoompu, Anna Danese, Marta Interlandi, Michaela F. Müller, Daniel C Strobl, Luke Zappia, Martin Dugas, Maria Colomé-Tatché, and Fabian J. Theis. Benchmarking atlas-level data integration in single-cell genomics. Nature methods, 19(1):41-50, 2022. URL:, URL:
  42. Dina Machuve, Ezinne Nwankwo, Neema Mduma, and Jimmy Mbelwa. Poultry diseases diagnostics models using deep learning. Frontiers in Artificial Intelligence, page 168, 2022. URL:
  43. Fumiyasu Makinoshima and Yusuke Oishi. Crowd flow forecasting via agent-based simulations with sequential latent parameter estimation from aggregate observation. Scientific Reports, 12(1):1-13, 2022. URL:, URL:
  44. Nicolas Malleson, Kevin Minors, Le-Minh Kieu, Jonathan A. Ward, Andrew West, and Alison Heppenstall. Simulating crowds in real time with agent-based modelling and a particle filter. Journal of Artificial Societies and Social Simulation, 23(3):3, 2020. URL:, URL:
  45. Renée M. Marchin, Diana Backes, Alessandro Ossola, Michelle R. Leishman, Mark G. Tjoelker, and David S. Ellsworth. Extreme heat increases stomatal conductance and drought-induced mortality risk in vulnerable plant species. Global Change Biology, 28(3):1133-1146, 2022. URL:
  46. Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. Invariant and equivariant graph networks. Technical report, 2018. URL:
  47. Atalanti Mastakouri and Bernhard Schölkopf. Causal analysis of covid-19 spread in germany. Advances in Neural Information Processing Systems, 33:3153-3163, 2020. URL:
  48. Siddharth Mishra-Sharma. Inferring dark matter substructure with astrometric lensing beyond the power spectrum. Machine Learning: Science and Technology, 3(1):01LT03, 2022. URL:, URL:
  49. Siddharth Mishra-Sharma and Ge Yang. Strong lensing source reconstruction using continuous neural fields. Technical report, 2022. URL:
  50. NASA. Dark energy, dark matter. Available at: Last accessed March 4th 2023.
  51. NASA. Understanding sea level,. available from: URL:
  52. NASA. What is a gravitational wave? Available at: URL:
  53. Kirtan Padh, Jakob Zeitler, David Watson, Matt Kusner, Ricardo Silva, and Niki Kilbertus. Stochastic causal programming for bounding treatment effects. Technical report, 2022. URL:
  54. Andrei Paleyes, Raoul-Gabriel Urma, and Neil D. Lawrence. Challenges in deploying machine learning: a survey of case studies. ACM Computing Surveys, 55(6):1-29, 2022. URL:, URL:
  55. Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, Cambridge, MA, USA, 2017. Google Scholar
  56. John Quinn. Mapping africa’s buildings with satellite imagery. Google Research Blog, available at:, 2021.
  57. Christian Requena-Mesa, Markus Reichstein, Miguel Mahecha, Basil Kraft, and Joachim Denzler. Predicting landscapes from environmental conditions using generative networks. In Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Dortmund, Germany, September 10-13, 2019, Proceedings 41, pages 203-217. Springer, 2019. URL:, URL:
  58. H. Ritchie and M. Roser. Forests and deforestation. Published online at,, 2021.
  59. Horst W. J. Rittel and Melvin M. Webber. Dilemmas in a general theory of planning. Policy sciences, 4(2):155-169, 1973. Reprinted in N. Cross, ed. Developments in design methodology, pp. 135–44. Chichester: J. Wiley & Sons, 1984. Google Scholar
  60. Royal Danish Academy. New research to combat malaria mosquitoes in african metropolises. Available at:, 2022.
  61. Johannes Rueckel, Lena Trappmann, Balthasar Schachtner, Philipp Wesp, Boj Friedrich Hoppe, Nicola Fink, Jens Ricke, Julien Dinkel, Michael Ingrisch, and Bastian Oliver Sabel. Impact of confounding thoracic tubes and pleural dehiscence extent on artificial intelligence pneumothorax detection in chest radiographs. Investigative Radiology, 55(12):792-798, 2020. URL:, URL:
  62. Jonathan Schmidt, Nicholas Krämer, and Philipp Hennig. A probabilistic state space model for joint inference from differential equations and data. Advances in Neural Information Processing Systems, 34:12374-12385, 2021. URL:
  63. Schmidt Futures. Schmidt futures launches $148m global initiative to accelerate ai use in postdoctoral research. Available at, 2022.
  64. Bernhard Schölkopf. Causality for machine learning. In Probabilistic and Causal Inference: The Works of Judea Pearl, pages 765-804. 2022. URL:
  65. Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, and Yoshua Bengio. Toward causal representation learning. Proceedings of the IEEE, 109(5):612-634, 2021. URL:, URL:
  66. Michael Stadler and Peter Kruse. Über Wirklichkeitskriterien. Suhrkamp, Frankfurt am Main, Germany, 1990. Google Scholar
  67. Tim Summers, Erik Mackie, Risa Ueno, Charles Simpson, J. Scott Hosking, Tudor Suciu, Andrew Coburn, and Emily Shuckburgh. Localized impacts and economic implications from high temperature disruption days under climate change. Climate Resilience and Sustainability, 1(2):e35, 2022. URL:
  68. Iiris Sundin, Tomi Peltola, Luana Micallef, Homayun Afrabandpey, Marta Soare, Muntasir Mamun Majumder, Pedram Daee, Chen He, Baris Serim, Aki Havulinna, Caroline Heckman, Giulio Jacucci, Pekka Marttinen, and Samuel Kaski. Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge. Bioinformatics, 34(13):i395-i403, 06 2018. URL:
  69. The Royal Society. Ai narratives: portrayals and perceptions of artificial intelligence and why they matter. Available at:, 2018.
  70. Francisco Vargas, Pierre Thodoroff, Austen Lamacraft, and Neil D. Lawrence. Solving schrödinger bridges via maximum likelihood. Entropy, 23(9):1134, 2021. URL:, URL:
  71. Soledad Villar, David W. Hogg, Kate Storey-Fisher, Weichi Yao, and Ben Blum-Smith. Scalars are universal: Equivariant machine learning, structured like classical physics. Advances in Neural Information Processing Systems, 34:28848-28863, 2021. URL:
  72. Soledad Villar, Weichi Yao, David W Hogg, Ben Blum-Smith, and Bianca Dumitrascu. Dimensionless machine learning: Imposing exact units equivariance. Technical report, 2022. URL:
  73. Wil Ward, Tom Ryder, Dennis Prangle, and Mauricio Alvarez. Black-box inference for non-linear latent force models. In International Conference on Artificial Intelligence and Statistics, volume 108, pages 3088-3098. PMLR, 2020. URL:
  74. Antoine Wehenkel, Jens Behrmann, Hsiang Hsu, Guillermo Sapiro, Gilles Louppe, and Jörn-Henrik Jacobsen. Robust hybrid learning with expert augmentation. Technical report, 2022. URL: