Implementing FAIR Data Infrastructures (Dagstuhl Perspectives Workshop 18472)

Authors Natalia Manola, Peter Mutschke, Guido Scherp, Klaus Tochtermann, Peter Wittenburg, Kathleen Gregory, Wilhelm Hasselbring, Kees den Heijer, Paolo Manghi, Dieter Van Uytvanck

Thumbnail PDF


  • Filesize: 3.85 MB
  • 34 pages

Document Identifiers

Author Details

Natalia Manola
  • University of Athens, GR
Peter Mutschke
  • GESIS – Leibniz Institute for the Social Sciences - Cologne, DE
Guido Scherp
  • ZBW - Leibniz Information Centre for Economics - Kiel, DE
Klaus Tochtermann
  • ZBW - Leibniz Information Centre for Economics - Kiel, DE
Peter Wittenburg
  • Max Planck Computing and Data Facility - Garching, DE
Kathleen Gregory
  • Data Archiving and Networked Services, Royal Netherlands Academy of Arts and Sciences, NL
Wilhelm Hasselbring
  • Universität Kiel, DE
Kees den Heijer
  • TU Delft, NL
Paolo Manghi
  • ISTI-CNR - Pisa, IT
Dieter Van Uytvanck
  • CLARIN ERIC - Utrecht, NL

Cite AsGet BibTex

Natalia Manola, Peter Mutschke, Guido Scherp, Klaus Tochtermann, Peter Wittenburg, Kathleen Gregory, Wilhelm Hasselbring, Kees den Heijer, Paolo Manghi, and Dieter Van Uytvanck. Implementing FAIR Data Infrastructures (Dagstuhl Perspectives Workshop 18472). In Dagstuhl Manifestos, Volume 8, Issue 1, pp. 1-34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


The open science movement is gaining strength and momentum worldwide, signalling a fundamental shift in how scientific research is made accessible and reusable. In order to fulfill the promises of open science, reliable and sustainable research data infrastructures must be developed. While the FAIR data principles provide a promising conceptual basis for developing such data infrastructures, they do not provide technological guidance on how to do so. Computer science is uniquely situated to fill this gap by researching and developing tools and technical specifications which can help to realize the creation of FAIR data infrastructures. To this end, this Dagstuhl Perspectives Workshop brought together computer scientists and digital infrastructure experts from across disciplinary domains to discuss key challenges and technical solutions to implementing and promoting the establishment of FAIR-compliant infrastructures for research data. This manifesto reports the findings from the workshop and provides recommendations along two lines: (1) how computer science can contribute to implementing FAIR data infrastructures and (2) how to make computer science research itself more FAIR.

Subject Classification

ACM Subject Classification
  • Information systems
  • fair principles
  • open data
  • open science
  • research data infrastructures


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel Blankenberg. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research, 46(W1):W537-W544, July 2018. URL:, URL:
  2. Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić, Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, and Luka Stojanovic. Common Workflow Language, v1.0. page 5921760 Bytes, 2016. URL:, URL:
  3. Malcolm Atkinson, Sandra Gesing, Johan Montagnat, and Ian Taylor. Scientific workflows: Past, present and future. Future Generation Computer Systems, 75:216-227, October 2017. URL:, URL:
  4. Adam Barker and Jano van Hemert. Scientific Workflow: A Survey and Research Directions. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Madhu Sudan, Demetri Terzopoulos, Doug Tygar, Moshe Y. Vardi, Gerhard Weikum, Roman Wyrzykowski, Jack Dongarra, Konrad Karczewski, and Jerzy Wasniewski, editors, Parallel Processing and Applied Mathematics, volume 4967, pages 746-753. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. URL:, URL:
  5. Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, and Carole Goble. Why linked data is not enough for scientists. Future Generation Computer Systems, 29(2):599-611, February 2013. URL:, URL:
  6. Gary Berg-Cross, Raphael Ritz, and Peter Wittenburg. RDA DFT Core Terms and Model, 2016. URL:
  7. Michael R. Berthold, Nicolas Cebron, Fabian Dill, Thomas R. Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Christoph Sieb, Kilian Thiel, and Bernd Wiswedel. KNIME: The Konstanz Information Miner. In Christine Preisach, Hans Burkhardt, Lars Schmidt-Thieme, and Reinhold Decker, editors, Data Analysis, Machine Learning and Applications, pages 319-326. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008. URL:, URL:
  8. Shawn Bowers. Scientific Workflow, Provenance, and Data Modeling Challenges and Approaches. Journal on Data Semantics, 1(1):19-30, May 2012. URL:, URL:
  9. Leonardo Candela, Donatella Castelli, Paolo Manghi, and Alice Tani. Data journals: A survey: Data Journals: A Survey. Journal of the Association for Information Science and Technology, 66(9):1747-1762, September 2015. URL:, URL:
  10. Leonardo Candela, Paolo Manghi, Fosca Gianotti, Valerio Grossi, and Roberto Trasarti. HyWare: a HYbrid Workflow lAnguage for Research E-infrastructures. D-Lib Magazine, 23(1/2), January 2017. URL:, URL:
  11. Bruce R. Childers, Grigori Fursin, Shriram Krishnamurthi, and Andreas Zeller. Artifact Evaluation for Publications (Dagstuhl Perspectives Workshop 15452). Dagstuhl Reports, 5(11):29-35, 2016. URL:, URL:
  12. Sarah Cohen-Boulakia, Khalid Belhajjame, Olivier Collin, Jérôme Chopard, Christine Froidevaux, Alban Gaignard, Konrad Hinsen, Pierre Larmande, Yvan Le Bras, Frédéric Lemoine, Fabien Mareuil, Hervé Ménager, Christophe Pradal, and Christophe Blanchet. Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities. Future Generation Computer Systems, 75:284-298, October 2017. URL:, URL:
  13. Sandra Collins, Françoise Genova, Natalie Harrower, Simon Hodson, Sarah Jones, Leif Laaksonen, Daniel Mietchen, Rūta Petrauskaitė, and Peter Wittenburg. Turning fair into reality: Final report and action plan from the european commission expert group on fair data, 2018. URL:
  14. Roberto Di Cosmo, Morane Gruenpeter, and Stefano Zacchiroli. 204.4 Identifiers for Digital Objects: The case of software source code preservation. September 2018. URL:, URL:
  15. Roberto Di Cosmo, Morane Gruenpeter, and Stefano Zacchiroli. Referencing Source Code Artifacts: A Separate Concern in Software Citation. Computing in Science & Engineering, 22(2):33-43, March 2020. URL:, URL:
  16. David De Roure. The future of scholarly communications: Based on a paper presented at the 37th UKSG Conference, Harrogate, April 2014. Insights: the UKSG journal, 27(3):233-238, November 2014. URL:, URL:
  17. David De Roure, Carole Goble, and Robert Stevens. The design and realisation of the Virtual Research Environment for social sharing of workflows. Future Generation Computer Systems, 25(5):561-567, May 2009. URL:, URL:
  18. Koenraad De Smedt, Dimitris Koureas, and Peter Wittenburg. FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units. Publications, 8(2):21, April 2020. URL:, URL:
  19. Roberto Di Cosmo. Software Heritage: Why and How We Collect, Preserve and Share All the Software Source Code. In 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS), pages 2-2, May 2018. Google Scholar
  20. Nicola Ferro. Reproducibility Challenges in Information Retrieval Evaluation. Journal of Data and Information Quality, 8(2):1-4, February 2017. URL:, URL:
  21. Nicola Ferro, Norbert Fuhr, Kalervo Järvelin, Noriko Kando, Matthias Lippold, and Justin Zobel. Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on "Reproducibility of Data-Oriented Experiments in e-Science". ACM SIGIR Forum, 50(1):68-82, June 2016. URL:, URL:
  22. Nicola Ferro and Diane Kelly. SIGIR Initiative to Implement ACM Artifact Review and Badging. ACM SIGIR Forum, 52(1):4-10, August 2018. URL:, URL:
  23. Ann Gabriel and Rebecca Capone. Executable Paper Grand Challenge Workshop. Procedia Computer Science, 4:577-578, 2011. URL:, URL:
  24. Daniel Garijo, Yolanda Gil, and Oscar Corcho. Abstract, link, publish, exploit: An end to end framework for workflow sharing. Future Generation Computer Systems, 75:271-283, October 2017. URL:, URL:
  25. Joint Position Paper on the European Open Science Cloud, 2017. Germany and the Netherlands. URL:
  26. Yolanda Gil, Ewa Deelman, Mark Ellisman, Thomas Fahringer, Geoffrey Fox, Dennis Gannon, Carole Goble, Miron Livny, Luc Moreau, and Jim Myers. Examining the Challenges of Scientific Workflows. Computer, 40(12):24-32, December 2007. URL:, URL:
  27. Kathleen M Gregory, Helena Cousijn, Paul Groth, Andrea Scharnhorst, and Sally Wyatt. Understanding data search as a socio-technical practice. Journal of Information Science, 46(4):459-475, August 2020. URL:, URL:
  28. Karen L. Hanson, Tim DiLauro, and Mark Donoghue. The RMap Project: Capturing and Preserving Associations amongst Multi-Part Distributed Publications. In Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries - JCDL '15, pages 281-282, Knoxville, Tennessee, USA, 2015. ACM Press. URL:, URL:
  29. Wilhelm Hasselbring, Leslie Carr, Simon Hettrick, Heather Packer, and Thanassis Tiropanis. From FAIR research data toward FAIR and open research software. it - Information Technology, 62(1):39-47, February 2020. URL:
  30. Wilhelm Hasselbring, Leslie Carr, Simon Hettrick, Heather Packer, and Thanassis Tiropanis. Open source research software. Computer, 53(8):84-88, August 2020. URL:
  31. Annika Jacobsen, Ricardo de Miranda Azevedo, Nick Juty, Dominique Batista, Simon Coles, Ronald Cornet, Mélanie Courtot, Mercè Crosas, Michel Dumontier, Chris T. Evelo, Carole Goble, Giancarlo Guizzardi, Karsten Kryger Hansen, Ali Hasnain, Kristina Hettne, Jaap Heringa, Rob W.W. Hooft, Melanie Imming, Keith G. Jeffery, Rajaram Kaliyaperumal, Martijn G. Kersloot, Christine R. Kirkpatrick, Tobias Kuhn, Ignasi Labastida, Barbara Magagna, Peter McQuilton, Natalie Meyers, Annalisa Montesanti, Mirjam van Reisen, Philippe Rocca-Serra, Robert Pergl, Susanna-Assunta Sansone, Luiz Olavo Bonino da Silva Santos, Juliane Schneider, George Strawn, Mark Thompson, Andra Waagmeester, Tobias Weigel, Mark D. Wilkinson, Egon L. Willighagen, Peter Wittenburg, Marco Roos, Barend Mons, and Erik Schultes. FAIR Principles: Interpretations and Implementation Considerations. Data Intelligence, 2(1-2):10-29, January 2020. URL:, URL:
  32. Robert Kahn and Robert Wilensky. A framework for distributed digital object services. International Journal on Digital Libraries, 6(2):115-123, April 2006. URL:, URL:
  33. Daniel S. Katz and Neil P. Chue Hong. Software Citation in Theory and Practice. In James H. Davenport, Manuel Kauers, George Labahn, and Josef Urban, editors, Mathematical Software – ICMS 2018, volume 10931, pages 289-296. Springer International Publishing, Cham, 2018. URL:, URL:
  34. Shriram Krishnamurthi and Jan Vitek. The real software crisis: repeatability as a core value. Communications of the ACM, 58(3):34-36, February 2015. URL:, URL:
  35. Sandro La Bruzzo, Paolo Manghi, and Andrea Mannocci. OpenAIRE’s DOIBoost - Boosting Crossref for Research. In Paolo Manghi, Leonardo Candela, and Gianmaria Silvello, editors, Digital Libraries: Supporting Open Science, volume 988, pages 133-143. Springer International Publishing, Cham, 2019. URL:, URL:
  36. Yann Le Franc, Jessica Parland-von Essen, Luiz Bonino, Heikki Lehväslaiho, Gerard Coen, and Christine Staiger. D2.2 fair semantics: First recommendations, March 2020. URL:
  37. Steffen Mazanek and Michael Hanus. Constructing a bidirectional transformation between BPMN and BPEL with a functional logic programming language. Journal of Visual Languages & Computing, 22(1):66-89, February 2011. URL:, URL:
  38. Daniel Méndez Fernández, Martin Monperrus, Robert Feldt, and Thomas Zimmermann. The open science initiative of the Empirical Software Engineering journal. Empirical Software Engineering, 24(3):1057-1060, June 2019. URL:, URL:
  39. Barend Mons. FAIR Science for Social Machines: Let’s Share Metadata Knowlets in the Internet of FAIR Data and Services. Data Intelligence, 1(1):22-42, March 2019. URL:, URL:
  40. Barend Mons, Erik Schultes, Fenghong Liu, and Annika Jacobsen. The FAIR Principles: First Generation Implementation Choices and Challenges. Data Intelligence, 2(1-2):1-9, January 2020. URL:, URL:
  41. P. Bryan Heidorn. Shedding Light on the Dark Data in the Long Tail of Science. Library Trends, 57(2):280-299, 2008. URL:, URL:
  42. George Strawn Peter Wittenburg. Common Patterns in Revolutionary Infrastructures and Data. 2018. URL:, URL:
  43. Stefan Pröll and Andreas Rauber. Scalable data citation in dynamic, large databases: Model and reference implementation. In 2013 IEEE International Conference on Big Data, pages 307-312, October 2013. URL:
  44. Andreas Rauber, Ari Asmi, Dieter Van Uytvanck, and Stefan Proell. Identification of Reproducible Subsets for Data Citation, Sharing and Re-Use. Bulletin of IEEE Technical Committee on Digital Libraries, Special Issue on Data Citation, 12(1):6-15, May 2016. URL:
  45. Research Data Alliance FAIR Data Maturity Model Working Group. FAIR Data Maturity Model: specification and guidelines - draft. 2020. URL:, URL:
  46. Mitsuhisa Sato, Satoshi Matsuoka, Peter M. Sloot, G. Dick van Albada, and Jack Dongarra, editors. Proceedings of the International Conference on Computational Science, ICCS 2011. ScienceDirect, 2011. URL:
  47. Shih-Fu Chang, T. Sikora, and A. Purl. Overview of the MPEG-7 standard. IEEE Transactions on Circuits and Systems for Video Technology, 11(6):688-695, June 2001. URL:, URL:
  48. Arfon M. Smith, Daniel S. Katz, Kyle E. Niemeyer, and FORCE11 Software Citation Working Group. Software citation principles. PeerJ Computer Science, 2:e86, September 2016. URL:, URL:
  49. Michela Spagnuolo and Remco Veltkamp. Special issue on executable papers for 3D object retrieval. Computers & Graphics, 37(5):A7-A8, August 2013. URL:, URL:
  50. George Strawn. Open Science, Business Analytics, and FAIR Digital Objects, 2019. URL:
  51. Jonathan Tennant, Ritwik Agarwal, Ksenija Baždarić, David Brassard, Tom Crick, Daniel J. Dunleavy, Thomas Rhys Evans, Nicholas Gardner, Monica Gonzalez-Marquez, Daniel Graziotin, Bastian Greshake Tzovaras, Daniel Gunnarsson, Johanna Havemann, Mohammad Hosseini, Daniel S. Katz, Marcel Knöchelmann, Christopher R Madan, Paolo Manghi, Alberto Marocchino, Paola Masuzzo, Peter Murray-Rust, Sanjay Narayanaswamy, Gustav Nilsonne, Josmel Pacheco-Mendoza, Bart Penders, Olivier Pourret, Michael Rera, John Samuel, Tobias Steiner, Jadranka Stojanovski, Alejandro Uribe-Tirado, Rutger Vos, Simon Worthington, and Tal Yarkoni. A tale of two 'opens': intersections between Free and Open Source Software and Open Scholarship. preprint, SocArXiv, March 2020. URL:, URL:
  52. the FAIRsharing Community, Susanna-Assunta Sansone, Peter McQuilton, Philippe Rocca-Serra, Alejandra Gonzalez-Beltran, Massimiliano Izzo, Allyson L. Lister, and Milo Thurston. FAIRsharing as a community approach to standards, repositories and policies. Nature Biotechnology, 37(4):358-367, April 2019. URL:, URL:
  53. Georg von Krogh and Eric von Hippel. The Promise of Research on Open Source Software. Management Science, 52(7):975-983, July 2006. URL:, URL:
  54. Jillian C. Wallis, Elizabeth Rolando, and Christine L. Borgman. If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology. PLoS ONE, 8(7):e67332, July 2013. URL:, URL:
  55. Tobias Weigel, Beth Plale, Mark Parsons, Gabriel Zhou, Yu Luo, Ulrich Schwardmann, Robert Quick, Margareta Hellström, and Kei Kurakawa. RDA Recommendation on PID Kernel Information (version 1), 2018. URL:
  56. Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E. Bourne, Jildau Bouwman, Anthony J. Brookes, Tim Clark, Mercè Crosas, Ingrid Dillo, Olivier Dumon, Scott Edmunds, Chris T. Evelo, Richard Finkers, Alejandra Gonzalez-Beltran, Alasdair J.G. Gray, Paul Groth, Carole Goble, Jeffrey S. Grethe, Jaap Heringa, Peter A.C ’t Hoen, Rob Hooft, Tobias Kuhn, Ruben Kok, Joost Kok, Scott J. Lusher, Maryann E. Martone, Albert Mons, Abel L. Packer, Bengt Persson, Philippe Rocca-Serra, Marco Roos, Rene van Schaik, Susanna-Assunta Sansone, Erik Schultes, Thierry Sengstag, Ted Slater, George Strawn, Morris A. Swertz, Mark Thompson, Johan van der Lei, Erik van Mulligen, Jan Velterop, Andra Waagmeester, Peter Wittenburg, Katherine Wolstencroft, Jun Zhao, and Barend Mons. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1):160018, December 2016. URL:, URL:
  57. Mark D. Wilkinson, Susanna-Assunta Sansone, Erik Schultes, Peter Doorn, Luiz Olavo Bonino da Silva Santos, and Michel Dumontier. A design framework and exemplar metrics for FAIRness. Scientific Data, 5(1):180118, December 2018. URL:, URL:
  58. Peter Wittenburg, George Strawn, Barend Mons, Luiz Boninho, and Erik Schultes. Digital Objects as Drivers towards Convergence in Data Infrastructures, 2019. URL:
  59. Katherine Wolstencroft, Robert Haines, Donal Fellows, Alan Williams, David Withers, Stuart Owen, Stian Soiland-Reyes, Ian Dunlop, Aleksandra Nenadic, Paul Fisher, Jiten Bhagat, Khalid Belhajjame, Finn Bacall, Alex Hardisty, Abraham Nieva de la Hidalga, Maria P. Balcazar Vargas, Shoaib Sufi, and Carole Goble. The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Research, 41(W1):W557-W561, July 2013. URL:, URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail