Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup

Authors Laura Sinikallio , Senka Drobac , Minna Tamper , Rafael Leal , Mikko Koho , Jouni Tuominen , Matti La Mela , Eero Hyvönen



PDF
Thumbnail PDF

File

OASIcs.LDK.2021.8.pdf
  • Filesize: 1.37 MB
  • 17 pages

Document Identifiers

Author Details

Laura Sinikallio
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland
Senka Drobac
  • Department of Computer Science, SeCo Research Group, Aalto University, Finland
Minna Tamper
  • Department of Computer Science, SeCo Research Group, Aalto University, Finland
Rafael Leal
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland
Mikko Koho
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland
Jouni Tuominen
  • Aalto University, Finland
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland
Matti La Mela
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland
Eero Hyvönen
  • Aalto University, Finland
  • HELDIG Centre for Digital Humanities, SeCo Research Group, University of Helsinki, Finland

Acknowledgements

Thanks to Ari Apilo, Sari Wilenius, and Päivikki Karhula of PoF for providing material for the project. Our work was funded by the Academy of Finland as part of the Semantic Parliament project, the EU project InTaVia: In/Tangible European Heritage, and is related to the COST action NexusLinguarum on linguistic data science. CSC - IT Center for Science, Finland, provided computational resources for the work.

Cite AsGet BibTex

Laura Sinikallio, Senka Drobac, Minna Tamper, Rafael Leal, Mikko Koho, Jouni Tuominen, Matti La Mela, and Eero Hyvönen. Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 8:1-8:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/OASIcs.LDK.2021.8

Abstract

This paper presents a knowledge graph created by transforming the plenary debates of the Parliament of Finland (1907-) into Linked Open Data (LOD). The data, totaling over νm{900 000} speeches, with automatically created semantic annotations and rich ontology-based metadata, are published in a Linked Open Data Service and are used via a SPARQL API and as data dumps. The speech data is part of larger LOD publication FinnParla that also includes prosopographical data about the politicians. The data is being used for studying parliamentary language and culture in Digital Humanities in several universities. To serve a wider variety of users, the entirety of this data was also produced using Parla-CLARIN markup. We present the first publication of all Finnish parliamentary debates as data. Technical novelties in our approach include the use of both Parla-CLARIN and an RDF schema developed for representing the speeches, integration of the data to a new Parliament of Finland Ontology for deeper data analyses, and enriching the data with a variety of external national and international data sources.

Subject Classification

ACM Subject Classification
  • Information systems → Ontologies
  • Information systems → Resource Description Framework (RDF)
  • Computing methodologies → Information extraction
Keywords
  • Plenary debates
  • parliamentary data
  • Parla-CLARIN
  • Linked Open Data
  • Digital Humanities

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Kaspar Beelen, Timothy Alberdingk Thijm, Christopher Cochrane, Kees Halvemaan, Graeme Hirst, Michael Kimmins, Sander Lijbrink, Maarten Marx, Nona Naderi, Ludovic Rheault, and et al. Digitization of the canadian parliamentary debates. Canadian Journal of Political Science, 50(3):849–864, 2017. URL: https://doi.org/10.1017/S0008423916001165.
  2. Uldis Bojārs, Roberts Darģis, Uldis Lavrinovičs, and Pēteris Paikens. LinkedSaeima: A linked open dataset of Latvia’s parliamentary debates. In Maribel Acosta, Philippe Cudré-Mauroux, Maria Maleshkova, Tassilo Pellegrini, Harald Sack, and York Sure-Vetter, editors, Semantic Systems. The Power of AI and Knowledge Graphs, pages 50-56, Cham, 2019. Springer-Verlag. Google Scholar
  3. Eduskunta. Eduskunnan täysistunnot, ladattava versio 1.5, 2017. URL: http://urn.fi/urn:nbn:fi:lb-2019101721.
  4. Tom Heath and Christian Bizer. Linked Data: Evolving the Web into a Global Data Space (1st edition). Morgan & Claypool, Palo Alto, California, 2011. URL: http://linkeddatabook.com/editions/1.0/.
  5. Eero Hyvönen. "Sampo" model and semantic portals for digital humanities on the semantic web. In DHN 2020 Digital Humanities in the Nordic Countries. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, pages 373-378. CEUR Workshop Proceedings, vol. 2612, October 2020. URL: http://ceur-ws.org/Vol-2612/poster1.pdf.
  6. Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen, and Kirsi Keravuori. Biographysampo - publishing and enriching biographies on the semantic web for digital humanities research. In Pascal Hitzler, Miriam Fernández, Krzysztof Janowicz, Amrapali Zaveri, Alasdair J.G. Gray, Vanessa Lopez, Armin Haller, and Karl Hammar, editors, The Semantic Web. ESWC 2019, pages 574-589. Springer-Verlag, June 2019. URL: https://doi.org/10.1007/978-3-030-21348-0_37.
  7. Eero Hyvönen, Minna Tamper, Arttu Oksanen, Esko Ikkala, Sami Sarsa, Jouni Tuominen, and Aki Hietanen. LawSampo: A semantic portal on a linked open data service for finnish legislation and case law. In The Semantic Web: ESWC 2020 Satellite Events. Revised Selected Papers, pages 110-114. Springer-Verlag, 2019. Google Scholar
  8. Eero Hyvönen, Jouni Tuominen, Miika Alonen, and Eetu Mäkelä. Linked Data Finland: A 7-star model and platform for publishing and re-using linked datasets. In ESWC 2014 Satellite Events, pages 226-230. Springer-Verlag, 2014. Google Scholar
  9. Esko Ikkala, Eero Hyvönen, Heikki Rantala, and Mikko Koho. Sampo-UI: A Full Stack JavaScript Framework for Developing Semantic Portal User Interfaces. Semantic Web - Interoperability, Usability, Applicability, 2021. accepted. Google Scholar
  10. Kimmo Kettunen and Matti La Mela. Digging deeper into the finnish parliamentary protocols – using a lexical semantic tagger for studying meaning change of everyman’s rights (allemansrätten). In DHN 2020 Digital Humanities in the Nordic Countries. Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, pages 63-80. CEUR Workshop Proceedings, vol. 2612, October 2020. URL: http://ceur-ws.org/Vol-2612/paper5.pdf.
  11. Martijn Kleppe, Laura Hollink, Max Kemman, Damir Juric, Henri Beunders, Jaap Blom, Johan Oomen, and Geert-Jan Houben. Polimedia: Analysing media coverage of political debates by automatically generated links to radio & newspaper items. In OKCon 2013 LinkedUp Veni Competition on Linked and Open Data for Education, pages 63-80. CEUR Workshop Proceedings, vol. 1124, September 2013. URL: http://ceur-ws.org/Vol-1124/linkedup_veni2013_04.pdf.
  12. Philipp Koehn. Europarl: A parallel corpus for statistical machine translation. In MT summit, volume 5, pages 79-86, 2005. URL: https://homepages.inf.ed.ac.uk/pkoehn/publications/europarl-mtsummit05.pdf.
  13. Mikko Koho, Lia Gasbarra, Jouni Tuominen, Heikki Rantala, Ilkka Jokipii, and Eero Hyvönen. AMMO Ontology of Finnish Historical Occupations. In Proceedings of the First International Workshop on Open Data and Ontologies for Cultural Heritage (ODOCH'19), volume 2375, pages 91-96. CEUR Workshop Proceedings, June 2019. URL: http://ceur-ws.org/Vol-2375/.
  14. Matti La Mela. Tracing the emergence of nordic allemansrätten through digitised parliamentary sources. In Mats Fridlund, Mila Oiva, and Petri Paju, editors, Digital histories: Emergent approaches within the new digital history, pages 181-197. Helsinki University Press, 2020. URL: https://doi.org/10.33134/HUP-5-11.
  15. Emanuele Lapponi, Martin G. Søyland, Erik Velldal, and Stephan Oepen. The Talk of Norway: a richly annotated corpus of the Norwegian parliament, 1998–2016. Language Resources and Evaluation, 52(3):873-893, 2018. URL: https://doi.org/10.1007/s10579-018-9411-5.
  16. Petri Leskinen, Jouni Tuominen, and Eero Hyvönen. Members of parliament in finland (1907–) knowledge graph and its linked open data service, 2021. Submitted for review. Google Scholar
  17. Eetu Mäkelä. Combining a REST lexical analysis web service with SPARQL for mashup semantic annotation from text. In Proceedings of the ESWC 2014 demonstration track, pages 424-428. Springer-Verlag, 2014. URL: https://doi.org/10.1007/978-3-319-11955-7_60.
  18. Arttu Oksanen, Jouni Tuominen, Eetu Mäkelä, Minna Tamper, Aki Hietanen, and Eero Hyvönen. Semantic Finlex: Transforming, publishing, and using Finnish legislation and case law as linked open data on the web. In G. Peruginelli and S. Faro, editors, Knowledge of the Law in the Big Data Age, volume 317 of Frontiers in Artificial Intelligence and Applications, pages 212-228. IOS Press, 2019. Google Scholar
  19. Andrej Pancur and Tomaž Erjavec. The siParl corpus of Slovene parliamentary proceedings. In Proceedings of the Second ParlaCLARIN Workshop, pages 28-34, Marseille, France, 2020. European Language Resources Association. URL: https://www.aclweb.org/anthology/2020.parlaclarin-1.6.
  20. Onni Pekonen. Debating "the ABCs of parliamentary life": the learning of parliamentary rules and practices in the late nineteenth-century Finnish Diet and the early Eduskunta. PhD thesis, University of Jyväskylä, Jyväskylä, 2014. URL: http://urn.fi/URN:ISBN:978-951-39-5843-5.
  21. Christian Rauh, Pieter De Wilde, and Jan Schwalbach. The ParlSpeech data set: Annotated full-text vectors of 3.9 million plenary speeches in the key legislative chambers of seven European states, 2017. URL: https://doi.org/10.7910/DVN/E4RSP9.
  22. Laurens Rietveld and Rinke Hoekstra. The YASGUI family of SPARQL clients. Semantic Web, 8(3):373-383, 2017. Google Scholar
  23. Katri Seppälä and Eero Hyvönen. Asiasanaston muuttaminen ontologiaksi. Yleinen suomalainen ontologia esimerkkinä FinnONTO-hankkeen mallista (Changing a keyword thesaurus into an ontology. General Finnish Ontology as an example of the FinnONTO model). Technical report, National Library, Plans, Reports, Guides, March 2014. URL: https://www.doria.fi/handle/10024/96825.
  24. Osma Suominen. Annif: DIY automated subject indexing using multiple algorithms. LIBER Quarterly, 29(1):1-25, 2019. URL: https://doi.org/10.18352/lq.10285.
  25. Minna Tamper, Arttu Oksanen, Jouni Tuominen, Aki Hietanen, and Eero Hyvönen. Automatic annotation service APPI: Named entity linking in legal domain. In Proceedings of ESWC 2020, Posters and Demos. Springer-Verlag, 2020. Google Scholar
  26. Katherine Thornton, Harold Solbrig, Gregory S. Stupp, Jose Emilio Labra Gayo, Daniel Mietchen, Eric Prud’hommeaux, and Andra Waagmeester. Using shape expressions (ShEx) to share RDF data models and to guide curation with rigorous validation. In Pascal Hitzler, Miriam Fernández, Krzysztof Janowicz, Amrapali Zaveri, Alasdair J.G. Gray, Vanessa Lopez, Armin Haller, and Karl Hammar, editors, The Semantic Web. ESWC 2019, pages 606-620. Springer-Verlag, 2019. URL: https://doi.org/10.1007/978-3-030-21348-0_39.
  27. Jouni Tuominen, Eero Hyvönen, and Petri Leskinen. Bio CRM: A data model for representing biographical data for prosopographical research. In Biographical Data in a Digital World (BD2017), 2017. URL: https://doi.org/10.5281/zenodo.1040712.
  28. Astrid van Aggelen, Laura Hollink, Max Kemman, Martijn Kleppe, and Henri Beunders. The debates of the European Parliament as Linked Open Data. Semantic Web, 8(2):271-281, 2017. URL: https://doi.org/10.3233/SW-160227.
  29. Eero Voutilainen. Tekstilajitietoista kielenhuoltoa: puheen esittäminen kirjoitettuna eduskunnan täysistuntopöytäkirjoissa. In Liisa Tiittula and Pirkko Nuolijärvi, editors, Puheesta tekstiksi - Puheen kirjallisen esittämisen alueita, keinoja ja rajoja, pages 162-191. Suomalaisen Kirjallisuuden Seura, 2016. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail