Search Results

Documents authored by Koho, Mikko


Document
Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup

Authors: Laura Sinikallio, Senka Drobac, Minna Tamper, Rafael Leal, Mikko Koho, Jouni Tuominen, Matti La Mela, and Eero Hyvönen

Published in: OASIcs, Volume 93, 3rd Conference on Language, Data and Knowledge (LDK 2021)


Abstract
This paper presents a knowledge graph created by transforming the plenary debates of the Parliament of Finland (1907-) into Linked Open Data (LOD). The data, totaling over νm{900 000} speeches, with automatically created semantic annotations and rich ontology-based metadata, are published in a Linked Open Data Service and are used via a SPARQL API and as data dumps. The speech data is part of larger LOD publication FinnParla that also includes prosopographical data about the politicians. The data is being used for studying parliamentary language and culture in Digital Humanities in several universities. To serve a wider variety of users, the entirety of this data was also produced using Parla-CLARIN markup. We present the first publication of all Finnish parliamentary debates as data. Technical novelties in our approach include the use of both Parla-CLARIN and an RDF schema developed for representing the speeches, integration of the data to a new Parliament of Finland Ontology for deeper data analyses, and enriching the data with a variety of external national and international data sources.

Cite as

Laura Sinikallio, Senka Drobac, Minna Tamper, Rafael Leal, Mikko Koho, Jouni Tuominen, Matti La Mela, and Eero Hyvönen. Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 8:1-8:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{sinikallio_et_al:OASIcs.LDK.2021.8,
  author =	{Sinikallio, Laura and Drobac, Senka and Tamper, Minna and Leal, Rafael and Koho, Mikko and Tuominen, Jouni and La Mela, Matti and Hyv\"{o}nen, Eero},
  title =	{{Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{8:1--8:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2021.8},
  URN =		{urn:nbn:de:0030-drops-145444},
  doi =		{10.4230/OASIcs.LDK.2021.8},
  annote =	{Keywords: Plenary debates, parliamentary data, Parla-CLARIN, Linked Open Data, Digital Humanities}
}
Document
Relevance Feedback Search Based on Automatic Annotation and Classification of Texts

Authors: Rafael Leal, Joonas Kesäniemi, Mikko Koho, and Eero Hyvönen

Published in: OASIcs, Volume 93, 3rd Conference on Language, Data and Knowledge (LDK 2021)


Abstract
The idea behind Relevance Feedback Search (RFBS) is to build search queries as an iterative and interactive process in which they are gradually refined based on the results of the previous search round. This can be helpful in situations where the end user cannot easily formulate their information needs at the outset as a well-focused query, or more generally as a way to filter and focus search results. This paper concerns (1) a framework that integrates keyword extraction and unsupervised classification into the RFBS paradigm and (2) the application of this framework to the legal domain as a use case. We focus on the Natural Language Processing (NLP) methods underlying the framework and application, where an automatic annotation tool is used for extracting document keywords as ontology concepts, which are then transformed into word embeddings to form vectorial representations of the texts. An unsupervised classification system that employs similar techniques is also used in order to classify the documents into broad thematic classes. This classification functionality is evaluated using two different datasets. As the use case, we describe an application perspective in the semantic portal LawSampo - Finnish Legislation and Case Law on the Semantic Web. This online demonstrator uses a dataset of 82145 sections in 3725 statutes of Finnish legislation and another dataset that comprises 13470 court decisions.

Cite as

Rafael Leal, Joonas Kesäniemi, Mikko Koho, and Eero Hyvönen. Relevance Feedback Search Based on Automatic Annotation and Classification of Texts. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 18:1-18:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{leal_et_al:OASIcs.LDK.2021.18,
  author =	{Leal, Rafael and Kes\"{a}niemi, Joonas and Koho, Mikko and Hyv\"{o}nen, Eero},
  title =	{{Relevance Feedback Search Based on Automatic Annotation and Classification of Texts}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{18:1--18:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2021.18},
  URN =		{urn:nbn:de:0030-drops-145543},
  doi =		{10.4230/OASIcs.LDK.2021.18},
  annote =	{Keywords: relevance feedback, keyword extraction, zero-shot text classification, word embeddings, LawSampo}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail