Contributions to Legal Document Summarization: Judgments from the Portuguese Supreme Court of Justice

Dias, Margarida Rebelo; Ribeiro, Ricardo; Pinto, H. Sofia

doi:10.4230/OASIcs.SLATE.2024.2

Abstract

Legal documents are commonly known for being lengthy and having a specific vocabulary. For professionals and non-jurists, having a summary of each document is crucial so they can use it as a reference for other cases without spending too much time reading the entire document. In the Portuguese Supreme Court of Justice, summaries are done manually, by its Judges which is very time-consuming because of the length of the legal documents. Aiming to support the Judges in this task, the goal of this work is to investigate how different techniques and methods of automated text summarization can achieve good performance on Portuguese legal documents.

Cite As Get BibTex

Margarida Rebelo Dias, Ricardo Ribeiro, and H. Sofia Pinto. Contributions to Legal Document Summarization: Judgments from the Portuguese Supreme Court of Justice. In 13th Symposium on Languages, Applications and Technologies (SLATE 2024). Open Access Series in Informatics (OASIcs), Volume 120, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/OASIcs.SLATE.2024.2

Author Details

Margarida Rebelo Dias

Iscte - Instituto Universitário de Lisboa, Lisbon, Portugal
INESC-ID Lisboa, Lisbon, Portugal

Ricardo Ribeiro

Iscte - Instituto Universitário de Lisboa, Lisbon, Portugal
INESC-ID Lisboa, Lisbon, Portugal

H. Sofia Pinto

Instituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal
INESC-ID Lisboa, Lisbon, Portugal

Funding

This research was supported by Fundação para a Ciência e Tecnologia (FCT), through the INESC-ID multi-annual funding with reference DOI:10.54499/UIDB/50021/2020. This research is part of the IRIS project with reference PR07005.

Acknowledgements

This project is a collaboration involving the Portuguese Supreme Court of Justice and INESC-ID.

References

Wubetu Barud Demilie. Comparative analysis of automated text summarization techniques: The case of ethiopian languages. Wireless Communications & Mobile Computing (Online), 2022, 2022.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171-4186. Association for Computational Linguistics, 2019. URL: https://doi.org/10.18653/V1/N19-1423.
Günes Erkan and Dragomir R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res., 22:457-479, 2004. URL: https://doi.org/10.1613/JAIR.1523.
Shuai Gong, Zhenfang Zhu, Jiangtao Qi, Wenqing Wu, and Chunling Tong. Sebursum: a novel set-based summary ranking strategy for summary-level extractive summarization. J. Supercomput., 79(12):12949-12977, 2023. URL: https://doi.org/10.1007/S11227-023-05165-8.
Yue Huang, Lijuan Sun, Chong Han, and Jian Guo. A high-precision two-stage legal judgment summarization. Mathematics, 11(6):1320, 2023.
Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. A sentence is known by the company it keeps: Improving legal document summarization using deep clustering. Artificial Intelligence and Law, pages 1-36, 2023.
Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. Summarization of lengthy legal documents via abstractive dataset building: An extract-then-assign approach. Expert Syst. Appl., 237(Part B):121571, 2024. URL: https://doi.org/10.1016/J.ESWA.2023.121571.
Ambedkar Kanapala, Sukomal Pal, and Rajendra Pamula. Text summarization from legal documents: a survey. Artif. Intell. Rev., 51(3):371-402, 2019. URL: https://doi.org/10.1007/S10462-017-9566-2.
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, pages 7871-7880. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.703.
Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74-81, 2004.
Yang Liu. Fine-tune BERT for extractive summarization. CoRR, abs/1903.10318, 2019. URL: https://arxiv.org/abs/1903.10318.
Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguistics, 8:726-742, 2020. URL: https://doi.org/10.1162/TACL_A_00343.
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1-140:67, 2020. URL: https://jmlr.org/papers/v21/20-074.html.
Fábio Souza, Rodrigo Frassetto Nogueira, and Roberto de Alencar Lotufo. Bertimbau: Pretrained BERT models for brazilian portuguese. In BRACIS, volume 12319 of Lecture Notes in Computer Science, pages 403-417. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-61377-8_28.
Yufeng Sun, Fengbao Yang, Xiaoxia Wang, and Hongsong Dong. Automatic generation of the draft procuratorial suggestions based on an extractive summarization method: Bertslca. Mathematical Problems in Engineering, 2021:1-12, 2021.
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with BERT. In ICLR. OpenReview.net, 2020. URL: https://openreview.net/forum?id=SkeHuCVFDr.
Justin Zhao, Timothy Wang, Wael Abid, Geoffrey Angus, Arnav Garg, Jeffery Kinnison, Alex Sherstinsky, Piero Molino, Travis Addair, and Devvret Rishi. Lora land: 310 fine-tuned llms that rival gpt-4, A technical report. CoRR, abs/2405.00732, 2024. URL: https://doi.org/10.48550/arXiv.2405.00732.
Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. Extractive summarization as text matching. In ACL, pages 6197-6208. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.552.

Contributions to Legal Document Summarization: Judgments from the Portuguese Supreme Court of Justice

Authors Margarida Rebelo Dias, Ricardo Ribeiro , H. Sofia Pinto

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message