Contributions to Legal Document Summarization: Judgments from the Portuguese Supreme Court of Justice

Authors Margarida Rebelo Dias, Ricardo Ribeiro , H. Sofia Pinto



PDF
Thumbnail PDF

File

OASIcs.SLATE.2024.2.pdf
  • Filesize: 1.05 MB
  • 14 pages

Document Identifiers

Author Details

Margarida Rebelo Dias
  • Iscte - Instituto Universitário de Lisboa, Lisbon, Portugal
  • INESC-ID Lisboa, Lisbon, Portugal
Ricardo Ribeiro
  • Iscte - Instituto Universitário de Lisboa, Lisbon, Portugal
  • INESC-ID Lisboa, Lisbon, Portugal
H. Sofia Pinto
  • Instituto Superior Técnico - Universidade de Lisboa, Lisbon, Portugal
  • INESC-ID Lisboa, Lisbon, Portugal

Acknowledgements

This project is a collaboration involving the Portuguese Supreme Court of Justice and INESC-ID.

Cite As Get BibTex

Margarida Rebelo Dias, Ricardo Ribeiro, and H. Sofia Pinto. Contributions to Legal Document Summarization: Judgments from the Portuguese Supreme Court of Justice. In 13th Symposium on Languages, Applications and Technologies (SLATE 2024). Open Access Series in Informatics (OASIcs), Volume 120, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/OASIcs.SLATE.2024.2

Abstract

Legal documents are commonly known for being lengthy and having a specific vocabulary. For professionals and non-jurists, having a summary of each document is crucial so they can use it as a reference for other cases without spending too much time reading the entire document. In the Portuguese Supreme Court of Justice, summaries are done manually, by its Judges which is very time-consuming because of the length of the legal documents. Aiming to support the Judges in this task, the goal of this work is to investigate how different techniques and methods of automated text summarization can achieve good performance on Portuguese legal documents.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Natural language processing
  • Applied computing → Law
Keywords
  • automatic text summarization
  • legal document summarization
  • abstractive summarization
  • transformers
  • European Portuguese

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Wubetu Barud Demilie. Comparative analysis of automated text summarization techniques: The case of ethiopian languages. Wireless Communications & Mobile Computing (Online), 2022, 2022. Google Scholar
  2. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171-4186. Association for Computational Linguistics, 2019. URL: https://doi.org/10.18653/V1/N19-1423.
  3. Günes Erkan and Dragomir R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res., 22:457-479, 2004. URL: https://doi.org/10.1613/JAIR.1523.
  4. Shuai Gong, Zhenfang Zhu, Jiangtao Qi, Wenqing Wu, and Chunling Tong. Sebursum: a novel set-based summary ranking strategy for summary-level extractive summarization. J. Supercomput., 79(12):12949-12977, 2023. URL: https://doi.org/10.1007/S11227-023-05165-8.
  5. Yue Huang, Lijuan Sun, Chong Han, and Jian Guo. A high-precision two-stage legal judgment summarization. Mathematics, 11(6):1320, 2023. Google Scholar
  6. Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. A sentence is known by the company it keeps: Improving legal document summarization using deep clustering. Artificial Intelligence and Law, pages 1-36, 2023. Google Scholar
  7. Deepali Jain, Malaya Dutta Borah, and Anupam Biswas. Summarization of lengthy legal documents via abstractive dataset building: An extract-then-assign approach. Expert Syst. Appl., 237(Part B):121571, 2024. URL: https://doi.org/10.1016/J.ESWA.2023.121571.
  8. Ambedkar Kanapala, Sukomal Pal, and Rajendra Pamula. Text summarization from legal documents: a survey. Artif. Intell. Rev., 51(3):371-402, 2019. URL: https://doi.org/10.1007/S10462-017-9566-2.
  9. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, pages 7871-7880. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.703.
  10. Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74-81, 2004. Google Scholar
  11. Yang Liu. Fine-tune BERT for extractive summarization. CoRR, abs/1903.10318, 2019. URL: https://arxiv.org/abs/1903.10318.
  12. Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguistics, 8:726-742, 2020. URL: https://doi.org/10.1162/TACL_A_00343.
  13. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1-140:67, 2020. URL: https://jmlr.org/papers/v21/20-074.html.
  14. Fábio Souza, Rodrigo Frassetto Nogueira, and Roberto de Alencar Lotufo. Bertimbau: Pretrained BERT models for brazilian portuguese. In BRACIS, volume 12319 of Lecture Notes in Computer Science, pages 403-417. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-61377-8_28.
  15. Yufeng Sun, Fengbao Yang, Xiaoxia Wang, and Hongsong Dong. Automatic generation of the draft procuratorial suggestions based on an extractive summarization method: Bertslca. Mathematical Problems in Engineering, 2021:1-12, 2021. Google Scholar
  16. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with BERT. In ICLR. OpenReview.net, 2020. URL: https://openreview.net/forum?id=SkeHuCVFDr.
  17. Justin Zhao, Timothy Wang, Wael Abid, Geoffrey Angus, Arnav Garg, Jeffery Kinnison, Alex Sherstinsky, Piero Molino, Travis Addair, and Devvret Rishi. Lora land: 310 fine-tuned llms that rival gpt-4, A technical report. CoRR, abs/2405.00732, 2024. URL: https://doi.org/10.48550/arXiv.2405.00732.
  18. Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. Extractive summarization as text matching. In ACL, pages 6197-6208. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.552.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail