Skip to main content
Erschienen in: Artificial Intelligence and Law 1/2022

17.04.2021 | Review Article

Clustering of Brazilian legal judgments about failures in air transport service: an evaluation of different approaches

verfasst von: Isabela Cristina Sabo, Thiago Raulino Dal Pont, Pablo Ernesto Vigneaux Wilton, Aires José Rover, Jomi Fred Hübner

Erschienen in: Artificial Intelligence and Law | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The paper presents different clustering approaches in legal judgments from the Special Civil Court located at the Federal University of Santa Catarina (JEC/UFSC). The subject is Consumer Law, specifically cases in which consumers claim moral and material compensation from airlines for service failures. To identify patterns from the dataset, we apply four types of clustering algorithms: Hierarchical and Lingo (soft clustering), K-means and Affinity Propagation (hard clustering). We evaluate the results based on the following criteria: (1) entropy and purity; (2) algorithm's ability in providing labels; (3) legal expert’s evaluation; and (4) experimental complexity. The results demonstrate that the most advantageous approach is Hierarchical Clustering, since it has the best entropy and purity numbers, as well as the least difficulty for the expert to analyze the clusters, and the least experimental complexity. The main contribution of the paper is to show the advantages and disadvantages of each approach, especially to identify labels in unstructured and non-indexed legal texts.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
In August 2020, the Brazilian Judiciary instituted "Datajud", a unified national database, and the courts are required to send data of all processes following pre-established standards (CNJ 2020b).
 
2
Data available at request. Code is available at https://​github.​com/​thiagordp/​clustering_​jec
 
Literatur
Zurück zum Zitat Alpaydin E (2009) Introduction to machine learning, 2nd edn. MIT Press, CambridgeMATH Alpaydin E (2009) Introduction to machine learning, 2nd edn. MIT Press, CambridgeMATH
Zurück zum Zitat Benjamin AHV (2012) Práticas abusivas. In: Benjamin AHV, Marques CL, Bessa LR (eds) Manual de direito do consumidor. Revista dos Tribunais, São Paulo, pp 265–266 Benjamin AHV (2012) Práticas abusivas. In: Benjamin AHV, Marques CL, Bessa LR (eds) Manual de direito do consumidor. Revista dos Tribunais, São Paulo, pp 265–266
Zurück zum Zitat Brown TB, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: Proceedings of 34th conference on neural information processing systems (NeurIPS 2020). Vancouver, Canada Brown TB, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: Proceedings of 34th conference on neural information processing systems (NeurIPS 2020). Vancouver, Canada
Zurück zum Zitat Campbell JC, Hindle A, Stroulia E (2015) Latent Dirichlet allocation. In: The art and science of analyzing software data. Elsevier, pp 139–159 Campbell JC, Hindle A, Stroulia E (2015) Latent Dirichlet allocation. In: The art and science of analyzing software data. Elsevier, pp 139–159
Zurück zum Zitat Cavalieri Filho S (2015) Programa de responsabilidade civil, 12th edn. Atlas, São Paulo Cavalieri Filho S (2015) Programa de responsabilidade civil, 12th edn. Atlas, São Paulo
Zurück zum Zitat Cichosz P (2015) Data mining algorithms: explained using R. Wiley, ChichesterCrossRef Cichosz P (2015) Data mining algorithms: explained using R. Wiley, ChichesterCrossRef
Zurück zum Zitat Conrad JG, Al-Kofahi K, Zhao Y, Karypis G (2005) Effective document clustering for large heterogeneous law firm collections. In: Proceedings of 10th international conference on artificial intelligence and law (ICAIL 2005). Bologna, pp 177–187 Conrad JG, Al-Kofahi K, Zhao Y, Karypis G (2005) Effective document clustering for large heterogeneous law firm collections. In: Proceedings of 10th international conference on artificial intelligence and law (ICAIL 2005). Bologna, pp 177–187
Zurück zum Zitat Dal Pont TR, Sabo IC, Hübner JF, Rover AJ (2020) Impact of text specificity and size on word embeddings performance: an empirical evaluation in Brazilian legal domain. Lecture Notes in Computer Science, vol 12319, Springer International Publishing, Cham, pp 521–535. https://doi.org/10.1007/978-3-030-61377-836 Dal Pont TR, Sabo IC, Hübner JF, Rover AJ (2020) Impact of text specificity and size on word embeddings performance: an empirical evaluation in Brazilian legal domain. Lecture Notes in Computer Science, vol 12319, Springer International Publishing, Cham, pp 521–535. https://​doi.​org/​10.​1007/​978-3-030-61377-836
Zurück zum Zitat Denari Z (2011) Da qualidade de produtos e serviços, da prevenção e da reparação de danos. In: Grinover AP et al (eds) Código brasileiro de defesa do consumidor: comentado pelos autores do anteprojeto, 10th edn. Forense Universitária, Rio de Janeiro, pp 179–258 Denari Z (2011) Da qualidade de produtos e serviços, da prevenção e da reparação de danos. In: Grinover AP et al (eds) Código brasileiro de defesa do consumidor: comentado pelos autores do anteprojeto, 10th edn. Forense Universitária, Rio de Janeiro, pp 179–258
Zurück zum Zitat Demšar J, Curk T, Erjavec A, Gorup Č, Hočevar T, Milutinovič M, Možina M, Polajnar M, Toplak M, Starič A, Štajdohar M, Umek L, Žagar L, Žbontar J, Žitnik M, Zupan B (2013) Orange: data mining toolbox in Python. J Mach Learn Res 14(35):2349–2353MATH Demšar J, Curk T, Erjavec A, Gorup Č, Hočevar T, Milutinovič M, Možina M, Polajnar M, Toplak M, Starič A, Štajdohar M, Umek L, Žagar L, Žbontar J, Žitnik M, Zupan B (2013) Orange: data mining toolbox in Python. J Mach Learn Res 14(35):2349–2353MATH
Zurück zum Zitat Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019 annual conference of the North American chapter of the Association for ... for Computational Linguistics: Human Language Technologies, vol 1, pp 4171–4186 Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019 annual conference of the North American chapter of the Association for ... for Computational Linguistics: Human Language Technologies, vol 1, pp 4171–4186
Zurück zum Zitat García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer, ChamCrossRef García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer, ChamCrossRef
Zurück zum Zitat Hair JF Jr, Black WC, Babin BJ, Anderson RE (2019) Multivariate data analysis, 8th edn. Cengage Learning, Andover Hair JF Jr, Black WC, Babin BJ, Anderson RE (2019) Multivariate data analysis, 8th edn. Cengage Learning, Andover
Zurück zum Zitat Hu X, Liu H (2012) Text analytics in social media. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, New York, pp 385–414CrossRef Hu X, Liu H (2012) Text analytics in social media. In: Aggarwal CC, Zhai CX (eds) Mining text data. Springer, New York, pp 385–414CrossRef
Zurück zum Zitat Jivani AG (2011) A comparative study of stemming algorithms. Int J Comp Tech Appl 2(6):1930–1938 Jivani AG (2011) A comparative study of stemming algorithms. Int J Comp Tech Appl 2(6):1930–1938
Zurück zum Zitat Jolliffe IT (2010) Principal component analysis, 2nd edn. Springer Series in Statistics, New YorkMATH Jolliffe IT (2010) Principal component analysis, 2nd edn. Springer Series in Statistics, New YorkMATH
Zurück zum Zitat Jurafsky D, Martin JH (2019) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 3rd edn. Draft, Stanford University Jurafsky D, Martin JH (2019) Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 3rd edn. Draft, Stanford University
Zurück zum Zitat Kotu V, Deshpande B (2019) Data science: concepts and practice, 2nd edn. Morgan Kaufmann, Cambridge Kotu V, Deshpande B (2019) Data science: concepts and practice, 2nd edn. Morgan Kaufmann, Cambridge
Zurück zum Zitat Lee ML, Lu H, Ling TW, Ko YT (1999) Cleansing data for mining and warehousing. In: Bench-Capon TJ, Soda G, Tjoa AM (eds) Database and Expert Systems Applications (DEXA 1999). Lecture Notes in Computer Science 1677. Springer, Berlin Lee ML, Lu H, Ling TW, Ko YT (1999) Cleansing data for mining and warehousing. In: Bench-Capon TJ, Soda G, Tjoa AM (eds) Database and Expert Systems Applications (DEXA 1999). Lecture Notes in Computer Science 1677. Springer, Berlin
Zurück zum Zitat Manning C, Raghavan P, Schütze H (2010) An introduction to information retrieval. Cambridge University Press, CambridgeMATH Manning C, Raghavan P, Schütze H (2010) An introduction to information retrieval. Cambridge University Press, CambridgeMATH
Zurück zum Zitat Mathai S, Gupta D, Radhakrishnan G (2018) Iterative concept-based clustering of Indian Court Judgments. In: Bhateja V et al (eds). Proceedings of the second international conference on computational intelligence and informatics (ICCII 2017). Springer Nature, Singapore, pp 91–103 Mathai S, Gupta D, Radhakrishnan G (2018) Iterative concept-based clustering of Indian Court Judgments. In: Bhateja V et al (eds). Proceedings of the second international conference on computational intelligence and informatics (ICCII 2017). Springer Nature, Singapore, pp 91–103
Zurück zum Zitat Matos C (1995) O ônus da prova no Código de Defesa do Consumidor. Justitia 57(170):94–102 Matos C (1995) O ônus da prova no Código de Defesa do Consumidor. Justitia 57(170):94–102
Zurück zum Zitat Mohri M, Rostamizadeh A, Talwalkar A (2018) Foundations of machine learning, 2nd edn. MIT Press, CambridgeMATH Mohri M, Rostamizadeh A, Talwalkar A (2018) Foundations of machine learning, 2nd edn. MIT Press, CambridgeMATH
Zurück zum Zitat Nery N Jr (2011) Da proteção contratual. In: Grinover AP et al (eds) Código brasileiro de defesa do consumidor: comentado pelos autores do anteprojeto, 10th edn. Forense Universitária, Rio de Janeiro, pp 511–656 Nery N Jr (2011) Da proteção contratual. In: Grinover AP et al (eds) Código brasileiro de defesa do consumidor: comentado pelos autores do anteprojeto, 10th edn. Forense Universitária, Rio de Janeiro, pp 511–656
Zurück zum Zitat Osiński S, Stefanowski J, Weiss D (2004) Lingo: search results clustering algorithm based on singular value decomposition. In: Kłopotek MA, Wierzchoń ST, Trojanowski K (eds) Intelligent information processing and web mining. Advances in soft computing 25. Springer, Berlin, pp 359–368CrossRef Osiński S, Stefanowski J, Weiss D (2004) Lingo: search results clustering algorithm based on singular value decomposition. In: Kłopotek MA, Wierzchoń ST, Trojanowski K (eds) Intelligent information processing and web mining. Advances in soft computing 25. Springer, Berlin, pp 359–368CrossRef
Zurück zum Zitat Raghav K, Reddy PB, Reddy VB, Reddy PK (2015) Text and citations based cluster analysis of legal judgments. In: Prasath R, Vuppala A, Kathirvalavakumar T (eds) Mining intelligence and knowledge exploration (MIKE 2015) 9468. Springer, Cham, pp 449–459CrossRef Raghav K, Reddy PB, Reddy VB, Reddy PK (2015) Text and citations based cluster analysis of legal judgments. In: Prasath R, Vuppala A, Kathirvalavakumar T (eds) Mining intelligence and knowledge exploration (MIKE 2015) 9468. Springer, Cham, pp 449–459CrossRef
Zurück zum Zitat Tan P, Steinbach M, Kumar V (2014) Introduction to data mining. Pearson Education Ltd, London Tan P, Steinbach M, Kumar V (2014) Introduction to data mining. Pearson Education Ltd, London
Zurück zum Zitat Theodoridis S, Koutroumbas K (2009) Pattern recognition, 4th edn. Academic Press, BurlingtonMATH Theodoridis S, Koutroumbas K (2009) Pattern recognition, 4th edn. Academic Press, BurlingtonMATH
Zurück zum Zitat Watanabe K (1985) Filosofia e características básicas do Juizado Especial de Pequenas Causas. In: Watanabe K (coord) Juizado Especial de Pequenas Causas: lei n. 7.244/1984. Revista dos Tribunais, São Paulo, pp 1–7 Watanabe K (1985) Filosofia e características básicas do Juizado Especial de Pequenas Causas. In: Watanabe K (coord) Juizado Especial de Pequenas Causas: lei n. 7.244/1984. Revista dos Tribunais, São Paulo, pp 1–7
Zurück zum Zitat Weiss D (2001) A clustering interface for web search results in Polish and English. Dissertation, Poznan University of Technology Weiss D (2001) A clustering interface for web search results in Polish and English. Dissertation, Poznan University of Technology
Zurück zum Zitat Zaidi F, Archambault D, Melançon G (2010) Evaluating the quality of clustering algorithms using cluster path lengths. In: Perner P (ed) Advances in data mining. Applications and theoretical aspects (ICDM 2010). Lecture Notes in Computer Science (LNAI 6171). Springer, Berlin, pp 42–56. https://doi.org/10.1007/978-3-642-14400-4_4CrossRef Zaidi F, Archambault D, Melançon G (2010) Evaluating the quality of clustering algorithms using cluster path lengths. In: Perner P (ed) Advances in data mining. Applications and theoretical aspects (ICDM 2010). Lecture Notes in Computer Science (LNAI 6171). Springer, Berlin, pp 42–56. https://​doi.​org/​10.​1007/​978-3-642-14400-4_​4CrossRef
Metadaten
Titel
Clustering of Brazilian legal judgments about failures in air transport service: an evaluation of different approaches
verfasst von
Isabela Cristina Sabo
Thiago Raulino Dal Pont
Pablo Ernesto Vigneaux Wilton
Aires José Rover
Jomi Fred Hübner
Publikationsdatum
17.04.2021
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence and Law / Ausgabe 1/2022
Print ISSN: 0924-8463
Elektronische ISSN: 1572-8382
DOI
https://doi.org/10.1007/s10506-021-09287-3

Weitere Artikel der Ausgabe 1/2022

Artificial Intelligence and Law 1/2022 Zur Ausgabe

Premium Partner