Skip to main content
Erschienen in: Artificial Intelligence and Law 2/2019

13.12.2018

Semi-automatic knowledge population in a legal document management system

verfasst von: Guido Boella, Luigi Di Caro, Valentina Leone

Erschienen in: Artificial Intelligence and Law | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Every organization has to deal with operational risks, arising from the execution of a company’s primary business functions. In this paper, we describe a legal knowledge management system which helps users understand the meaning of legislative text and the relationship between norms. While much of the knowledge requires the input of legal experts, we focus in this article on NLP applications that semi-automate essential time-consuming and lower-skill tasks—classifying legal documents, identifying cross-references and legislative amendments, linking legal terms to the most relevant definitions, and extracting key elements of legal provisions to facilitate clarity and advanced search options. The use of Natural Language Processing tools to semi-automate such tasks makes the proposal a realistic commercial prospect as it helps keep costs down while allowing greater coverage.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
4
The Arianna portal already exports documents to NIR XML format.
 
6
We specified a maximum distance of 2 words in order to encompass both sentences of the form ‘Il rif1 è soppresso’ (The rif1 is suppressed) and sentences of the form ‘Il rif1 è stato soppresso’ (The rif1 has been suppressed). In Italian, the lemma of both words ‘è’ and ‘stato’ is ‘essere’.
 
Literatur
Zurück zum Zitat Ajani G, Boella G, Caro L, Robaldo L, Humphreys L, Praduroux S, Rossi P, Violato A (2016) The European Taxonomy Syllabus: a multi-lingual, multi-level ontology framework to untangle the web of european legal terminology. Appl Ontol 11(4):325–375CrossRef Ajani G, Boella G, Caro L, Robaldo L, Humphreys L, Praduroux S, Rossi P, Violato A (2016) The European Taxonomy Syllabus: a multi-lingual, multi-level ontology framework to untangle the web of european legal terminology. Appl Ontol 11(4):325–375CrossRef
Zurück zum Zitat Ajani G, Lesmo L, Boella G, Mazzei A, Rossi P (2007) Terminological and ontological analysis of european directives: multilinguism in law. In: Proceedings of the 11th international conference on artificial intelligence and law: ICAIL. ACM, pp 43–48 Ajani G, Lesmo L, Boella G, Mazzei A, Rossi P (2007) Terminological and ontological analysis of european directives: multilinguism in law. In: Proceedings of the 11th international conference on artificial intelligence and law: ICAIL. ACM, pp 43–48
Zurück zum Zitat Berland M, Charniak E (1999) Finding parts in very large corpora. In: Annual meeting association for computational linguistics, vol 37. Association for Computational Linguistics, pp 57–64 Berland M, Charniak E (1999) Finding parts in very large corpora. In: Annual meeting association for computational linguistics, vol 37. Association for Computational Linguistics, pp 57–64
Zurück zum Zitat Biagioli C, Francesconi E, Passerini A, Montemagni S, Soria C (2005) Automatic semantics extraction in law documents. In: Proceedings of the tenth international conference on artificial intelligence and law: ICAIL. ACM, pp 133–140 Biagioli C, Francesconi E, Passerini A, Montemagni S, Soria C (2005) Automatic semantics extraction in law documents. In: Proceedings of the tenth international conference on artificial intelligence and law: ICAIL. ACM, pp 133–140
Zurück zum Zitat Biemann C (2005) Ontology learning from text: a survey of methods. LDV Forum 20:75–93 Biemann C (2005) Ontology learning from text: a survey of methods. LDV Forum 20:75–93
Zurück zum Zitat Boella G, Di Caro L, Graziadei M, Cupi L, Salaroglio CE, Humphreys L, Konstantinov H, Marko K, Robaldo L, Ruffini C et al (2015) Linking legal open data: breaking the accessibility and language barrier in european legislation and case law. In: Proceedings of the 15th international conference on artificial intelligence and law. ACM, pp 171–175 Boella G, Di Caro L, Graziadei M, Cupi L, Salaroglio CE, Humphreys L, Konstantinov H, Marko K, Robaldo L, Ruffini C et al (2015) Linking legal open data: breaking the accessibility and language barrier in european legislation and case law. In: Proceedings of the 15th international conference on artificial intelligence and law. ACM, pp 171–175
Zurück zum Zitat Boella G, Di Caro L, Humphreys L, Robaldo L, van der Torre L (2012) Nlp challenges for eunomos, a tool to build and manage legal knowledge. In: Language resources and evaluation (LREC), pp 3672–3678 Boella G, Di Caro L, Humphreys L, Robaldo L, van der Torre L (2012) Nlp challenges for eunomos, a tool to build and manage legal knowledge. In: Language resources and evaluation (LREC), pp 3672–3678
Zurück zum Zitat Boella G, Di Caro L, Robaldo L (2013) Semantic relation extraction from legislative text using generalized syntactic dependencies and support vector machines. In: International workshop on rules and rule markup languages for the semantic web. Springer, pp 218–225 Boella G, Di Caro L, Robaldo L (2013) Semantic relation extraction from legislative text using generalized syntactic dependencies and support vector machines. In: International workshop on rules and rule markup languages for the semantic web. Springer, pp 218–225
Zurück zum Zitat Bosco C, Montemagni A, Mazzei A, Lombardo V, Dell’Orletta F, Lenci A, Lesmo L, Attardi G, Simi M, Lavelli A, Hall J, Nilsson J, Nivre J (2010) Comparing italian parsers on a common treebank: the evalita experience. In: Proceedings of the 6th international conference on language resources and evaluation (LREC 2010) Bosco C, Montemagni A, Mazzei A, Lombardo V, Dell’Orletta F, Lenci A, Lesmo L, Attardi G, Simi M, Lavelli A, Hall J, Nilsson J, Nivre J (2010) Comparing italian parsers on a common treebank: the evalita experience. In: Proceedings of the 6th international conference on language resources and evaluation (LREC 2010)
Zurück zum Zitat Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771CrossRef Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771CrossRef
Zurück zum Zitat Buitelaar P, Cimiano P, Magnini B (2005) Ontology learning from text: an overview. Ontol Learn Text Methods Eval Appl 123:3–12 Buitelaar P, Cimiano P, Magnini B (2005) Ontology learning from text: an overview. Ontol Learn Text Methods Eval Appl 123:3–12
Zurück zum Zitat Candan KS, Di Caro L, Sapino ML (2008) Creating tag hierarchies for effective navigation in social media. In: Proceedings of the 2008 ACM workshop on search in social media. ACM, pp 75–82 Candan KS, Di Caro L, Sapino ML (2008) Creating tag hierarchies for effective navigation in social media. In: Proceedings of the 2008 ACM workshop on search in social media. ACM, pp 75–82
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
Zurück zum Zitat de Maat E, Krabben K, Winkels R (2010a) Machine learning versus knowledge based classification of legal texts. In: Proceedings of legal knowledge and information systems conference: JURIX 2010, pp 87–96 de Maat E, Krabben K, Winkels R (2010a) Machine learning versus knowledge based classification of legal texts. In: Proceedings of legal knowledge and information systems conference: JURIX 2010, pp 87–96
Zurück zum Zitat de Maat E, Krabben K, Winkels R (2010b) Machine learning versus knowledge based classification of legal texts. In: Proceedings of legal knowledge and information systems conference: JURIX 2010. IOS Press, pp 87–96 de Maat E, Krabben K, Winkels R (2010b) Machine learning versus knowledge based classification of legal texts. In: Proceedings of legal knowledge and information systems conference: JURIX 2010. IOS Press, pp 87–96
Zurück zum Zitat Del Gaudio R, Branco A (2007) Automatic extraction of definitions in Portuguese: a rule-based approach. In: Progress in artificial intelligence, pp 659–670 Del Gaudio R, Branco A (2007) Automatic extraction of definitions in Portuguese: a rule-based approach. In: Progress in artificial intelligence, pp 659–670
Zurück zum Zitat Di Caro L, Candan KS, Sapino ML (2008) Using tagflake for condensing navigable tag hierarchies from tag clouds. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1069–1072 Di Caro L, Candan KS, Sapino ML (2008) Using tagflake for condensing navigable tag hierarchies from tag clouds. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1069–1072
Zurück zum Zitat Di Caro L, Candan KS, Sapino ML (2011) Navigating within news collections using tag-flakes. J Vis Lang Comput 22(2):120–139CrossRef Di Caro L, Candan KS, Sapino ML (2011) Navigating within news collections using tag-flakes. J Vis Lang Comput 22(2):120–139CrossRef
Zurück zum Zitat Diplaris S, Tsoumakas G, Mitkas P, Vlahavas I (2005) Protein classification with multiple algorithms. In: Bozanis P, Houstis EN (eds) Advances in informatics. PCI 2005. Lecture notes in computer science, vol 3746. Springer, Berlin Diplaris S, Tsoumakas G, Mitkas P, Vlahavas I (2005) Protein classification with multiple algorithms. In: Bozanis P, Houstis EN (eds) Advances in informatics. PCI 2005. Lecture notes in computer science, vol 3746. Springer, Berlin
Zurück zum Zitat Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH
Zurück zum Zitat Fortuna B, Mladenič D, Grobelnik M (2006) Semi-automatic construction of topic ontologies. In: Ackermann M et al (eds) Semantics, web and mining. EWMF 2005, KDO 2005. Lecture notes in computer science, vol 4289. Springer, Berlin Fortuna B, Mladenič D, Grobelnik M (2006) Semi-automatic construction of topic ontologies. In: Ackermann M et al (eds) Semantics, web and mining. EWMF 2005, KDO 2005. Lecture notes in computer science, vol 4289. Springer, Berlin
Zurück zum Zitat Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th conference on Computational linguistics-volume 2. Association for Computational Linguistics, pp 539–545 Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th conference on Computational linguistics-volume 2. Association for Computational Linguistics, pp 539–545
Zurück zum Zitat Ittoo A, Bouma G (2013) Minimally-supervised extraction of domain-specific part-whole relations using wikipedia as knowledge-base. Data Knowl Eng 85:57–79CrossRef Ittoo A, Bouma G (2013) Minimally-supervised extraction of domain-specific part-whole relations using wikipedia as knowledge-base. Data Knowl Eng 85:57–79CrossRef
Zurück zum Zitat Lauser B, Hotho A (2003) Automatic multi-label subject indexing in a multilingual environment. In: Koch T, Sølvberg IT (eds) Research and advanced technology for digital libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, pp 140–151 Lauser B, Hotho A (2003) Automatic multi-label subject indexing in a multilingual environment. In: Koch T, Sølvberg IT (eds) Research and advanced technology for digital libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, pp 140–151
Zurück zum Zitat Lesmo L (2007) The rule-based parser of the NLP group of the University of Torino. Intell Artif 2(4):46–47 Lesmo L (2007) The rule-based parser of the NLP group of the University of Torino. Intell Artif 2(4):46–47
Zurück zum Zitat Lesmo L (2009) The turin university parser at evalita 2009. In: Proceedings of EVALITA, p 9 Lesmo L (2009) The turin university parser at evalita 2009. In: Proceedings of EVALITA, p 9
Zurück zum Zitat Lesmo L, Mazzei A, Palmirani M, Radicioni DP (2013) Tulsi: an nlp system for extracting legal modificatory provisions. Artif Intell Law 21(2):139–172CrossRef Lesmo L, Mazzei A, Palmirani M, Radicioni DP (2013) Tulsi: an nlp system for extracting legal modificatory provisions. Artif Intell Law 21(2):139–172CrossRef
Zurück zum Zitat Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41CrossRef Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41CrossRef
Zurück zum Zitat Moschitti A, Bejan CA (2004) A semantic kernel for predicate argument classification. In: CoNLL-2004 Moschitti A, Bejan CA (2004) A semantic kernel for predicate argument classification. In: CoNLL-2004
Zurück zum Zitat Navigli R, Velardi P (2010) Learning word-class lattices for definition and hypernym extraction. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, Sweden. Association for Computational Linguistics, pp 1318–1327 Navigli R, Velardi P (2010) Learning word-class lattices for definition and hypernym extraction. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, Sweden. Association for Computational Linguistics, pp 1318–1327
Zurück zum Zitat Ponzetto SP, Strube M (2007) Deriving a large scale taxonomy from wikipedia. In: Proceedings of the 22nd national conference on artificial intelligence, vol 2. MIT Press, Cambridge, pp 1440–1445 Ponzetto SP, Strube M (2007) Deriving a large scale taxonomy from wikipedia. In: Proceedings of the 22nd national conference on artificial intelligence, vol 2. MIT Press, Cambridge, pp 1440–1445
Zurück zum Zitat Robaldo L (2011) Distributivity, collectivity, and cumulativity in terms of (in)dependence and maximality. J Log Lang Inf 20(2):233–271MathSciNetCrossRefMATH Robaldo L (2011) Distributivity, collectivity, and cumulativity in terms of (in)dependence and maximality. J Log Lang Inf 20(2):233–271MathSciNetCrossRefMATH
Zurück zum Zitat Robaldo L, Caselli T, Russo I, Grella M (2011) From italian text to timeml document via dependency parsing. In: Proceedings of the 12th international computational linguistics and intelligent text processing conference (CICLing 2011), Tokyo, Japan, 2011, pp 177–187 Robaldo L, Caselli T, Russo I, Grella M (2011) From italian text to timeml document via dependency parsing. In: Proceedings of the 12th international computational linguistics and intelligent text processing conference (CICLing 2011), Tokyo, Japan, 2011, pp 177–187
Zurück zum Zitat Robaldo L, Di Caro L, Antonini A (2013) Sentitagger - automatically tagging text in opinionmining-ml. In: ESSEM@AI*IA, volume 1096 of CEUR workshop proceedings. CEUR-WS.org, pp 177–180 Robaldo L, Di Caro L, Antonini A (2013) Sentitagger - automatically tagging text in opinionmining-ml. In: ESSEM@AI*IA, volume 1096 of CEUR workshop proceedings. CEUR-WS.org, pp 177–180
Zurück zum Zitat Robaldo L, Sun X (2017) Reified input/output logic: combining input/output logic and reification to represent norms coming from existing legislation. J Log Comput 27(8):2471–2503MathSciNetCrossRefMATH Robaldo L, Sun X (2017) Reified input/output logic: combining input/output logic and reification to represent norms coming from existing legislation. J Log Comput 27(8):2471–2503MathSciNetCrossRefMATH
Zurück zum Zitat Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523CrossRef Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523CrossRef
Zurück zum Zitat Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18:613–620CrossRefMATH Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18:613–620CrossRefMATH
Zurück zum Zitat Steinberger R, Mohamed E, Turchi M (2012) Jrc eurovoc indexer jex-a freely available multilabel categorisation tool. In: Proceedings of the 8th international conference on language resources and evaluation (LREC 2012) Steinberger R, Mohamed E, Turchi M (2012) Jrc eurovoc indexer jex-a freely available multilabel categorisation tool. In: Proceedings of the 8th international conference on language resources and evaluation (LREC 2012)
Zurück zum Zitat Tran OT, Bach NX, Le NM, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22(1):29–60CrossRef Tran OT, Bach NX, Le NM, Shimazu A (2014) Automated reference resolution in legal texts. Artif Intell Law 22(1):29–60CrossRef
Zurück zum Zitat Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min (IJDWM) 3(3):1–13CrossRef Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min (IJDWM) 3(3):1–13CrossRef
Zurück zum Zitat Velardi P, Faralli S, Navigli R (2013) Ontolearn reloaded: a graph-based algorithm for taxonomy induction. Comput Linguist 39:665–707CrossRef Velardi P, Faralli S, Navigli R (2013) Ontolearn reloaded: a graph-based algorithm for taxonomy induction. Comput Linguist 39:665–707CrossRef
Zurück zum Zitat Yamada I, Torisawa K, Kazama J, Kuroda K, Murata M, De Saeger S, Bond F, Sumida A (2009) Hypernym discovery based on distributional similarity and hierarchical structures. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 2. Association for Computational Linguistics, pp 929–937 Yamada I, Torisawa K, Kazama J, Kuroda K, Murata M, De Saeger S, Bond F, Sumida A (2009) Hypernym discovery based on distributional similarity and hierarchical structures. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 2. Association for Computational Linguistics, pp 929–937
Zurück zum Zitat Yang H, Callan J (2008) Ontology generation for large email collections. In: Proceedings of the 2008 international conference on Digital government research. Digital Government Society of North America, pp 254–261 Yang H, Callan J (2008) Ontology generation for large email collections. In: Proceedings of the 2008 international conference on Digital government research. Digital Government Society of North America, pp 254–261
Metadaten
Titel
Semi-automatic knowledge population in a legal document management system
verfasst von
Guido Boella
Luigi Di Caro
Valentina Leone
Publikationsdatum
13.12.2018
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence and Law / Ausgabe 2/2019
Print ISSN: 0924-8463
Elektronische ISSN: 1572-8382
DOI
https://doi.org/10.1007/s10506-018-9239-8

Weitere Artikel der Ausgabe 2/2019

Artificial Intelligence and Law 2/2019 Zur Ausgabe

Premium Partner