1 Introduction
-
We push forward research on law document analysis for civil law systems, focusing on the modeling, learning and understanding of logically coherent corpora of law articles, using the Italian Civil Code as case in point.
-
We study the law article retrieval task as a prediction problem based on the deep machine learning paradigm. More specifically, following the lastest advances in research on deep neural network models for text data, we propose a deep pre-trained contextualized language model framework, named LamBERTa (Law article mining based on BERT architecture). LamBERTa is designed to fine-tune an Italian pre-trained BERT on the ICC corpora for law article retrieval as prediction, i.e., given a natural language query, predict the most relevant ICC article(s).
-
Notably, we deal with a very challenging prediction task, which is characterized not only by a high number (i.e., hundreds) of classes—as many as the number of articles—but also by the issues that arise from the need for building suitable training sets given the lack of test query benchmarks for Italian legal article retrieval/prediction tasks. This also leads to coping with few-shot learning issues (i.e., learning models to predict the correct class of instances when a small amount of examples are available in the training dataset), which has been recognized as one of the so-called extreme classification scenarios (Bengio et al. 2019; Chalkidis et al. 2019b). We design our LamBERTa framework to solve such issues based on different schemes of unsupervised training-instance labeling that we originally define for the ICC corpus, although they can easily be generalized to other law code systems.
-
We address one crucial aspect that typically arises in deep/machine learning models, namely explainability, which is clearly of interest also in artificial intelligence and law (e.g., Branting et al. 2019; Hacker et al. 2020). In this regard, we investigate explainability of our LamBERTa models focusing on the understanding of how they form complex relationships between the textual tokens. We further provide insights into the patterns generated by LamBERTa models through a visual exploratory analysis of the learned representation embeddings.
-
We present an extensive, quantitative experimental analysis of LamBERTa models by considering:The obtained results have shown the effectiveness of LamBERTa, and its superiority against (i) widely used deep-learning text classifiers that have been tested on our different query sets for the article prediction tasks, and against (ii) a few-shot learner conceived for an attribute-aware prediction task that we have newly designed based on the availability of ICC metadata.
-
six different types of test queries, which vary by originating source, length and lexical characteristics, and include comments about the ICC articles as well as case law decisions from the civil section of the Italian Court of Cassation that contain significant jurisprudential sentences associated with the ICC articles;
-
single-label evaluation as well as multi-label evaluation tasks;
-
different sets of assessment criteria.
-
2 Related work
3 Data
-
Book-1, on Persons and the Family, articles 1-455–contains the discipline of the juridical capacity of persons, of the rights of the personality, of collective organizations, of the family;
-
Book-2, on Successions, articles 456-809—contains the discipline of succession due to death and the donation contract;
-
Book-3, on Property, articles 810-1172—contains the discipline of ownership and other real rights;
-
Book-4, on Obligations, articles 1173-2059—contains the discipline of obligations and their sources, that is mainly of contracts and illicit facts (the so-called civil liability);
-
Book-5, on Labor, articles 2060-2642—contains the discipline of the company in general, of subordinate and self-employed work, of profit-making companies and of competition;
-
Book-6, on the Protection of Rights, articles 2643-2969—contains the discipline of the transcription, of the proofs, of the debtor’s financial liability and of the causes of pre-emption, of the prescription.
ICC | #Arts. | #Sentences over the articles | #Words over the articles | ||||||
---|---|---|---|---|---|---|---|---|---|
portion | tot. | min | max | mean (std) | tot. | min | max | mean (std) | |
Book-1 | 395 | 1979 | 3 | 21 | 5.010 (2.323) | 32,354 | 11 | 569 | 81.909 (71.952) |
Book-2 | 345 | 1561 | 3 | 13 | 4.525 (1.675) | 24,520 | 9 | 354 | 71.072 (51.366) |
Book-3 | 364 | 1619 | 3 | 24 | 4.448 (1.816) | 25,971 | 6 | 893 | 71.349 (65.836) |
Book-4 | 891 | 3595 | 3 | 12 | 4.035 (1.338) | 50,509 | 7 | 365 | 56.688 (38.837) |
Book-5 | 713 | 3937 | 3 | 37 | 5.522 (3.191) | 75,764 | 8 | 1465 | 106.261 (117.393) |
Book-6 | 331 | 1453 | 3 | 17 | 4.390 (1.895) | 25,937 | 12 | 654 | 78.360 (76.954) |
All | 3039 | 14,131 | 3 | 37 | 4.650 (2.243) | 234,945 | 6 | 1465 | 77.310 (78.373) |
4 The proposed LamBERTa framework
4.1 Problem setting
4.1.1 Motivations for BERT-based approach
4.1.2 Challenges
-
the first challenge refers to the high number (i.e., hundreds) of classes, which correspond to the number of articles in the ICC corpus, or portion of it, that is used to train a LamBERTa model;
-
the second challenge corresponds to the so-called few-shot learning problem, i.e., dealing with a small amount of per-class examples to train a machine learning model, which Bengio et al. recognize as one of the “extreme classification” scenarios (Bengio et al. 2019).
-
the third challenge derives from the unavailability of test query benchmarks for Italian legal article retrieval/prediction tasks. This has prompted us to define appropriate methods for data annotation, thus for building up training sets for the LamBERTa framework. To address this problem, we originally define different schemes of unsupervised training-instance labeling; notably, these are not ad-hoc defined for the ICC corpus, rather they can be adapted to any other law code corpus.
4.2 Overview of the LamBERTa framework
4.3 Global and local learning approaches
-
on the one hand, local models are designed to embed the logical coherence of the articles within a particular book and, although limited to its corresponding topical boundaries, they are expected to leverage the multi-faceted semantics underlying a specific civil law theme (e.g., inheritage);
-
on the other hand, books are themselves part of the same law code, and hence a global model might be useful to capture possible interrelations between the single books, however, by embedding different topic signals from different books (e.g. inheritage of Book-2 vs. labor law of Book-5), it could incur the risk of topical dilution over all the ICC.
4.4 Data preparation
4.4.1 Domain-specific terms injection and tokenization
ICC | #Added | Vocab. |
---|---|---|
Portion | Tokens | Size |
Book-1 | 833 | 31,935 |
Book-2 | 698 | 31,800 |
Book-3 | 1072 | 32,174 |
Book-4 | 1383 | 32,485 |
Book-5 | 2048 | 33,150 |
Book-6 | 829 | 31,931 |
All | 3993 | 35,095 |
4.4.2 Text encoding
4.5 Methods for unsupervised training-instance labeling
-
Title-only [
T
]. This is the simplest yet lossy scheme, which keeps an article’s title while discarding its content; the round-robin block is just the title of an article.6 -
n-gram. Each training unit corresponds to n consecutive sentences of an article; the round-robin block starts with the n-gram containing the title and ends with the n-gram containing the last sentence of the article. We set \(n \in \{1,2,3\}\), i.e., we consider a unigram [
UniRR
], a bigram [BiRR
], and a trigram [TriRR
] model, respectively. -
Cascade [
CasRR
]. The article’s sentences are cumulatively selected to form the training units; the round-robin block starts with the first sentence (i.e., the title), then the first two sentences, and so on until all article’s sentences are considered to form a single training unit. -
Triangle [
TglRR
]. Each training unit is either an unigram, a bigram or a trigram, i.e., the round-robin block contains all n-grams, with \(n \in \{1,2,3\}\), that can be extracted from the article’s title and description. -
Unigram with parameterized emphasis on the title [
UniRR.T
\(^+\)]. The set of training units is comprised of one subset containing the article’s sentences with round-robin selection, and another subset containing only replicas of the article’s title. More specifically, the two subsets are formed as follows:-
The first subset is of size equal to the maximum between the number of article’s sentences and the quantity \(m \times mean\_s\), where m is a multiplier (set to 4 as default) and \(mean\_s\) expresses the average number of sentences per article, excluding the title. As reported in Table 1 (sixth column), this mean value can be recognized between 3 and 4—recall that the title is excluded from the count— therefore we set \(mean\_s \in \{3,4\}\).
-
The second subset finally contains \(minTU - m \times mean\_s\) replicas of the title.
-
-
Cascade with parameterized emphasis on the title [
CasRR.T
\(^+\)] and Triangle with parameterized emphasis on the title [TglRR.T
\(^+\)]. These two schemes follow the same approach asUniRR.T
\(^+\) except for the composition of the round-robin block, which corresponds toCasRR
andTglRR
, respectively, with the title left out from this block and replicated in the second block, for each article.
4.6 Learning configuration
5 Explainability of LamBERTa models based on Attention Patterns
bertviz
visualization tool,8 we show how LamBERTa forms its distinctive attention patterns. For this purpose, here we present a selection of examples built upon sentences from Book-2 of the ICC (i.e., relevant to key concepts in inheritance law), which are next reported both in Italian and English-translated versions.bertviz
visualizes attention patterns as lines connecting the word being updated (left) with the word(s) being attended to (right), for any given input sequence, where color intensity reflects the attention weight. Figure 2a–c shows noteworthy examples focused on the word “succession”, from the following sentences:In Fig. 2a, we observe how the source is connected to a meaningful, non-contiguous set of words, particularly, “apre” (“opens”), “morte” (“death”), and “defunto” (“deceased person”). In addition, in Fig. 2b, we observe how “successione” is related to “coniuge” (“spouse”), “donazione” (“donation”), which is further enriched in the two-head attention patterns with “coeredi” (“co-heirs”), “conferire” (“give”), and “concorrono” (“contribute”), shown in Fig. 2c; moreover, “successione” is still connected to “defunto”. Remarkably, these patterns highlight the model’s ability not only to mine semantically meaningful patterns that are more complex than next-word or delimiter-focused patterns, but also to build patterns that consistently hold across various sentences sharing words. Note that, as shown in our examples, these sentences can belong to different contexts (i.e., different articles), and can significantly vary in length. The latter point is particularly evident, for instance, in the following example sentences:from Art. 456: “la successione si apre al momento della morte nel luogo dell’ultimo domicilio del defunto” (i.e., “the succession opens at the moment of death in the place of the last domicile of the deceased person”)from Art. 737: “i figli e i loro discendenti ed il coniuge che concorrono alla successione devono conferire ai coeredi tutto ciò che hanno ricevuto dal defunto per donazione” (i.e., the children and their descendants and the spouse who contribute to the succession must give to the co-heirs everything they have received from the deceased person as a donation)
from Art. 457: “l’eredità si devolve per legge o per testamento. Non si fa luogo alla successione legittima se non quando manca, in tutto o in parte, quella testamentaria” (i.e., “the inheritance is devolved by law or by will. There is no place for legitimate succession except when the testamentary succession is missing, in whole or in part”)from Art. 683: “la revocazione fatta con un testamento posteriore conserva la sua efficacia anche quando questo rimane senza effetto perché l’erede istituito o il legatario è premorto al testatore, o è incapace o indegno, ovvero ha rinunziato all’eredità o al legato” (i.e., “ the revocation made with a later will retains its effectiveness even when this remains without effect because the established heir or legatee is premortal to the testator, or is incapable or unworthy, or has renounced the inheritance or the legacy”)
6 Visualization of ICC LamBERTa Embeddings
7 Experimental evaluation
7.1 Evaluation goals
-
To validate and measure the effectiveness of LamBERTa models for law article retrieval tasks: how do local and global models perform on different evaluation contexts, i.e., against queries of different type, different length, and different lexicon? (Sect. 8.1)
-
To evaluate LamBERTa models in single-label as well as multi-label classification tasks: how do they perform w.r.t. different assumptions on the article relevance to a query, particularly depending on whether a query is originally associated with or derived from a particular article, or by-definition associated with a group of articles? (Sect. 8.1)
-
To understand how a LamBERTa model’s behavior is affected by varying and changing its constituents in terms of training-instance labeling schemes and learning parameters (Sect. 8.2).
-
To demonstrate the superiority of our classification-based approach to law article retrieval by comparing LamBERTa to other deep-learning-based text classifiers (Sect. 8.3.1) and to a few-shot learner conceived for an attribute-aware prediction task that we have newly designed based on the ICC heading metadata (Sect. 8.3.2).
7.2 Query sets
-
QType-1—book-sentence-queries refer to a set of queries that correspond to randomly selected sentences from the articles of a book. Each query is derived from a single article, and multiple queries are from the same article.
-
QType-2—paraphrased-sentence-queries share the same composition of QType-1 queries but differ from them as the sentences of a book’s articles are paraphrased. To this purpose, we adopt a simple approach based on backtranslation from English (i.e., an original sentence in Italian is first translated to English, then the obtained English sentence is translated to Italian).11
-
QType-3—comment-queries are defined to leverage the publicly available comments on the ICC articles provided by legal experts through the platform “Law for Everyone”.12 Such comments are delivered to provide annotations about the interpretation of the meanings and law implications associated to an article, or to particular terms occurring in an article. Each query corresponds to a comment available about one article, which is a paragraph comprised of about 5 sentences on average.
-
QType-4—comment-sentence-queries refer to the same source as QType-3, but the comments are split into sentences, so that each query contains a single sentence of a comment. Therefore, each query will be associated to a single article, and multiple queries will refer to the same article.
-
QType-5—case-queries refer to a collection of case law decisions from the civil section of the Italian Court of Cassation, which is the highest court in the Italian judicial system. These case law decisions are selected from publicly available corpora of the most significant jurisprudential sentences associated with the ICC articles, spanning over the period 1977-2015.
-
QType-6—ICC-heading-queries are defined by extracting the headings of chapters, subchapters, and sections of each ICC book. Such headings are very short, ranging from one to few keywords used to describe the topic of a particular division of a book.
ICC portion | Query type | #Queries | #Words | Query Type | #Queries | #Words | #Sentences |
---|---|---|---|---|---|---|---|
Book-1 | QType-1 | 790 | 13,436 | QType-3 | 331 | 38,383 | 1262 |
Book-2 | QType-2 | 690 | 10,998 | QType-4 | 322 | 30,226 | 1000 |
Book-3 | 728 | 11,942 | 286 | 23,815 | 781 | ||
Book-4 | 1774 | 27,352 | 764 | 85,360 | 2640 | ||
Book-5 | 1426 | 22,754 | 570 | 87,094 | 2417 | ||
Book-6 | 662 | 11,993 | 294 | 31,898 | 1040 |
ICC portion | Query type | #Queries | #Words | Year range | |||
---|---|---|---|---|---|---|---|
Book-1 | QType-5 | 333 | 39,464 | 1978–2014 | |||
Book-2 | 347 | 36,856 | 1979–2014 | ||||
Book-3 | 371 | 39,643 | 1979–2014 | ||||
Book-4 | 975 | 111,234 | 1978–2015 | ||||
Book-5 | 1037 | 114,287 | 1977–2015 | ||||
Book-6 | 720 | 81,186 | 1978–2015 |
ICC portion | Query type | #Chapter queries | #Subchapter queries | #Section queries | #Total words | ||
---|---|---|---|---|---|---|---|
Book-1 | QType-6 | 14 | 25 | 22 | 317 | ||
Book-2 | 5 | 30 | 14 | 160 | |||
Book-3 | 9 | 21 | 29 | 230 | |||
Book-4 | 9 | 51 | 57 | 366 | |||
Book-5 | 11 | 11 | 51 | 397 | |||
Book-6 | 5 | 18 | 38 | 251 |
7.2.1 Characteristics and differences of the query-sets
7.3 Evaluation methodology and assessment criteria
7.3.1 Single-label evaluation context
7.3.2 Multi-label evaluation context
8 Results
8.1 Global vs. local models
8.1.1 Single-label evaluation
UniRR.T
\(^+\) labeling scheme—as we shall discuss in Sect. 8.2, this choice of training-instance labeling scheme is justified as being in general the best-performing scheme for the query sets; nonetheless, analogous findings were drawn by using other types of queries and labeling schemes.R | P | \(F^{\mu }\) | \(F^{M}\) | R@3 | R@10 | MRR | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
i | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G |
\(Q_1\) | 0.962 | 0.949 | 0.975 | 0.979 | 0.961 | 0.955 | 0.969 | 0.964 | 0.989 | 0.992 | 0.999 | 0.997 | 0.976 | 0.970 |
\(Q_2\) | 0.972 | 0.954 | 0.981 | 0.985 | 0.971 | 0.961 | 0.977 | 0.969 | 0.994 | 0.981 | 0.999 | 0.986 | 0.983 | 0.968 |
\(Q_3\) | 0.990 | 0.978 | 0.994 | 0.994 | 0.990 | 0.981 | 0.992 | 0.986 | 1.000 | 0.995 | 1.000 | 0.997 | 0.995 | 0.987 |
\(Q_4\) | 0.961 | 0.948 | 0.983 | 0.985 | 0.964 | 0.958 | 0.972 | 0.966 | 0.984 | 0.977 | 0.990 | 0.986 | 0.975 | 0.966 |
\(Q_5\) | 0.910 | 0.905 | 0.944 | 0.947 | 0.909 | 0.908 | 0.927 | 0.925 | 0.984 | 0.974 | 0.998 | 0.988 | 0.947 | 0.940 |
\(Q_6\) | 0.979 | 0.961 | 0.986 | 0.986 | 0.978 | 0.966 | 0.982 | 0.973 | 1.000 | 0.985 | 1.000 | 0.992 | 0.989 | 0.974 |
\(Q_1\) | 0.841 | 0.730 | 0.866 | 0.823 | 0.828 | 0.745 | 0.853 | 0.774 | 0.905 | 0.829 | 0.949 | 0.881 | 0.881 | 0.784 |
\(Q_2\) | 0.828 | 0.639 | 0.856 | 0.766 | 0.814 | 0.669 | 0.841 | 0.697 | 0.992 | 0.742 | 0.941 | 0.812 | 0.871 | 0.701 |
\(Q_3\) | 0.861 | 0.728 | 0.886 | 0.843 | 0.851 | 0.756 | 0.873 | 0.781 | 0.922 | 0.816 | 0.942 | 0.842 | 0.896 | 0.778 |
\(Q_4\) | 0.736 | 0.635 | 0.756 | 0.706 | 0.713 | 0.639 | 0.746 | 0.668 | 0.806 | 0.716 | 0.861 | 0.783 | 0.779 | 0.685 |
\(Q_5\) | 0.718 | 0.684 | 0.759 | 0.742 | 0.710 | 0.686 | 0.738 | 0.712 | 0.843 | 0.808 | 0.908 | 0.867 | 0.790 | 0.755 |
\(Q_6\) | 0.841 | 0.704 | 0.874 | 0.817 | 0.833 | 0.730 | 0.857 | 0.756 | 0.914 | 0.783 | 0.941 | 0.845 | 0.882 | 0.756 |
\(Q_1\) | 0.349 | 0.25 | 0.248 | 0.196 | 0.274 | 0.211 | 0.290 | 0.220 | 0.494 | 0.422 | 0.675 | 0.530 | 0.455 | 0.355 |
\(Q_2\) | 0.313 | 0.177 | 0.213 | 0.145 | 0.239 | 0.155 | 0.253 | 0.159 | 0.494 | 0.307 | 0.655 | 0.441 | 0.445 | 0.265 |
\(Q_3\) | 0.396 | 0.260 | 0.298 | 0.216 | 0.325 | 0.228 | 0.340 | 0.236 | 0.577 | 0.426 | 0.704 | 0.574 | 0.507 | 0.371 |
\(Q_4\) | 0.336 | 0.247 | 0.239 | 0.190 | 0.264 | 0.206 | 0.279 | 0.215 | 0.487 | 0.387 | 0.622 | 0.522 | 0.438 | 0.343 |
\(Q_5\) | 0.171 | 0.252 | 0.128 | 0.190 | 0.137 | 0.205 | 0.147 | 0.216 | 0.364 | 0.433 | 0.562 | 0.590 | 0.302 | 0.375 |
\(Q_6\) | 0.387 | 0.235 | 0.291 | 0.178 | 0.317 | 0.193 | 0.332 | 0.203 | 0.588 | 0.388 | 0.745 | 0.551 | 0.515 | 0.340 |
\(Q_1\) | 0.190 | 0.136 | 0.197 | 0.183 | 0.170 | 0.135 | 0.194 | 0.156 | 0.363 | 0.265 | 0.502 | 0.376 | 0.333 | 0.239 |
\(Q_2\) | 0.216 | 0.100 | 0.191 | 0.148 | 0.173 | 0.106 | 0.203 | 0.119 | 0.343 | 0.172 | 0.473 | 0.257 | 0.315 | 0.157 |
\(Q_3\) | 0.241 | 0.153 | 0.223 | 0.188 | 0.204 | 0.145 | 0.231 | 0.169 | 0.433 | 0.268 | 0.569 | 0.405 | 0.395 | 0.250 |
\(Q_4\) | 0.189 | 0.144 | 0.176 | 0.173 | 0.156 | 0.137 | 0.182 | 0.157 | 0.324 | 0.253 | 0.450 | 0.345 | 0.293 | 0.228 |
\(Q_5\) | 0.092 | 0.119 | 0.132 | 0.148 | 0.090 | 0.113 | 0.108 | 0.132 | 0.256 | 0.278 | 0.414 | 0.426 | 0.222 | 0.248 |
\(Q_6\) | 0.224 | 0.144 | 0.211 | 0.209 | 0.192 | 0.147 | 0.218 | 0.171 | 0.372 | 0.237 | 0.508 | 0.336 | 0.333 | 0.210 |
\(Q_1\) | 0.228 | 0.118 | 0.233 | 0.121 | 0.210 | 0.101 | 0.230 | 0.120 | 0.494 | 0.304 | 0.813 | 0.488 | 0.430 | 0.263 |
\(Q_2\) | 0.292 | 0.162 | 0.316 | 0.232 | 0.274 | 0.172 | 0.303 | 0.191 | 0.501 | 0.280 | 0.726 | 0.444 | 0.465 | 0.270 |
\(Q_3\) | 0.284 | 0.190 | 0.323 | 0.217 | 0.273 | 0.180 | 0.302 | 0.203 | 0.566 | 0.330 | 0.856 | 0.474 | 0.489 | 0.286 |
\(Q_4\) | 0.259 | 0.159 | 0.299 | 0.187 | 0.250 | 0.150 | 0.278 | 0.172 | 0.528 | 0.314 | 0.803 | 0.493 | 0.458 | 0.279 |
\(Q_5\) | 0.354 | 0.256 | 0.401 | 0.307 | 0.342 | 0.252 | 0.376 | 0.279 | 0.641 | 0.484 | 0.884 | 0.647 | 0.564 | 0.426 |
\(Q_6\) | 0.392 | 0.241 | 0.445 | 0.313 | 0.372 | 0.247 | 0.417 | 0.273 | 0.665 | 0.376 | 0.885 | 0.520 | 0.590 | 0.347 |
UniRR.T
\(^+\), for all sets of book-sentence-queries (QType-1), paraphrased-sentence-queries (QType-2), comment-queries (QType-3), comment-sentence-queries (QType-4), and case-queries (QType-5). (Bold values correspond to the best model for each query-set and evaluation criterion)Query-set | R | P | \(F^{\mu }\) | \(F^{M}\) | |||||
---|---|---|---|---|---|---|---|---|---|
i | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | |
QType-1 | \(Q_1\) | 0.615 | 0.592 | 0.617 | 0.591 | 0.615 | 0.591 | 0.616 | 0.591 |
\(Q_2\) | 0.674 | 0.636 | 0.684 | 0.640 | 0.676 | 0.637 | 0.679 | 0.638 | |
\(Q_3\) | 0.623 | 0.604 | 0.626 | 0.603 | 0.624 | 0.603 | 0.625 | 0.603 | |
\(Q_4\) | 0.331 | 0.313 | 0.326 | 0.312 | 0.324 | 0.312 | 0.328 | 0.312 | |
\(Q_5\) | 0.631 | 0.632 | 0.633 | 0.634 | 0.628 | 0.631 | 0.632 | 0.633 | |
\(Q_6\) | 0.711 | 0.675 | 0.720 | 0.677 | 0.710 | 0.675 | 0.715 | 0.676 | |
QType-2 | \(Q _1\) | 0.557 | 0.473 | 0.559 | 0.474 | 0.557 | 0.472 | 0.558 | 0.473 |
\(Q_2\) | 0.595 | 0.442 | 0.604 | 0.445 | 0.598 | 0.443 | 0.599 | 0.444 | |
\(Q_3\) | 0.563 | 0.477 | 0.566 | 0.479 | 0.563 | 0.477 | 0.564 | 0.478 | |
\(Q_4\) | 0.172 | 0.236 | 0.213 | 0.237 | 0.186 | 0.236 | 0.190 | 0.237 | |
\(Q_5\) | 0.522 | 0.498 | 0.526 | 0.499 | 0.522 | 0.497 | 0.524 | 0.498 | |
\(Q_6\) | 0.624 | 0.511 | 0.628 | 0.512 | 0.625 | 0.511 | 0.626 | 0.513 | |
QType-3 | \(Q_1\) | 0.312 | 0.241 | 0.315 | 0.242 | 0.312 | 0.241 | 0.313 | 0.242 |
\(Q_2\) | 0.296 | 0.164 | 0.300 | 0.165 | 0.297 | 0.164 | 0.298 | 0.164 | |
\(Q_3\) | 0.332 | 0.226 | 0.337 | 0.228 | 0.333 | 0.226 | 0.334 | 0.227 | |
\(Q_4\) | 0.234 | 0.174 | 0.241 | 0.175 | 0.236 | 0.174 | 0.237 | 0.174 | |
\(Q_5\) | 0.209 | 0.249 | 0.214 | 0.252 | 0.209 | 0.250 | 0.211 | 0.251 | |
\(Q_6\) | 0.361 | 0.234 | 0.365 | 0.236 | 0.361 | 0.234 | 0.363 | 0.235 | |
QType-4 | \(Q_1\) | 0.227 | 0.154 | 0.227 | 0.155 | 0.226 | 0.154 | 0.227 | 0.154 |
\(Q_2\) | 0.205 | 0.098 | 0.207 | 0.103 | 0.205 | 0.096 | 0.206 | 0.100 | |
\(Q_3\) | 0.259 | 0.153 | 0.261 | 0.154 | 0.258 | 0.153 | 0.260 | 0.153 | |
\(Q_4\) | 0.163 | 0.123 | 0.164 | 0.127 | 0.164 | 0.123 | 0.163 | 0.125 | |
\(Q_5\) | 0.151 | 0.172 | 0.152 | 0.179 | 0.150 | 0.174 | 0.151 | 0.175 | |
\(Q_6\) | 0.236 | 0.147 | 0.238 | 0.149 | 0.235 | 0.147 | 0.237 | 0.148 | |
QType-5 | \(Q_1\) | 0.283 | 0.179 | 0.289 | 0.183 | 0.284 | 0.180 | 0.286 | 0.181 |
\(Q_2\) | 0.294 | 0.170 | 0.304 | 0.175 | 0.298 | 0.172 | 0.299 | 0.173 | |
\(Q_3\) | 0.298 | 0.171 | 0.302 | 0.174 | 0.299 | 0.171 | 0.300 | 0.172 | |
\(Q_4\) | 0.274 | 0.192 | 0.276 | 0.193 | 0.275 | 0.191 | 0.275 | 0.192 | |
\(Q_5\) | 0.378 | 0.296 | 0.382 | 0.298 | 0.378 | 0.295 | 0.380 | 0.297 | |
\(Q_6\) | 0.404 | 0.226 | 0.408 | 0.229 | 0.405 | 0.226 | 0.406 | 0.227 |
UniRR.T
\(^+\), for all sets of book-sentence-queries (QType-1), paraphrased-sentence-queries (QType-2), comment-queries (QType-3), comment-sentence-queries (QType-4), and case-queries (QType-5)Query-set | R | P | \(F^{\mu }\) | \(F^{M}\) | |||||
---|---|---|---|---|---|---|---|---|---|
i | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | |
QType-1 | \(Q_1\) | 0.247 | 0.220 | 0.312 | 0.269 | 0.268 | 0.236 | 0.276 | 0.242 |
\(Q_2\) | 0.198 | 0.187 | 0.256 | 0.237 | 0.216 | 0.203 | 0.223 | 0.209 | |
\(Q_3\) | 0.217 | 0.192 | 0.287 | 0.250 | 0.238 | 0.209 | 0.247 | 0.217 | |
\(Q_4\) | 0.191 | 0.179 | 0.234 | 0.217 | 0.205 | 0.191 | 0.210 | 0.196 | |
\(Q_5\) | 0.201 | 0.191 | 0.253 | 0.240 | 0.218 | 0.207 | 0.224 | 0.213 | |
\(Q_6\) | 0.268 | 0.243 | 0.306 | 0.283 | 0.279 | 0.254 | 0.286 | 0.262 | |
QType-2 | \(Q_1\) | 0.244 | 0.194 | 0.310 | 0.239 | 0.265 | 0.209 | 0.273 | 0.214 |
\(Q_2\) | 0.215 | 0.160 | 0.278 | 0.200 | 0.235 | 0.173 | 0.246 | 0.178 | |
\(Q_3\) | 0.206 | 0.164 | 0.280 | 0.224 | 0.229 | 0.182 | 0.238 | 0.189 | |
\(Q
_4\) | 0.172 | 0.144 | 0.213 | 0.178 | 0.186 | 0.156 | 0.190 | 0.159 | |
\(Q_5\) | 0.186 | 0.168 | 0.235 | 0.213 | 0.202 | 0.183 | 0.208 | 0.188 | |
\(Q_6\) | 0.257 | 0.212 | 0.297 | 0.248 | 0.268 | 0.222 | 0.276 | 0.228 | |
QType-3 | \(Q_1\) | 0.245 | 0.167 | 0.319 | 0.215 | 0.269 | 0.183 | 0.277 | 0.188 |
\(Q_2\) | 0.170 | 0.096 | 0.239 | 0.134 | 0.192 | 0.108 | 0.199 | 0.112 | |
\(Q_3\) | 0.189 | 0.133 | 0.265 | 0.186 | 0.212 | 0.149 | 0.220 | 0.155 | |
\(Q_4\) | 0.169 | 0.126 | 0.215 | 0.161 | 0.184 | 0.138 | 0.189 | 0.141 | |
\(Q_5\) | 0.154 | 0.146 | 0.204 | 0.200 | 0.169 | 0.165 | 0.176 | 0.169 | |
\(Q_6\) | 0.240 | 0.167 | 0.285 | 0.211 | 0.252 | 0.178 | 0.261 | 0.186 | |
QType-4 | \(Q_1\) | 0.188 | 0.122 | 0.241 | 0.153 | 0.205 | 0.132 | 0.211 | 0.136 |
\(Q_2\) | 0.123 | 0.065 | 0.174 | 0.090 | 0.139 | 0.073 | 0.144 | 0.076 | |
\(Q_3\) | 0.156 | 0.094 | 0.230 | 0.133 | 0.176 | 0.105 | 0.184 | 0.110 | |
\(Q_4\) | 0.126 | 0.094 | 0.164 | 0.122 | 0.139 | 0.103 | 0.142 | 0.106 | |
\(Q_5\) | 0.118 | 0.115 | 0.188 | 0.176 | 0.140 | 0.134 | 0.145 | 0.139 | |
\(Q_6\) | 0.173 | 0.116 | 0.204 | 0.143 | 0.182 | 0.123 | 0.187 | 0.129 | |
QType-5 | \(Q_1\) | 0.309 | 0.199 | 0.381 | 0.237 | 0.334 | 0.212 | 0.341 | 0.216 |
\(Q_2\) | 0.193 | 0.110 | 0.265 | 0.151 | 0.215 | 0.122 | 0.223 | 0.127 | |
\(Q_3\) | 0.244 | 0.167 | 0.274 | 0.179 | 0.255 | 0.172 | 0.258 | 0.173 | |
\(Q_4\) | 0.233 | 0.158 | 0.296 | 0.199 | 0.254 | 0.172 | 0.261 | 0.176 | |
\(Q_5\) | 0.214 | 0.174 | 0.272 | 0.224 | 0.234 | 0.191 | 0.240 | 0.196 | |
\(Q_6\) | 0.245 | 0.159 | 0.276 | 0.185 | 0.253 | 0.165 | 0.259 | 0.171 |
8.1.2 Multi-label evaluation
UniRR.T
\(^+\), for all sets of ICC-heading-queries (i.e., QType-6 query sets). Column ‘a_cs’ stands for average class size, i.e., the average no. of articles belonging to each query labelChapter | Subchapter | Section | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
i | a_cs | \(F^{\mu }\) | P@3 | a_cs | \(F^{\mu }\) | P@3 | a_cs | \(F^{\mu }\) | P@3 | ||||||
\(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | \(L_i\) | G | ||||
\(Q_1\) | 26.1 | 0.314 | 0.251 | 0.867 | 0.733 | 14.4 | 0.307 | 0.253 | 0.875 | 0.792 | 7.5 | 0.348 | 0.205 | 0.722 | 0.556 |
\(Q_2\) | 69.0 | 0.256 | 0.247 | 0.800 | 0.800 | 11.6 | 0.167 | 0.145 | 0.519 | 0.370 | 11.8 | 0.234 | 0.089 | 0.500 | 0.333 |
\(Q_3\) | 40.3 | 0.287 | 0.225 | 0.667 | 0.556 | 17.6 | 0.185 | 0.145 | 0.778 | 0.389 | 8.5 | 0.270 | 0.219 | 0.577 | 0.500 |
\(Q_4\) | 98.4 | 0.238 | 0.083 | 0.444 | 0.333 | 17.1 | 0.216 | 0.144 | 0.673 | 0.531 | 8.3 | 0.218 | 0.149 | 0.653 | 0.531 |
\(Q_5\) | 64.8 | 0.174 | 0.163 | 0.545 | 0.727 | 22.2 | 0.263 | 0.237 | 0.733 | 0.733 | 10.9 | 0.215 | 0.219 | 0.558 | 0.512 |
\(Q_6\) | 66.2 | 0.402 | 0.198 | 1.000 | 0.600 | 21.4 | 0.319 | 0.202 | 0.667 | 0.467 | 6.5 | 0.290 | 0.248 | 0.571 | 0.486 |
8.2 Ablation study
query-set | method |
R
|
P
|
\(F^{\mu }\)
|
\(F^{M}\)
| R@10 |
MRR
|
E
| E@10 |
---|---|---|---|---|---|---|---|---|---|
QType-1 | T
| 0.530 | 0.855 | 0.623 | 0.655 | 0.619 | 0.564 | 3.023 | 1.335 |
UniRR
| 0.675 | 0.928 | 0.755 | 0.782 | 0.851 | 0.744 | 0.343 | 0.167 | |
BiRR
| 0.552 | 0.775 | 0.621 | 0.645 | 0.745 | 0.635 | 0.790 | 0.342 | |
TriRR
| 0.584 | 0.763 | 0.637 | 0.661 | 0.801 | 0.674 | 1.149 | 0.455 | |
CasRR
| 0.555 | 0.901 | 0.653 | 0.687 | 0.712 | 0.608 | 1.422 | 0.808 | |
TglRR
| 0.900 | 0.955 | 0.910 | 0.927 | 0.980 | 0.929 | 0.459 | 0.281 | |
UniRR.T
\(^+\)
|
0.972
|
0.981
|
0.971
|
0.977
|
0.999
|
0.983
|
0.149
|
0.089
| |
CasRR.T
\(^+\)
| 0.823 | 0.929 | 0.843 | 0.873 | 0.903 | 0.853 | 0.863 | 0.457 | |
TglRR.T
\(^+\)
| 0.919 | 0.972 | 0.927 | 0.945 | 0.977 | 0.940 | 0.492 | 0.298 | |
QType-2 | T
| 0.410 | 0.598 | 0.456 | 0.487 | 0.551 | 0.463 | 3.892 | 1.706 |
UniRR
| 0.549 | 0.744 | 0.605 | 0.632 | 0.754 | 0.624 |
1.310
|
0.585
| |
BiRR
| 0.342 | 0.470 | 0.374 | 0.396 | 0.581 | 0.436 | 2.084 | 0.899 | |
TriRR
| 0.442 | 0.620 | 0.494 | 0.516 | 0.699 | 0.544 | 2.253 | 0.881 | |
CasRR
| 0.383 | 0.620 | 0.444 | 0.473 | 0.597 | 0.455 | 2.289 | 1.232 | |
TglRR
| 0.612 | 0.722 | 0.626 | 0.662 | 0.823 | 0.684 | 2.101 | 1.107 | |
UniRR.T
\(^+\)
|
0.828
|
0.856
|
0.814
|
0.841
|
0.941
|
0.871
| 1.620 | 0.732 | |
CasRR.T
\(^+\)
| 0.625 | 0.740 | 0.639 | 0.677 | 0.784 | 0.685 | 2.221 | 1.074 | |
TglRR.T
\(^+\)
| 0.632 | 0.729 | 0.642 | 0.677 | 0.854 | 0.705 | 2.311 | 1.174 | |
QType-3 | T
| 0.023 | 0.008 | 0.010 | 0.012 | 0.171 | 0.073 | 6.312 | 2.816 |
UniRR
| 0.296 | 0.208 | 0.230 | 0.244 | 0.618 | 0.425 | 4.368 | 2.051 | |
BiRR
| 0.212 | 0.140 | 0.156 | 0.169 | 0.494 | 0.321 | 5.185 | 2.349 | |
TriRR
| 0.194 | 0.113 | 0.133 | 0.143 | 0.478 | 0.295 | 5.885 | 2.523 | |
CasRR
| 0.110 | 0.075 | 0.082 | 0.089 | 0.363 | 0.199 |
4.034
|
1.800
| |
TglRR
| 0.261 | 0.186 | 0.203 | 0.217 | 0.624 | 0.394 | 4.048 | 1.939 | |
UniRR.T
\(^+\)
|
0.313
|
0.213
|
0.239
|
0.253
|
0.655
|
0.445
| 4.938 | 2.266 | |
CasRR.T
\(^+\)
| 0.232 | 0.157 | 0.176 | 0.187 | 0.525 | 0.339 | 4.717 | 2.172 | |
TglRR.T
\(^+\)
| 0.217 | 0.162 | 0.173 | 0.185 | 0.593 | 0.347 | 4.254 | 2.100 | |
QType-4 | T
| 0.037 | 0.042 | 0.028 | 0.039 | 0.137 | 0.076 | 6.379 | 2.800 |
UniRR
| 0.175 | 0.199 | 0.160 | 0.186 | 0.439 | 0.271 | 4.000 | 1.834 | |
BiRR
| 0.104 | 0.125 | 0.094 | 0.114 | 0.309 | 0.185 | 4.592 | 2.065 | |
TriRR
| 0.090 | 0.101 | 0.077 | 0.096 | 0.275 | 0.163 | 5.357 | 2.241 | |
CasRR
| 0.056 | 0.068 | 0.049 | 0.061 | 0.219 | 0.115 |
3.268
|
1.615
| |
TglRR
| 0.154 | 0.178 | 0.141 | 0.165 | 0.416 | 0.253 | 4.174 | 1.991 | |
UniRR.T
\(^+\)
|
0.216
| 0.191 |
0.173
|
0.203
|
0.473
|
0.315
| 5.077 | 2.311 | |
CasRR.T
\(^+\)
| 0.135 | 0.161 | 0.119 | 0.147 | 0.347 | 0.211 | 4.586 | 2.130 | |
TglRR.T
\(^+\)
| 0.144 |
0.202
| 0.143 | 0.168 | 0.398 | 0.226 | 4.148 | 2.027 | |
QType-5 | T
| 0.036 | 0.025 | 0.022 | 0.029 | 0.127 | 0.081 | 6.570 | 2.968 |
UniRR
| 0.260 | 0.288 | 0.232 | 0.273 | 0.666 | 0.436 |
4.446
|
2.074
| |
BiRR
| 0.232 | 0.236 | 0.197 | 0.234 | 0.548 | 0.357 | 4.873 | 2.198 | |
TriRR
| 0.225 | 0.248 | 0.204 | 0.236 | 0.568 | 0.380 | 5.498 | 2.408 | |
CasRR
| 0.082 | 0.089 | 0.068 | 0.086 | 0.334 | 0.187 | 5.429 | 2.400 | |
TglRR
| 0.278 |
0.318
| 0.258 | 0.297 | 0.686 | 0.434 | 4.560 | 2.150 | |
UniRR.T
\(^+\)
|
0.292
| 0.316 |
0.274
|
0.303
|
0.726
|
0.465
| 4.972 | 2.299 | |
CasRR.T
\(^+\)
| 0.238 | 0.295 | 0.233 | 0.263 | 0.579 | 0.380 | 4.822 | 2.158 | |
TglRR.T
\(^+\)
| 0.283 | 0.316 | 0.273 | 0.299 | 0.700 | 0.463 | 4.638 | 2.137 |
UniRR.T
\(^+\)with \(minTU=64\), for each type of query-setquery-set |
R
|
P
|
\(F^{\mu }\)
|
\(F^{M}\)
| R@10 |
MRR
|
E
| E@10 |
---|---|---|---|---|---|---|---|---|
QType-1 | 0.971 | 0.982 | 0.971 | 0.977 | 0.996 | 0.982 | 0.090 | 0.072 |
− 0.10%
| + 0.13% |
− 0.04%
|
− 0.04%
|
− 0.34%
|
− 0.13%
| − 39.40% | − 19.41% | |
QType-2 | 0.797 | 0.825 | 0.781 | 0.811 | 0.928 | 0.845 | 1.266 | 0.657 |
− 3.73%
|
− 3.67%
|
− 4.09%
|
− 3.62%
|
− 1.43%
|
− 2.97%
| − 21.85% | − 10.30% | |
QType-3 | 0.351 | 0.254 | 0.280 | 0.295 | 0.666 | 0.481 | 3.661 | 1.814 |
+ 12.14% | + 19.25% | + 17.15% | + 16.60% | + 1.68% | + 8.09% | − 25.86% | − 19.95% | |
QType-4 | 0.221 | 0.204 | 0.185 | 0.212 | 0.482 | 0.308 | 3.889 | 1.928 |
+ 2.31% | + 7.37% | + 6.94% | + 4.95% | + 2.55% |
− 0.96%
| − 23.40% | − 16.57% | |
QType-5 | 0.342 | 0.352 | 0.320 | 0.347 | 0.740 | 0.460 | 3.615 | 1.872 |
+ 17.13% | + 11.33% | + 16.67% | + 14.53% | + 1.93% |
− 1.08%
| − 27.27% | − 18.58% |
UniRR.T
\(^+\) with \(minTU=64\), for each type of query-setQuery-set | Clustering-based | ICC-Classification-based | ||||||
---|---|---|---|---|---|---|---|---|
R
|
P
|
\(F^{\mu }\)
|
\(F^{M}\)
|
R
|
P
|
\(F^{\mu }\)
|
\(F^{M}\)
| |
QType-1 | 0.658 | 0.668 | 0.662 | 0.663 | 0.197 | 0.252 | 0.215 | 0.221 |
− 2.08%
|
− 2.20%
|
− 2.07%
|
− 2.21%
|
− 0.51%
|
− 1.56%
|
− 0.46%
|
− 0.90%
| |
QType-2 | 0.556 | 0.565 | 0.559 | 0.560 | 0.198 | 0.252 | 0.215 | 0.222 |
− 6.55%
|
− 6.46%
|
− 6.52%
|
− 6.51%
|
− 8.04%
|
− 9.34%
|
− 8.45%
|
− 8.61%
| |
QType-3 | 0.322 | 0.327 | 0.324 | 0.324 | 0.166 | 0.228 | 0.185 | 0.192 |
+8.78% | +9.00% | +9.09% | +8.72% |
− 2.35%
|
− 4.60%
|
− 3.65%
|
− 3.52%
| |
QType-4 | 0.202 | 0.206 | 0.203 | 0.204 | 0.131 | 0.183 | 0.148 | 0.153 |
− 0.98%
| +0.00% |
− 0.98%
|
− 0.49%
| +6.50% | +5.17% | +6.47% | +6.25% | |
QType-5 | 0.312 | 0.327 | 0.319 | 0.320 | 0.181 | 0.247 | 0.201 | 0.209 |
+6.04% | +7.72% | +7.12% | +7.09% |
− 6.22%
|
− 6.79%
|
− 6.51%
|
− 6.28%
|
8.2.1 Training-instance labeling schemes
T
is slightly better than CasRR
). More interestingly, lower size of n-gram seems to be beneficial to the effectiveness of the model, especially on QType-3, QType-4, and QType-5 queries, indeed UniRR
always outperforms both BiRR
and TriRR
; also, UniRR
behaves better than the cascade scheme (CasRR
) as well, and again the gap is more evident in the comment-based queries. The combination of n-grams of varying size reflected by the TglRR
scheme leads to a significant increase in the performance over all previously mentioned schemes, for QType-1, QType-2, and QType-5 queries. This would suggest that more sophisticated labeling schemes can lead to higher effectiveness in the learned model. Nonetheless, superior performance is obtained by considering the schemes with emphasis on the title, which all improve upon the corresponding schemes not emphasizing the title, with UniRR.T
\(^+\) being the best-performing method by far according to all criteria.8.2.2 Training units per article
8.3 Comparative analysis
8.3.1 Text-based law article prediction
-
BiLSTM (Liu et al. 2016; Zhou et al. 2016), a bidirectional LSTM model as sequence encoder. LSTM models have been widely used in text classification as they can capture contextual information while representing the sentence by fixed size vector. The model exploited in this evaluation utilizes 2 layers of BiLSTM, with 32 hidden units for each BiLSTM layer.
-
TextCNN (Kim 2014), a convolutional-neural-network-based model with multiple filter widths for text encoding and classification. Every sentence is represented as a bidimensional tensor of shape (n, d), where n is the sentence length and d is the dimensionality of the word embedding vectors. The TextCNN model utilizes three different filter windows of sizes \(\{3,4,5\}\), 100 feature maps for the windows’ sizes, ReLU activation function and max-pooling.
-
TextRCNN (Lai et al. 2015), a bidirectional LSTM with a pooling layer on the last sequence output. TextRCNN therefore combines the recurrent neural network and convolutional network to leverage the advantages of the individual models in capturing the text semantics. The model first exploits a recurrent structure to learn word representations for every word in the text, thus capturing the contextual information; afterwards, max-pooling is applied to determine which features are important for the classification task.
-
Seq2Seq-A (Du and Huang 2018; Bahdanau et al. 2015), a Seq2Seq model with attention mechanism. Seq2Seq models have been widely used in machine translation and document summarization due to their capability to generate new sequences based on observed text data. For text classification, here the Seq2Seq-A model utilizes a single layer BiLSTM as encoder with 32 hidden units. This encoder learns the hidden representation for every word in an input sentence, and its final state is then used to learn attention scores for each word in the sentence. After learning the attention weights, the weighted sum of encoder hidden states (hidden states of words) gives the attention output vector. The latter is then concatenated to the hidden representation and passed to a linear layer to produce the final classification.
-
Transformer model for text classification, which is adapted from the model originally proposed for the task of machine translation in Vaswani et al. (2017). The key aspect in this model is the use of an attention mechanism to deal with long range dependencies, but without resorting to RNN models. The encoder part of the original Transformer model is used for classification. This encoder is composed of 6 layers, each having two sub-layers, namely a multi-head attention layer, and a 2-layer feed-forward network. Compared to BERT, residual connection, layer normalization, and masking are discarded.
UniRR.T
\(^+\) labeling scheme, and tested over all sets of book-sentence-queries (QType-1, upper subtable), paraphrased-sentence-queries (QType-2, second upper subtable), comment-sentence-queries (QType-4, third upper subtable), and case-queries (QType-5, bottom subtable): best-performing values of precision, recall, and micro-averaged F-measure. (Bold values correspond to the best performance obtained by a competing method, for each query-set and evaluation criterion; LamBERTa performance values, formatted in italic, are also reported from Table 4 to ease the comparison with the competing methods)LamBERTa | TextCNN | BiLSTM | TextRCNN | Seq2Seq-A | Transformer | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
i | P | R | \(F^{\mu }\) | P | R | \(F^{\mu }\) | P | R | \(F^{\mu }\) | P | R | \(F^{\mu }\) | P | R | \(F^{\mu }\) | P | R | \(F^{\mu }\) |
\(Q_1\) | 0.975 | 0.962 | 0.961 | 0.942 | 0.911 | 0.910 | 0.752 | 0.712 | 0.680 | 0.894 | 0.838 | 0.835 | 0.620 | 0.629 | 0.564 | 0.959 | 0.950 | 0.948 |
\(Q_2\) | 0.981 | 0.972 | 0.971 | 0.959 | 0.940 | 0.936 | 0.735 | 0.730 | 0.681 | 0.894 | 0.852 | 0.845 | 0.630 | 0.670 | 0.585 | 0.965 | 0.955 | 0.954 |
\(Q_3\) | 0.994 | 0.990 | 0.990 | 0.962 | 0.951 | 0.951 | 0.727 | 0.724 | 0.674 | 0.905 | 0.865 | 0.861 | 0.631 | 0.641 | 0.573 | 0.970 | 0.965 | 0.965 |
\(Q_4\) | 0.983 | 0.961 | 0.964 | 0.971 | 0.933 | 0.938 | 0.789 | 0.772 | 0.736 | 0.919 | 0.875 | 0.872 | 0.727 | 0.703 | 0.665 | 0.969 | 0.947 | 0.951 |
\(Q_5\) | 0.944 | 0.910 | 0.909 | 0.895 | 0.857 | 0.848 | 0.689 | 0.670 | 0.626 | 0.817 | 0.763 | 0.750 | 0.669 | 0.646 | 0.602 | 0.936 | 0.903 | 0.903 |
\(Q_6\) | 0.986 | 0.979 | 0.978 | 0.949 | 0.938 | 0.935 | 0.745 | 0.723 | 0.677 | 0.903 | 0.866 | 0.858 | 0.757 | 0.718 | 0.685 | 0.955 | 0.957 | 0.956 |
\(Q_1\) | 0.866 | 0.841 | 0.828 | 0.800 | 0.769 | 0.750 | 0.360 | 0.412 | 0.349 | 0.674 | 0.646 | 0.619 | 0.345 | 0.381 | 0.324 | 0.801 | 0.774 | 0.754 |
\(Q_2\) | 0.856 | 0.828 | 0.814 | 0.760 | 0.726 | 0.706 | 0.409 | 0.442 | 0.382 | 0.646 | 0.643 | 0.601 | 0.353 | 0.384 | 0.327 | 0.813 | 0.787 | 0.762 |
\(Q_3\) | 0.886 | 0.861 | 0.851 | 0.817 | 0.765 | 0.758 | 0.464 | 0.493 | 0.435 | 0.707 | 0.672 | 0.648 | 0.361 | 0.409 | 0.342 | 0.852 | 0.804 | 0.795 |
\(Q_4\) | 0.756 | 0.736 | 0.713 | 0.729 | 0.643 | 0.669 | 0.310 | 0.330 | 0.292 | 0.643 | 0.607 | 0.582 | 0.278 | 0.297 | 0.258 | 0.733 | 0.651 | 0.674 |
\(Q_5\) | 0.759 | 0.718 | 0.710 | 0.714 | 0.666 | 0.649 | 0.381 | 0.399 | 0.353 | 0.637 | 0.610 | 0.580 | 0.377 | 0.393 | 0.345 | 0.734 | 0.715 | 0.697 |
\(Q_6\) | 0.874 | 0.841 | 0.833 | 0.788 | 0.737 | 0.725 | 0.451 | 0.451 | 0.405 | 0.662 | 0.625 | 0.603 | 0.378 | 0.438 | 0.364 | 0.795 | 0.756 | 0.742 |
\(Q_1\) | 0.197 | 0.190 | 0.170 | 0.158 | 0.152 | 0.124 | 0.013 | 0.018 | 0.013 | 0.146 | 0.147 | 0.120 | 0.034 | 0.019 | 0.018 | 0.114 | 0.100 | 0.089 |
\(Q_2\) | 0.191 | 0.216 | 0.173 | 0.170 | 0.155 | 0.136 | 0.015 | 0.037 | 0.009 | 0.166 | 0.167 | 0.133 | 0.011 | 0.034 | 0.012 | 0.121 | 0.093 | 0.092 |
\(Q_3\) | 0.223 | 0.241 | 0.204 | 0.201 | 0.208 | 0.173 | 0.006 | 0.021 | 0.007 | 0.205 | 0.204 | 0.176 | 0.019 | 0.029 | 0.019 | 0.118 | 0.112 | 0.102 |
\(Q_4\) | 0.176 | 0.189 | 0.156 | 0.159 | 0.149 | 0.126 | 0.006 | 0.007 | 0.005 | 0.174 | 0.173 | 0.144 | 0.006 | 0.016 | 0.004 | 0.073 | 0.071 | 0.061 |
\(Q_5\) | 0.132 | 0.092 | 0.090 | 0.109 | 0.080 | 0.090 | 0.013 | 0.017 | 0.008 | 0.104 | 0.092 | 0.085 | 0.004 | 0.087 | 0.006 | 0.089 | 0.075 | 0.070 |
\(Q_6\) | 0.211 | 0.224 | 0.192 | 0.183 | 0.166 | 0.155 | 0.014 | 0.024 | 0.014 | 0.186 | 0.185 | 0.157 | 0.017 | 0.065 | 0.020 | 0.120 | 0.103 | 0.090 |
\(Q_1\) | 0.233 | 0.228 | 0.210 | 0.147 | 0.137 | 0.123 | 0.009 | 0.009 | 0.008 | 0.155 | 0.144 | 0.123 | 0.001 | 0.005 | 0.002 | 0.102 | 0.089 | 0.086 |
\(Q_2\) | 0.316 | 0.292 | 0.274 | 0.186 | 0.163 | 0.132 | 0.009 | 0.023 | 0.009 | 0.172 | 0.169 | 0.143 | 0.023 | 0.012 | 0.016 | 0.118 | 0.092 | 0.104 |
\(Q_3\) | 0.323 | 0.284 | 0.273 | 0.221 | 0.267 | 0.219 | 0.006 | 0.013 | 0.008 | 0.178 | 0.173 | 0.151 | 0.010 | 0.019 | 0.012 | 0.193 | 0.115 | 0.124 |
\(Q_4\) | 0.299 | 0.259 | 0.250 | 0.216 | 0.187 | 0.170 | 0.007 | 0.009 | 0.004 | 0.115 | 0.112 | 0.095 | 0.021 | 0.011 | 0.011 | 0.106 | 0.088 | 0.097 |
\(Q_5\) | 0.401 | 0.354 | 0.342 | 0.271 | 0.251 | 0.222 | 0.016 | 0.016 | 0.009 | 0.288 | 0.232 | 0.226 | 0.005 | 0.020 | 0.006 | 0.186 | 0.144 | 0.152 |
\(Q_6\) | 0.445 | 0.392 | 0.372 | 0.357 | 0.322 | 0.294 | 0.045 | 0.050 | 0.033 | 0.301 | 0.303 | 0.259 | 0.022 | 0.029 | 0.016 | 0.201 | 0.130 | 0.137 |
UniRR.T
\(^+\) labeled data for training. We carried out several runs by varying the main parameters as previously discussed, and eventually, we selected the best-performing results for each competitor and query-set, which are shown in Table 11.8.3.2 Attribute-aware law article prediction
UniRR.T
\(^+\) labeling scheme, and tested over all sets of paraphrased-sentence-queries (QType-2, upper subtable), comment-sentence-queries (QType-4, second upper subtable), and case-queries (QType-5, bottom subtable). The table shows best-performing values of precision, recall, and macro-averaged F-measure. (LamBERTa performance values, formatted in italic, are also reported from Table 4 to ease the comparison with the competing method)LamBERTa | A-FewShotAP (Hu et al. 2018) | |||||
---|---|---|---|---|---|---|
i | P | R | \(F^{M}\) | P | R | \(F^{M}\) |
\(Q_1\) | 0.866 | 0.841 | 0.853 | 0.355 | 0.410 | 0.381 |
\(Q_2\) | 0.856 | 0.828 | 0.841 | 0.328 | 0.378 | 0.351 |
\(Q_3\) | 0.886 | 0.861 | 0.873 | 0.450 | 0.514 | 0.480 |
\(Q_4\) | 0.756 | 0.736 | 0.746 | 0.319 | 0.384 | 0.348 |
\(Q_5\) | 0.759 | 0.718 | 0.738 | 0.382 | 0.444 | 0.411 |
\(Q_6\) | 0.874 | 0.841 | 0.857 | 0.412 | 0.474 | 0.441 |
\(Q_1\) | 0.197 | 0.190 | 0.194 | 0.018 | 0.031 | 0.023 |
\(Q_2\) | 0.191 | 0.216 | 0.203 | 0.023 | 0.043 | 0.030 |
\(Q_3\) | 0.223 | 0.241 | 0.231 | 0.027 | 0.038 | 0.032 |
\(Q_4\) | 0.176 | 0.189 | 0.182 | 0.029 | 0.047 | 0.036 |
\(Q_5\) | 0.132 | 0.092 | 0.108 | 0.035 | 0.042 | 0.038 |
\(Q_6\) | 0.211 | 0.224 | 0.218 | 0.031 | 0.037 | 0.034 |
\(Q_1\) | 0.233 | 0.228 | 0.230 | 0.006 | 0.013 | 0.009 |
\(Q_2\) | 0.316 | 0.292 | 0.303 | 0.016 | 0.021 | 0.018 |
\(Q_3\) | 0.323 | 0.284 | 0.302 | 0.016 | 0.018 | 0.017 |
\(Q_4\) | 0.299 | 0.259 | 0.278 | 0.007 | 0.010 | 0.008 |
\(Q_5\) | 0.401 | 0.354 | 0.376 | 0.018 | 0.027 | 0.022 |
\(Q_6\) | 0.445 | 0.392 | 0.417 | 0.025 | 0.037 | 0.030 |