100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Tentamen (uitwerkingen)

Threshold-Based Retrieval and Textual Entailment Detection on Legal Bar Exam Questions

Beoordeling
-
Verkocht
-
Pagina's
9
Cijfer
A+
Geüpload op
07-08-2024
Geschreven in
2024/2025

RELATED WORK The related work for our approach is divided in two parts: The legal information retrieval task and the entailment detection task. The first part consists of approaches using BM25 scoring or word embeddings, as well as similarity thresholding for a retrieval task. We further present deep learning methods, followed by approaches using thresholds for a textual entailment task. 2.1 Legal Information Retrieval 2.1.1 BM25-Based Solutions. In the COLIEE ’16 competition, Onodera and Yoshioka apply BM25 scoring for information retrieval with several extensions using query keyword expansion. Their best result was an F-measure of 54.5% [11]. Arora et al. observe the best score with the BM25 scoring method on a different task of legal document retrieval [1], compared to language models and term frequency - inverse document frequency (TF-IDF) weighting. This finding contradicts the previous observations from COLIEE competitions and the FIRE 2017 IRLeD Track, where ranking SVMs [12] or language models [17, 32] performed better than mere BM25 scoring. Despite those observations, BM25 has shown to provide at least competitive results in many cases, so that we consider it as part of our approach. 2.1.2 Word Embeddings. Word Embeddings have proven to be useful in many natural language processing contexts. We outline several works which have used this document feature representation for legal information retrieval. During the COLIEE ’18 competition, the SPABS team was able to overcome vocabulary mismatch in some cases using an RNN-based solution with Word2Vec embeddings trained on English legal documents [34]. Team UB used word embeddings with PL2 term weighting [34]. Yoshioka et al. suggest to use semantic matching techniques for hard questions involving vocabulary mismatch combined with more reliable lexical methods for easy questions [34]. This is the main motivation for our retrieval system, which incorporates lexical BM25 scoring and word embeddings as a semantic representation, respectively. 2.1.3 Thresholding. Thresholding based on similarity values can improve retrieval results by filtering out low-scoring matches. Islam and Inkpen use similarity thresholds to increase the precision of text matching [10]. Stein et al. also employ thresholds for plagiarized document retrieval [29]. In the COLIEE ’18 competition, team UBIRLED use a similarity threshold for filtering out irrelevant case judgments [13]. Nanda et al. select the top-5 matching documents from a topic clustering approach [20]. Given the document with the highest similarity score to the query, they apply thresholding, such that any further document will be incorporated into the result set if the distance to the topmost document is less than 15%. Our approach uses a similar criterion for document inclusion. 2.2 Legal Textual Entailm

Meer zien Lees minder
Instelling
Threshold-Based Retrieval And Textual Entailment
Vak
Threshold-Based Retrieval and Textual Entailment









Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
Threshold-Based Retrieval and Textual Entailment
Vak
Threshold-Based Retrieval and Textual Entailment

Documentinformatie

Geüpload op
7 augustus 2024
Aantal pagina's
9
Geschreven in
2024/2025
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

Voorbeeld van de inhoud

Threshold-Based Retrieval and Textual Entailment Detection on
Legal Bar Exam Questions
Sabine Wehnert Sayed Anisul Hoque

Otto von Guericke University Magdeburg Otto von Guericke University Magdeburg
Germany Germany

Wolfram Fenske Gunter Saake

arXiv:1905.13350v1 [cs.IR] 30 May 2019




Otto von Guericke University Magdeburg Otto von Guericke University Magdeburg
Germany Germany

ABSTRACT jurisdictions falling under the scope of their activities. Tracking
Getting an overview over the legal domain has become challeng- changes in law is a challenging task, especially in statutory law
ing, especially in a broad, international context. Legal question where a single modification may affect the applicability of several
answering systems have the potential to alleviate this task by au- legal articles, due to implicit co-dependencies between these doc-
tomatically retrieving relevant legal texts for a specific statement uments. While domain experts are mostly required to ensure a
and checking whether the meaning of the statement can be in- reliable assessment of relationships among laws and their implica-
ferred from the found documents. We investigate a combination of tions, the amount of legal documents is hard to oversee for a single
the BM25 scoring method of Elasticsearch with word embeddings person. Therefore, a decision support system can help in finding rel-
trained on English translations of the German and Japanese civil evant laws and applying them to a specific question or statement.1
law. For this, we define criteria which select a dynamic number Finding out whether a statement is true, given a corpus of legal text,
of relevant documents according to threshold scores. Exploiting falls under the task of legal question answering. A legal question
two deep learning classifiers and their respective prediction bias answering system consists of two major parts: document retrieval
with a threshold-based answer inclusion criterion has shown to be and textual entailment recognition. In the retrieval phase, relevant
beneficial for the textual entailment task, when compared to the law articles are selected for a query, having the form of a statement
baseline. which shall be supported or contradicted by the law articles from
the document collection. During the textual entailment phase, the
CCS CONCEPTS query and accordingly retrieved legal documents are processed by
a classification algorithm which returns “yesž in case of positive
· Information systems → Question answering; Similarity mea-
textual entailment or “nož otherwise. This work is a contribution
sures; Relevance assessment; · Computing methodologies → Neu-
to the Competition on Legal Information Extraction/Entailment
ral networks.
(COLIEE) competition which provides a dataset from Japanese bar
exam questions (translated to English) for evaluating the system
KEYWORDS performance on both tasks, retrieval and entailment classification.
legal text retrieval, textual entailment, stacked encoder, explainable Our contribution involves the following methods:
artificial intelligence, threshold-based relevance scoring
• We combine results from BM25 scoring with word embedding-
ACM Reference Format:
based retrieval.
Sabine Wehnert, Sayed Anisul Hoque, Wolfram Fenske, and Gunter Saake.
2019. Threshold-Based Retrieval and Textual Entailment Detection on Legal • We develop a stacked encoder ensemble for entailment de-
Bar Exam Questions. In Proceedings of COLIEE 2019 workshop: Competition tection.
on Legal Information Extraction/Entailment (COLIEE 2019). ACM, New York, • We use thresholding for both approaches.
NY, USA, 9 pages.
The remainder of this work is structured as follows: Section 2
outlines related work for both tasks with respect to their achieve-
1 INTRODUCTION ments using similar methods to our approach. In Section 3, we
Nowadays, globalization poses a challenge for many international describe basic concepts for string representation in machine learn-
organizations, since they need to ensure compliance to laws of all ing models, scoring methods and stacked encoders. We explain
our approach in detail in Section 4 and show evaluation results in
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
Section 5. After discussing those results, we conclude our findings
for profit or commercial advantage and that copies bear this notice and the full citation and mention our future work considerations.
on the first page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
COLIEE 2019, June 21, 2019, Montreal, Quebec
© 2019 Copyright held by the owner/author(s).
1 The work is supported by Legal Horizon AG, Grant No.:1704/00082

, COLIEE 2019, June 21, 2019, Montreal, Quebec Wehnert et al.


2 RELATED WORK the entailment relationship. The task is performed on the SNLI2
The related work for our approach is divided in two parts: The dataset which is based on image captioning. Rocktäschel et al. apply
legal information retrieval task and the entailment detection task. neural attention [2] for entailment recognition on the same SNLI
The first part consists of approaches using BM25 scoring or word corpus [26]. Two LSTM networks are employed for encoding the
embeddings, as well as similarity thresholding for a retrieval task. query and the document, whereby the output vectors from the
We further present deep learning methods, followed by approaches document are used by an attention mechanism for each word in
using thresholds for a textual entailment task. the respective query. Their method achieves 83.5% accuracy, which
is compared to the results by Bowman et al. an improvement of
2.1 Legal Information Retrieval 3.3 percentage points. Liu et al. use a bidirectional LSTM with
an attention mechanism [16] and obtained 85% accuracy on the
2.1.1 BM25-Based Solutions. In the COLIEE ’16 competition, On-
SNLI dataset. A stacked encoder architecture developed by Nie and
odera and Yoshioka apply BM25 scoring for information retrieval
Bansal achieved 86.1% accuracy on the SNLI dataset. Considering
with several extensions using query keyword expansion. Their best
that result as the state-of-the-art, we adapt the main idea to our
result was an F-measure of 54.5% [11]. Arora et al. observe the
task in the legal domain and further explain this architecture in
best score with the BM25 scoring method on a different task of
section 3.3. Do et al. use a convolutional neural network (CNN)
legal document retrieval [1], compared to language models and
with word embeddings [6]. They incorporate additional features
term frequency - inverse document frequency (TF-IDF) weighting.
from a TF-IDF and latent semantic indexing (LSI) representation of
This finding contradicts the previous observations from COLIEE
the sentences. Finally, they feed these features in conjunction with
competitions and the FIRE 2017 IRLeD Track, where ranking SVMs
the output of the CNN model into a multi-layer perceptron (MLP)
[12] or language models [17, 32] performed better than mere BM25
network to predict the answer.
scoring. Despite those observations, BM25 has shown to provide at
We are inspired by the work of the Chen et al., which focuses
least competitive results in many cases, so that we consider it as
on a factoid question and answering system [5]. Their goal is to
part of our approach.
predict a sequence in the document to answer the query, as opposed
2.1.2 Word Embeddings. Word Embeddings have proven to be use- to our task of detecting an entailment relationship. They trained
ful in many natural language processing contexts. We outline sev- two multi-layer bi-directional LSTMs to encode the articles and the
eral works which have used this document feature representation query. For encoding the article, they extract multiple features from
for legal information retrieval. During the COLIEE ’18 competi- the query and document pairs: word embeddings of the document
tion, the SPABS team was able to overcome vocabulary mismatch (300-dimensional Glove embeddings), an exact matching flag, token
in some cases using an RNN-based solution with Word2Vec em- features (part-of-speech tags, named entity tags, normalized term
beddings trained on English legal documents [34]. Team UB used frequencies) and attention scores for the similarity of a document
word embeddings with PL2 term weighting [34]. Yoshioka et al. and the aligned query. These features are concatenated to form the
suggest to use semantic matching techniques for hard questions input vector for the LSTM that encodes the article. The question is
involving vocabulary mismatch combined with more reliable lexical encoded without extracting any features. Their evaluation is based
methods for easy questions [34]. This is the main motivation for on the top five pages returned by the algorithm, and results in 77.8%
our retrieval system, which incorporates lexical BM25 scoring and of correct answers on the SQuAD [23] dataset.
word embeddings as a semantic representation, respectively. Nanda et al. apply a hybrid network of LSTM networks coupled
with a CNN, with the final prediction based on a softmax classi-
2.1.3 Thresholding. Thresholding based on similarity values can fier [20]. They use pre-trained general-purpose word embeddings
improve retrieval results by filtering out low-scoring matches. Islam from the Google news corpus, consisting of 3 billion words. Their
and Inkpen use similarity thresholds to increase the precision of accuracy for the COLIEE ’17 competition was 53.8%, which they at-
text matching [10]. Stein et al. also employ thresholds for plagia- tribute to the general-purpose embeddings which may not capture
rized document retrieval [29]. In the COLIEE ’18 competition, team important semantic relationships needed for the legal domain.
UBIRLED use a similarity threshold for filtering out irrelevant case From these works, we conclude that LSTM architectures are
judgments [13]. Nanda et al. select the top-5 matching documents suitable for entailment detection for open-domain tasks. However,
from a topic clustering approach [20]. Given the document with the COLIEE dataset poses a challenge for deep learning models due
the highest similarity score to the query, they apply thresholding, to the specific meaning of terms in the legal domain and the rather
such that any further document will be incorporated into the result small size of the dataset. Therefore, we refrain from training word
set if the distance to the topmost document is less than 15%. Our embeddings on the statute law competition corpus only, but con-
approach uses a similar criterion for document inclusion. sider using other general-purpose word embeddings and a slightly
different architecture compared to the previous work. We also find
2.2 Legal Textual Entailment in the related work that extracting additional features from the
2.2.1 Deep Learning Approaches. Deep learning approaches have documents can improve the classifier performance.
been used by several authors for entailment detection, starting with
2.2.2 Thresholding. Thresholding for the entailment task is ap-
an application of a single-layered long short-term memory network
plied in two cases: First, the entailment detection can be done by
(LSTM) for input encoding by Bowman et al. [3]. The encoded
using a similarity threshold. This works similar to an attention layer
features from both texts are concatenated and passed through three
200-dimensional tanh layers to a softmax classifier for predicting 2 nlp.stanford.edu/projects/snli/

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
TIFFACADEMICS Liberty University
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
671
Lid sinds
2 jaar
Aantal volgers
375
Documenten
6218
Laatst verkocht
3 weken geleden
REING SUPREME SCHOLARLY // ENLIGHTENED

Here we offer revised study materials to elevate your educational outcomes. We have verified learning materials (Research,Assignments,notes etc...) for different courses guaranteed to boost your academic results. We are dedicated to offering you the best services and you are encouraged to inquire further assistance from our end if need be. Having a wide knowledge in Nursing,trust us to take care of your Academic materials and your remaing duty will just be to Excel. Remember to give us a review,it is key for us to understand our clients satisfaction. We highly appreciate refferals given to us. Also clients who always come back for more of the study content your offer are extremely valued. ALL THE BEST.

Lees meer Lees minder
3,7

132 beoordelingen

5
59
4
13
3
33
2
11
1
16

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen