EUCLCORP: Multilingual Legal Corpus for EU Case Law Analysis

The European Union case law corpus (EUCLCORP) Aleksandar Trklja University of Birmingham

What is EUCLCORP? • The European Union case law corpus (EUCLCORP) is a standardised, multidimensional and multilingual corpus of the case law of the Court of Justice of the European Union (CJEU) and of eight EU member states’ constitutional/supreme courts.

Project development • The project has been developed in the following phases: • Phase one: project application • Phase two: data compilation • Phase three: data annotation • Phase four: web-interface • Supported by a European Research Council (ERC) Proof of Concept grant • Based at the University of Birmingham (July 2016 - December 2017).

Not just another legal database

Not just another legal database • Unlike conventional legal databases EUCLCORP contains the following corpus tools: • monolingual concordance lines • parallel concordance lines • collocations • frequency lists • n-grams • simple search • CQP-based search

Annotation • The corpus has been annotated with linguistic and external metadata information. • Linguistic information: tokenization, lemmatization, parts-of-speech tags, sentence and paragraph boundaries and enumeration of sentences and paragraphs.

Annotation • Non-linguistic metadata for CJEU subcorpus: text sections (Summary, Parties, Grounds, Costs, Operative Part and Subject), language of the case, case name, case number, date and cellar number. • Non-linguistic metadata for national judgments: language of the case, name of the court, date, case name and names of judges. • Sentences from ECJ judgments: aligned at the sentence level to enable the search on parallel concordance lines.

ECJ judgments

National judgments

Web interface and corpus tools User-friendly interface for the search query [lemma="increase" & tag="V.*"] ]{0,2}[ tag="N.*"] ::match.meta_date="1980.*" within grounds

Web interface and corpus tools N-grams associated with the token ‘capable’

Web interface and corpus tools

Contribution • EUCLCORP has been created with the aim to foster the development of empirical legal linguistics studies.

Contribution • EUCLCORP allows users to investigate in a systematic way: • the history of the meaning(s) of a particular legal term; • features that distinguish legal language from languages used in other registers; • in the case of ambiguous terms – the senses in which they are most frequently and most typically used; • the influence of national legal languages on EU case law (and vice versa); • the impact of translation on the development of EU case law; • discourse relations and argumentation patterns.

Thank you for your attention!

EUCLCORP: Multilingual Legal Corpus for EU Case Law Analysis

EUCLCORP: Multilingual Legal Corpus for EU Case Law Analysis

Presentation Transcript

The European Union

The European Union

European Contract Law Lecture I: European Union

The European Union

The European Union

The European Union

Private International Law in the European Union

THE EUROPEAN UNION

The European Union

The EUROPEAN UNION

The European Union

The European Union

Law of the European Union course for exchange students 4. Application of European Union Law

EUROPEAN UNION LAW

THE EUROPEAN UNION

European Union environmental law

The European Union

Private International Law in the European Union