1 / 7

2 Modern Approaches to Corpus Linguistics

2 Modern Approaches to Corpus Linguistics. Dominique Longrée , LASLA – Université de Liège et FUSL ( Bruxelles ). 2 Modern Approaches to Corpus Linguistics. • automatic taggers as heuristic tools • multilevel approaches : the motives what do they have in common ?.

aaron
Download Presentation

2 Modern Approaches to Corpus Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2 Modern Approaches to Corpus Linguistics Dominique Longrée, LASLA – Université de Liège et FUSL (Bruxelles) 2 Modern Approaches to Corpus Linguistics • • automatic taggers as heuristic tools • • multilevel approaches : the motives • what do they have in common ?

  2. 2 Modern Approaches to Corpus Linguistics 1. Automatic taggers as heuristic tools • a LASLA research project : • testing various automatic recognition software, know as taggers • Biber, 1993, Illouz, 1999, etc. : the quality of production can vary significantly • - from one type of text to another • - from one tagger to another. • Questions : • - are the results better with a tagger trained • on one author or on a given text • for another text • - by the same author, or within the same discourse? • - what can we deduce from those results regarding • the tagger or • the homogeneity of corpora?

  3. 2 Modern Approaches to Corpus Linguistics 1. Automatic taggers as heuristic tools • The test-texts : • - book 3 of The Gallic Wars by Caesar – BGall3 (3673 tokens • - The Conspiracy of Catilina by Sallust – SalCat. (10688 tokens), • - book 3 of The History of Alexander the Great by Quintus Curtius • – QC3 (7261 tokens), • - The First Oration Against Catilina by Cicero – CicCat1 (3333 tokens) • - poem 66 of Catullus – Catu66 (586 tokens) • Varying the nature of the training and evaluation corpus , • in order to identify and measure variant factors : • style of the work • style of the author • diachrony • literary genre • type of discourse

  4. 2 Modern Approaches to Corpus Linguistics 1. Automatic taggers as heuristic tools • In theoretical terms : • taggers appear to have some value as heuristic instruments • For instance, highlight • - the homogeneity of the historical style • over and above diachronic development • - the gap between narration and discourse (speeches) • - the gap between the styles of Caesar and Cicero • a smaller gap between Catullus and Cicero • or between Catullus and Quintus Curtius/Tacitus • than the gap between Catullus and Caesar, • etc

  5. 2 Modern Approaches to Corpus Linguistics 2. Multilevel approaches : the “motives” • Some indicators intuitively catalogued in Latin narrative prose - sequences of verb tenses - lexical elements • repente, subito ‘suddenly’, ‘abruptly’ • - syntactical structures / ‘linking clichés’ • Quibus rebus cognitis ‘Those things being known’ • Quod ubi animaduertit ‘When he had noticed that’ • Limits - no very analysis as text’s structure indicators - no study of their interaction • - poor use for characterising text genre and style

  6. 2 Modern Approaches to Corpus Linguistics 2. Multilevel approaches : the “motives” • The Discourse Modes and Bases Approach - Kroon, 2007, 2009; Adema, 2007, 2008, 2009 - a priori definition of typical features for each discourse mode • - in order to evaluate text homogeneity • LASLA and BCL approach • - to develop endogenous exploratory methods - to take into account this text linearity • - to specify functional convergences between several indicators • methods • calling upon mathematical models (neighborhoods, bursts) • combining • small-scale qualitative approach • large-scope quantitative analysis

  7. 2 Modern Approaches to Corpus Linguistics 3. What do these approaches have in common ? • they take texts and discourses into account in both their dimensions • - the multilevel nature of texts and of languages, • from phonetics to pragmatics • - the fact that texts and discourses • - are organized according to linearity • - can be considered as topological entities.

More Related