580 likes | 663 Views
This report presents statistical methods used to analyze syntactic variables in L1 writing from an ongoing study by PhD student Bård Uri Jensen. The focus is on differences in grammatical choices between handwriting and keyboarding, with hypotheses on production speed, functionality utilization, and psychological factors influencing choices.
E N D
Somestatisticalmethodsonsyntactic variables in L1 writingReport from an ongoingstudy Bård Uri Jensen PhD student UiB / Hedmark University College (Hamar) Solstrand 2010-03-26
Contents • Introducing the project • The ELEV corpus vs the ASK corpus • Extracting data • Analysing data
My doctoral project • Research question • Do peopletend to make differentgrammaticalchoiceswhenthey type onkeyboardratherthanwrite by hand? • Hypotheses • Higherproduction speed affectsthechoices in a ”spontaneous” direction • Skilledwritersmayutilisetheenhancedfunctionality and shift features in theoppositedirection • Otherpsychologicalfactorsmayaffectthechoices • motivationalfactors • social media norms
The ELEV corpus • A ”parallel” corpus of hand-written and keyboarded texts • Two texts by each pupil • The ASK corpus system • Manual syntactic segmentation • t-units • clauses • fragments • No error tags
<t-unit> All humans aredifferent, </t-unit> <t-unit> Womenuse computers </t-unit> <t-unit> and boys readbooks </t-unit> <t-unit> I like cross-countryskiing. Because it givesmebetterstamina. </t-unit> <t-unit> Alle mennesker er forskjellige, </t-unit> <t-unit> Kvinnfolk driver på data </t-unit> <t-unit> og gutter leser bøker </t-unit> <t-unit> Jeg liker å få på ski. Fordi det gir meg bedre kondisjon. </t-unit>
<t-unit type="imp"> get (yourself) drunk. </t-unit> <t-unit type="spm"> Is this a healthydevelopment? </t-unit> <t-unit type="imp"> drikk deg full. </t-unit> <t-unit type="spm"> Er dette en sunn utvikling? </t-unit>
<t-unit> The police know <clause type="nominal"> therearepeople under 18 <clause type="relativ"> who drink there, </clause> </clause> </t-unit> <t-unit> Politiet vet <clause type="nominal"> det er folk under 18 <clause type="relativ"> som drikker der, </clause> </clause> </t-unit>
<frag> Butwhataboutotherbooks? </frag> <t-unit type="frag"> but [I] know aboutseveralgirls <clause type="relativ"> whodon’t do it also! </clause> </t-unit> <frag> Men hva med andre bøker? </frag> <t-unit type="frag"> men veit da om flere jenter <clause type="relativ"> som ikke gjør det også! </clause> </t-unit>
<t-unit type="spm"> Is this a <corrsic=”helthy"> healthy </corr> development? </t-unit> <t-unit type="spm"> Er dette en <corr sic="sund"> sunn </corr> utvikling? </t-unit>
Corpus searches [features='.* subst .*']; <t-unit>[]*</t-unit>; <t-unit_type=”imp”>[]*</t-unit>; <t-unit>[]{5,10}</t-unit>; <t-unit>([lemma='\$.']*[!lemma='\$.']){5,10}[lemma='\$.']*</t-unit>;
Corpus searches : frontal subclauses <t-unit> [features='.* konj .*']?(<clause_type="nominal"> | <clause_type="relativ"> | <clause_type="adverbial">) [];
Corpus searches : embedding <t-unit>[!clause]+<clause>[]*</clause>[!clause]+</t-unit>; <t-unit>[!clause]+<clause_type!="relativ">[]*</clause>[!clause]+</t-unit>;
Corpus searches :lexical distribution [lemma!='\$.']; [features=".* verb .*"];
Statistics : Three examples • Some simple analyses • differences of mean • correlations • Classification analysis • Clustering
Classification analysis • Independent variables (parameters) • writing mode • hand ~ keyboard • writing skills • medium ~ high • gender • essay question • Dependent variable • freqof attributive adjectives • subclausefreq