1 / 34

Peter Grzybek

Peter Grzybek. Estonian Proverb s : S earching for re gularities. www.peter-grzybek.eu. How long is a proverb ? How long are words in proverbs ? Does word length depend on proverb length ? Is word length independent of within-text position ?.

casey
Download Presentation

Peter Grzybek

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peter Grzybek Estonian Proverbs: Searchingforregularities www.peter-grzybek.eu Kriq 75, August 18/19, 2014

  2. How long is a proverb ? • How long are words in proverbs ? • Does word length depend on proverb length ? • Is word length independent of within-text position ? Kriq 75, August 18/19, 2014

  3. How to measure the length of linguistic units and entities ? • Memo: „There are no positive facts in language.“ (Saussure) • There is always more than one definition. • Define the entity you want to measure. • If you want to measure sentence length, define ‚sentence‘. • If you want to measure word length, define ‚word‘. • Determine the measuring units in which you want to measure. • E.g., sentence length: number of clauses, phrases, words, syllables, morphemes, … ? • E.g., word length: number of syllables, morphemes, letters, graphemes, of phonemes, … ? • Define the measuring units. • Define ‘clause’, ‘phrase’, ‘syllable’, ‘morpheme’, ‘phoneme’, ‘grapheme’, ‘letter’, … ? Rule in Quantitative Linguistics: Take directconstituentsasmeasuringunits Kriq 75, August 18/19, 2014

  4. Howlongareproverbs ? Sentencelength:Oneproverbonesentence Kriq 75, August 18/19, 2014

  5. Orthographicproblems: Mother-in-law - Isn‘tthat a problem ? В этом доме. в кратцу - вкратце Phonological word (tactgroup): Námostu. In agglutinative languages … … stems do not change, … affixes do not fuse with other affixes, … affixes do not change form conditioned by other affixes. Kriq 75, August 18/19, 2014

  6. Howlongarewords ? Kriq 75, August 18/19, 2014

  7. Estonianphonemes: Threedegreesofphonemiclength (consonantsandvowels) [o] (short o) koli = „Müll“ [oˈ] (long o) kooli = „Schule“ [oː] (extra long o) kooli" = „schulen“ Kriq 75, August 18/19, 2014

  8. Decisions / Definitions (In accordancewithKriq 1967) Kriq 75, August 18/19, 2014

  9. Üksriisubrihaga, teinepühibluuaga.(EV 15016) [Der eine recht mit dem Rechen, der andere kehrt mit dem Besen.] Wo:6 – St:6 – Sy:13 Üksrii-subri-ha-ga, tei-ne pü-hibluu-a-ga. Isipuu, isipuuke.(EV 2245) [Das eine ist der Baum, das andere ist das Bäumchen] Wo:4 – St:4 – Sy:7 I-si puu, i-si puu-ke. Kriq 75, August 18/19, 2014

  10. Erna Normann(1955) Valimikeestivanasõnu 3576 proverbs Ca. end 19th, early 20thcentury Kriq 75, August 18/19, 2014

  11. Comparisons Old (17th/18thcentury) and Contemporary Bimodaldistributions: Additional Peaks ( 6 / 8 ) Question: Doestheword-stemdistinctionexplainthe bi-modality? Kriq 75, August 18/19, 2014

  12. Eestivanasõnad (12921 proverbs) Words per proverb Stems per proverb Words stems: Linear relation !  Concentration on words Kriq 75, August 18/19, 2014

  13. Some In-between conclusions Bi-modality seems to originate in the proverb material‘s characteristics; this phenomenon needs more detailed study It seems reasonable to assume the overall picture to be a result of differences between syntactically different provers: e.g., „simple“ (uni-partite proverbs without hypotaxis) vs. „complex“ (n-partite proverbs with hypotaxis). As long as we do not have relevant data available, data pooling seems to be an appropriate procedure, to make the forest visible before the trees. Pooling data: Intervals 2-3, 4-5, 6-7,… Kriq 75, August 18/19, 2014

  14. Is there a way to find a theoretical model for sentence length frequencies ? Assumptions: The distribution of length is organized in a law-like manner. It is sufficient to make assumptions about the difference D of two neighboring frequencies (probabilities) Which factors influence D ? a language-specific factors b production-specific factors c norming forces d level-specific factors (words vs. phrases) Hyperpascaldistribution (Beta-binomial d.) Kriq 75, August 18/19, 2014

  15. Eestivanasõnad Testingthehyperpascaldistribution k = 1.21 m = 0.07 q = 0.39 C = X²/N = 0.0193 LengthofEstonianproverbsisregularlyorganized. The well-knownhyperpascaldistributionis a goodmodel. Kriq 75, August 18/19, 2014

  16. Isthere a regularityofwordlength in Estonianproverbs ? Normann (21038 words) Kriq 75, August 18/19, 2014

  17. In searchof a wordlengthmodel Poisson-distribution 1-displaced Poisson-distribution („Fucks distribution“) C = X²/N = 0.08  No good model ! Kriq 75, August 18/19, 2014

  18. An alternative modelforwordlength in Estonian (proverbs) Geometricdistribution 1-displaced geometricdistribution 1-displaced Shenton-Skeesgeometricdistribution Word stems Orthographicwords p = 0.85 a = 3.49 C = 0.0062 p = 0.88 a = 4.71 C = 0.0023 Kriq 75, August 18/19, 2014

  19. Word length in Eestivanasõnad (88296 words) p = 0.84 a = 3.30 C = 0.0074 Kriq 75, August 18/19, 2014

  20. Proverb Length  Word Length (Normann) Kriq 75, August 18/19, 2014

  21. Menzerath-Altmann law (Altmann1980) »The longer (more complex) a linguistic construct, the shorter (less complex) its constituents.« Example: The longer a sentence the shorter the clauses constituting the sentence. NB: Direct relations (in the classical structuralist paradigm) only, i.e., the relation of a construct to its immediate constituents; the relation between entities from indirectly related levels (e.g., between sentences and words, leapfrogging the intermediate level of sub-sentential constructs like clauses or phrases) is expected to show different (more complex) tendencies. Basic form: y: construct = dependent variable, x: constituent independent variable K: integration constant, a: parameter determining the steepness of the decrease (for a < 0). Full form Extended form (Wimmer-Altmann law) Kriq 75, August 18/19, 2014

  22. Proverb Length  Word Length Normann K= 1.68 c= –0.84 R² = 0.90 Eesti vanasõnad K=1.71 a = 0.18 c=–1.05 R² = 0.98 Kriq 75, August 18/19, 2014

  23. Word Length Syllable Length Eesti vanasõnad K=2.02 c=0.42 R² = 0.96 Kriq 75, August 18/19, 2014

  24. Positional aspects of word length Fourier series: R² = 0.99 Kriq 75, August 18/19, 2014

  25. In the two approaches discussed above, analyses concerned: • the dependence of word length on sentence length  no attention to within-sentence position, • the dependence of word length on within-proverb position ignoring the specific proverb length. Unipartite proverbs with length T3–T5 Decrease – increase Minimum at 2nd position Maximum at last position Bipartite proverbs with length T6–T10 Cycle I:  unipartite proverbs (T6) Cycle II: T7, T9, and T10 T6, T8  unipartiteproverbs = monotonous increase Kriq 75, August 18/19, 2014

  26. Whatcausesproverbstobelong(er) orshort(er) ? Frominternalsynergetictoexternalfactors Kriq 75, August 18/19, 2014

  27. ... Tänan teid kannatlikkuse ja tähelepanu ... Kriq 75, August 18/19, 2014

  28. FamiliarityFrequency • German data • American data SentenceLengthandFamiliarity (German data: N= 11.355; excluding zero-familiarity, f >100) SeL= 8.40 Frq-0.09 R² = 0.89 Kriq 75, August 18/19, 2014

  29. Desiderata forEstonianParemiology • Variants vs. Types • Frequency • Familiarity • Linguistic forms of variants • Frequency • of variants • of types • Familiarity • of variants • of types “It seems preposterous even to ask where the 'variants of one proverb' end and the 'variants of another proverb' begin, or how many 'different proverbs' could be found within such a thicket.” Kriq 75, August 18/19, 2014

  30. Frequencydistributionof ‚variants‘ (Unreliabledataforf > 10) Zipf distribution Right-truncated Zipf distribution a=1.91 R =9 C=X²/N = 0.0032 a=2.08 C=X²/N = 0.06 Kriq 75, August 18/19, 2014

  31. K=6.52 c=0.07 R² = 0.96 Kriq 75, August 18/19, 2014

  32. Kriq 75, August 18/19, 2014

  33. July 21, 1939: ArvoArnol‘dovič Krikmann Belgian National Day Village Pudivere (German: Poidifer) Estonian Writer Eduard Vilde (1865-1933) Simuna Parish Important point in F.G.W. Struve‘s Geodatic arc, A chain of triangulations (1827) July 21, 1940: President Konstantin Päts affirmed the government of Johannes Vares (appointed by Andrej Ždanov), accompanied by the arrival of Soviet demonstrators and Red Army troops, replacement of the Flag of Estonia by the Red flag on Pikk Hermann, meeting of the newly elected parliament Riigikogu on July 21. July 21, 1944: Graf Claus von Stauffenberg and his fellow conspirators were executed in Berlin for the plot to assassinate Adolf Hitler. July 21, 1944: The United States Senate ratifies the North Atlantic Treaty. Kriq 75, August 18/19, 2014

  34. Kriq 75, August 18/19, 2014

More Related