1 / 31

Romanian Online Dialect Atlas

Romanian Online Dialect Atlas. ? 2003 Embleton, Uritescu, Wheeler. 2. Romanian Online Dialect Atlas . Sheila M. Embleton Department of Languages, Literatures and Linguistics, York UniversityDorin Uritescu co-editor of source atlas: Noul Atlas lingvistic rom?n. Crisana.Department of French, G

wilmer
Download Presentation

Romanian Online Dialect Atlas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 1 Romanian Online Dialect Atlas An exploration into the management of high volumes of complex knowledge in the social sciences and humanities. Sheila M. Embleton Dorin Uritescu Eric S. Wheeler

    2. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 2 Romanian Online Dialect Atlas Sheila M. Embleton Department of Languages, Literatures and Linguistics, York University Dorin Uritescu co-editor of source atlas: Noul Atlas lingvistic român. Crisana. Department of French, Glendon College, York University Eric S. Wheeler ITEC program, York University, Managing partner, Wheeler and Young Inc.

    3. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 3 Romanian Online Dialect Atlas Supported (2003-2006) by a grant from: Social Sciences and Humanities Research Council (Canada)

    4. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 4 Agenda The problem of high-volume, complex data in social sciences and humanities. Predecessor projects: English, Finnish dialect data Use of Multidimensional Scaling (MDS) to consolidate data Interactive, media-rich presentation

    5. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 5 Problem In social sciences/humanities, data is often characterized by: high volume multiple variables or dimensions no a priori model Dialectology provides a good exemplar

    6. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 6 Dialectology Explain the variations in linguistic usage across geography Simple example: “church” vs. “kirk” (< OE cirice) More realistic problem: 169 features in 313 locations (SED) 213 features in 400+ locations (Finnish)

    7. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 7 Dialect atlases Record the details in maps Many maps needed to make an atlas Recovery of individual facts is possible but... Global understanding of the situation is lost in the volume of details

    8. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 8 English Survey of English Dialects (SED) 169 features at 313 locations Computer Developed Linguistic Atlas of English Applied MDS to already computerized data

    9. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 9 English: results 2-D map of dialect locations No geographic information used Close correspondence to geography (as expected) Highlighted further problems of handling and understanding high-volumes of data

    10. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 10

    11. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 11 Finnish

    12. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 12 Kettunen (1940) The Dialect Atlas of Finland 213 maps x 530 locations Up to 16 features per map Typically 1-3 features per location ~120,000 data items Project: data computerization (largely done) Stage II: application of MDS (not yet done)

    13. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 13 Map 1 (parts)

    14. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 14

    15. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 15 Ambiguity

    16. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 16 Resolution Make Editorial decision: “X, not Y” Mark as “AMBIGUOUS” “X or Y” Get more input “X (says expert)”

    17. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 17 Lesson In transforming data from one medium to another, even well-structured data will have unexpected pitfalls: Design data-transformation carefully Prototype your system; Find the problems early Plan to work iteratively

    18. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 18 Romanian Online Dialect Atlas: Crisana Apply innovative contemporary methods in dialect geography to an online set of Romanian dialect data.

    19. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 19 Romanian language Key to understanding the evolution of all Romance languages Early branch, distinct from French-Spanish-Italian line Exemplar of non-hierarchical, dialect variation, and linguistic continua Transition areas contain mixtures of dialect features and specific features

    20. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 20 RODA: Part 1 Create online version of The New Romanian Linguistic Atlas. Crisana (Stan & Uritescu. 1996) Available on internet and CD Default interpretations Interactive interface to data custom select data for a map Add audio clips to illustrate data

    21. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 21 RODA Prototype 1

    22. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 22 RODA: Part 2 Allow plug-in applications and other analyses of data, e.g. Apply Multidimensional Scaling to dialect data Statistical technique Consolidate large amounts of data Complement to traditional analyses of small amounts of data

    23. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 23 Multidimensional Scaling

    24. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 24 Multidimensional Scaling Statistical technique (Torgerson 1952) Used in sociology, psychology, marketing Reveals the scales along which data varies; gives a data-space Uses distances [(dis)similarities] among responses of subjects

    25. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 25 MDS Axioms of metric d(X,X) = 0 d(X,Y) = d(Y,X) d(X,Y) > 0 if X?Y d(X,Y) ? d(X,C) + d(C,Y) for all points C Matrix reflects these rules

    26. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 26 MDS n+1 points generate an n-dimensional space MDS can reduce that high-dimensional space to 2 (or 3) dimensions Result: complex data can be viewed as a “map”

    27. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 27 MDS Can use MDS to consolidate data English 312 dimensions reduced to 2 All 169 features included (and taken in relevant subsets) Finnish, Romanian provide large data sets that can do the same

    28. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 28 Interactive, media-rich presentation Objectives Make data accessible, useful to a wide research audience Methods Interactive selection of data Constructive presentation of data Addition of audio and other media Online is much more than a book!

    29. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 29 Framework and App’ns Online atlas provides a framework for accessing and presenting data Other applications can work within the framework to transform or process the data, such as: MDS data consolidation Tools to analyze dialect variants of phonemes (proposed) Others

    30. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 30 Summary Humanities and Social Sciences deal with large, complex data sets Explore methods to access, process, present this kind of data Solutions include: MDS type processing Online, interactive, rich presentation Example: Romanian Online Dialect Atlas

    31. Romanian Online Dialect Atlas © 2003 Embleton, Uritescu, Wheeler 31 References Embleton, Sheila M. and Eric S. Wheeler (2000). Computerized Dialect Atlas of Finnish: Dealing with Ambiguity. J. of Quantitative Linguistics 2000. 7.3. pp 227-231. Embleton, Sheila M. and Eric S. Wheeler (1997a). Multidimensional Scaling and the SED Data. in Wolfgang Viereck and Heinrich Ramisch. The Computer Developed Linguistic Atlas of England 2. Tuebingen: Max Niemeyer Verlag. Embleton, Sheila M. and Eric S. Wheeler (1997b). Finnish Dialect Atlas for Quantitative Studies. J. of Quantitative Linguistics 1997. 4.1-3. pp 99-102 Schiffman, Susan S. , M. Lance Reynolds, Forrest W. Young (1981). Introduction to Multidimensional Scaling. Theory, Methods, and Applications. New York: Academic Press. 411pp. Torgerson, W. S. 1952. Multidimensional scaling: 1. theory and method. Psychometrika. 17. 401-419. Stan, Ionel & Uritescu, Dorin. 1996. Noul Atlas lingvistic român. Crisana. Vol. I. Bucharest: Romanian Academy Press. (2003. Vol. II. Bucharest: Romanian Academy Press) Uritescu, Dorin. 1983. “Asupra repartitiei dialectale a graiurilor dacoromâne. Graiul din Oas" / "On the Dialect Structure of Daco-Romanian. The Dialect of Oas”/, in Materiale si cercetari dialectale II, Cluj-Napoca: The University of Cluj- Napoca, pp. 231 - 246. Uritescu, Dorin. 1984a. “Subdialectul crisean.” In: V. Rusu (ed.), Tratat de dialectologie româneasca. Craiova: Scrisul românesc, 284-320, 916-930. Uritescu, Dorin. 1984b. “Graiul din Tara Oasului.” In: V. Rusu (ed.), Tratat de dialectologie româneasca. Craiova: Scrisul românesc, 390-399, 964-967. Wheeler, Eric S. (2002). Zipf's Law and Why It Works Everywhere. Glottometrica 4, 45-48. Wheeler, Eric S. (2003). Multidimensional Scaling to Visualize Text Separation. Glottometrica 6 forthcoming. Wheeler, Eric S. (nd). Multidimensional scaling. chapter in Reinhard Koehler. (ed) forthcoming Handbook in Quantitative Linguistics.

More Related