LANGUAGE DOCUMENTATION and DESCRIPTION. Bahar Kocaman , Sezer Yurt, Gülden Berber. By documenting languages we engage in amassing data for preservation . This will allow future generations to access data for languages even after they are gone .
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
BaharKocaman, Sezer Yurt, Gülden Berber
thedescriptionsmay be empoweringforendangered
data fromlanguageconsultants, preferably in
environmentsfamiliartothem, such as theirhomesor
spending an extendedamount of time with a
community in an exoticplace, documentingand
recording a littleknownlanguage of a community
withthehelp of localinformants.
systematicallyanalysingparts of a language,
usuallywithin a community of speakers of that
language’ (Sakel & Everett, 2012:5).
comprehensive as possible, andso on.
raw data thatmaythen be usedforfurther
givelinguistsraw data toworkwithandto
preservea culturalheritage of thecommunity.
of languagedeath, it willpreserverecords of the
program is TheRosetta Project, whereinformation
forover 1500 languages is currentlystored, much of
essentials of a language, based on availablematerial.
Itprovidesanalyses of a variety of areas of the
language, such as itsphonological, morphological,
grammaticalandsyntacticsystems, as well as, ideally
presenting a lexicon of thelanguage.
comparablewithotherdescriptions, but specific
enoughtocapturetheuniqueness of thelanguage
documentationsmayleadto a higherrecognition,
even on a politicallevel, of thelanguage. Reference
materials, such as educationalmaterial, produced in
a higherawareness, andmightevenleadtothe
languagebecomingrecognizedenoughto be taught
arealwaysbased on a sample of
1.ProbabilitySample: Inordertocheckforstatisticaltendenciesandcorrelations of
to presence orabsence of thosevariables.
reduplicationbychoosing a set of
variables, such as
2. VarietySample: is ‘mainlyusedfor
explorativeresearch: whenlittle is knownaboutthe form orconstruction
underinvestigation it is importantthat
thesampleoffers a maximumdegree of
thelinguisticparameters [i.e. variables]
involved’ (Rijkhoff & Bakker 1998:265).
3. ConvenienceSample: is a sample
based on whatkind of data one has
1.BibliographicalBias: Small or
isolatesorlanguages of unknown
published his description of the
languageswere not to be found in any
surveys on types of wordorder.
2. Genetic (Genealogical) Bias: Somelanguage
underrepresented in thesample.
is biasedtowardsonefamilyoverothers, a feature
mightlookmoreorlesscommonthan it actually is,
simplybecause of how it appears in thedominating
Indo-Europeanlanguages, but it is quite
common in Niger-Congolanguages. If a
sample has a higherproportion of Indo-
patternthat is likelytoemerge is thattone
been in sustainedcontactandhaveinfluenced
found in thelanguagesoutsidethearea.
Balkan area, whichbelongtodifferent
genera of Indo-European, have
postposedarticles as opposedtothe
linguisticarea, and as opposedto
otherlanguages of thesamegenera.
4. TypologicalBias:One linguistic type is over- or
underrepresented in a sample.
5. CulturalBias: We have an over- orunderrepresentation
of the different culturesof the world in the sample.
The Yucatec speakers;
obligatory numeral classifier system.
When asked to sort pictures ofobjects,
objects by shape,
objects by material composition.
from the cultural outlook( i. e. how one
viewsand categorizes objects)
derives from the linguisticstructure, is
probablyimpossible to establish.
genetically, they are likely to have
inherited common linguistic types from
their ancestor language, to be spoken in
the same area and by people sharing the
same culture” (Cristofaro 2005:91).
from typology, but is actuallypretty
essential, since what we are dealing with
is sets of data, samples aimed at
representingthe whole, anddrawing
conclusions fromthese sampled data.
selectionof languagesand in the
wherethedata providedforeachlanguage is restricted.
eachlanguage, but thenumber of languages is smaller.
domain whileotherdatabasescode a host of featuresand
1. Word Atlas of Language Structure
coveringa greatpart of abstractlinguisticsystemincluding
phonology, morphology, syntax, grammar, andlexical
specificallythelocation of thelanguageanditsgenealogical
features, theirchaptersmaycontain a largeamountof
languagesthoughthesemay not necessarilyoverlap
2.Atlas of PidginandCreole Language
betweenlanguage is absolute.
language can also be foundforeverylanguage in the
questionnaireof featuresforthelanguage of their
a summary of thesociohistorical background and a broad
structuraloutline of thelanguage.
languagesthat is selectedlanguagesthatmayormay
not be of a specifictypologicalsort. So a complete
crosscomparisonbetweenAPiCSand WALS is not
comparetherates of changeswithin a set
of words in differentlanguages in order
totrytoestablish in how far theyare
wordsusing a fixedalgorithm.
lexicalitemsfor as manylanguages as possible.
as genealogicalaffiliation, locationandnumber of
tosubmit a largeamount of languages.
comparison, thewordshaveto be transcribed
in a machinereadable format.
approximationof theoriginal, it is not possible
tosimplyconvert it backto a moredetailed
format in ordertomake it moreaccessibleto
same. Howeverthere is a differencefor
themarecalled as convenience
signedlanguage in documentation
anddescription is thatsigned
visual/gesturallanguages but spoken