1 / 25

Pacific And Regional Archive for Digital Sources in Endangered Cultures

Large-scale digital archives of endangered Asia-Pacific languages. Pacific And Regional Archive for Digital Sources in Endangered Cultures. Linda Barwick, University of Sydney Presentation to APAN E-science workshop, Honolulu, 28 Jan 2004. Endangered regional languages.

dimaia
Download Presentation

Pacific And Regional Archive for Digital Sources in Endangered Cultures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large-scale digital archives of endangered Asia-Pacific languages Pacific And Regional Archive for Digital Sources in Endangered Cultures Linda Barwick, University of Sydney Presentation to APAN E-science workshop, Honolulu, 28 Jan 2004

  2. Endangered regional languages • Approx. 2500 of the world’s 6000 languages in Australia’s region (Oceania, E and SE Asia) • Majority of these 2500 are endangered - number of languages likely to fall to a few hundred by 2100 (UNESCO) • Loss of language -> loss of cultural knowledge (e.g. ecological knowledge) and expressions (e.g. songs) -> loss of human diversity

  3. Why digital archives? • Salvaging materials recorded in endangered analogue formats • Only means of ensuring long-term preservation and access to audio • Optimal format for transcription and analysis • Distributed management & access (including authentication) via broadband R&E networks • Participation in international consortia for resource discovery and advice • Quality-controlled citeable primary data resource to support research results

  4. The coming revolution … • Quality-controlled citeable primary data resource to support research results requires: • Authenticated resource creation path • Finegrained description of resource • Metadata • Transcript • Timecoding • (Translation… ) • Sustainability, security, discoverability and accessibility of resource (i.e. needs to be online) • Instantiation of links between research results and primary data (e.g. via electronic publication)

  5. Other regional digital language and music archives • Archive of Maori and Pacific Music, U. Auckland • Tjibaou Cultural Centre, New Caledonia • Vanuatu Cultural Centre • Institute of Papua New Guinea Studies Music Archive, Port Moresby • Australian Institute of Aboriginal and Torres Strait Islander studies audiovisual archive • Alaskan Native Languages Center • Archive of Indigenous Languages of Latin America • Formosan Language archive • Others … e.g. Malaysia ….

  6. Some European archives hosting Asia-Pacific region material • DoBeS (Documentation of Endangered Languages) Archive, Max Planck Institute, Nijmegen, Holland • Endangered Languages Programme Archive, SOAS, UK • Vienna Phonogrammarchiv • Berlin Phonogrammarchiv • LACITO, France • Musée de l’homme, France • British National Sound Archive …

  7. About PARADISEC • Established 2003 to preserve and make accessible Australian researchers’ field recordings of endangered languages and musics from the Asia-Pacific region • Collaborative project funded by Australian Research council, participants Universities of Sydney, Melbourne and ANU • Does not include Australian languages - these managed via AIATSIS • Present focus on audio recordings - plan to include and integrate other digital resources

  8. Collection status Jan 2004 • 1324 assessed records, covering approx. 150 regional languages from 14 countries • (Australia, Burma, Fiji, Indonesia, Japan, Laos, Malaysia, Micronesia, New Zealand, Papua New Guinea, Singapore, Taiwan, Vanuatu, Vietnam) • 392 hours ingested and online via password, APAC store account - on target for 500 hours (1 terabyte) in first year • Metadata quality control via registration with Open Language Archive Community (6/03) and OAI • First collections digitised and returned to depositors

  9. Metadata - shared online database • For description, assessment, rights, access • Filemaker Pro while in development • Currently moving to MySQL/PHP • Created & managed online in shared server space • Public access to catalogueplanned for 2004 • Will link to collection(for authorised users) Nick Thieberger, Melbourne unitPARADISEC project manager

  10. PARADISEC audio standards • 24-bit 96khz Broadcast Wave Format (uncompressed PCM audio with encapsulated metadata) 2GB/h • Ingestion managed via Quadriga system (also used by National Library of Australia, Screensound, etc) • CD-audio and Mp3 browser copies via batch processing Frank Davey, audio engineer, Sydney unit

  11. Depositor and user liaison • PARADISEC digital archive only - provides temporary storage while objects are digitised • originals returned tooriginating institution/depositor with CD-audio copy • depositors have onlinepassword-protectedaccess to full-resolutiondigital files • we provide advice on archiving of originals if requested • born-digital originals will revolutionise work practices Amanda Harris, project administration, Sydney unit

  12. depositor password authentication owner APAC national facility (Canberra) cultural centre authorised general user “Azoulay” archive space data entry/ administration working space digitisation (Sydney) Usyd MSS PARADISEC structure metadata/ database design (Melbourne)

  13. Rights • Depositor and user agreement forms online • Rights information embedded in the processing system for eventual automated access or restriction of access • Trial password access currently implemented on APAC store and shared database

  14. Access (audio online) • Download whole files from data store (e.g. for authorised community use) • Streaming MP3 (browsing) • Audition section of file (in development 2004) • Transcript, dictionaries, maps, images etc as point of entry to collection (in development 2004) • Effective access depends on transcripts with translations and timecoding • Need ‘timecoding for dummies’ tools • Encouragement for users to add value to repository by lodging transcripts, indexes etc.

  15. Training & Resources • Demand for practical workshops for researchers and communities • Researcher training to archive in everyday practice not just as end point • Website as gateway for online resources • Potential for online collaboration with users and stakeholder communities in adding value to collection through timecoding and metadata

  16. International discipline- related digital entities National media archives Regional stakeholders and cultural centres Australian Higher Education Sector PARADISEC’s communities PARADISEC

  17. Regional stakeholders PARADISEC Regional community • Speakers/performers and their inheritors • Local and national cultural centres • Vanuatu Kaljoral Senta • Institute of PNG Studies • Etc… • Must be involved for ethical and rights reasons • Significant user community

  18. Regional stakeholders PARADISEC Issues • Differentials in infrastructure • Differentials in funding • Training and career structures • Technical support • Local language access interface

  19. Regional stakeholders PARADISEC PARADISEC Wishlist • Effective international networking links to stakeholder communities • User-friendly, cost-effective and open-source database, indexation and annotation software • More opportunities for user workshops and skillsharing within the Asia-Pacific region • Greater awareness of potential for cultural heritage applications in the planning/feasibility study stages of regional infrastructure projects

  20. Sub-community of Open Archives Initiative Worldwide virtual library of language resources PARADISEC one of 27 participating archives AIMS develop consensus on best current practice for digital archiving of language resources develop network of interoperating repositories & services for housing & accessing such resources PARADISEC International discipline-related digital entities OLAC http://www.language-archives.org

  21. PARADISEC International discipline-related digital entities DELAMAN http://www.delaman.org Other participants include: • Alaska Native Language Center Archives (University of Alaska Fairbanks, USA) • Archive of Indigenous Languages of Latin America (University of Texas, USA) • Archive of Maori and Pacific Music (University of Auckland, New Zealand) • DoBeS archive (Max Planck Institut für Psycholinguistik, Holland) • ELAR archive (School of Oriental and African Studies, UK)

  22. PARADISEC International discipline-related digital entities Issues • Differentials in scope and mission of participants • Differential IP and rights protocols across international boundaries • Differentials in data structures, standards and system architectures

  23. PARADISEC International discipline-related digital entities Wishlist • Networking, ethical agreements and standards to allow mirroring of data between participating archives to provide secure backup and efficiencies in data provision to global user communities

  24. Linkages • Support and advice from ... • ANU Internet Futures, APAC, Grangenet • ScreenSound • National Library • AIATSIS • Collaborations ... • EMELD (Electronic Metastructures for Endangered Languages Data) • DELAMAN and OLAC • Regional cultural organisations • Strategic partnerships with other digital archives

  25. Contacts Please visit our website http://www.paradisec.org.au Director (Sydney unit) lb@paradisec.org.au Project manager (Melbourne) nickt@paradisec.org.au

More Related