1 / 44

What does it mean to “support TEI” for manuscript transcription?

What does it mean to “support TEI” for manuscript transcription?. Ben Brumfield TEI 2012. References at http://manuscripttranscription.blogspot.com Transcript and slides to be posted. C{r|l}o{u|w}ds. Clouds Crowds. Crowdsourced Transcription. Genealogy Natural Sciences

warner
Download Presentation

What does it mean to “support TEI” for manuscript transcription?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What does it mean to “support TEI” for manuscript transcription? Ben Brumfield TEI 2012

  2. References at http://manuscripttranscription.blogspot.com • Transcriptand slides to be posted.

  3. C{r|l}o{u|w}ds • Clouds • Crowds

  4. Crowdsourced Transcription • Genealogy • Natural Sciences • Open Source/Creative Commons • Libraries/Archives/Museums

  5. Record-based Transcription • Millions of entries transcribed. • Tabular document inputs. • Database output format. • Uses understood in advance • Mark-up…

  6. Free-form Transcription • Scripto • NARA initiatives • Wikisource (especially DE) • FromThePage

  7. Why is TEI important? • TEI’s influence extends beyond its use. • How does the public encounter edition? • How do developers learn about edition?

  8. Who is doing TEI, then? • Transcription Tool Directory • 27 Tools • 23 Editors • 7 “support TEI” • But what does that mean?

  9. TEI Support Interviews • Transcribe Bentham • T-PEN • Carolingian Canon Law (CCL) • Papyrological Editor (Papyri.info) • Monasterium.net (MOM-CA) • Virtuelles deutsches Urkundennetzwerk (VdU) • ItineraNova.be

  10. Emphasis on Transcribe Bentham • Tried TEI support as an experiment • “It was part of the research, really: can online volunteers do this?” –Melissa Terras • Publications on results • “Transcription Maximised, Expense Minimised?” (LLC) • “Building a Volunteer Community” (DHQ)

  11. Variation: Commitment to TEI • “It was untenable to me that we could just ask for ascii transcriptions of text,given the years and years knowledge of best practice in marking up texts.” –Melissa Terras (TB) • “I personally am a TEI sceptic.” –James Ginther (T-PEN)

  12. Variation: Commitment to TEI • “I am no evangelist[…] but am more inclined to see it as a standard that is probably necessary and in many respects useful[....] On balance, it has allowed us to develop valuable intellectual perspectives on our texts[….] It is important that we are doing so in the context of a community of practice” –Abigail Firey (CCL)

  13. Variation: Location of TEI • Transcribe Bentham • Pure TEI Interface • Pages stored in MediaWiki • Itinera Nova • Pure TEI Transcripts in XRX system • “The volunteers have no idea what XML or TEI is. The archivists know that the syntax is internally transformed into TEI.” -Jochen Graf

  14. Encoding • Common assumption: encoding is hard! • Several respondents were not so sure: one believed there was "[t]oo much markup expected", another that encoding was "unnecessarily complicated", and one – who, unsurprisingly, is not a regular transcriber – found encoding "a hopeless nightmare" and the transcription process "a horror". (Causer and Wallace, “Building a Volunteer Community”)

  15. Encoding • But lots of things are hard. • Non-TEI mark-up is a challenge. • [O]ver half of respondents found that deciphering Bentham’s hand took longer than encoding (Causer and Wallace, “Building a Volunteer Community”)

  16. Making Encoding Easier • Tag Buttons • T-PEN/CCL • TEI Toolbar (TB) • Tag Menus • VdU • Papyrological Editor

  17. Button Limitations • Users outgrow buttons • “I believe one or two transcribers now add tags manually rather than use the toolbar, which says something about the improvement in their IT skills.” –Tim Causer (TB) • Users ignore buttons • “One editor for exampled prefered to put || for <lb> as he was used from the preparation of a printed edition.” –Georg Vogeler (VdU)

  18. Problem: Print Notations • Users ignore XML annotation, revert to conventions from printed editions. • Opportunity: transform print notation to TEI • Papyri.info (Leiden+) • Ininera Nova

  19. Leiden+ in Papyri.info

  20. Leuven Archives at Itinera Nova

  21. Leuven Archives at Itinera Nova

  22. Non-Problem: Print Notations • “I don't think we have seen many (if any) examples of this - transcribers do seem to stick to the tag set.” • “It's now very rare for any transcribers to submit a plain text transcription” –Tim Causer (TB)

  23. Why? • Material: • Established conventions for papyri, charters. • No single convention for modern print editions. • Users: • Public has no “bad habits” to break

  24. Tag Selection • “Several problems caused us to modify that button set. First, there were something like 67 necessary buttons, and it was maddening to fish around for the desired button. And the research assistants, who had been encoding in oXygen, just typed in angle brackets and memorized tags, instead of using the buttons. As a result, we made it possible to move the buttons to one's liking, so we could put them in an order in the button palette that made sense in terms of finding them. Second, keeping track of several branches of an XML tree line by line, even with the prompting closing buttons, was a pain. Therefore, airly soon thereafter we omitted the "purely structural" tags in the button set.” – Abigail Firey (CCL)

  25. Tag Selection • “I have a literary background, and saw further opportunities for encoding Bentham's authorial operations (refinements to the encoding of interventions in the text--things like that). But I recognised that what the project called for was light structural markup that did not overcomplicate the users' job.” –Justin Tonra (TB)

  26. Tag Selection • “The job of application developers is to find the most effective subset for a particular project, and devise ways for 'general' users to enter it, without distracting them unduly from the content they are inputting” –Richard Davis

  27. The Future: Training Games • “Currently we are going another way: teaching the users in "game play"-like method, i.e. building a demonstrator in which the novice can learn the use of the editor (and the whole system) by accomplishing some given tasks” • Georg Vogeler (VdU)

  28. The Future: WYSIWYG • One solution is to introduce, as an alternative, a What-You-See-Is-What-You-Get interface, so that transcribing will be like typing in a word-processor. In this scenario, the transcription toolbar would be done away with, and transcribers would not have to concern themselves with visible markup at all. • Tim Causer, Transcribe Bentham Blog

  29. The Future: Combinations?

  30. Interviewees • Transcribe Bentham • Melissa M. Terras • Justin Tonra • Tim Causer • Richard M. Davis

  31. Interviewees • T-PEN • James Ginther • Abigail Firey (CCL) • Papyri.info • Hugh Cayless • Tom Elliott • MOM-CA • Georg Vogeler (VdU) • Jochen Graf (Itinera Nova)

  32. Questions • Ben Brumfield • Independent software engineer • @benwbrum • benwbrum@gmail.com • FromThePage.com • http://manuscripttranscription.blogspot.com/

More Related