1 / 19

TPEN: Transcription for Paleographic and Editorial Notation

Publishing transcriptions as annotations of manuscript images. TPEN: Transcription for Paleographic and Editorial Notation. Funded by the Andrew W. Mellon Foundation and The National Endowment for the Humanities Initial beta release October 2011 http://www.digital-editor.blogspot.com/

calum
Download Presentation

TPEN: Transcription for Paleographic and Editorial Notation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Publishing transcriptions as annotations of manuscript images TPEN: Transcription for Paleographic and Editorial Notation Funded by the Andrew W. Mellon Foundation and The National Endowment for the Humanities Initial beta release October 2011 http://www.digital-editor.blogspot.com/ http://t-pen.org Jonathan Deering Saint Louis University jdeerin1@slu.edu

  2. A bit of history CCCC 415 the Norman Anonymous • Latin text written around 1106 in Normandy • Repositories providing digital images of manuscripts provide viewing environments that are fine for inspecting images, but not for transcribing them • Connecting the text with the image at the line level has a number of benefits for transcribing and viewing • Automatic line segmentation can handle identifying the lines quite well

  3. Connect a line of transcribed text with a line from the image

  4. Adding a repository • TPEN runs discovery process on a new repository, noting all MSS available and which image URLs make up that MSS using a customized spider or parsing a manifest • Metadata about MSS is stored as is image metadata • That is all! Currently have CEEC, e-codices, Houghton Library (Havard Univesity), La biblioteca del Sacro Convento di Assisi, and Parker on the Web.

  5. Choosing a manuscript

  6. The transcription Environment • User requests to transcribe a manuscript. • They may forgo modifying the list of images included and the image order, and being transcribing the first page. • TPEN downloads the first image, parses the lines, and uses the information to draw the transcription environment, which includes a request to the repository for the image. • The UI drawn for the user includes a request for the image from the repository, not from TPEN.

  7. The transcription UI

  8. Anatomy of a transcription • Transcribed text • Optional additional comment as annotation on the transcription • Image url + xyhw • Creator - useful when choosing among multiples • Date

  9. The life of a transcription • The user creates and saves their transcription. It is not made public unless they have given permission. • Exporting the transcription allows you to transform any xml tagging you may have included, and output the transcription as PDF, RTF, and XML. • You may also make it available as a set of OAC annotations which TPEN will host.

  10. Common editing processes 1. Transcribe (months) 2. Edit (years) 3. Publish (???)

  11. Why transcriptions as annotations? • Created content is based on original content, but separation is maintained • Creation requires some editorial decision making • Multiple annotations and transcriptions can exists for the same original content

  12. Publishing the transcription as an OAC annotation OAC annotations: 3 parts Body- The content of the annotation Target- The item that is being annotated Relationship-The fact that the relationship is annotation RDF Image Annotation transcription hasTarget hasBody

  13. An actual annotation oac:hasBody http://t-pen.org/transcription/197262/11 oac:hasTarget http://www.ceec.uni-koeln.de/projekte/CEEC/manuscripts/indiv/kn28-0122/kn28-0122_001.jpg#xyhw=0,770,35,486 http://t-pen.org/transcription/197262/11 tationis exercere conveniat.

  14. Why oac? • Semantic web approach fits well with the model of connecting text we have with images someone else has • Fits with sharedcanvas mechanism we use to publish image order of virtual manuscripts • Allows the transcription to be used in mashups without rehosting the transcription • Goal is machine readability!

  15. Treating transcriptions as a separate outcome Transcriptions can have great value aside from being an intermediate step in the production of an edition. Here are a few of the ways they can be used...

  16. Search can use the full text of the manuscript

  17. Text and image can be displayed side by side or overlayed to enable easier inspection

  18. Complex edited editions • Allows easy reference back to source transcriptions from within the edition • Allows reference to the source images without any additional effort by the editor • Allows interoperability of tools

  19. Acknowledgements • Stanford University Libraries • Houghton Library, Harvard University • John Hopkins University • University of Kentucky • University of Freibourg • University of Cologne • Bibblioteca del Sacro Convento di Assisi (Italy)

More Related