1 / 54

Processing PDF: How to Go from PDF to E-text to Audio

Processing PDF: How to Go from PDF to E-text to Audio. Gaeir Dietrich Director High Tech Center Training Unit of the California Community Colleges Foothill Community College District. PDF from Publishers. Portable document format (PDF) Reads the same on any computer Looks like the book

lali
Download Presentation

Processing PDF: How to Go from PDF to E-text to Audio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processing PDF:How to Go from PDF toE-text to Audio Gaeir Dietrich DirectorHigh Tech Center Training Unitof the California Community CollegesFoothill Community College District

  2. PDF from Publishers Portable document format (PDF) Reads the same on any computer Looks like the book Smaller than TIFFs Contains all the text Always check to make sure the book is the right one! Easy for publishers

  3. Requesting through ATN Access Text Network Now free for requesting files from ATN-member publishers Paid membership to exchange files www.accesstext.org Not all publishers But ATN does have the largest ones

  4. Other Resources at ATN • Accessible Textbook Finder • http://www.accesstext.org/atf.php • Link to Publisher Lookup • http://www.publisherlookup.org/ • Will have to contact non-ATN member publishers directly

  5. Using Publisher PDFs Sometimes students can use files directly Often files will need further processing for student use At the very least, large files may need to be broken into chapters

  6. PDF Strengths Good format for large print Cropping Fit to page on large pages Print sections on large pages (tiling) Adobe Reader has some nice features Change colors Reflow Limited voicing Works on both Mac and PC Easy for most publishers to create

  7. PDF Weaknesses Not always fully accessible Screen readers do not always like them—even when they are text-based Reading order can be problematic May be graphics (pictures of text) May have too much security

  8. As an Aside… • When faculty create PDFs… • The PDF always started as something else…usually a Word file • Try to get the starting document if the student prefers audio • Security concerns? • Word files can be password protected • Button > Prepare > Encrypt

  9. Types of PDF Documents Text-based Text can be selected Graphical Picture of text (i.e., a graphic) Text cannot be selected Use text-select tool to tell the difference Files may be “locked”

  10. Processing PDFs • Adobe Acrobat Professional • Check on College Buys for discount • Good OCR program • Abbyy FineReader • Nuance OmniPage • IF you are a Kurzweil campus, you will also need Kurzweil

  11. Adobe Tools Adobe Reader Free Useful for students who need minimalaccessibility features http://www.adobe.com/products/reader/ Adobe Acrobat Professional Essential for alt media specialists Extract text, create accessible PDFs, enabled Adobe Reader features www.uscollegebuy.com Discounted Price

  12. Acrobat Reader Reads aloud But does not highlight or track Enlarges text Nice reflow feature Changes text/background colors Text highlighting, sticky notes, and comments Access for text-based PDFs

  13. Production Features in Reader • Really designed for reading, not reformatting • Export PDF • Subscription service (about $20/year) • Upload PDF file, service auto-converts to Word, download

  14. Process with Acrobat Pro Cropping Enlargement for printing Tiling Extracting/deleting pages Combining/inserting pages Text extraction Works best with text-based PDF Does have built-in OCR capability

  15. Customize Quick Tools • Click on the “gear” • View > Show/hide > Toolbar Items > Quick Tools

  16. Quick Tools Menu

  17. Customize

  18. Please Note • To enable single-key shortcuts • Open Preferences dialog box Ctrl + K • Under General > select Use Single-Key Accelerators To Access Tools (first checkbox under Basic Tools)

  19. Cropping • Tools > Pages > Crop • Shortcut: C • (Please note: This shortcut brings up the mouse-driven cropping tool—must double click to open the dialog box!)

  20. Crop Tool

  21. Crop Toolbox

  22. Enlarging • Choose paper size/printer • File > Print > Size…to Fit • Shortcut: Ctrl + P (tab through) • Tip: Crop document before enlarging

  23. Print to Fit

  24. Tiling • Choose paper size/printer • File > Print > Poster > Tile Scale and Overlap • Shortcut: Ctrl + P (tab through) • Tip: Crop document before tiling

  25. Enlarge with Tiling

  26. Extracting Pages • Tools > Pages > Extract • Delete Shortcut: Ctrl + Shift + D • Extract Pages Shortcut: Alt V + T + P (opens Pages pane; F6 focuses in pane and can arrow down)

  27. Extraction Tool

  28. Tips for Extracting Chapters • Crop on complete file before extracting • Work on a copy!!!!! • Extract from end toward front! • Use table of contents to help • Place focus on first page of chapter to extract (beginning with last)

  29. Starting from the Back

  30. Combining • File > Pages > Insert • OR • Create > Combine files

  31. Inserting Pages

  32. Combining Pages

  33. Auto Extracting Text • File > Save As > MS Word • Retains styles and paragraphs • File > Save As > More options… • Text (Accessible) • Lose styles, places hard returns at end of line • Text (Plain) • Lose styles, keeps paragraphs • Shortcut: Alt F + A

  34. Save As Options

  35. Better Text Extraction OCR programs analyze text and structure Acrobat Pro has built-in OCR, but other programs provide more control Can control which text to include

  36. More Control over Text • For graphical PDFs • Or • To maintain more control over extracting text from text-based PDFs • Use an OCR program!

  37. Processing Graphical PDFs Must run optical character recognition (OCR) Computers cannot read pictures OCR programs recognize the “characters” in the picture How you process the file depends on the end format the student wants!

  38. Want to Stay in PDF? • Sometimes students do want a text-based PDF • Can OCR in Adobe Pro • Tools> Recognize Text

  39. Under Tools

  40. Want Text Out • OmniPage or FineReader • FineReader generally easier to learn • Save to Word or HTML or Text based on student preference • Use virtual printer with Kurzweil • Create KESI files • R&W • Save as Word

  41. Which One When? • Want a Word file? • Best choice is OmniPage or FineReader • Want a Kurzweil document? • Use Kurzweil to process the PDF • For students to do themselves? • Whichever program they prefer

  42. Why? OCR programs are designed to make extraction and editing easy Document readers (R&W, Kurzweil, etc.) are designed to make reading easy…NOT editing.

  43. NEVER!!! • Do NOT run OCR with FineReader or OmniPage…save to PDF…and then take into Kurzweil, R&W, etc. • Kurzweil, R&W, WYNN will run their own OCR on the PDF! • Wastes time, adds error to do OCR twice

  44. OCR Programs Treat PDFs the same as a TIFF If you OCR scanned documents, use the same process Load image file Select zones Create templates as needed

  45. OCR Process Details • Crop before loading into OCR engine • Turn on multiple languages as needed • If doing math, turn on Greek • Only turn on the languages you need • Edit in the OCR program • Some OCR programs have font matching features • Save to Word

  46. Captions and Such • For students who want audio or who are using screen readers • Separate the main body of the text and the “ancillary text” (captions, sidebars, footnotes) • Create two documents 00 Chapter and 00A Chapter • Allows the student to hear main text uninterrupted

  47. Two Doc Workflow • Open PDF in OCR Program • Analyze layout for entire document • Save a copy • On one copy…delete all ancillary text • Save to Word as 00 Chapter • On other copy…delete all main body text • Save as 00A Chapter • Keep page numbers in both documents!

  48. Once in Word • Learn to use “show hidden” • Ctrl + Shift + 8 • Beware of the optional hyphen • Search and replace to delete • Search for ^- replace with nothing • Run spell check • Use styles to structure files for braille program

  49. Converting Files

  50. Mobile Readers? Check formats that device can handle Some handle PDF and DOC, some do not All readers handle TXT Also called text, ASCII Can save from Word as plain text

More Related