1 / 55

Analyzing Text at the Middle Distance between the Close Read and Culturomics

This research explores the intersection of close reading and culturomics and how they can be used to analyze and understand text. It discusses the concept of "close reading" and the broader field of culturomics. It also presents two case studies using WordSeer to analyze North American slave narratives and Shakespeare's works. The research highlights the importance of analyzing text at the middle distance between close reading and culturomics.

nicholem
Download Presentation

Analyzing Text at the Middle Distance between the Close Read and Culturomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Text at the Middle Distance between the Close Read and Culturomics • Marti A. Hearst • U.C Berkeley • Joint Work with AditiMuralidharan

  2. Background: Culturomics (Text Mining) Middle Distance: Sensemaking Foreground: The Close Read

  3. Definition: “Close Read” “Close reading describes, in literary criticism, the careful, sustained interpretation of a brief passage of text. Such a reading places great emphasis on the particular over the general, paying close attention to individual words, syntax, and the order in which sentences and ideas unfold as they are read.” Text -English Wikipedia, 6/4/2012

  4. “Power and Passion in Shakespeare’s Pronouns Interrogating ‘you’ and ‘thou’” Penelope Freedman, 2007, MPG Books, 280 pp. Scene from “As you like it” by Daniel Maclise (1806-70)

  5. Conclusions (“Power and Passion of Shakespeare’s Pronouns”) Text “The subtleties of the use of ‘you’ and ‘thou’ that have emerged … can seem, at worst, random or, at best, unfathomable. … A set of oppositions has been revealed here: … These oppositions are complex and slippery: they may operate in parallel, may converge or diverge. Each pronoun choice has to be seen in a highly specific context.”

  6. Definition: “Culturomics” Narrower than “digital humanities” and broader than “corpus linguistics”. Text ( Loose interpretation of definitions at culturomics.org )

  7. “Culturomics” example:middle distance vs. middle ground

  8. As an NLP Researcher, where do your ideas come from? Can HCI improve your work?

  9. Pirolli and Card 2005, Pirolli and Russell 2011 Sensemaking • A vague information need • Iteratively refine it by • Searching • Reading • Analyzing • Reach understanding

  10. Sensemaking for Literature Study

  11. WordSeer (version 1) The North American Pre-civil-war Slave Narratives

  12. Do the north american slave narratives all conform to the same stereotypes? The North American Slave Narratives • Stories of the lives of former slaves • Published by white abolitionist sponsors • About 3000 narratives survive • ~300 in prototype

  13. A “Master Plan” for the slave narratives “... conventions so early and firmly established that one can imagine a sort of master outline drawn from the great narratives and guiding the lesser ones” Text -- Olney, J. “I was born: Slave Narratives and their Status as Autobiography”, Callaloo, 1984

  14. Our approach • Phase 1: Support searching for instances of conventions • Phase 2: Support visualizing their occurrence in the collection

  15. Searching for stereotypes • Keyword search is not enough • Search words: “cruel”“harsh”“overseer”“master”“mistress” • Instead: “overseer”“master”, “mistress”described as“cruel”, “harsh” • Also want the entire picture, for comparison • “overseer”“master”, “mistress”described as ____?_____ • ___?_____ described as cruel

  16. modifier object subject Natural language processing The cruel overseer beat us severely. (automatically-extracted structure)

  17. Grammatical search

  18. Part 2: visualizing stereotypes • Prevalence • Position of occurrence within a document • Across the entire collection

  19. “I was born”

  20. Results (presented at MLA 2012) • Prevalent stereotypes • “I was born” • Separation from parents • Cruel treatment • Escape • A ‘missed’ stereotype • Parents’ death • Not as strictly ordered as implied by Olney’s master plan.

  21. Problems • Vocabulary • Same concept expressed with many different wordings • Needed to see synonyms, nearby words, suggestions on searches • Comparison and curation • Couldn’t isolate and compare results on sub-collections of document

  22. WordSeer (version 1.5) wordseer.berkeley.edu

  23. Analyze Hamlet. How does the portrayal of men and women in Shakespeare change in different circumstances? The complete works of Shakespeare • 42 documents -- plays and sonnet collections • 1589 -- 1612 English 203:Hamlet in the Humanities Lab Spring 2012, University of Calgary (CHI ’12 works in progress)

  24. The Vocabulary Problem Which words embody the concept of female beauty? 261 results

  25. Collection and Comparison Does the treatment of love vary between the comedies and tragedies?

  26. Collection and Comparison Step 2. Compare word usage

  27. comedies tragedies

  28. “in love” comedies tragedies

  29. Results • WordSeer 1.5 being successfully used (so far) in Hamlet class • How does the relationship between Hamlet and his mother change over the course of the play? • How does Act 1 portray the character of Horatio? • Investigated changing language use around men and women • Unknowingly replicated and extended previous findings by other Shakespeare scholar

  30. How Does This Apply to Social Media Language?

  31. As an NLP Researcher, where do your ideas come from? Can HCI improve your work?

  32. Sentiment Analysis? Sarcasm?

More Related