1 / 40

Kalev Leetaru, Eric Shook, and Shaowen Wang

A CyberGIS Approach to Digital Humanities and Social Sciences: The World of Textual Geography and a Case Study of Wikipedia’s History of the World. Kalev Leetaru, Eric Shook, and Shaowen Wang. CyberInfrastructure and Geospatial Information Laboratory (CIGI)

thad
Download Presentation

Kalev Leetaru, Eric Shook, and Shaowen Wang

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A CyberGIS Approach to Digital Humanities and Social Sciences: The World of Textual Geography and a Case Study of Wikipedia’s History of the World Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information Science School of Earth, Society, and Environment National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign CyberGIS ‘ 12, Urbana IL, August 8, 2012

  2. http://www.sgi.com/go/wikipedia

  3. Workflow Fulltext Geocoding Sentiment Mining CyberGIS

  4. Inside the CyberGIS “black box” Open Service API Workflow Management Services GISolve Middleware Security Data & Viz Resource Selection Domain Decomposition Task Scheduling CI Clouds XSEDE OSG Emotional Heatmap

  5. Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location1. Number of articles mentioning this location2. Number of articles mentioning both this location and topic3. Average tone of articles mentioning both this location and topic

  6. Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location1. Number of articles mentioning this location2. Number of articles mentioning both this location and topic3. Average tone of articles mentioning both this location and topic ?

  7. Spatializing Emotion 3 important elements 1. Importance of location 2. Prevalence of topic 3. Emotion toward topic Goal: Capture 3 elements on a single map

  8. 1) Importance of Location Every mention of a location increases its importance Generate a density map of the number of times a location is mentioned in text using Kernel Density Estimation (KDE) based on knearest neighbor search

  9. 1) Importance of Location

  10. 2) Prevalence of Topic We term topic intensity to capture the prevalence of a topic relative to other topics, and adopt a method commonly used in epidemiological studies to estimate it Relative risk is a ratio of the KDE of disease infection locations and case control locations

  11. Topic Intensity Topic Intensity KDE(articles that mention a topic)___ KDE(articles that do not mention the topic) Relative Risk KDE(points with disease)__ KDE(points without disease)

  12. Topic Intensity

  13. 3) Emotion Toward a Topic Challenging question: Is the emotional measure tone, discrete or continuous? Is tone "countable" like trees or does it exist as a continuum like air temperature? Tone is a continuum: Cannot have "number of tones"

  14. 3) Emotion Toward a Topic A different method is used, because tone is continuous and not discrete Inverse distance weighted (IDW) interpolation is used to estimate tone across space creating a tone map Tone map captures positive and negative tone toward a particular topic across space

  15. 3) Emotion Toward a Topic

  16. Overview – 3 layers Article density - Proxy: Importance of location Topic intensity - Proxy: Prevalence of topic relative to other topics Tone - Proxy: Emotion toward a topic

  17. Overview – 3 layers Article density - Proxy: Importance of location Topic intensity - Proxy: Prevalence of topic relative to other topics Tone - Proxy: Emotion toward a topic First two layers represent scaling factors for tone Value range: 0 - 1 Value range: 0 - 100 Value range: -100 - 100

  18. Emotional Heatmap Article Density * Topic Intensity * = Emotional Heatmap Tone

  19. Emotional Heatmap of Armed Conflict in 2003 (Wikipedia)

  20. Summary First steps, but started the dialogue Balance Managing the complexity of cyberinfrastructure access Simplifying the workflow of chaining of spatial analytics Making sense of what’s involved Scientific rigor

  21. Ongoing Work Translate spatial knowledge to domain knowledge by answering a basic question: why is this here and not there? Tackle spatial aggregation issues Represent locations as areas not points Areal interpolation

  22. Acknowledgments • GuofengCao, AnandPadmanabhan • National Science Foundation • BCS-0846655 • OCI-1047916 • Open Science Grid • XSEDE SES070004N 39

  23. Thanks! 40

More Related