80 likes | 236 Views
Video from the Web and its Context: Defining Boundaries, Developing Technology. NDIIPP Partner Meeting Arlington, VA July 9, 2008 Gary Marchionini & Helen Tibbo School of Information & Library Science, UNC-Chapel HIll. Context Emerges Through Use.
E N D
Video from the Web and its Context: Defining Boundaries, Developing Technology NDIIPP Partner Meeting Arlington, VA July 9, 2008 Gary Marchionini & Helen Tibbo School of Information & Library Science, UNC-Chapel HIll
Context Emerges Through Use • A primary concern is capturing dynamics associated with video content. • Given a video, what associated context will help people in the future understand the video and its role in human history?
Usage Characteristics Beyond the usual metadata (Title, Description, Username, Time when video added, Duration in seconds, Category, Keywords) • Number of times viewed • Number of times annotated • Text • Video • Rank in Results List • Number of times favorited • Number of times linked to • From the web • From (specific) blogs • Allusions and Mashups • References, reviews in other venues (print, e-media)
What to Harvest? A Collection Development Issue • Topic • US Presidential Election 2008 • Epidemics and pandemics • Energy • Medical issues • Natural disasters • Truth commissions • Source (where to harvest) • YouTube (Blinkx, thenewsroom, etc.) • Blogpsphere • Specialized collections (e.g., NYT, CNN, CSPAN, specialized archives, open video, public.tv, etc.)
How to Harvest • Crawl (follow links) vs. query (use API) • Metadata and context plus video files • Storage (DB, SRB) • Parameters • How often? (daily) • How many results? (100 hits) • How many hops (0) • (use YT API to execute 57 queries each day, store results in MySQL db, flash files in SRB store house in Odum Institute)
Progress (adapted from JCDL 08 paper) ~21000 videos today