1 / 23

Digital Libraries

Digital Libraries. Nick Narcise April 4 th 2006. What is a Digital Library?. What is a Digital Library?. Definition from Wikipedia

Download Presentation

Digital Libraries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Digital Libraries Nick Narcise April 4th 2006

  2. What is a Digital Library?

  3. What is a Digital Library? Definition from Wikipedia A digital library is a library in which a significant proportion of the resources are available in machine-readable format (as opposed to print or microform), accessible by means of computers. The digital content may be locally held or accessed remotely via computer networks.

  4. D-Lib Magazine

  5. What Do You Do with a Million Books? • Gregory CraneTufts UniversityD-Lib Magazine March 2006 Volume 12 Number 3 ISSN 1082-9873 http://www.dlib.org/dlib/march06/crane/03crane.html

  6. Main Focus The ability to extract from the stored record of humanity useful information in an actionable format for any given human being of any culture at any time and in any place

  7. Reduce the tangle of text mining, analysis, and searching technologies converting analog source to text translating one language to another Transform raw text into data

  8. How is a Library digitized? The process of digitizing a library began with the catalog, moved to periodical indexes and abstracting services, next to periodicals and large reference works and finally book publishing. Some of the largest and most successful digital libraries are Project Gutenberg, ibiblio and the Internet Archive.

  9. Optical Character Recognition From Wikipedia, the free encyclopedia Optical character recognition, usually abbreviated to OCR, involves computer software designed to translate images of typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them in (ASCII or Unicode).

  10. Problems with OCR • May have errors • Useless as a knowledge base • Human beings are still much better at reading and interpreting the contents of page images than machines.

  11. Text, Information, Knowledge and the Evolving Record of Humanity • Gregory Crane and Alison JonesTufts UniversityD-Lib MagazineMarch 2006 Volume 12 Number 3 ISSN 1082-9873 http://www.dlib.org/dlib/march06/jones/03jones.html

  12. C. Montgomery Burns: "I'd like to send this letter to the Prussian consulate in Siam by aeromail. Am I too late for the 4:30 autogyro?" Clerk: "Uhhh, I better look in the manual ..." Burns: "The ignorance! ..." Clerk: "This book must be out of date – I don't see 'Prussia,' 'Siam' or 'autogyro.'" From "Mother Simpson," The Simpsons Television Show, Episode 3F06

  13. Digital Reference Materials Thesaurus of Geographic Names (TGN) • Includes names and other information about places such as cities, counties, nations and their associated physical features like mountains, coasts and rivers. Other information related to history, population, culture, art and architecture is included. • TGN can associate the obsolete name Siam with the nation of Thailand (tgn,1000142) – but also with towns named Siam in Iowa (tgn,2035651), Tennessee (tgn,2101519), and Ohio (tgn,2662003). Prussia appears but as a general region (tgn,7016786), with no indication when or if it was a sovereign nation. Alexandria Digital Library (ADL) • represents a sophisticated framework with which to create such resources: places can be associated with temporal information about their foundation (e.g., Washington, DC, founded on 16 July 1790),

  14. Consider the sentence “The current price of tea in China is 35 cents per pound."

  15. The idea is that a digital library could • plot the prices of various commodities in different markets over time, • plot the various lifetimes of individuals, or extract and classify many events would be very useful

  16. Digital Reference Materials • Carefully transcribed primary sources <l n="22">Forte fuit iuxta tumulus, quo cornea summo</l> • Gazetteers and semi-structured text sources <div 2 type=entry><head>AARONSBURG</head><p>P v., Hains t., Centre co., Pa. It is at the eastern extremity of Penn's valley, near Penn's creek, 32 m. Bellefonte, 89 N.W. Harrisburg. 181 W. It contains a lutheran church, two stores, and 450 inhab • Citation-based authority lists <div1 type="entry" id="abdera"><head>Abdera</head><div2 type="subentry" id="abdera-1"><head>Abdera, city of Thrace</head><div3 type="index"><list type="index"><item><bibl n="Paus. 6.5.4">Paus. 6.5.4</bibl>, <bibl n="Paus. 6.14.12">Paus. 6.14.12</bibl></item><item>a town of Thrace on the Nestus: <bibl n="Hdt. 1.168">Hdt. 1.168</bibl>, <bibl n="Hdt. 6.46">Hdt. 6.46</bibl>, <bibl n="Hdt. 7.109">Hdt. 7.109</bibl>, <bibl n="Hdt. 7.120">Hdt. 7.120</bibl>, <bibl n="Hdt. 7.126">Hdt. 7.126</bibl></item><item>founded at grave of Abderus: <bibl n="Apollod. 2.5.7">Apollod. 2.5.7</bibl></item><item>Xerxes' first halt in his flight: <bibl n="Hdt. 8.120">Hdt. 8.120</bibl></item></list></div3></div2></div1>

  17. Digital Reference Materials • Machine readable dictionaries • <entryFree id="n3709" key="a)krwth/rion" type="main"><orthextent="full" lang="greek">a)krwth/rion</orth>, <genlang="greek">to/</gen>, (<etym lang="greek">a)/kros</etym>)<sense id="n3709.0" n="A" level="1"><tr>topmost</tr> or <tr>prominent part</tr>, <foreign lang="greek">a). tou= ou)/reos</foreign> mountain <tr>peak</tr>, <bibl n="Perseus:abo:tlg,0016,001:7:217"><author>Hdt.</author><biblScope>7.217</biblScope></bibl> • General Encyclopedias

  18. A Research Library Based on the Historical Collections of the Internet Archive • William Y. Arms, Selcuk Aya, Pavel DmitrievComputer Science Department, Cornell University • Blazej KotInformation Science, Cornell University • Ruth Mitchell, Lucia WalleCornell Theory Center, Cornell University D-Lib Magazine February 2006 Volume 12 Number 2 ISSN 1082-9873 http://www.dlib.org/dlib/february06/arms/02arms.html

  19. Main Idea of Article Academic researchers have to comb through collections of libraries, museums, and archives to analyze and synthesize the information buried within them.

  20. A Web Library for Social Science Research Idea is to replace much of the tedious manual effort with computer programs that act as their agents. challenge was to organize the materials and provide powerful, intuitive tools that will make a huge collection of semi-structured data accessible to researchers, without demanding high levels of computing expertise.

  21. Questions?

  22. Thank You

More Related