1 / 74

Edward A. Fox (with Hussein Suleman, Ming Luo) fox@vt fox.cs.vt

Building Digital Libraries Made Easy: Toward Open Digital Libraries ICADL 2002 – Singapore – Dec. 2002. Edward A. Fox (with Hussein Suleman, Ming Luo) fox@vt.edu http://fox.cs.vt.edu CS DLRL Internet TIC NDLTD CITIDEL NSDL …

mariel
Download Presentation

Edward A. Fox (with Hussein Suleman, Ming Luo) fox@vt fox.cs.vt

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Digital Libraries Made Easy:Toward Open Digital Libraries ICADL 2002 – Singapore – Dec. 2002 Edward A. Fox (with Hussein Suleman, Ming Luo) fox@vt.edu http://fox.cs.vt.edu CS DLRL Internet TIC NDLTD CITIDEL NSDL … Virginia Tech, Blacksburg, VA, USA

  2. Acknowledgements (Selected) • Sponsors: ACM, Adobe, DLF, IBM, Mellon Foundation, Microsoft, NSF (Grants CDA-9312611; DUE-0121741, 0136690, 0121679; IIS-0080748, 0086227, 0002935, and 9986089), OCLC, SOLINET, UNESCO, US Dept. Ed. (FIPSE), VTLS, … • Faculty/Staff (now): Boots Cassel, Su-Shing Chen, Debra Dudley, Jeremy Frumkin, Joe Futrelle, Lee Giles, Martin Halbert, Rex Hartson, John Impagliazzo, Deborah Knox, JAN Lee, Kurt Maly, Gail McMillan, Eric Morgan, Manuel Perez, Muhammad Zubair, … • Students: Fernando Das Neves, Marcos Goncalves, Rohit Kelapure, Aaron Krowne, Paul Mather, Ryan Richardson, Priya Shivakumar, Wensi Xi, Liang Xu, Baoping Zhang, …

  3. Outline • Overview, Problem • Experience: Case Study Projects • Open Archives Initiative • Hussein Suleman Dissertation • DL in a Box, OCKHAM • Summary and Conclusion

  4. Overview We • address the problem of how to develop DLs; • build on experience in building many DLs; • strive for simplicity as per OCKHAM initiative; • build upon the Open Archives Initiative; • demonstrate our approach in diverse situations; • and invite all to • use DL-in-a-box and • help build Open Digital Libraries.

  5. Problem Why do DL developers continue to “reinvent the wheel”? The top 10 reasons are: • The library budget won’t allow purchase of a commercial DL system. • Unless the development effort is local, there won’t be any control. • DLs are extensions of DBMSs, so they are simple applications to develop. • Since DLs operate on the Web, one must adopt the newest W3C proposal.

  6. Problem – cont’d • Since technology moves so quickly, it is essential to follow the latest fad. • CS students always develop from scratch. • This team knows it can do it better. • This system must have more capabilities than any other system. • This DL has to be more flexible and extensible. • This is the right system architecture – at last!

  7. Outline • Overview, Problem • Experience: Case Study Projects • Open Archives Initiative • Hussein Suleman Dissertation • DL in a Box, OCKHAM • Summary and Conclusion

  8. Experience: Case Study Projects • AmericanSouth.org • NDLTD • CSTC • JERIC • CITIDEL • NSDL • Digital Library in a Box

  9. AmericanSouth.org • Domain: culture and history of the southern region of America (USA) • Genre: diverse distributed collections at a dozen universities • Submission & Collection: local sites  Emory University (for SOLINET)

  10. Networked Digital Library of Theses and Dissertations (NDLTD) • Domain: graduate education and research • Genre: electronic theses and dissertations (ETDs) • Submission & Collection: local sites  www.ndltd.org, www.theses.org

  11. Computer Science Teaching Center (CSTC) • Domain: teaching computer science • Genre: courseware • Submission & Collection: www.cstc.org

  12. CS Teaching Center (CSTC): Lessons Learned • Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. • Learners benefit from having well-crafted modules that have been reviewed and tested. • Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built.

  13. Browsing (2)

  14. ACM Journal of Educational Resources in Computing (JERIC) • Domain: teaching computer science • Genre: courseware, scholarly articles • Submission & Collection: CSTC, ACM Digital Library

  15. JERIC • Journal of Educational Resources in Computing • Accessible from www.cstc.org and www.acm.org and www.citidel.org • ACM and SIGCSE support • Refereed and interactive • Part of ACM Digital Library

  16. Computing and Information Technology Interactive Digital Educational Library (CITIDEL) • Domain: computing / information technology • Genre: one-stop-shopping for teachers & learners: courseware (CSTC, JERIC), leading DLs (ACM, IEEE-CS, DB&LP, CiteSeer), PlanetMath.org, technical reports, … • Submission & Collection: sub/partner collections  www.citidel.org

  17. CITIDEL Team • An NSDL Collection Track project • Led by Virginia Tech, with co-PIs: • Fox (director, DL systems) • Lee (history) • Perez (user interface, Spanish support) • Partners • College of New Jersey (Knox) • Hofstra (Impagliazzo) • Villanova (Cassel) • Penn State (Giles)

  18. Multi-dimensional Categorization

  19. CITIDEL Collection Sources include ACM CSTC Research Index IEEE-CS … Experts’ finding aids NCSTRL include include include metadata fulltext NEC’s data data processed w. R.I. Borner’s info viz software repository include include ACM DL SIGCSE proceedings JERIC

  20. CITIDEL Collection Building thru Nominating Submitting include after after or thru Creating include after Searching, Browsing Crawling Composing thru aided by using using GetSmart Classifying Crawlifier VIADUCT

  21. Overview of CITIDEL architecture

  22. Distributed repository structure

  23. Digital library architecture for local and interoperable CITIDEL services

  24. National Science Digital Library (NSDL) • Domain: undergraduate and K-12 education, etc. • Genre: educational resources • Submission & Collection: sites of 90 projects  www.nsdl.org

  25. referenced items & collections referenced items & collections Special Databases Portals & Clients Portals & Clients Portals & Clients NSDL Services NSDL Services Other NSDL Services NSDL Collections NSDL Collections NSDL Collections Core Services: information retrieval CI Services browsing CI Services authentication Core Services: metadata gathering CI Services personalization Core Collection- Building Services protocols CI Services discussion Core Collection- Building Services harvesting CI Services annotation NSDL Information ArchitectureDeveloped by the Technical Infrastructure Workgroup User Interfaces CoreNSDL “Bus” Usage Enhancement Collection Building

  26. Digital Library in a Box • Domain: helping DL projects • Genre: any domain, but especially those involved in NSDL (since funded in part is through NSDL – with U. FL, NCSA) • Software and Documentation: http://dlbox.nudl.org

  27. Outline • Overview, Problem • Experience: Case Study Projects • Open Archives Initiative • Hussein Suleman Dissertation • DL in a Box, OCKHAM • Summary and Conclusion

  28. Open Archives Initiative OAI www.openarchives.org openarchives@openarchives.org

  29. Metadata harvesting The World According to OAI Service Providers Discovery Current Awareness Preservation Data Providers

  30. Technical Umbrella for Practical Interoperability… Metadata Harvesting Reference Libraries Museums Publishers E-PrintArchives …that can be exploited by different communities

  31. Tiered Model of Interoperability Mediator services Metadata harvesting Document models

  32. OA 1 OA 2 OA 4 OA 3 OA 5 OA 6 OA 7 OAI – Black Box Perspective Services: Search Browse Summarize Visualize Metadata: Docs: DO DO DO DO DO DO DO

  33. Archive Lite Sites NCSTRL Eprints Own: History, ResearchIndex, CSTC, … CITIDEL Active Aggregation throughOAI Harvesting IEEE-CS, ACM, …

  34. Protocol for Metadata Harvesting • Service Requests • Identify • ListMetadataFormats • ListSets • GetRecord • ListIdentifiers • ListRecords • Metadata Multiplicity • Date/Time Ranges • Sets (with semantics depending on local data providers) • Resumption Tokens

  35. NDLTD OAI Example

  36. Outline • Overview, Problem • Experience: Case Study Projects • Open Archives Initiative • Hussein Suleman Dissertation • DL in a Box, OCKHAM • Summary and Conclusion

  37. Open Digital Library (ODL) Hypothesis (Hussein Suleman) • Can we leverage the successful model of the OAI Protocol for Metadata Harvesting to alleviate our architectural problems ? Maybe … if Digital Libraries can be modeled as • networks of extended Open Archives, where • each extended Open Archive is a • source of data and/or a provider of services.

  38. Example Architecture (NDLTD) Virginia Tech User Interface PhysNet Humboldt Search Browse Recent Duisburg CalTech Union Catalog Dresden MIT Filter User Interface OAI/ODL archive OAI/ODL protocol legend MIT

  39. ODL Demonstration - FrontPage

  40. ODL Demonstration - Search

  41. ODL Demonstration - Browse

  42. Hussein Suleman’s Thesis Summary • Open Digital Libraries (DLs) • Open Archives Initiative (OAI) • Protocol for Metadata Harvesting (PMH) • Extending OAI-PMH provides the glue for building componentized DLs. • Lightweight protocols connect the components to support modular systems with good efficiency.

  43. Research in a Nutshell • We build extensible modular systems with customizable services. • This supports interoperability and allows distributed development. • This is in use in www.cstc.org, AmericanSouth.org, www.citidel.org, … • Components include search, browse, annotate, editorial support, union, filter, whats-new, submit, rate, recommend, …

  44. Image Video Video Video Image Image Program Program Program 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 Document Document Document 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 ? users digital objects

  45. Program Video Video Image Image Program Program Video Image ? 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 ? ? ? ? ? ? ? ? ? ? ? ? Document ? Document Document ? 1010100101010010101010010101010101010101 1010100101010010101010010101010101010101 ? 1010100101010010101010010101010101010101 ? ? ? ? ? ? componentized digital library

More Related