1 / 24

JSTOR Advanced Technology Research Group

JSTOR Advanced Technology Research Group. Working in collaboration with other researchers to provide access to advanced technologies within the same workspace as the literature and primary source material.

mio
Download Presentation

JSTOR Advanced Technology Research Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. JSTOR Advanced Technology Research Group Working in collaboration with other researchers to provide access toadvanced technologies within the same workspace as the literature and primary source material. This will enhance discovery and further encourage the use of the materials, enabling new scholarship and the creation of new knowledge.

  2. JSTOR • As a digital library we’ve done quite well.

  3. Research Support • JSTOR has always worked with researchers by providing datasets and supported them in analyzing our data in their research. • We continue to that and will rarely refuse a reasonable request (we handle1-2 dozen requests a year for usage, citation and content data). • Recently we decided to become a more active participant in the use of JSTOR data for research.

  4. Evolving JSTOR’s Technology • JSTOR is, for many scholars, their digital ‘bookshelf’ (or part of it). • The ‘real’ work takes place at the workbench, not the bookshelf • Workbenches are ‘trade’ specific (though tools need not be). • We need to recognize the diversity of practice and, as yet, neither we nor the practitioners really understand what is needed for each digital practice. • JSTOR’s Showcase is where we bring technology and scholarship together and develop digital workbenches for our constituents. • It is an open, ongoing digital workshop with shared tools and materials, where we try to build workbenches with and for other scholars. • It will be extended to interwork with other facilities and provide APIs to our functionality and resources.

  5. The plan • We will host tools and technologies from the community (including JSTOR), quickly and openly, working on JSTOR content (and others where available). • Showcase will be a step toward ‘real’ offerings, but our betas will be as useful and usable as we can manage. • We will actively solicit and respond to feedback, so that our workbenches will evolve. • We will provide a place where researchers can expose their work to users for the mutual benefit of both.

  6. Active projects • DfR – Simple text mining and corpus exploration • Visualizations of JSTOR usage, participants • Topic mapping ( Blei / Princeton ) • Document Remastering from camera Images – aka “Decapod” (Breuel/Kaiserslautern, Treviranus/Toronto) • Open Annotation Collaboration (Cole, Von de Sompel, Cohen, Sanderson et al) • …

  7. Data for Research - Examples The long ‘s’ The British Empire The golden age of social sciences

  8. Foresite • A collaborative program with University of Liverpool and HP Labs, Bristol • Build a relationship graph of the entire JSTOR corpus. • Explore using an ‘acetate overlay’ model.

  9. Foresite OAI-ORE Explorer. With U. Liverpool & HP Labs

  10. Decapod • A collaborative program with University of Kaiserslautern & University of Toronto. Funded by the Andrew Mellon Foundation. • Building a small, inexpensive, easy to operate, ‘1-click’ paper to document digitization rig. • Apply state of the art document understanding and usability to allow small institutions to digitize their collections.

  11. Decapod • “1-Click”, paper to remastered document. • OSS software. • state-of-art document understanding. • Mobile friendly (reflow). • Operator friendly. • Budget Friendly. • Based off Ocropus & Fluid • Partners: DFKI/Kaiserslautern, ATRC/Toronto, JSTOR

  12. Open Annotation Collaboration • A Mellon funded project, starting in May 2009, with the over arching goals to • Facilitate the emergence of a Web and Resource-centric interoperable annotation environment that allows leveraging annotations across the boundaries of annotation clients, annotation servers, and content collections. Interoperability specifications will be devised. • To demonstrate through implementations an interoperable annotation environment enabled by the interoperability specifications in settings characterized by a variety of annotation client/server environments, content collections, and scholarly use cases. • To seed widespread adoption by deploying robust, production-quality applications conformant with the interoperable annotation environment in ubiquitous and specialized services and tools used by scholars (eg. JSTOR, Zotero, and MONK). • The partners in this project are: • University Library and Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign • Center for History and New Media, George Mason University • Maryland Institute for Technology in the Humanities, University of Maryland • eResearch Laboratory, School of Information Technology & Electrical Engineering, The University of Queensland • Research Library, Los Alamos National Laboratory • JSTOR • Contacts • Tim Cole UIUC • Clare Llewellyn JSTOR

  13. What we are investigating. • Corpus Analysts Workbench • Topic Mapping with LDA (Blei / Princeton) • Extract topic “signatures” and trace them through time. • Evolution of Ideas (Blei & Gerrish / Princeton) • Identify ‘ideas’. • Concept extraction using Associative Rule Mining (Sanderson / Liverpool) • Citation strategies and the impact of those strategies (Adamic / UM) • Citation and Similarity structures in Corpora (Bergstrom & West / UW ) • Eigenfactor and similar techniques.

  14. What we are investigating (cont.. ) • Linguists Workbench • Tools for Linguists using SEASR (Llora / UIUC, O’Donnell / U. Mich.) • Association Rule Mining – from the journal “Evolution” (Sanderson / Liverpool). • Oldest English Words analysis (Pagel / Reading)

  15. What we are investigating • Computer-Assisted analysis and mineable Knowledge bases • Art Auction Catalogues • Using accredited “crowd-sourcing” to review, correct machine-generated corpus of documents and ‘lot’ records. • Computer Aided Transcription of manuscripts. • Using modern image-matching techniques to scattershot transcribe documents _and_ build a database of script signatures for mining. • Digital Staining of Plant Specimens. • Use feature extraction, analysis and false-coloring to highlight morphological structures and create a database of signatures for mining.

  16. Global Plants Initiative Base Image

  17. Analysis from previous illustration showing reticulated venation.

  18. What we are investigating • Community Specific / Miscellaneous • Tools for Secondary Schools (SEASR) (Llora / UIUC) • Reading level & synopsis generation. • NEH/NSF/SSHRC/JISC Digging into Data program (community support & participant in 1 proposal)

  19. Summary • Thank you • John Burns – john.burns@jstor.org • Main site: http://www.jstor.org • Showcase: http://showcase.jstor.org • Data for Research: http://DfR.jstor.org

More Related