1 / 13

Developing Reliable Automatic Metadata Generation: Feedback from MatDL Pathway

Developing Reliable Automatic Metadata Generation: Feedback from MatDL Pathway. NSDL Annual Meeting , Washington, DC November 6-8 2007 Advancing NSDL Networks. Cathy S. Lowe, Laura M. Bartolo, Kent State University. Outline. MatDL Pathway

jensen
Download Presentation

Developing Reliable Automatic Metadata Generation: Feedback from MatDL Pathway

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing Reliable Automatic Metadata Generation: Feedback from MatDL Pathway NSDL Annual Meeting , Washington, DC November 6-8 2007 Advancing NSDL Networks Cathy S. Lowe, Laura M. Bartolo, Kent State University

  2. Outline • MatDL Pathway • iVia metadata generation for PDFs & test set • Evolution of result set • description • title • keywords • author • Next Steps NSDL Annual Meeting 2007 Washington, DC

  3. NSDL Materials Digital Library Pathway http://matdl.org/matdlwiki http://matdl.org/virtuallabs NSF MS Initiatives (NIRTs, MRSECs, IMIs) • Soft Matter Wiki Virtual Labs • Intro to Solid State Chem • Intro to Bio Physics • Modern Chemistry Code Development • Matforge • NIST FiPy • CMU • DOE CMSN Teaching Resource Development • MS Teaching Archive Stewardship • MatDL Repository http://matdlforge.org http://teaching.matdl.org http://matdl.org NSDL Annual Meeting 2007 Washington, DC

  4. iVia metadata generation & original test set • Worked with iVia metadata generation only • Test set • PDF format • 83 undergraduate research papers from Cornell Center for Materials Research (CCMR) REU program NSDL Annual Meeting 2007 Washington, DC

  5. NSDL Annual Meeting 2007 Washington, DC

  6. NSDL Annual Meeting 2007 Washington, DC

  7. Evolution of result set • Metadata generation for PDFs not available (2005) • Metadata generation for PDFs available (2006) – improving over time • description • title • keyword • author ** recently available NSDL Annual Meeting 2007 Washington, DC

  8. Description generation Good accuracy for explicit “Abstract” • Correct - ~38% • Partially correct – ~33% • Incorrect/not generated – ~29% NSDL Annual Meeting 2007 Washington, DC

  9. Title generation Very good accuracy • precision 91.09% • recall 89.30% NSDL Annual Meeting 2007 Washington, DC

  10. Keyword generation Manually rated 5 keyphrases per document – Good accuracy • Highly descriptive - 39% • Acceptable - 41% • Unacceptable - 20% NSDL Annual Meeting 2007 Washington, DC

  11. Author generation --new functionality Applied to original sample: • Correct - 45% • Partially correct - 27% • Incorrect/not generated - 28% NSDL Annual Meeting 2007 Washington, DC

  12. Next Steps • Collaboration mutually beneficial for tool developers & NSDL community-based repositories • Continue to work with tool as it improves • Continue/expand working with MRSECs REU resources NSDL Annual Meeting 2007 Washington, DC

  13. Thank you & Questions?clowe@kent.edu The NSDL Materials Digital Library Pathway is supported by the National Science Foundation DUE-0532831. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. NSDL Annual Meeting 2007 Washington, DC

More Related