state department cables information retrieval system n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
State Department Cables Information Retrieval System PowerPoint Presentation
Download Presentation
State Department Cables Information Retrieval System

Loading in 2 Seconds...

play fullscreen
1 / 16

State Department Cables Information Retrieval System - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

State Department Cables Information Retrieval System. Fall 2007 LBSC 796 Erica Cooper, Linda Melchor Chris Reed, Jo-Han Rong Dave Rouff, Jess Snyder. Overview. About the collection Nature of the expected users About the search tool Batch evaluation Results User Study Results

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'State Department Cables Information Retrieval System' - kedma


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
state department cables information retrieval system

State Department Cables Information Retrieval System

Fall 2007 LBSC 796

Erica Cooper, Linda Melchor

Chris Reed, Jo-Han Rong

Dave Rouff, Jess Snyder

overview
Overview
  • About the collection
  • Nature of the expected users
  • About the search tool
  • Batch evaluation Results
  • User Study Results
  • Next Steps (Features to add, given time to do it)
about the collection
About the Collection
  • The U.S. State Department
    • Branch of the Federal Government responsible for U.S. foreign relations, diplomatic policy, and protecting U.S. citizens abroad.
  • 1973 to 1975 diplomatic communications
    • Behind the scenes look at international relations.
    • World events of the time: End of Vietnam war, Watergate, Bush (senior) becomes ambassador to China then DCI
anticipated user base
Anticipated User Base
  • For the nature of the collection, the IR system will be used by researchers who want to see US opinion change as events unfold over time.
  • Users will not be looking for only one message on a topic, but all messages on a topic
  • Users may not know telegram format for addresses and TAGS
about the search tool
About the Search Tool
  • State Cables IR system developed in Java using the following resources
    • NetBeans IDE
    • Apache
      • Lucene toolkit - Information Retrieval tools
      • Digester - import XML
  • Two major components
    • Importing XML formatted messages and building index
    • User GUI and Index Searcher
benefits geographic and tags abstraction

Japan

Tokyo

4793

3173

9326

Africa

XZ

10946

Iraq

IZ

2534

258

960

4172

1515

2783

3236

5440

Benefits: Geographic and TAGS abstraction
  • In the early 1970’s, telegram authors were encouraged to be brief, so left out key terms assumed to be known to the recipient.
batch evaluation results
Batch Evaluation Results
  • Inherent OR for search terms and abstraction terms causes increased recall, but lower precision.
user study results
User Study Results
  • Do novice users find the system easy to learn?
    • All the volunteers considered the system easy enough for a novice to use. However, one volunteer stated, “A person used to Google would expect more. ”
  • Can users easily learn to formulate effective queries using our system?
    • The responses were yes.
    • However, observations showed that we should emphasize that Boolean queries can be used. Initial search results were very large due to vague queries.
user study results1
User Study Results
  • Are there common mistakes or misunderstandings that can be addressed for a better design?
    • “AND” should be automatically capitalized so it is understood as a Boolean term not a query keyword.
    • ”It would be nice to be able to see the whole Subject/Title" so that it is easier to select which ones she wants to read.
  • What are their expectations for the system?
    • Users found that the system met their expectations.
    • One volunteer stated that that system was more "user friendly compared to NARA's current system.”
user study results2
User Study Results
  • What would they like to see in the design of the system?
    • Being able to hit the “Enter” key instead of having to click on the “Perform Search” button
    • Limit the number of hits per search
    • Make the interface wider
    • Provide summaries of articles from the result
    • Provide feedback that tells user that their request is being processed
    • A layout of that would separate the results to make it easier to read
user study results3
User Study Results
  • Were the added features, geographic abstraction” and TAGS abstraction used? If so, were they useful?
    • One volunteer used the added features, but could not tell if they worked.
  • Any suggestions or comments?
    • Highlight search terms
    • Jazz it up more visually
next steps
Next Steps
  • Bugs to fix and features to add given additional time
    • option to sort results by date rather than score
      • Telegram DTG format interpreted as a string, resulting in string based sorting.
    • Warning for large result set and option to cancel search before committing to wait
    • pull search function out of button click/hit list click so it is persistent past the click event
      • currently the system requires when a message is selected from the hit list
    • option to export results to a file
    • web GUI
    • option to accept or reject proposed abstraction tools
    • ability to recognize multi-word search terms