1 / 20

Automatic indexing and retrieval of crime-scene photographs

Automatic indexing and retrieval of crime-scene photographs. Scene of Crime Information System (SOCIS). Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield. Outline. Application Scenario Project Overview SOCIS features Text-based approaches

Download Presentation

Automatic indexing and retrieval of crime-scene photographs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Automatic indexing and retrieval of crime-scene photographs Scene of Crime Information System (SOCIS) Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield

  2. Outline • Application Scenario • Project Overview • SOCIS features • Text-based approaches • Using NLP: • The Indexing mechanism • The Retrieval mechanism • Preliminary system evaluation • Links Cambridge 2002

  3. Crime Scene Documentation:Current Practices • Scene of Crime Officers: • attend crime scene • photograph the scene • collect evidence (package and label items) • write reports and create indexed photo-album(s) • case-files piled in storage rooms Cambridge 2002

  4. Examples Cambridge 2002

  5. IT support for CSI • Crime Investigation requires: • Fast and accurate retrieval of case-related info (and therefore efficient classification of this info) • Identification of “patterns” among cases • IT support for Crime Investigation: • Governmental agencies’ Systems (HOLMES) • Commercial Systems (LOCARD, SOCRATES) (Crime Management and Administration Systems) Needed: “Intelligent”support for Crime Investigation Cambridge 2002

  6. 2000 - 2003 Project Overview • Domain: Scene of Crime Investigation (SOC) • Scenario: Use of digital photography and speech to populate a central police database with case related information • Objective: Creation of a prototype system that allows for intelligent indexing and retrieval of crime photographs Cambridge 2002

  7. SOCIS features • Access through the web (JSP application) • Storage of case documentation & meta-information in central database • Automatic indexing of photographs • Automatic retrieval of photographs • Automatic population of official forms Cambridge 2002

  8. Cambridge 2002

  9. Cambridge 2002

  10. Cambridge 2002

  11. “view of deceased with computer cable removed” Cambridge 2002

  12. Text-based image indexing & retrieval: approaches • Manual assignment of keywords • Automatic extraction of keywords (statistics +/ semantic expansion) [Smeaton’96, Sable’99, Rose’00] • Extraction of logical form representations (syntactic relations and concept classification) [Rowe’99] Precision and recall increase as indexing terms go beyond keywords capturing relational info Cambridge 2002

  13. Text-based image indexing & retrieval: problems •  keyword barrier • syntactic relations need to be complemented with semantic information • Consider: • “view to the loft” vs. “view into loft” • “position of baby with no bedding” • “position of baby with bedding removed” Cambridge 2002

  14. Pipeline of processing resources: tokeniser  sentence splitter  POS tagger  lemmatizer  NE recognizer  parser  discourse interpreter (+ triple extraction layer) Indexing terms Query triples ARG1 REL ARG2 ARG1 REL ARG2 Indexing-Retrieval Mechanism captions matching OntoCrime + KB Free text query Cambridge 2002

  15. Corpus and Domain Model • 1200 captions from 350 different crime cases dealt by South Yorkshire Police (text files) • 65 captions (transcribed speech experiment) Different lengths but same characteristics: Phrasal constructions, named entities, meta-info, what and where references Domain model = OntoCrime and knowledge base Role = selection restrictions for triples’ arguments and semantic expansion for retrieval Cambridge 2002

  16. Triple Extraction • 17Relations : AND, AROUND, MADE-OF, OF, ON, WITHOUT etc. • Form of triples: ARG1 REL ARG2 • Restrictions and filters for arguments • Rules for captions with multiple relations • Inferences restricted to certain cases Cambridge 2002

  17. Triple Extraction examples • “body on floor surrounded by blood” Body ON floor blood AROUND floor blood AROUND body • “shot of footprint on top of bar” • “photograph from behind bar of body on floor” • “bottle, gun and ashtray on table” • “footprint with zigzag and target on chair” Cambridge 2002

  18. Class: Class: Retrieval Mechanism • Allow for free text query • Extract relational facts from the query • Match the query triples with the indexing triples of each captioned photograph • Allow for exact match of arguments or class info ARG1, RELATION, ARG2 • If no triples can be extracted, keyword matching takes place with semantic expansion if needed Cambridge 2002

  19. Preliminary Evaluation • Indexing mechanism evaluation run on 600 captions indicated refinements on the rules (80% accuracy in extracting and inferring triples) • Preliminary usability evaluation with real users: Relational information considered to be an intuitive way for forming queries for image retrieval • Future work: overall evaluation of free text query for image retrieval Cambridge 2002

  20. Conclusions • Could the SOCIS approach be ported to other domains ? • Thorough testing and experimentation needed • However, it is a corpus-driven approach: Not just an alternative image indexing/retrieval approach,but the one dictated by a real application For more information on SOCIS: http://www.dcs.shef.ac.uk/nlp/socis Cambridge 2002

More Related