220 likes | 2.64k Views
Yosi Mass IBM Haifa Research Lab. 5 Min self introduction. Rank XML Query – past & present. XML Query XML fragments (SIGIR’03, RIAO’04) XML fragments for DB (RIAO’07) XML fragments for UIMA (Unstructured Information Management architecture)
E N D
Yosi Mass IBM Haifa Research Lab 5 Min self introduction XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008
XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008 Rank XML Query – past & present • XML Query • XML fragments (SIGIR’03, RIAO’04) • XML fragments for DB (RIAO’07) • XML fragments for UIMA (Unstructured Information Management architecture) • XML fragments for MM (SAPIR. SIGIR’07 MIR workshop) • Rank XML • Extended Vector space model for XML ranking (SIGIR’03) • XML Component ranking (INEX’02-INEX’06) • Relevance feedback for XML ranking (INEX’04-INEX’05) • Applications • Desktop search using underline UIMA annotations and XML fragments
XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008 Motivation (cont.) • The world of search IR XML Database Data Format Unstructured Semi-Structured Structured Query Free text XQuery/XPath SQL ? Results Ranked Binary Binary (Match/NoMatch) Ranked • Goal: Apply IR methods to XML retrieval • IR-like (fuzzy) query format • IR-like ranked results
? XML fragment query IR-like query format for XML Compare apples with apples • XML fragments of the same nature as XML documents • A good representation for IR ranking algorithms ? Full-text documents Free-text query • Free-text of the same nature as documents XML documents XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008
XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008 //book/title[contains(., “search”)] <book><title>search</title></book> //book/title[contains(., “search”) or contains(.,”algorithms”)] <book><title>search algorithms</title></book> Expressing information needs • Find books with title containing “search” • In XQuery/Xpath • As fragment • Find books with title containing “search” or “algorithms” • In XQuery/Xpath • As fragment
XML Ranking Querying, Dagstuhl, 9-13 Mar, 2008 Interest • Top-k for large scale MM and text (XML) in P2P (SAPIR EU project) • XML search in P2P