1 / 12

Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Query Representation and Understanding Workshop 2011 (QRU '11) ACM SIGIR 2011, Beijing, China. Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries. Language of Queries.

yen-kelley
Download Presentation

Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Representation and Understanding Workshop 2011 (QRU '11) ACM SIGIR 2011, Beijing, China Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

  2. Language of Queries • Interaction between user and search engines over the years has resulted in the evolution of a distinct language for Web search queries gprsconfigsamsung focus at&t samsung focus at&tgprsconfig focus configat&tgprssamsung Query Representation and Understanding 2011 (QRU '11)

  3. Language of Queries How can we begin to analyze this new language? Query Representation and Understanding 2011 (QRU '11)

  4. Complex Networks • Real life networks not easily explained by standard topologies • Applications to linguistics – word co-occurrences, consonant inventories, syntactic and semantic features, language dynamics Query Representation and Understanding 2011 (QRU '11)

  5. Complex Networks Word co-occurrence networks: Interesting tool to discover fundamental properties of a language Query Representation and Understanding 2011 (QRU '11)

  6. Data 16.7 million entries sampled from Bing Query Logs from Australia (February – May 2009) Courtesy: Microsoft India Development Center Query Representation and Understanding 2011 (QRU '11)

  7. Network Models for Queries • “gprs” “config” “samsung focus” “at&t” • “dell laptop” “extreme” “gaming” “config” Global co-occurrence config Local co-occurrence Edge restriction extreme samsung focus gprs dell laptop gaming at&t Query Representation and Understanding 2011 (QRU '11)

  8. Two-regimePower Law • Two-regime power law in degree distribution • Similar coefficients for queries and English • Kernel (K-Lex) and peripheral (P-Lex) lexicon distinction Query Representation and Understanding 2011 (QRU '11)

  9. K-Lex and P-Lex • Higher mean shortest paths • Less tight kernel • More k-p edges • Socio-cultural effects Insights (1) • Differences in compositions of K-Lex and P-Lex • Heads and modifiers Query Representation and Understanding 2011 (QRU '11)

  10. K-Lex and P-Lex • Higher mean shortest paths • Less tight kernel • More k-p edges • Socio-cultural effects Insights (2) airedale terrier • Higher mean shortest path in query networks • Peripheral units can independently form queries • More difficult to understand the context of a previously unseen unit • High surprise factor tumor where download prison break Query Representation and Understanding 2011 (QRU '11)

  11. K-Lex and P-Lex • Higher mean shortest paths • Less tight kernel • More k-p edges • Socio-cultural effects Insights (3) • Kernel is less tightly coupled • 98% edges run between kernel and periphery, while intra-kernel edges dominate in English • Socio-cultural factors govern kernel-periphery distinction (lyrics, movies, adelaide in K-Lex; code, accessories, delhi in P-Lex) Query Representation and Understanding 2011 (QRU '11)

  12. Query Representation and Understanding 2011 (QRU '11)

More Related