1 / 31

Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop. Bachelor Thesis Leibniz University of Hanover Micha ł Kopycki. Bestseller. H ow to write SPYWARE for “research purpose” and get paid for this. Personalization Research Issues (from Eelco’s presentation).

julius
Download Presentation

Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Implicit Feedback to Identify Usage Patterns on the Desktop Bachelor Thesis Leibniz University of HanoverMichał Kopycki Michał Kopycki

  2. Bestseller How to write SPYWARE for “research purpose” and get paid for this Michał Kopycki

  3. Personalization Research Issues (from Eelco’s presentation) Data Acquisition Knowledge Inference User Model Adaptation Decision Making Adaptation Mechanism Michał Kopycki

  4. Outline • Motivation Motivation Logging Framework User study Conclusion and future work Michał Kopycki

  5. [CGNP05] Haystack ‘97 [CN06] [CSC+07] Letizia ‘95 LifeStreams ‘96 [Her06] [CDH+08] JIRIT ‘00 User Context [RM00] Beagle++ ‘05 [TDH05] [BM02] Stuff I’ve Seen ‘03 [WJR02] Movielens Amazon LastFM StumbleUpon Libra Del.icio.us Michał Kopycki

  6. User Context ... in our context Interaction with resource as context • Resource as context Sequence of access TFxIDF Sender Reading time Genre Reference Time windows Bookmarking GPS location Web address Printing document Michał Kopycki

  7. What is user context good for ? • Relationships between resources • Elicitation of user interests • Activity based computing Sergey Chernov, Task Detection for Activity-Based Desktop Search, L3S Research Seminar Michał Kopycki 10/8/2014 7

  8. “…exploiting usage analysis information about sequences of accesses to local resources…” (L3S 2006) Thesis goals • User context recognition support • Public Desktop dataset alternative „… The absence of shared information makes it difficult to focus research problems, and to compare research results…” (Newman 1997) “…an appropriate common test collection that is accepted by the community is required…” (Voorhees. 2001) “…Building a Desktop IR testbed seems to be more challenging…”(L3S 2007) • “…Desktop datasets within different research groups using a single methodology and a common set of tools …” (L3S 2008) Michał Kopycki

  9. Outline • Motivation Logging Framework User study Conclusion and future work Michał Kopycki

  10. Requirements • Automatic • Automatic • Cross-application • Cross-application • Implicit Feedback • Implicit Feedback Relevant A • Privacy preserving • Privacy preserving Not relevant Web Logging Framework Email Relevant • Extensible • Extensible B Not relevant New best Web browser plug-in New best Email client plug-in Relevant C IM Not relevant File System Michał Kopycki

  11. Our approach Resources Applications Michał Kopycki

  12. Component view User Activity Logger C\C++ File system drivers Desktop Internet Explorer Window Events Windows undocumented API Outlook Express Window hooks File System XUL VSTO Thudnerbird Firefox Outlook 2003 Outlook 2007 Thunderbird .NET Firefox C# JavaScript Michał Kopycki

  13. Logging Framework Michał Kopycki

  14. Supported notifications Michał Kopycki

  15. Nepomuk adaptation Logging Framework User Observation Hub Michał Kopycki

  16. Outline • Motivation Logging Framework User study Conclusion and future work Michał Kopycki

  17. User study • 21 participants • Average of 170 active logging days • 2,828,706 Events • Average of 2,815 distinct emails per user • Average of 9,337 distinct URLs per user • Average of 902 events per user per day • Average 5 hours of active interaction per user per day Michał Kopycki

  18. Dataset activity coverage Michał Kopycki

  19. Data collection Methodology: www l3s de google Encryption schemas: Michał Kopycki

  20. A glimpse into user behavior Instant reader Moderate reader Michał Kopycki

  21. Outline • Motivation Logging Framework User study Conclusion and future work Michał Kopycki

  22. Conclusion • Logging Framework • http://pas.kbs.uni-hannover.de/ • http://sourceforge.net/projects/activity-logger • User study • Desktop Dataset • Nepomuk integration • PIM’08 Workshop paper Michał Kopycki

  23. Future work • Logging Framework: • centralized architecture • ontology based RDF output format • support for new applications and notifications • Vista support • Exploratory analysis of the Desktop dataset • Email interaction • Web search interaction • Application interaction Michał Kopycki

  24. References • [BM02] Peter Brusilovsky and Mark T. Maybury. From adaptive hypermedia to the adaptive web. Communications of the ACM, volume 45, pages 30–33, 2002. • [CDH+08] Sergey Chernov, Gianluca Demartini, Eelco Herder, Michał Kopycki, and Wolfgang Nejdl. Evaluating Personal Information Management using an activity logs enriched Desktop dataset. In (To appear) PIM ’08: In Proceedings of the Workshop on Personal Information Management, 2008. • [CSC+07] Sergey Chernov, Pavel Serdyukov, Paul-Alexandru Chirita, Gianluca Demartini, and Wolfgang Nejdl. Building a desktop search test-bed. In ECIR ’07: Proceedings of 29th European Conference on IR Research, Advances in Information Retrieval, pages 686–690. Springer, 2007. • [Her06] E. Herder. Forward, Back and Home Again - Analyzing User Behavior on the Web. PhD thesis, University of Twente, Enschede, 2006. • [RM00] B. J. Rhodes and P. Maes. Just-in-time information retrieval agents. IBM Systems Journal, volume 39, pages 685–704, 2000. • [TDH05] Jaime Teevan, Susan T. Dumais, and Eric Horvitz. Personalizingsearch via automated analysis of interests and activities. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 449–456. ACM, 2005. • [WJR02] R.W. White, J.M. Jose, and I. Ruthven. Comparing explicit and implicitfeedbacktechniquesfor web retrieval: Trec-10 interactivetrack report. TREC ’02: Proceedings of the Tenth Text Retrieval Conference, 2002. • [CN06] Paul-Alexandru Chirita, Wolfgang Nejdl Analyzing User Behavior to Rank Desktop Items. In: String Processing and Information Retrieval, 13th International Conference, SPIRE 2006, Proceedings, pp. 86-97, 2006. • [CGNP05] Paul-Alexandru Chirita, Stefania Costache, Wolfgang Nejdl, RalucaPaiu Beagle++: Semantically Enhanced Searching and Ranking on the Desktop. (Electronic Edition) In: The Semantic Web: Research and Applications, 3rd European Semantic Web Conference, ESWC 2006, Proceedings, pp. 348-362, 2006. • [WTN00] Steve Whittaker, Loren Terveen, and Bonnie A. Nardi. Let’s stop pushing the envelope and start addressing it: a Reference Task Agenda for HCI. Human Computer Interaction, volume 15, pages 75–106, 2000. • [McG95] Joseph E. McGrath. Methodology matters: doing research in the behavioral and social sciences. Human-computer interaction: toward the year 2000, pages 152–169, 1995. • [CLWB01] Mark Claypool, Phong Le, Makoto Wased, and David Brown. Implicit interest indicators. In IUI ’01: Proceedings of the 6thinternational conference on Intelligent user interfaces, pages 33–40. ACM, 2001. • [TAAK04] Jaime Teevan, Christine Alvarado, Mark S. Ackerman, and David R. Karger. The perfect search engine is not enough: a study of orienteering behavior in directed search. In CHI ’04: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 415–422. ACM, 2004. • [WRJ02] Ryen W. White, Ian Ruthven, and Joemon M. Jose. Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 57–64. ACM, 2002. • [Voo02] Ellen M. Voorhees. The philosophy of information retrievalevaluation. In CLEF ’01: Revised Papers from the SecondWorkshopof the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems, pages 355–370, London, 2002. Michał Kopycki

  25. Many thanks to: • Sergey and Eelco Study participants YOU !! Michał Kopycki

  26. Related work Single domain (Web, Email) Dragontalk [TAAK04] Connections [WRJ02] Implicit Feedback Explicit Feedback Haystack LifeStreams MyLifeBits Stuff I’ve Seen Beagle ++ Cross domain Michał Kopycki

  27. Collected data Michał Kopycki

  28. A glimpse into user behavior • File access over folder hierarchy Michał Kopycki

  29. A glimpse into user behavior • Web page visit length Michał Kopycki

  30. Alternative to the public Desktop dataset Dataset 1 Dataset 2 Desktop dump Logging Framework Desktop dump Logging Framework Comparable Soft-repeatable Common structure Common output Dataset 3 Desktop dump Logging Framework Michał Kopycki

  31. Seems hard, but… “It is possible” [BLA06],[APRILFOOL08],[HAHA07] DEADLINE Michał Kopycki

More Related