1 / 41

Web Personalization and Recommender Systems

2. What is Web Personalization. Web Personalization:

onan
Download Presentation

Web Personalization and Recommender Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Web Personalization and Recommender Systems

    2. 2 What is Web Personalization Web Personalization: “personalizing the browsing experience of a user by dynamically tailoring the look, feel, and content of a Web site to the user’s needs and interests.” Related Phrases mass customization, one-to-one marketing, site customization, target marketing Why Personalize? broaden and deepen customer relationships provide continuous relationship marketing to build customer loyalty help automate the process of proactively market products to customers lights-out marketing cross-sell/up-sell products provide the ability to measure customer behavior and track how well customers are responding to marketing efforts

    3. 3 Personalization v. Customization It’s a question of who controls the user’s browsing experience Customization user controls and customizes the site or the product based on his/her preferences usually manual, but sometimes semi-automatic based on a given user profile Personalization done automatically based on the user’s actions, the user’s profile, and (possibly) the profiles of others with “similar” profiles

    4. 4

    5. 5

    6. 6

    7. 7 Challenges and Pitfalls Technical Challenges data collection and data preprocessing discovering actionable knowledge from the data which personalization algorithms Implementation/Deployment Challenges what to personalize when to personalize degree of personalization or customization how to target information without being intrusive

    8. 8 Web Personalization The Problem serve dynamic content to users based on their profiles or preferences Current Approaches Rule-based Filtering: store a profile for users based on explicit registration information; use prespecified rules to generate recommendations Collaborative Filtering: requires explicit ratings from users to find profiles e.g., GroupLens, Firefly, PHOAKS, Syskill & Webert Content-Based Filtering: learn/store personal profiles locally or on server-side; recommendations are based on content similarity e.g., WebWatcher, Letizia Limitations of Current Technologies content-based recommendations may be too narrow user input is subjective and prone to bias profiles may be static and can become outdated quickly problems with scalability and accuracy

    9. 9 Examples: FireFly Network (Shardanand & Maes 95) Net Perceptions Users rate musical artists from like to dislike 1 = detest; 7 = can’t live without; 4 = ambivalent There is a normal distribution around 4 However, what matters are the extremes Nearest Neighbors Strategy: Find similar users and predicted (weighted) average of user ratings Pearson r algorithm: weight by degree of correlation between user U and user J 1 means very similar, 0 means no correlation, -1 means dissimilar

    10. 10

    11. 11 Learning Interface Agents Add agents to the user interface and delegate tasks to them Use machine learning to improve performance learn user behavior, preferences Useful when: 1) past behavior is a useful predictor of the future behavior 2) wide variety of behaviors amongst users Examples: mail clerk: sort incoming messages in right mailboxes calendar manager: automatically schedule meeting times? Personal news agents portfolio manager agents Advantages: less work for user and application writer adaptive behavior user and agent build trust relationship gradually

    12. 12 Letizia: Autonomous Interface Agent (Lieberman 96) Recommends web pages during browsing based on user profile Learns user profile using simple heuristics Passive observation, recommend on request Provides relative ordering of link interestingness Assumes recommendations “near” current page are more valuable than others

    13. 13 Consequences of passiveness Weak heuristics example: click through multiple uninteresting pages en route to interestingness example: user browses to uninteresting page, then goes for a coffee example: hierarchies tend to get more hits near root Cold start No ability to fine tune profile or express interest without visiting “appropriate” pages Some possible alternative/extensions to internally maintained profiles: expose to the user (e.g. fine tune profile) ? expose to other users/agents (e.g. collaborative filtering)? expose to web server (e.g. cnn.com custom news)?

    14. 14 WebWatcher Dayne Freitag, Thorsten Joachims, Tom Mitchell (CMU) A "tour guide" agent for the WWW user tells agent what kind of information he/she is seeking (e.g., set of keywords) WebWatcher then accompanies user while browsing the web highlights hyperlinks that it believes will be of interest its strategy for giving advice is learned from feedback in earlier tours

    15. 15 Syskill & Webert (Pazzani et al 96) User defines topic page for each topic User rates pages (cold or hot) Syskill & Webert creates profile with Bayesian classifier accurate incremental probabilities can be used for ranking of documents operates on same data structure as picking informative features Only top k (=100) “informative” words are used as features presence or absence of words provides information on classification of pages word occurs in a higher percentage of hot pages than cold pages

    16. 16 Syskill & Webert Rating Pages

    17. 17 Syskill & Webert Rating Pages

    18. 18 Usage-Based Web Personalization Basic Idea find aggregate user profiles by automatically discovering user access patterns through Web usage mining (offline process) match a user’s active session against the discovered profiles to provide dynamic content (online process) Advantages / Goals profiles are based on objective information (how users actually traverse the site) no explicit user ratings or interaction with users (to enter a profile, etc.) can preserve user privacy (mining from anonymous data) usage data captures relationships missed by content-based approaches Applications provide a customized navigational experience for users based on their interests targeted electronic advertising / personalized e-coupons / customer support

    19. 19 Clustering and User Profiles Collaborative Filtering and Clustering CF techniques attempt to match a set of user ratings against previous user ratings and find “nearest neighbors” Clustering can be used to pre-calculate typical user profiles Transaction clustering: Pageviews used as features: dimensionality problems arise for large sites Each cluster contains many transactions; problem is how to “derive” useful aggregate profiles from large transaction clusters Pageview Clustering Find overlapping clusters of pageviews directly - clusters serve as aggregate profiles Can capture overlapping interests of different types of users (even those with potentially dissimilar transactions) Traditional clustering techniques fail due to very high dimensionality Related work: “Adaptive Web Sites” by Perkowitz and Etzioni

    20. Automatic Web Personalization: Offline Process

    21. Automatic Web Personalization: Online Process

    22. 22 Real-Time Recommendation Engine Keep track of users’ navigational history through the site a fixed-size sliding window over the active session to capture the current user’s “short-term” history depth Match current user’s activity against the discovered profiles profiles either can be based on aggregate usage profiles, or are obtained directly from association rules or sequential paterns Dynamically generated recommendations are added to the returned page each pageview can be assigned a recommendation score based on matching score to user profiles (e.g., aggregate usage profiles) “information value” of the pageview based on domain knowledge (e.g., link distance of the candidate recommendation to the active session)

    23. 23 Recommendations Based on Association Rules

    24. 24 Discovering Aggregate Usage Profiles Characteristics of Aggregate Profiles the goal is to effectively capture common usage patterns from potentially anonymous click-stream data profiles are represented as weighted collections of pageviews weights represent the significance of pageviews within each profile profiles are overlapping in order to capture common interests among different groups/types of users (e.g., customer segments) multiple profiles may contribute to the recommendation set for a given user Example Profiles from the ACR (Assoc. for Consumer Research) Site:

    25. 25 Methodologies for the Discovery of Aggregate Profiles Discovery of Profiles Based on Transaction Clusters cluster user transactions - features are significant pageviews identified in the preprocessing stage derive usage profiles (set of pageview-weight pairs) based on characteristics of each transaction cluster Cluster Pageviews directly compute overlapping clusters of pageviews based on co-occurrence patterns across transactions features are user transactions, so dimensionality poses a problem for traditional clustering algorithms we use Association-Rule Hypergraph Partitioning with an overlap factor

    26. 26 Input set of relevant pageviews in preprocessed log set of user transactions each transaction is a pageview vector Transaction Clusters each cluster contains a set of transaction vectors for each cluster compute centroid as cluster representative Aggregate Usage Profiles a set of pageview-weight pairs: for transaction cluster C, select each pageview pi such that (in the cluster centroid) is greater than a pre-specified threshold Profile Aggregation Based on Clustering Transactions (PACT)

    27. 27 Matching score computed using cosine similarity User’s active session (pageviews in the current window) is compared to each aggregate profile (both are viewed as pageview vectors) Weight of items in the profile vector is the significance weight of the item for that profile Weight of items in the session vector can be all 1’s, or based on some method for determining their significance in the current session Generating recommendations based on matching profiles from each matching profile recommend the items not already in the user session window, and not directly linked from the pages in the current session window the recommendation score for an item is based on a combination of profile matching score (similarity to session window) and the weight of the item in that profile additionally, we can weight items farther away from the current location of user higher (i.e., consider them better recommendations) Recommendations Based on Aggregate Profiles

    28. 28 PACT - An Example

    29. 29 Recommendations Based on PACT

    30. 30 Integrating Content and Usage For Personalization

    31. 31 Integration of Content Profiles Content Profile Representation content profiles are also represented as overlapping collections of pageview-weight pairs cluster features over the n-dimensional space of pageviews for each feature cluster derive a content profile by collecting pageviews in which these features appear as significant Integration with Recommendation Engine Usage and content profiles have similar representation, so they can be used by the recommendation engine in the same way Item weights within profiles must be normalized, so that content and usage profiles can be compared on the same scale One approach: match active user session with all profiles (both content and usage); then use the maximal recommendation score for candidate recommendations Another approach: use content profiles for generating recommendations only if no matching usage profiles (with sufficient confidence) is found

    32. 32 How Content Profiles Are Generated

    33. 33 How Content Profiles Were Generated

    34. 34 How Content Profiles Were Generated

    35. 35 Comparison of Recommendations (Example Based on ACR Site)

    36. 36 Comparison of Recommendations (Example Based on ACR Site)

    37. 37 Prediction Accuracy - Precision (Example Based on ACR Site) 18342 transactions, 62 pageview URLs (after filtering) Data set divided into training and evaluation sets Portion of each transaction in evaluation set used to generate a recommendation set (based on a given recommendation threshold) Precision = percentage of recommendations actually visited in the transaction

    38. 38 Coverage = percentage of visited pageviews recommended by the personalization engine Prediction Accuracy - Coverage (Example Based on ACR Site)

    39. 39 Example - ACR Demo Site (http://aztec.cs.depaul.edu/scripts/acr2)

    40. 40 Automatic Web Personalization Example - ACR Demo Site

    41. 41 Automatic Web Personalization Example - ACR Demo Site

    42. 42 Automatic Web Personalization Example - ACR Demo Site

More Related