530 likes | 653 Views
Recommender Systems Revisited – From Items to Transactions . Laks V.S. Lakshmanan University of British Columbia Vancouver, Canada http://www.cs.ubc.ca/~laks Joint work with Zeinab Abbassi . . Why Recommendations? . 2. The Recommendation Paradigm
E N D
Recommender Systems Revisited – From Items to Transactions Laks V.S. Lakshmanan University of British Columbia Vancouver, Canada http://www.cs.ubc.ca/~laks Joint work with ZeinabAbbassi. PersDB'09, Lyon.
Why Recommendations? 2 • The Recommendation Paradigm • Suggest content (in most cases, items) to users based on her profile and past activities. • Why Recommendation? • Search queries can be generic: e.g., >90% of Yahoo! Travel queries are general descriptions like family trip. • More so for Social Content Sites ... PersDB'09, Lyon, August 2009.
Why Recommendations? 3 • Social Content Sites • Sites where users make friends and share contents • E.g., facebook, del.icio.us, Flickr, etc. • Content sites letting you share info with your social buddies • E.g., nytimes.com, indiatimes.com, youtube.com • Recommendation is an indispensible information exploration paradigm on social content sites. • The rich activities and user connections provide lots of opportunities for generating recommendations. PersDB'09, Lyon, August 2009.
Overview of RecSys 4 • Item-Based Strategies • Estimate the rating of an unrated item (i) by the user (u) based on its similarity to items already rated and how u rated those items. • Collaborative Filtering Strategies • Estimate the rating of i by u based on how u’s similarity network (either explicit or implicit) rated i. PersDB'09, Lyon, August 2009.
Overview of RecSys U1 U2 … Ui … Item-based. I1 I2 … Ij …
Overview of RecSys U1 U2 … Ui … User-based: Collaborative Filtering I1 I2 … Ij …
Overview of RecSys • Fusion Strategies • Model/Machine Leaning-based approaches. • What’s common among all RecSys algorithms? • Recommend items to users. • What if we want to recommend transactions instead? • Motivating Apps • User exchanging items • Offline (one-shot) exchanges • Asynchronous exchanges • Users buying/selling items for a price
Talk Outline 8 • Motivation • Problem Definition • Related Work • Our Approach • Experimental Results • Summary & Future Work. PersDB'09, Lyon, August 2009.
Online social networks – emergence and rapid growth. Users spend more time on online social networks. MySpace and Facebook are among the top 10 websites: Motivation 9
Motivation – Exchange Markets 10 • There are exchange markets around on the Web today: • OddShoe.org • Peerflix.com (movie exchange) • ReadItSwapIt.co.uk • JoeBarter.com • Intervac PersDB'09, Lyon, August 2009.
Motivation 11 • Need “Matching” Algorithms • Enhance quality of user experience. • Let more people be engaged in the system and more of the time. • Monetization. • Lack of comprehensive study of “matching”. • Need efficient recommendation algorithms. PersDB'09, Lyon, August 2009.
Some Related Problems • Chinese Postman Problem: • Collect mail from postal station (s). • Deliver mail on all streets (edges). • Minimize distance covered (fuel consumed).
Some Related Problems • Cycle Covers: ? ? • Let’s try again!
Some Related Problems • Cycle Covers: 12 6 7 15 20 16 Can we cover all vertices with edge-disjoint cycles? What is the minimum weight of such a cover?
Some Related Problems • Cycle Covers: 12 6 7 15 20 16 Can we cover all vertices with edge-disjoint cycles? What is the minimum weight of such a cover? Vertex-disjoint, edge-disjoint, vertex/edge cover, bounded length, etc. – variants.
Related Work 16 • Graphs – Cycle Cover Problem: • Polytime algorithm for Chinese Postman problem on undirected graphs [Edmonds & Johnson 73]. • CPP is NP-hard for mixed graphs [Papadimitriou 76] but admits a 3/2-approx. [Raghavachari & Veerasamy 99]. • Cycle Cover -- cover given set of nodes/edges with set of min. length cycles. • Min. Weight Cycle Covers – a variant of CPP; NP-hard in general [Thomassen 97]. • CC w/ bounds on cycle length (heuristic) [Hochbaum and Olinick 01]. • Approximation algorithms when length is bounded [Immorlica+ 05]. PersDB'09, Lyon, August 2009.
Related Work 17 • Recommender Systems: • Management science perspective [Murthy & Sarkar 03]. • Collaborative filtering [Resnik+ 94, Shani+ 02]. • Survey of item-based, collaborative filtering, fusion-based, and model-based [Adomavicius & Tuzhilin 05]. PersDB'09, Lyon, August 2009.
Related Work • Kidney Exchange problem: 4,000 deaths/yr in US. 70,000 waiting for a cadaver kidney.
Related Work 19 • Kidney Exchange problem: • In kidney transplants frequently the donor’s kidney is not compatible with the patient’s. • Example: A’ is willing to donate her kidney to A and B’ to B but incompatible. However B’ kidney compatible with A and A’ kidney with B. • Motivation: Find feasible exchanges and save more people’s lives. • Medical constraints: no cycle longer than 3! PersDB'09, Lyon, August 2009.
Related Work 20 • Bi-cycles in this case: perfect matching, therefore polynomial! • Cycles of length 3 or more: NP-complete. • In [Abraham+ 07] solved by Integer Linear Programming for the problem of United States kidney exchange with real data! • We will look at a more general problem than KE. • Incentivizing exchanges in P2P file-sharing systems [Anagnostakis & Greenwald 04]. PersDB'09, Lyon, August 2009.
Set of users U and a set of items I. Two lists for each user u in U – item list Su, items u is willing to give away; wish list Wu, items u is looking for. Network – nodes = users; u v iff there is a feasible transaction from u to v. Edges labeled with item. A Model 21 PersDB'09, Lyon, August 2009.
Example PersDB'09, Lyon. 22
Different Models 23 • One-shot exchange Markets: • Simple exchange markets (swaps). • Exchange markets through short cycles. • Probabilistic exchange markets. • Wish List as Query List. • Exchange markets over time. PersDB'09, Lyon, August 2009.
Simple exchange markets 24 i v u • Only one-by-one transactions. • The problem is to find a set of pairs: • [ (u,i) , (v,j)] where iЄ Su, j Є Wu, iЄWvand j ЄSv. • form 2-cycles (swaps). • Typically each user has one instance of any item and also wants one instance of an item in his wish list [ (u,i) , (*,*)] should not appear more than once for each user u, i.e., looking for a set of conflict-free cycles. i j k w PersDB'09, Lyon, August 2009.
Exchange markets through cycles. 25 • We look for cycles of length more than 2 in the system. • The goal is to find cycles: [ (u_1,i_1) , (u_2,i_2) , (u_3,i_3) , …, (u_k,i_k) ] where i_1 in S_u1, i_1 in W_u2, i_2 in S_u2, i_2 in W_u2, …. PersDB'09, Lyon, August 2009.
Exchange markets through short cycles 26 • Note: A cycle can happen if and only if allthe participating edges are realized. • discover short cycles and solve the short cycle cover problem for cycles of length <= k, where k = 3, 4, 5, … PersDB'09, Lyon, August 2009.
Probabilistic Exchange Markets 27 • Each edge in the graph has a probability indicating the likelihood of it being realized. • The probability of realizing each edge is independent of the other edges. • Two kinds of probabilities are of interest: • Pu(v): what’s the probability u is willing to perform a transaction with v? • Pu(i,j): what’s the probability u is willing to exchange item i for j? PersDB'09, Lyon, August 2009.
Query List as Wish List • Wish list only contains “predicates” instead of items. • E.g., horror movie, science fiction, eastern philosophy, home hardware, … • Item list as before. • Users may rate/review items. • Matchmaking has to factor in ratings, i.e., matching has to use RecSys technology. -- Will focus on simple exchange, short cycles, and prob. markets in this talk.
Goal • Generate recommendations that maximize the (expected) number of items exchanged through the network. • Each user u gets a reco.: • Gives = {(give i to v), …} • Gets = {(get j from w), …} • Set of reco’s together constitute a set of conflict-free cycles that maximize above metric.
SimpleMarket Problem 30 • Theorem: Even SimpleMarket problem is NP-complete. • (Contrast with one-by-one kidney exchange.) • Reduction from four-cycle partitioning of 4-partite graphs to our problem. • Reduction from three-cycle partitioning of 3-partite graphs to 4-cycle partitioning of 4-partite graphs. • 3-cycle partitionining of 3-partite graphs is NP-complete [Holyer 81, Abraham+ 06]. PersDB'09, Lyon, August 2009.
ProbMarket 31 Lemma: The kidney exchange version of ProbMarket can be solved in polynomial time. Idea: Maximum weighted perfect matching continues to work. PersDB'09, Lyon, August 2009.
Algorithms 32 • Maximal set of Cycles • Greedy. • Local Search. • Greedy/Local Search. PersDB'09, Lyon, August 2009.
Maximal Algorithm 33 • Initialize the set of cycles CFSC=empty. • At each step, • Find an exchange cycle C. • Add C to the set of cycles CFSC. • Remove all edges in G in conflict with this cycle. • Terminate if there is no remaining cycle. • Find an exchange cycle C: • Run a DFS or BFS algorithm until you find a backward edge. • BFS tends to find short cycles. PersDB'09, Lyon, August 2009.
Greedy Algorithm 34 • Initialize the set of cycles CFSC=empty. • At each step, • Find the best exchange cycle C. • Add C to the set of cycles CFSC. • Remove all edges in conflict with this cycle. • Terminate if there is no remaining cycle. • Find the best exchange cycle C: • Try all short cycles and find the cycle with maximum weight. PersDB'09, Lyon, August 2009.
Intermediate Maximal/Greedy 35 • Improve Running Time. • Find the best exchange cycle C: • Run BFS from each node v and find a cycle Cv. • Find the cycle Cv with the maximum weight and add it. PersDB'09, Lyon, August 2009.
Local search algorithm 36 • Initialize the set of cycles CFSC=empty. • At each step, • Let the current set of cycles be CFSC. • For any exchange cycle C that is not already picked, • Try to add C, and remove all cycles in CFSC in conflict with C • If the total weight of CFSC increases, add C to CFSC and remove all conflicting cycles from CFSC. • If no local improvement is possible, output CFSC and terminate. PersDB'09, Lyon, August 2009.
Greedy/Local Search 37 • First, Run the greedy algorithm to find a set of cycles CFSC. • Then, Run the local search algorithm starting from the set CFSC. • How good are these algorithms? PersDB'09, Lyon, August 2009.
Set Packing 38 • Our problem is a special case of weighted k-set packing problem: • Given a collection of sets, each of which has an associated real weight and contains at most k elements drawn from a finite base set, find a collection of disjoint sets of maximum total weight. Output Input PersDB'09, Lyon, August 2009. http://www.cs.sunysb.edu/~algorith/files/set-packing.shtml
Relation to set packing 39 • Elements (User u gives item i) (User v gets item j) • Sets Cycles of exchanges. • Weights of sets: • Short cycle case: weight is 2k for k item exchanges. • Probabilistic: weight is [\pi_{e \in C} p(e)]*2k. • Main difference: Sets are not given explicitly. • Sets are cycles (given implicitly) and we have to discover them. PersDB'09, Lyon, August 2009.
Quality of Algorithms 40 • Maximal: No guaranteed quality: O((|V| + |E|)|B|) time. • Greedy: 2k-approximation [Chandra & Halldorsson 99]: O(|V|^2k |B|) time. • Local Search: (2K-1)-approximation. [Arkin &Hassin 97]: O(|V|^2k|E|log OPT). • Local Greedy: 2(2k+1)/3-approximation [Arkin & Hassin 97]. • More details [Abbassi & L 09]. PersDB'09, Lyon, August 2009.
Experiments • Algorithms implemented in MATLAB and run on 2.16 GHz Intel Core 2 Duo CPU and 1 GBof RAM under Windows XP. • Goals: • Extent to which allowing cycles of length > 2 increases coverage of users/items. • Quality of results of algorithms (Recall: Maximal has no theoretical guarantees). • Scalability. • Synthetic data: Structure as well as user activities follow power law [Newman 03].
Takeaways: Maximal vs Approximation Algorithms • Skew factor = 1.0, cycle length bound = 4, #users = 25K. • Skew factor = 1.5, cycle length bound = 4, #users = 25K.
Summary & Future Work 47 • Market exchanges over online social nets – simple, short cycles, probabilistic. • Related kidney exchange problem – polytime for swaps and NP-complete for k > 2. • Even swaps NP-complete for market exchange. • Reduction to weighted k-set packing approximation algorithms and Maximal (heuristic). • Experiments: “Diminishing returns” as k goes up. • Maximal – more than one order of magnitude more efficient and comparable quality! • More empirical analysis needed. PersDB'09, Lyon, August 2009.
Summary & Future Work 48 • Experiments on Real data sets. • More efficient approximation algorithms? • Randomization? • Exchange Markets over time? • Many different objectives: e.g., #items exchanged, fairness, average waiting time, … • Market Price, Buy/Sell. • Connection with game theory. • Query List as Wish List (think movie in place of kidney!) PersDB'09, Lyon, August 2009.
Other Projects in Social Networks and A Shameless Ad • Mining/Analysis of Social Networks (e.g., for viral marketing). • Network Evolution. • Diversification in RecSys. • Network-aware Search. • Social Search – SocialScope. • Opportunities for grad students and postdocs. • See http://www.cs.ubc.ca/~laks and UBC CS Grad Programs