1 / 37

Team #6 Bill Cheng Sabina Del Rosso Stephen Hom

Social Networking Analytics for Calbee (SNAC). Team #6 Bill Cheng Sabina Del Rosso Stephen Hom Omede Firouz Stacy Hsueh Wei Jiang Thoranis Karnasuta. DATABASE. CLIENT. EER/SCHEMA. NORMALIZATION. QUERIES. Client Background: Calbee San Francisco.

kaya
Download Presentation

Team #6 Bill Cheng Sabina Del Rosso Stephen Hom

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Social Networking Analytics for Calbee (SNAC) Team #6 Bill Cheng Sabina Del Rosso Stephen Hom OmedeFirouz Stacy Hsueh Wei Jiang ThoranisKarnasuta DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES Professor Ken Goldberg. IEOR 115. December 9, 2011.

  2. Client Background: Calbee San Francisco • CALBEE, Inc. is one of the largest snack companies in Japan • Company based on the premise of good health • Calbee, San Francisco is the company’s first US-based flagship store • Founded in early 2011 • Located in Westfield Mall • Active in social media • Website, Facebook, Twitter DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES Image from calbeeshop.com

  3. Current Infrastructure • Currently do not keep track of social media hits on any site • Use Point of Sale for sales data and employee clock-ins DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES Image from http://www.unrealstudio.com

  4. Database Objectives • Handle future expansion into e-commerce • Increase social media marketing in targeted demographics • View effect of promotions on sales and social media to help better cater future promotions • Provide a foundation to maximize profits • Logistic management using integer programming • Data mining and machine learning to predict sales DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  5. EER Diagram DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  6. Relational Design Schema Relational Design Schema (46 relations) Promotion/Sales/Retail: Relations Numbered 0-9 1. PRODUCT(ProdID, Name, IsSour, IsSweet, IsSalty, IsSavory, ManufCost, RetailPrice) 2. PURCHASE(PurchaseID, ProdID1, PromoID3a, CustID6a, StoreID4a, EmpID5a, Timestamp, ipAddress) 3a. PROMOTION(PromoID, PromoCode, StoreID, StartDate, EndDate, Discount) 3b. PROMOTION_SPREAD_VIA_TWITTER(PromoID3a, TweetID10c) 3c. PROMOTION_SPREAD_VIA_F(PromoID3a, F_CID11c) 3d. PROMOTION_SPREAD_VIA_G+(PromoID3a, G_CID12c) 3e. PROMOTION_SPREAD_VIA_S(PromoID3a, S_DID13c) 3f. PROMOTION_SPREAD_VIA_B(PromoID3a, BPost_ID14a) 3g. PROMOTION_INFO_VIA_W(PromoID3a, url15) 4a. STORE(StoreID,AddressNo,StreetName, City, Country, ZipCode, PhoneNo) 4b. STORE_CARRIES(StoreID4a, ProdID1, Stock) 5a. EMPLOYEE(EmpID, LName, FName, Position, FavProdID1, StoreID4, AddressNo,StreetName, City, State, Country, ZipCode, SSN) 5b. EMPLOYEE_IS_FRIEND(EmpID5a, T_UID10a, F_UID11a, G_UID12a, S_UID13a) 5c. EMPLOYEE_IS_CUSTOMER(EmpID5a, CustID6a) 6a. CUSTOMER(CustID, LName, FName, AddressNo, StreetName, City, State, Country, ZipCode, FavProd1, BirthDate) 6b. CUSTOMER_IS_FRIEND(CustID6a, T_UID10a, F_UID11a, G_UID12a, S_UID13a) 8a. PRODUCT_AD(P_Ad_ID, ProductID1, DateBeginAd, DateEndAd, F_or_G_Ad)8b. STORE_AD(S_Ad_ID, Store_ID, DateBeginAd, DateEndAd, F_or_G_Ad) 8c. F_P_AD_CLICKED(P_Ad, F_UID, Timestamp, ipAddress) 8d. G_P_AD_CLICKED(P_Ad_ID, G_UID, Timestamp, ipAddress) 8e. F_S_AD_CLICKED(S_Ad_ID, F_UID, Timestamp, ipAddress) 8f. G_S_AD_CLICKED(S_Ad_ID, G_UID, Timestamp, ipAddress) DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  7. Relational Design Schema Cont. Social Media: Relations Numbered 10-19 10a. T_USER(T_UID, T_Username, Fname, Lname, City, State, BirthDate, Email) 10b. T_FOLLOWING(T_UID10a, Follower_T_UID10a, DateBeganFollowing) 10c. TWEET(TweetID, T_UID10a, Auth_T_UID10a, TextStr, Timestamp) 11a. F_USER(F_UID, Fname, Lname, City, State, BirthDate, Email) 11b. F_FRIENDS(F_UID11a, Friend_F_UID11a, DateBecameFriends) 11c. F_COMMENT(F_CID, Auth_F_UID11a, On_F_CID11c, TextStr, Timestamp) 11d. F_LIKE(F_CID11c, F_UID11a, Timestamp) 12a. G_USER(G_UID, Fname, Lname, City, State, BirthDate, Email) 12b. G_FRIENDS(G_UID12a, Friend_G_UID12a, DateBecameFriends) 12c. G_COMMENT(G_CID, Auth_G_UID12a, On_G_CID12c, TextStr, Timestamp) 12d. G_LIKE(G_CID12c, G_UID12a, Timestamp) 13a. S_USER(S_UID, Fname, Lname, City, State, BirthDate, Email) 13b. S_FOLLOWING(S_UID13a, Follower_S_UID13a, DateBeganFollowing) 13c. S_DISCOVERY(S_DID, S_UID13a, url, Timestamp) 13d. S_REVIEW(S_DID13c, S_UID13a, TextStr, Like/Dislike, Timestamp) 14a. BLOG_POST(url, BPost_ID, Author_Emp_ID5a, TextStr, Timestamp) 14b. BLOG_COMMENT(BComment_ID, url, BPost_ID14a, TextStr, Timestamp, ipAddress) 14c. ASSOCIATE_IP_T(T_UID10a, Timestamp, ipAddress) 14d. ASSOCIATE_IP_FB(F_UID11a, Timestamp, ipAddress) 14e. ASSOCIATE_IP_G(G_UID12a, Timestamp, ipAddress) 14f. ASSOCIATE_IP_S(S_UID13a, Timestamp, ipAddress) 15. MAIN_WEBSITE(url, link_to_html_file, Timestamp) DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  8. Relational Design Schema Cont. Other Data: Relations Numbered 20-29 20a. GOOGLE_TREND(GT_ID, word, city, country, day, hits) 20b. RELATED_TREND(word, Related_Prod_ID1) DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  9. Access Table Relationships DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  10. Access Table Relationships Cont. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  11. Normalization Analysis: 1NF Removal of a multi-valued attribute (flavor): DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  12. Normalization Analysis: 2NF Removal of a partial FD: {PromoID}  {PromoCode, StoreID, StartDate, EndDate, Discount} DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  13. Normalization Analysis: 3NF Removal of a transitive FD: {T_UID}  {T_Username, Fname, Lname, City, State, BirthDate, Email} DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  14. Normalization Analysis: BCNF Removal of a FD with a non-superkey attribute on the LHS: {PromoCode}  {StartDate, EndDate, Discount} DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  15. Query 1: Popular Product Stock Find out the most talked about products in a city and their quantities (stock). This will help us determine which products to move around to balance inventories in expectation of sale increases. Data can be exported to a solver to do a shipment problem. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  16. Query 1: SQL SELECT Product.ProdID, Product.ProdName, (SELECT COUNT(F_Comment.F_CID) FROM F_Comment WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*') AS Hits, Store.City, Store_Carries.StoreID AS Store, Store_Carries.Stock AS Stock FROM Product, Store, Store_Carries WHERE (((Product.ProdID)=[Store_Carries].[ProdID]) AND ((Store.StoreID)=[Store_Carries].[StoreID])) ORDER BY Product.ProdName; DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  17. Query 1: Output DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  18. Query 1: Data Analysis • We have a list of stores and their stock of different products • Transportation problem to encourage similar levels of stock • Minimize shipments, shipping costs, etc. • Subject to: No outliers (stores with low stock) Possible shipment constraints Possible traffic constraints Etc. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  19. Query 1: Data Analysis (AMPL) DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  20. Query 1: Data Analysis DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  21. Query 2: Promo Social Networking Consider a promotion. Compare product social network comments in a given city two weeks before, during, and two week after a promotion to judge its effectiveness.   Order by the return on the investment. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  22. Query 2: SQL SELECT Promotion.PromoID, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.StartDate AND F_Comment.Timestamp > Promotion.StartDate - 14) AS HitsBefore, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.EndDate AND F_Comment.Timestamp > Promotion.StartDate) AS HitsDuring, (SELECT COUNT(*) FROM F_Comment, Product WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND Product.ProdID = Promotion.ProdID AND F_Comment.Timestamp < Promotion.EndDate + 14 AND F_Comment.Timestamp > Promotion.EndDate) AS HitsAfter, (SELECT SUM(Promotion.Discount*Product.RetailPrice) FROM Product) AS PromoCost FROM Promotion ORDER BY PromoCost; DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  23. Query 3: Priority Customers Use friendship data to rate friends by how many recommendations they have made. Determine how many of a person's friends became friends with us after they became friends with us. In this way, we identify possible priority customers of Calbee to target for special advertisements and promotions. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  24. Query 3: SQL SELECT F.F_UID, ( SELECT COUNT(*) FROM F_Friends AS F2 WHERE F2.F_UID = F.F_UID AND EXISTS( SELECT F3.DateBecameFriends FROM F_FRIENDS F3 WHERE F3.Friend_F_UID = 1 AND F3.F_UID = F2.Friend_F_UID AND F3.DateBecameFriends > F.DateBecameFriends)) AS friendCount FROM F_Friends AS F WHERE F.Friend_F_UID = 1; DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  25. Query 3: Data Analysis DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  26. Query 4: Google Trends and Stocks Determine priority stores which don’t stock products that they should, as determined by google trend word popularity. For a given google trend word, find the top 5 cities in which the word is most searched in year 2011. Then, find stores in those cities and which related products they do not stock. This will help us identify how to improve inventory. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  27. Query 4: SQL SELECT Store.StoreID AS Store, Store.City AS City, Product.ProdID AS Prod FROM Store, Store_Carries, Product WHERE Store.City IN (SELECT TOP 5 Google_Trend.City FROM Google_Trend WHERE Google_Trend.Word = 'test') AND Store_Carries.StoreID = Store.StoreID AND Store_Carries.ProdID = Product.ProdID AND Store_Carries.Stock = 0 AND Product.ProdID IN (SELECT Related_Trend.Related_Prod_ID FROM Related_Trend WHERE Related_Trend.Word = 'test'); DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  28. Query 5: Social Network and Purchases Gather Social Networking, Google Trend, and Purchase data over time to formulate predictive models. For a given product, find the number of social network hits of a product, the related trend word hits, and the number of purchases in that product for a given city on a given day. In this way, we can use social network 'buzz' and trend data to predict purchases as a function of time and city. Order by product then timestamp. DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  29. Query 5: SQL SELECT Product.ProdID, Product.ProdName, Purchase.Timestamp, (SELECT COUNT(F_Comment.F_CID) FROM F_Comment WHERE F_Comment.TextStr LIKE '*' + Product.ProdName + '*' AND F_Comment.Timestamp = Purchase.Timestamp) AS SocialNetworkHits, (SELECT SUM(Google_Trend.hits) FROM Google_Trend WHERE Google_Trend.word = Product.ProdName AND Google_Trend.Timestamp = Purchase.Timestamp) AS TrendHits FROM Product, Purchase WHERE Purchase.ProdID = Product.ProdID ORDER BY Purchase.ProdID, Timestamp; DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  30. Query 5: Output DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  31. Query 5: Data Analysis DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  32. Query 5: Data Analysis • Social media networking, Google Trends, and Purchases data used predictively • Group into weekly vectors • Extract significant data using Principle Component Analysis to project onto 2 dimensions. • Cluster data using K-Means  See if we can predict future sales using machine learning DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  33. Query 5: Data Analysis DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  34. Login Interface Employees login here: DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  35. Switchboard Allows employees to insert data in forms or run selected query DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  36. Forms: New Employee Enter information on new Calbee employees DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

  37. Questions? Thank you! DATABASE CLIENT EER/SCHEMA NORMALIZATION QUERIES

More Related