1 / 49

Consumer Behavior Prediction using Parametric and Nonparametric Methods

Consumer Behavior Prediction using Parametric and Nonparametric Methods. Elena Eneva CALD Masters Presentation 19 August 2002 Advisors: Alan Montgomery, Rich Caruana, Christos Faloutsos. Outline. Introduction Data Economics Overview Baseline Models New Hybrid Models Results

sonja
Download Presentation

Consumer Behavior Prediction using Parametric and Nonparametric Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva CALD Masters Presentation 19 August 2002 Advisors: Alan Montgomery, Rich Caruana, Christos Faloutsos

  2. Outline • Introduction • Data • Economics Overview • Baseline Models • New Hybrid Models • Results • Conclusions and Future Work

  3. Background • Retail chains are aiming to customize prices in individual stores • Pricing strategies should adapt to the neighborhood demand • Stores can increase operating profit margins by 33% to 83%

  4. inelastic elastic Price Elasticity Q is quantity purchased P is price of product consumer’s response to price change

  5. Data Example

  6. Data Example – Log Space

  7. Assumptions • Independence • Substitutes: fresh fruit, other juices • Other Stores • Stationarity • Change over time • Holidays

  8. Price of Product 1 Quantity bought of Product 1 Quantity bought of Product 2 Price of Product 2 Predictor Price of Product 3 Quantity bought of Product 3 Category . . . . . . “I know your customers” convert to ln space convert to original space Quantity bought of Product N Price of Product N “The” Model Need to multiply this across many stores, many categories.

  9. Converting to Original Space

  10. Existing Methods • Traditionally – using parametric models (linear regression) • Recently – using non-parametric models (neural networks)

  11. Take Advantage: use the known functional form to bias the NN • Build hybrid models from the baseline models new accuracy NN LR robustness Our Goal • Advantage of LR: known functional form (linear in log space), extrapolation ability • Advantage of NN: flexibility, accuracy

  12. Datasets • weekly store-level cash register data at the product level • Chilled Orange Juice category • 2 years • 12 products • 10 random stores selected

  13. Evaluation Measure • Root Mean Squared Error (RMS) • the average deviation between the predicted quantity and the true quantity

  14. Models • Hybrids • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections • Baselines • Linear Regression • Neural Networks

  15. Baselines • Linear Regression • Neural Networks

  16. Linear Regression • q is the quantity demanded • pi is the price for the ith product • K products overall • The coefficients a and bi are determined by the condition that the sum of the square residuals is as small as possible.

  17. Linear Regression

  18. Results RMS

  19. Neural Networks • generic nonlinear function approximators • a collection of basic units (neurons), computing a (non)linear function of their input • backpropagation

  20. Neural Networks 1 hidden layer, 100 units, sigmoid activation function

  21. Results RMS

  22. Hybrids • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections

  23. Smart Prior Idea: start the NN at a “good” set of weights, help it start from a “smart” prior. • Take this prior from the known “linearity” • NN first trained on synthetic data generated by the LR model • NN then trained on the real data

  24. Smart Prior

  25. Results RMS

  26. Multitask Learning Idea: learning an additional related task in parallel, using a shared representation • Adding the output of the LR model (built over the same inputs) as an extra output to the NN • Make the net share its hidden nodes between both tasks • Custom halting function • Custom RMS function

  27. MultiTask Learning

  28. Results RMS

  29. Jumping Connections Idea: fusing LR and NN • change architecture • add connections which “jump” over the hidden layer • Gives the effect of simulating a LR and NN all together

  30. Jumping Connections

  31. Results RMS

  32. Frozen Jumping Connections Idea: you have the linearity, now use it! • same architecture as Jumping Connections, plus really emphasizing the linearity • freeze the weights of the jumping layer, so the network can’t “forget” about the linearity

  33. Frozen Jumping Connections

  34. Frozen Jumping Connections

  35. Frozen Jumping Connections

  36. Results RMS

  37. Models • Baselines: • Linear Regression • Neural Networks • Hybrids • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections • Combinations • Voting • Weighted Average

  38. Combining Models Idea: Ensemble Learning • Committee Voting – equal weights for each model’s prediction • Weighted Average – optimal weights determined by a linear regression model 2 baseline and 3 hybrid models (Smart Prior, MultiTask Learning, Frozen Jumping Conections)

  39. Committee Voting • Average the predictions of the models

  40. Results RMS

  41. Weighted Average – Model Regression • Linear regression on baselines and hybrid models to determine vote weights

  42. Results RMS

  43. Normalized RMS Error • Compare model performance across stores • Stores of different sizes, ages, locations, etc • Need to normalize • Compare to baselines • Take the error of the LR benchmark as unit error

  44. Normalized RMS Error

  45. Conclusions • Clearly improved models for customer choice prediction • Will allow stores to price the products more strategically and optimize profits • Maintain better inventories • Understand product interaction

  46. Future Work Ideas • analyze Weighted Average model • compare extrapolation ability of new models • use other domain knowledge • shrinkage model – a “super” store model with data pooled across all stores

  47. Acknowledgements I would like to thank my advisors and my CALDling friends and colleagues

  48. The Most Important Slide for this presentation and the paper: www.cs.cmu.edu/~eneva/research.htm eneva@cs.cmu.edu

  49. References • Montgomery, A. (1997). Creating Micro-Marketing Pricing Strategies Using Supermarket Scanner Data • West, P., Brockett, P. and Golden, L (1997) A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice • Guadagni, P. and Little, J. (1983) A Logit Model of Brand Choice Calibrated on Scanner data • Rossi, P. and Allenby, G. (1993) A Bayesian Approach to Estimating Household Parameters

More Related