1 / 53

Consumer Behavior Prediction using Parametric and Nonparametric Methods

Consumer Behavior Prediction using Parametric and Nonparametric Methods. Elena Eneva Carnegie Mellon University 25 November 2002. eneva@cs.cmu.edu. Recent Research Projects. Dimensionality Reduction Methods and Fractal Dimension (with Christos Faloutsos)

Download Presentation

Consumer Behavior Prediction using Parametric and Nonparametric Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva Carnegie Mellon University 25 November 2002 eneva@cs.cmu.edu

  2. Recent Research Projects • Dimensionality Reduction Methods and Fractal Dimension(with Christos Faloutsos) • Learning to Change Taxonomies(with Valery Petrushin, Accenture Technology Labs) • Text Re-Classification Using Existing Schemas(with Yiming Yang) • Learning Within-Sentence Semantic Coherence(with Roni Rosenfeld) • Automatic Document Summarization(with John Lafferty) • Consumer Behavior Prediction(with Alan Montgomery [Business school] and Rich Caruana [SCS])

  3. Outline • Introduction & Motivation • Dataset • Baseline Models • New Hybrid Models • Results • Summary & Work in Progress

  4. How to increase profits? • Without raising the overall price level? • Without more advertising? • Without attracting new customers?

  5. A: Better Pricing Strategies Encourage the demand for products which are most profitable for the store • Recent trend to consolidate independent stores into chains • Pricing doesn’t take into account the variability of demand due to neighborhood differences.

  6. A: Micro-Marketing • Pricing strategies should adapt to the neighborhood demand • The basis: the difference in interbrand competition in different stores • Stores can increase operating profit margins by 33% to 83% [Montgomery 1997]

  7. Understanding Demand • Need to understand the relationship between the prices of products in a category and the demand for these products • Price Elasticity of Demand

  8. inelastic elastic Price Elasticity consumer’s response to price change Q is quantity purchased P is price of product

  9. Prices and Quantities • Q demanded of a specific product is a function of the prices of all the products in that category • This function is different for every store, for every category

  10. Price of Product 1 Quantity bought of Product 1 Quantity bought of Product 2 Price of Product 2 Predictor Price of Product 3 Quantity bought of Product 3 Category . . . . . . “I know your customers” Quantity bought of Product N Price of Product N The Function Need to multiply this across many stores, many categories.

  11. How to find this function? • Traditionally – using parametric models (linear regression)

  12. Data Example

  13. Data Example – Log Space

  14. Price of Product 1 Quantity bought of Product 1 Quantity bought of Product 2 Price of Product 2 Predictor Price of Product 3 Quantity bought of Product 3 Category . . . . . . “I know your customers” convert to ln space convert to original space Quantity bought of Product N Price of Product N The Function Need to multiply this across many stores, many categories.

  15. How to find this function? • Traditionally – using parametric models (linear regression) • Recently – using non-parametric models (neural networks)

  16. Take Advantage: use the known functional form to bias the NN • Build hybrid models from the baseline models new accuracy NN LR robustness Our Goal • Advantage of LR: known functional form (linear in log space), extrapolation ability • Advantage of NN: flexibility, accuracy

  17. Evaluation Measure • Root Mean Squared Error (RMS) • the average deviation between the true quantity and the predicted quantity

  18. Error Measure – Unbiased Model which is an unbiased estimator for q. but by computing the integral over the distribution is a biased estimator for q, and we correct the bias by using

  19. Dataset • Store-level cash register data at the product level for 100 stores • Store prices updated every week • Two Years of transactions • Chilled Orange Juice category (12 Products)

  20. Models • Hybrids • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections • Baselines • Linear Regression • Neural Networks

  21. Baselines • Linear Regression • Neural Networks

  22. Linear Regression • q is the quantity demanded • pi is the price for the ith product • K products overall • The coefficients a and bi are determined by the condition that the sum of the square residuals is as small as possible.

  23. Linear Regression

  24. Results - RMS Error RMS

  25. Neural Networks • Generic nonlinear function approximators • Collection of basic units (neurons), computing a (non)linear function of their input • Random initialization • Backpropagation • Early stopping to prevent overfitting

  26. Neural Networks 1 hidden layer, 100 units, sigmoid activation function

  27. Results RMS RMS

  28. Hybrid Models • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections

  29. Smart Prior Idea: Initialize the NN with a “good” set of weights; help it start from a “smart” prior. • Start the search in a state which already gives a linear approximation • NN training in 2 stages • First, on synthetic data (generated by the LR model) • Second, on the real data

  30. Smart Prior LR

  31. Results RMS RMS

  32. Multitask Learning [Caruana 1997] Idea: learning an additional related task in parallel, using a shared representation • Adding the output of the LR model (built over the same inputs) as an extra output to the NN • Make the NN share its hidden nodes between both tasks

  33. MultiTask Learning • Custom halting function • Custom RMS function

  34. Results RMS RMS

  35. Jumping Connections Idea: fusing LR and NN • Modify architecture of the NN • Add connections which “jump” over the hidden layer • Gives the effect of simulating a LR and NN together

  36. Jumping Connections

  37. Results RMS RMS

  38. Frozen Jumping Connections Idea: show the model what the “jump” is for • Same architecture as Jumping Connections, but two training stages • Freeze the weights of the jumping layer, so the network can’t “forget” about the linearity

  39. Frozen Jumping Connections

  40. Frozen Jumping Connections

  41. Frozen Jumping Connections

  42. Results RMS RMS

  43. Models • Baselines: • Linear Regression • Neural Networks • Hybrids • Smart Prior • MultiTask Learning • Jumping Connections • Frozen Jumping Connections • Combinations • Voting • Weighted Average

  44. Combining Models Idea: Ensemble Learning Use all models and then combine their predictions • Committee Voting • Weighted Average 2 baseline and 3 hybrid models (Smart Prior, MultiTask Learning, Frozen Jumping Conections)

  45. Committee Voting • Average the predictions of the models

  46. Results RMS RMS

  47. Weighted Average – Model Regression • Optimal weights determined by a linear regression model over the predictions

  48. Results RMS RMS

  49. Normalized RMS Error • Compare model performance across stores with different: • Sizes • Ages • Locations • Need to normalize • Compare to baselines • Take the error of the LR benchmark as unit error

  50. Normalized RMS Error

More Related