MODELS & DATA. A Four-Box Model of a DSS / BI System Implicit vs Explicit Models Typologies of Models Types of Data The Model-Data Interdependency Is Quality Data Worth It? A Predictive Model for Evaluating Pricing Policies. USER INTERFACE. DECISION MODELS. A FOUR-BOX MODEL OF
A FOUR-BOX MODEL OF
A DSS/BI SYSTEM
Time series analysis
Regression analysis, etc.ANALYSIS OF DATA
Most frequently used operations are simple :
Segregating data into groups
Picking out exceptions
Ranking, Plotting, Making tables, etc.
Whenever a manager (or anybody else) looks at data, he or she has a preconceived idea of how the world works and therefore of what is interesting or worthwhile in the data. We shall call such ideas models.
John D. C. Little
Models provide the means for converting data into actionable information...
- Models carried in people's heads
- Prose Models
- Flow Models
- Mathematical ModelsIMPLICIT vs EXPLICIT MODELS
Why do managers use implicit models ?
What are the benefits of explicating an implicit model ?
What problems are encountered when explicating an implicit model ?
Linear vs. Non-linear Models
How time is handled?
Static vs. Dynamic Models
How risk is handled?
Deterministic vs. Stochastic Models
At what level of detail?
Micro vs. Macro ModelsA Typology of Models- How is the Real World Represented?
Problems: Model may not fit the problem
More data needed
More time and cost
Higher intellectual cost"SATISFICING" vs. "OPTIMIZING"IN DECISION-MAKING
Choose a solution that is good enough using
manager's rules of thumb or heuristics.
Benefits: Saves time and cost
Easy to implement
1. Define the Problem to be Addressed by the Model
2. List Relevant Factors - Do not worry about Data
3. Select the Most Critical Factors
4. Link the Selected Factors
5. Obtain the Required Data
6. Develop the System
7. Validate the Output from the System
8. Sensitivity Analysis of the Output from the System
Oakwood Medical Labs, Detroit- Arranges the 800 stops of 26 drivers each day to pick up blood samples from, and drop-off time-sensitive results to, 1000 clinics and hospitals
Sleepy’s - A Mattress Chain in Bethage, N.Y.- Promises quicker home delivery than its competition
Homemakers, a Furniture Superstore in Des Moines, Iowa- Offers a two-hour window on next-day home delivery- Previously, “it would take two days to prepare the schedules and, even though we used to give a 4-hour delivery window, maybe we made it on time or maybe not.
Source: Wall Street Journal, Apr 2, 1998RIMMS: A Model-Based System For Efficient Routing & Scheduling
Users add data on scheduled stops, pickups and individual customer time-demands
Model calculates the best way to manage a day’s deliveries and pick-ups
Users can incorporate soft-data on other relevant factors, for example:- courier pick-ups take several minutes longer than drop-offs, a devilish problem that can throw off schedules- how a storm the previous night can slow driving speedsBiggest Strength: Good Data
- e.g. linear model of sales to advertising
Models that are too big.
- require too much data
- "larger" is not always "better"
What is a "good" model ?
easy to understand
complete on important issues
just enough detail for operational accuracy
judicious use of all types of data"BAD" vs "GOOD" MODELS
Data readily available
BUT ... Ignore what causes sales
Better because they link sales to “explanatory” variables
However ... ... Which variables? Cost of Data? ... What type of relationship? ... Accuracy of projections of the explanatory variables?An Example: Forecasting Sales
Rx Sales = 527 + 0.13*Symptoms + 74*(Our Prom / Comp Prom)
Sometimes retrieval questions come up of course, but most often the answers to important questions require non-trivial manipulation of stored data. Knowing this tells us much about the kind of software required. For example, a database management system is not enough.
- John Little (1979)
“Data” has to be converted into “Information” that
triggers managerial action.
The conversion process is critical to get value from the
Avoids the “completeness” trap in building a data warehouse
A “good” model...
complete on important issues
just enough detail for operational accuracy
judicious use of hard and soft dataModels Help in Data Conversion
. . . More Time to Develop
. . . And, Cost More Not just $ but the Intellectual Cost
People tend to reject what they do not understand. The manager carries responsibility for outcomes. We should not be surprised if he prefers a simple analysis that he can grasp, even through it may have qualitative structure, broad assumptions, and only a little relevant data, to a complex model whose assumptions may be partially hidden or couched in jargon and whose parameters may be the result of obscure statistical manipulations. - John Little (1970)Better Models Require . . .
Design a Prototype scaled to the barest minimum
Collect data for the Prototype
- Lowest data cost
Develop Prototype using real data
Users evaluate benefits of system
Data Problem: Serious gaps in operational data
Available data on promotions: How much was spent
When the bills were paid
Missing key data: When were the promotions run
...to correlate with sales data
Issue: Data problem is solvable in principle
But... Is it worth the effort and cost?Case Example:A Consumer Packaged Goods Company
sales, promotion expenditures and dates, margins
Detailed data needed for useful information
by packs for each brand and by markets
weekly data for capturing sales fluctuations
two years of data to compare pre- with post-deal sales levels
Cost of data
Manual effort to extract dates of promotions from logbooks
2 brands, a major brand and a new brand
8 markets (out of 50), 3 large, 3 medium and 2 small
Demonstrated the value of collecting the missing data and building an integrated database
Led to the development of a promotion-event calendar systemThe Low-Cost Prototype- To Assess Value of Data
Because of the narrow focus of operational systems
Operational systems are an important source of data for decision support
Design of operational systems must incorporate data requirements of management support systems
When implementing new Human Resource Information Systems (e.g., PeopleSoft), are the data requirements of human resource management considered? For evaluating hiring sources? Career development? Etc.
Private Sector and Public Sector
Predicting Customer Response is Difficult
Past behavior is of limited value
Competitor’s reactions to “our” price is unpredictable
Even More Difficult in the Public Sector
Bottom-line impact is not enough
Must consider: Who is affected? How?The Product Pricing Problem
Exhibit “threshold effects”
Price is only one factor -- other decision variables (e.g., distribution, promotion) interact with price to affect demand
External factors, about which we have imperfect information, impact pricing decisionsPrice and Demand Relationships Are Complex
Essentially a “flat” fare
Insensitive to distance traveled
Inequities of Present Fare Structure
Favors long trips at the expense of short ones
Long-distance riders -- mostly suburban commuters with relatively high incomes.
Short-distance riders -- mostly urban residents traveling off-peak for discretionary purposes
Thus, distance inequities often imply social inequitiesThe Transit Pricing Problem
e.g., with a 25 cent Flat Fare:
Rider #1 travels 1 mile and pays 25 cents per mile
Rider #2 travels 5 miles and pays 5 cents per mile
Drawback of Flat Fares: Long-distance riders being subsidized by short-distance riders
Potential of Distance-Based Fares to:
Reduce inequities in fare per mile
Increase revenueWhy Consider Distance-Based Fares?
Relate a measure of travel demand to a set of explanatory (“independent”) variables
Measures of travel demand:
# of passengers or # of trips
Demographic variables (e.g., median income), trip characteristics (e.g., peak/off-peak), and decision variables (e.g., fares)Macro Models for Demand Forecasting - The Conventional Tool
Would a price increase hurt inner city residents more or less than suburban commuters?
Would loss in patronage be greater off-peak than peak?
Would a lower fare benefit work trips? Shopping trips?
A Micro Model at the level of the individual rider is needed to handle the variety of ridership characteristics such as age, income, place of residence, time and purpose of travel, etc.Macro Models versus Micro Models
The “what if” forecasts for the individual riders are then aggregated by age, income, purpose of trip, etc. to show what groups of riders would be affected by the fare change.Micro - Simulation Model
Forecast transit usage and revenue for individual riders in the sample survey.
Weight the individual rider’s figures by an expansion factor to project the results to the population.
Aggregate the weighted figures by the desired ridership categories to assess the revenue and equity effects.Gist of the Micro Model for Transit Pricing
The “what if” demand for a new fare policy is determined through the fare elasticity appropriate for that rider -- the “Simulation” approachMerits of the Micro-Simulation Approach
Model is easy to understand -- critical since user will not risk using it for pricing decisions; even more so when a multiplicity of parties are involved as in transit pricingMerits of the Micro-Simulation Approach
Problem: The more elaborate the model, the more data needed to set up the model
For the model to be useful, it should be:
Simple enough for transit managers to readily understand but not simplistic
Complete on important issues for a valid assessment of the impact of new fare policies
A model that does not rely on historical data for calibration
Generating outputs that the user finds easy to interpretDesign of the Transit Pricing Model
Above equation adjusts the current demand through a ratio based on the fare elasticity that is appropriate for that rider
Micro-simulation is better than a macro regression model in an important way -- the model is robust because reasonable values for the elasticity will not yield unreasonable values for forecast demandWhat is the Model?
Proposed distance-based fare policy: a base fare of 10¢ and a 5¢ increment per mile
New fare for this rider is 35 ¢ per trip
% change in fare paid by this rider = (10 ¢/25 ¢) x 100 = 40%
% change in frequency of ridership = (% change in fare paid) x EE = “fare elasticity of demand” = % change in demand for a 1% change in fare
e.g., an E value of -.25 implies that a 1% increase in fare will reduce demand by .25%
Hence, for the 40% increase in fare paid by this rider under the new policy, the percent reduction in demand is predicted to be 10%An Example
Calibration of the model involves the estimation of only one parameter - fare elasticity
To simplify the calibration, segment the sample of riders into groups that are expected to have the same elasticity
Since fare elasticity has a clear operational meaning, it is feasible for the transit managers to judgmentally segment the market and estimate fare elasticities for each segmentKey Features of the Model
Enables more policy alternatives to be examined than if the manager relied on judgment alone
Uses sensitivityanalysis to test the robustness of the conclusions with regard to the soft data inputs used in the analysis
Key element of this concept is its approach to calibration: Use the manager’s judgment, especially when available data are either inadequate or dirtyDecision Calculus Concept
The model calculates the % change in frequency of ridership for the proposed fare change based on the elasticity appropriate for that rider
The model applies this % change to current weekly frequency of ridership to obtain predicted new frequency with the proposed policy
The model calculates the fare paid per trip under the new policy and the predicted weekly revenue for the individual riderHow the Model Works
The expanded ridership and revenue figures are then aggregated according to income, age, etc.
Computer output includes % changes in ridership and revenue to facilitate “before” and “after” comparisonsHow the Model Works
Since all riders in the population do not react in the same way to fare changes, the population should be first subdivided into segments whose members are expected to be fairly similar in terms of their responses to fare changes
Since elasticity estimates are soft, sensitivity analysis has to be done using multiple elasticity values to select a fare policy that performs in a satisficing manner with the range of estimates usedWhy the Model Works