Ameliorating Mental Mistakes in Tradeoff Studies Terry Bahill Systems and Industrial Engineering University of Arizona email@example.com ©, 1993-2010, Bahill This file is located at http://www.sie.arizona.edu/sysengr/slides/
Acknowledgement This research was supported by AFOSR/MURI F49620-03-1-0377.
Reference Smith, E. D., Son, Y. J., Piattelli-Palmarini, M. and Bahill, A. T., Ameliorating mental mistakes in tradeoff studies, Systems Engineering, 10:3, 222-240, 2007. All of the material in this presentation is based on peer-reviewed journal papers. None of it comes from the Internet.
Present situation • Tradeoff studies are broadly recognized by CMMI and recommended as a Decision Analysis and Resolution (DAR) method for simultaneously considering multiple alternatives with many criteria. • Tradeoff studies, which involve human • calibration • data updating • numerical judgment, • are often muddled by analysts • are often distrusted by decision makers.
Resolution • The decision-making fields of • Judgment and Decision Making • Cognitive Science • Experimental Economics have a large body of research on human biases and errors in considering numerical judgments and criteria-based choices. • Similarities between their experiments and the elements of tradeoff studies show that tradeoff studies are susceptible to human biases.
Nobel Prize Daniel Kahneman won the Nobel Prize in Economics in 2002 "for having integrated insights from psychological research into economic science, especially concerning human judgment and decision-making under uncertainty."
Judgment and decision making experiments • Allais paradox • Thaler paradox • Ellsberg paradox • Reflection effect • Certainty effect • Law of small numbers • Ranking in subjective probability • Strength and weight • Value versus Utility • Probabilities • Risks and uncertainties • Prospects • Time discounting • Elimination by aspects
Our goal • We want to help people create tradeoff studies to choose among alternatives. • We want people to have confidence that they made the right decision. • We recommend actions that will help people avoid making specific mental mistakes in doing tradeoff studies. • These recommendations are the prime deliverable of this research effort.
Eric Smith studied hundreds of experimental papers and isolated seven dozen biases that could affect the components of tradeoff studies. His results are summarized in this Excel spreadsheet.
Components of a tradeoff study • Problem statement • Evaluation criteria • Weights of importance • Alternative solutions • Evaluation data • Scoring functions • Normalized scores • Combining functions • Preferred alternatives • Sensitivity analysis
Mental mistakes • Emotions, cognitive illusions, conscious and unconscious biases, fallacies, fear of regret and the use of heuristics can cause mistakes in tradeoff studies. • We will group all these terms under the phrase mental mistakes. • The following four dozen slides list specific mental mistakes and state how they can affect particular components of tradeoff studies.
Problem Statement Mistakes* • Bad problem stating • Incorrect phrasing • Ambiguous problem stating • Substituting a related attribute • Feeling invincible
Bad problem stating^ • “The problem of the design of a system must be stated strictly in terms of its requirements, not in terms of a solution or a class of solutions.” Wayne Wymore • It is a mistake to state the problem in terms of a solution instead of the customer needs and expectations. • Recommendation: Communicate with and question the customer in order to determine his or her values and needs.
Incorrect phrasing • Phrasing of the question affects the answer • Problem-M: Several Australian mammal species are nearly wiped out by hunters.Intervention: Contribute to a fund to provide a safe breeding area for these species. • Problem-W: Skin cancer from sun exposure is common among farm workers.Intervention: Support free medical checkups for threatened groups. • When asked about giving money, subjects said they would contribute more money to provide a safe breeding area than for free medical checkups. • However, when asked which intervention they would support, they said they would rather support free medical checkups. • Recommendation: Questions designed to get a value for a criterion should be tightly coupled to the criterion.
Phrasing* • The way you phrase the question will determine the answer you get. • When asked whether they would approve surgery in a hypothetical medical emergency, many more people accepted surgery when the chance of survival was given as 99 percent than when the chance of death was given as 1 percent.
$ bet Has higher dollar value P bet Has higher probability Preference reversals* Although the expected values are the same, most people preferred to play the P bet, however most people wanted a higher selling price for the $ bet.
Ambiguous problem stating^ • If a problem statement is vague (such as “work for the public good”) then proposed solutions can vary greatly, and derive support for very different reasons and in different ways. • Recommendation: State the problem without ambiguity; which is more ambiguous (1) to allocate physical resources or (2) to influence perceptions through psychology?
Substituting a simpler process • Sometimes a person substitutes a related entity that comes to mind more readily. In effect, “people who are confronted with a difficult question sometimes answer an easier one instead.” • When making a decision that should be decided by a tradeoff study, people sometimes substitute a simpler decision process. • Recommendation: Decision makers should realize that a premature reduction of a tradeoff study to a simpler decision process is a common heuristic that prevents through consideration of the original decision.
Feeling invincible* • Teen-age boys are notorious for thinking • I won’t get caught • I can’t get hurt • I will avoid car accidents • I won’t cause an unwanted pregnancy • I won’t get sexually transmitted disease • I don’t have to back up my hard drive, my computer won’t crash • They can’t do that to me • In 1912, the White Star line said that the Titanic was ‘unsinkable.’ • Recommendation: Decision makers must learn and have the freedom to question statements that are obviously true and other sacred cows.
Feeling invincible2 • Codebreakers have been routinely breaking codes for over 600 years. • During WWII the American generals often had copies of Hitler’s battle orders before the German generals. • Yet the Americans did not think that the Germans were breaking the American codes. (I do not know for a fact that they were.)
Evaluation Criteria Mistakes • Dependent criteria • Relying on personal experience • Forer Effect
Dependent criteria* • Evaluation criteria should be independent.^ • For evaluating humans, Height and Weight are not independent: Sex (male versus female) and Intelligence Quotient are independent. • Recommendation: Dependent criteria should be grouped together as subcriteria.
Relying on personal experience • "We are all prisoners of our own experience.” • Criteria may be chosen by the analyst's experience, with insufficient customer input and environmental confirmation. • Recommendation: It is imperative to conduct thorough searches for objective knowledge. Talk to your customer and other stakeholders.
Forer effect • The analyst might fail to question or re-write criteria from a legacy tradeoff study that originated from a perceived authority and is now seemingly adaptable to the tradeoff at hand. • Recommendations: • Give some time to considering and formulating criteria from scratch, before consulting and possibly reusing previously written criteria. • Generic criteria taken from the company process assets library must be tailored for the project at hand.
Weight of Importance Mistakes • Choice versus calculation • Ignoring severity amplifiers
Choice versus calculation Choice, 67% chose Program X Calculation, 4% calculated $55M or more
Ignoring severity amplifiers* Different people will give different weights of importance because of their perceptions of Recommendation: Intersubject variability can be reduced with education, peer review of the assigned weights and group discussions. Keep a broad view of the whole organization, so that criteria in one area are considered in light of all others.
Alternative Solution Mistakes • Serial consideration of alternatives • Isolated or juxtaposed alternatives • Conflicting criteria • Adding alternatives • Maintaining the status quo • Uneven level of detail
Serial consideration of alternatives • When solving a problem, people seize on a hypothesis and hold on to it until it is disproved. • Once the hypothesis is disproved, they will progress to the next hypothesis and hold on to it until it is disproved. • This bias can persist throughout a tradeoff study, as an analyst uses the whole study to try to prove that a currently favored alternative is the best. • Recommendation: Alternative solutions should be evaluated in parallel from the beginning of the tradeoff study, so that a collective and impartial consideration will permit the selection of the best alternative from a complete solution space.
Isolated or juxtaposed alternatives • Two dictionaries were evaluated in isolation and juxtaposed. • When evaluated in isolation, subjects were willing to pay more for dictionary A than for B. However, when evaluated at the same time, subjects were willing to pay more for dictionary B. • Recommendations: • New alternative solutions should be subject to elimination only after comparison to all alternative solutions. • Group alternatives by affinities.
Conflicting criteria “You can either select one of these gambles or you can pay $1 to add one more gamble to the choice set. The added gamble will be selected at random from the list you reviewed.”
Adding alternatives • Patient M. S. is a 52-year-old journalist with a mini-stroke. She had a similar episode ten days ago that lasted about 12 hours. Angiography shows a 70% constriction of the left carotid artery. Past medical history is noteworthy for past alcoholism (no liver cirrhosis) and mild diabetes (diet controlled) • Patient A. R. is a 72-year-old retired police officer with a mini-stroke. He had two similar episodes in the last three months with the last occurring one month ago. Angiography shows a 90% constriction of the right carotid artery. He has no concurrent medical problems and is in generally good health. • On which patient would you operate first? 38% of the physicians chose Patient A. R.
The additional alternative • Patient P. K. is a 55-year-old bartender with a mini-stroke. She had one similar episode a week ago that lasted about 6 hours. Angiography shows a 75% constriction of the ipsilateral carotid artery. Past medical history is noteworthy for ongoing cigarette smoking (since age 15 at a rate of one pack per day). • In the group of deciders that was given all three patients, 58% of the physicians now chose Patient A. R., a big increase. • Recommendation: • All of the alternative solutions should be evaluated in parallel from the beginning of the tradeoff study. • If an alternative must be added in the middle of a study, then the most similar alternative will lose support.
Maintaining the status quo • Students were paid $1.50. • Then they were asked to trade their $1.50 for a metal Zebra pen: 25% kept the $1.50. • Then they were asked to trade their $1.50 for either a metal Zebra pen or two plastic Pilot pens: 53% kept the $1.50. • An increase in the conflict of the choice increased their decision to stay with the status quo. • Recommendation: Do not needlessly increase the number of alternatives.
Uneven level of detail • Uneven level of detail in the description of the alternatives might confuse a naive reader. • If alternatives are abstracted at a different level of detail it will be difficult to assign scores to the alternatives.
Evaluation Data Mistakes • Relying on personal experience • Magnitude and reliability • Judging probabilities poorly
Relying on personal experience • Estimates for evaluation data may faultily come from personal experiences. • People may be completely oblivious to things they have not experienced, or they may think that their limited experience is complete. • What people think they know may be different from what they actually know. • Recommendations: • The source of evaluation data must be subject to peer and public review. • Decision analysts must be willing to yield absolute control over evaluation data.
Magnitude and reliability • People tend to judge the validity of data first on its magnitude (‘strength’), and then according to its reliability (‘weight’). • Therefore, data with outstanding magnitudes but poor reliability are likely to be chosen and used. • Recommendation: Either data with uniform reliability should be used, or the speciousness of data should be taken into account in the Risk portion of a tradeoff study.
Gambler’s fallacy Over-Alternation fallacy Conjunction fallacy Disjunction fallacy Law of small numbers Extensionality fallacies Mis-Estimation of probabilities Ease of Representation: Typicality Sub-Additively Super-Additively Confirmation bias Certainty effect Ambiguity aversion Aversion to sequences of chance events Delay-Speedup asymmetry Loss/Gain discounting Frequency Illusions Base-Rate Neglect Probabilistic illusions
Ignoring the first measurement1 • Often when a measurement (test) reveals an unexpected result, the physician and/or the patient will ask for a second measurement. • If the second measurement is pleasing, then the first measurement is discarded and only the result of the last measurement is recorded. • Recommendation: If there is no evidence showing why the first measurement was in error, then it should not be discarded.
Ignoring the first measurement2 • A reasonable strategy would be to record the average of the two measurements. • For example, if you take your blood pressure and the result is abnormally high, then you might measure it again. • If the second measurement indicates that blood pressure is in the normal range, and you do not have proof that the first reading was a mistake, then do not record only the second reading, either record both measurements or the average of the two readings.
Scoring Function Mistakes • Mixing gains and losses • Not using scoring functions • Anchoring
Scoring functions • Objective value is translated to subjective worth • Input values become normalized output scores • Scoring functions must be elicited from the customer
Percent happy scouts mistake • The Pinewood Derby tradeoff study had these criteria • Percent Happy Scouts • Number of Irate Parents • Because people evaluate losses and gains differently, the Preferred alternatives might have been different if they had used • Percent Unhappy Scouts • Number of Ecstatic Parents
Recommendation: Scoring functions in a tradeoff study should express gains rather than losses.
Not using scoring functions • Most tradeoff studies that we have observed in industry did not use scoring functions. • In some cases, scoring functions were explained in the company’s engineering process, but they were not convenient, hence they were not used. • Recommendation: The Wymorian standard scoring functions should be used in tradeoff studies. Those located at http://www.sie.arizona.edu/sysengr/slides/, should be referenced in company engineering processes.
Anchoring • A person’s first impression dominates all further thought. • People were shown a wheel of fortune with numbers from one to hundred. • The wheel was spun and the subjects were asked to estimate the number of African nations in the United Nations. • If the wheel showed a small number, like 12, the subjects underestimated the correct number. • If the wheel showed a large number, like 92, the subjects overestimated the correct number. • Recommendation: When estimating values for parameters of scoring functions, think about the whole range of expected values for the parameters.
Anchoring2 • You should fill out a tradeoff study matrix row by row with the status quo as the first alternative. Therefore, the values of the status quo are the anchors for estimating the other data. Unfortunately, the status quo is likely to have extremely low values for performance and extremely high values for cost, schedule and risk. But at least the anchoring alternative is known, consistent and you have control over it. • Recommendations: • Make the status quo the first alternative. • In one iteration examine the scores left to right and in the next iteration examine them right to left.