Flight Test and Statistics

1 / 124

# Flight Test and Statistics - PowerPoint PPT Presentation

Flight Test and Statistics. PRESENTED BY Richard Duprey Director, FAA Certification Programs National Test Pilot School Mojave, California. Flight Test and Statistics “If you want to be absolutely certain you are right, you can’t say you know anything.”. Flight Test and Statistics Overview.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Flight Test and Statistics' - shalom

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Flight Test and Statistics

PRESENTED BY

Richard Duprey

Director, FAA Certification Programs

National Test Pilot School

Mojave, California

### Flight Test and Statistics“If you want to be absolutely certain you are right, you can’t say you know anything.”

Flight Test and Statistics Overview
• Background on National Test Pilot School
• Coverage of Statistics
• Scope - six hours of academics
• Detail
• Use of statistics in flight test
• Types of questions we try to answer
NTPS Background
• Private non-profit
• Grants Master Science
• Only civilian school of its kind
• SETP equivalent to USAF and Navy Test Pilot Schools
• Offers variety of courses (Fixed Wing and Helicopters)
• Professional - 1 year
• Introductory
• Performance and Flying Qualities Testing
• Systems Testing
• Operational Test and Evaluation
• NVG
• FAA Test Pilot / FTE initial and recurrent training
Data Analysis - Hour 1
• Types of Errors
• Types of Data
• Elementary Probability
• Classical Probability
• Experimental Probability
• Axioms
• Examples
Introduction
• Flight testing involves data collection
• time to climb
• fuel flow for range estimates
• qualitative flying qualities ratings
• INS drift rate
• Landing and Take-off data
• Weapon effectiveness
• All of these experimental observations have inaccuracies
• Understanding these errors, their sources, and developing methods to minimize their effect is crucial to good flight testing
Types of Errors
• There are two very different types of errors
• systemic errors and random errors
• Systemic errors
• repeatable errors
• caused by flawed measuring process
• ex: measuring with an 11 inch ruler or airspeed indicator corrections
• Random errors
• not repeatable and usually small
• caused by unobserved changes in the experimental situation
• errors by observer - reading airspeed indicator
• unpredictable variations - small voltage fluctuations causing fuel counter errors
• can’t be eliminated but typically distributed about a well defined distribution
Types of Data
• There are four types of numerical data:
• NOMINAL DATA
• numerical in name only - say an aircraft configuration
• 1 = gear down, 2 = gear up, 3 = slats extended
• normal arithmetic processes not applicable
• 3 >1 or 3-1=2 are not valid relationships
• ORDINAL DATA
• contains information about rank order only
• #1 = C-150, #2 = B-1, #3 = F-15
• in terms of max speed: 3>1 is valid, but not 3-1=2
Types of Data
• There are four types of numerical data (continued)
• INTERVAL DATA
• contains rank and difference information - ex: temperature in degrees Fahrenheit
• 30, 45, 60 at different times, 15 deg. difference
• zero point arbitrary, so 60o F is not twice 30oF
• RATIO DATA
• all arithmetic processes apply
• most flight test data falls into this category
• Can say that a 1000 pound per hour fuel flow is 4 times greater than 250 PPH
Probability and Flight Test
• Quantitative analysis of random errors of measurement in flight testing must rely on probability theory
• Goal
• Student to understand what technique is appropriate and limitations on the results
Elementary Probability
• The probability of event A occurring is the fraction of the total times that we expect A to occur -
• Where: - P(A) is the probability of A occurring
• - na is the number of times we expect A to occur
• - N is the total number of attempts or trials
Elementary Probability
• From this definition, P(A) must always be between 0 and 1
• if A always happens, na = N and P(A) = 1
• if A never happens, na= 0 and P(A) = 0
• In order to determine P(A) we can take two different approaches
• make predictions based on foreknowledge (“a priori”)
• conduct experiments (“a posteriori”)
Classical (‘a priori’) Probability
• If it is true that
• every single trial leads to one of a finite number of outcomes
• and, every possible outcome is equally likely
• Then,
• na is the number of ways that A can happen
• N is the total number of possible outcomes
• For example:
• six-sided die implies six possible outcomes: N = 6
• if A is getting a 6 on one roll, na = 1
• P(A) = 1/6 = 0.1667
Second Example
• What is the probability of getting two heads when we toss two fair coins?
• There are four possible outcomes (N = 4)
• (H,H) (H,T) (T,H) (T,T)
• na = 1 since only one of the possible outcomes results in two heads (H,H)
• Thus P(A) = 1/4 = 0.25
Classical (‘a priori’) Probability
• Approach instructive
• Generally not applicable to flight test where:
• Possible outcomes infinite
• Each possible outcome not equally likely
• Leads us to second approach
Experimental (‘a posteriori’) Probability
• Experimental probability is defined as
• Where

- nA obs is the number of times we observe A

Versus . number of times we expect A to occur

- Nobs is the number of trials

Experimental Example
• If the probability of getting heads on a single toss of a coin is determined experimentally, we might get

1.0

Porb

0.5

0

norb

1000

100

10

1

Probability Axioms
• Probability Theory can be used to describe relationships between events
Probability Axioms
• Three probability axioms are easily justified as opposed to proven
• P(not A) = 1 - P(A)
• Probability of something happening has to be one
• P(A or B) = P(A) + P(B)
• P(H or T) = 0.5 + 0.5 =1 for a single coin
• P(A and B) = P(A) x P(B)
• P(T and T) = 0.5 x 0.5 = 0.25 for two coins
• same answer we got when examining all possible outcomes
• The last two axioms require that
• each outcome is independent
• A occurring doesn’t affect probability of A or B occurring
• each outcome is mutually exclusive
• Only one can occur in a single trial
Example
• Problem:
• Based on test data, 95% of the time an F-4 will successfully make an approach-end barrier engagement on an icy runway
• what is the probability that at least one of a flight of four F-4’s will miss?
• Solution:
• P (1 or more miss) = 1 - P(all engage)
• Probability that at least one will miss is the complement of the probability that all will engage
• P (all engage) = P(1st success) × P(2nd ) × P(3rd) × P(4th)

= 0.95 × 0.95 × 0.95 × 0.95 = 0.954 = 0.81

Thus,

• P (1 or more miss) = 1 - 0.81 = 0.19

Example

• Problem:
• What is the probability of getting 7 or 11 on a single roll of a pair of dice?
• Solution:
• Since getting 7 or 11 are independent, mutually exclusive events, we can say
• P (7 or 11) = P (7) + P (11)
• N = 62 = 36
• n7 = 6
• (6, 1) (1, 6) (5, 2) (2, 5) (4, 3) (3, 4)
• n11 = 2
• (6, 5) (5, 6)
• Thus,
• P (7) = 6/36, P (11) = 2/36
• P (7 or 11) = 6/36 + 2/36 = 0.222
Data Analysis - Hour 2
• Populations and Samples
• Measures of Central Tendency
• Dispersion
• Probability Distributions
• Discrete
• Continuous
• Cumulative
Population & Samples
• A population is all possible observations
• Many populations are infinite
• A pair of dice can be rolled indefinitely
• Population of F-117 weapons deliveries is all the possible drops it could make in its lifetime
• Some populations are limited
• A sample is any subset of a population
• For example
• 100 rolls of a pair of dice
• Bomb scores for 100 weapon delivery sorties
Population Constructs
• Constructing a population
• Must impose assumptions
• Homogenous
• Independent
• Random
Sample Requirements
• Homogeneous
• the data must come from one population only
• DC-10 take-off data shouldn’t be used with MD-11
• Independent
• selecting one data point must not affect subsequent probabilities
• selecting and removing a heart from a deck of cards changes the probability of drawing another heart
• DC-10 landing 75 feet past touchdown aim point on one landing doesn’t change probability that next landing will miss by same distance (or any distance)
• Random
• equal probability of selecting any member of population
• using a member of a population with a bias would be non-random
• F-16 with boresight error would cause a bias in downrange miss distance
Measures of Central Tendency
• Given homogenous, independent, random sample, need to describe the contents of that sample
• Measure steel rod diameter with a micrometer - would get several different answers
• Tighten the micrometer
• Dust particles on the rod
• What to do with answers that are different?
Measures of Central Tendency
• There are three common measures of central tendency:
• Mean (arithmetic average) - most commonly used
• Mode
• most common value in the sample
• there may be more than one mode
• Median
• middle value
• for an even-numbered sample, average the two middle values
• Dangers ........
Dispersion
• Just reporting the mean as the answer can be very misleading
• Consider the following two samples, both with a mean of 100 (and same median as well)
• Sample 1: 99.9, 100, 100.1
• Sample 2: 0.1, 100, 199.9
• We also need to report how much the data generally differs from the mean value
Deviation
• We define deviation as the difference between the ith data point and the mean:
• Averaging the deviations does not help:
Mean Deviation
• Since there as many deviations above and below the mean, we could average the absolute values of deviations:
Standard Deviation
• While the mean deviation can be used, the standard deviation s is a more common measure of dispersion:
• versus
• The square of the standard deviation, s2, is called the variance
Notation
• Normally, we use Greek letters to denote statistics for populations:

m for population mean

s2 for population variance

• And we use Roman letters for sample statistics:

for sample mean

s2 for sample variance

Sample Standard Deviation
• One other difference exists between s and s
• The sample standard deviation has the sum of the squares divided by N - 1 versus N
• Mathematically, this is due to a loss of one degree of freedom
• The effect is to increase the standard deviation slightly
• Difference decreases as sample gets larger
Flight Test Example - PA28 Takeoff Distance
• Two data points eliminated - wrong configuration, improper technique
• Data adjusted for standard weight (2150 lbs.), runway slope (GPS), temperature, pressure, airspeed/altimeter corrections
• Technique, rotate at 65, liftoff at 70, maintain 75 until 50 feet AGL
Probability Distributions
• Statistical applications requires understanding of the characteristics of the data obtained
• Probability distributions gives us such understanding
Probability Distributions
• To understand probability distributions, consider the problem of tossing 2 coins
• Let n represent the number of heads for a single toss of both coins
• Then the probabilities of getting n = 0, 1, or 2 can be calculated:
• for n = 0, P(0) = 0.25
• for n = 1, P(1) = 0.5
• for n = 2, P(2) = 0.25
Discrete Distributions
• We can present the data as a bar graph
Empirical Distributions
• In flight test, we are concerned with empirical distributions versus theoretical in the coin example
• If we collect data on landing errors:
Continuous Distributions
• If we get more and more data, and make the intervals smaller, our histogram approaches a continuous curve:

Continuous Probability Distribution of Touchdown Miss Distance

• Can’t be interpreted same way as the previous discrete distribution
Continuous Distributions
• Height of curve above a point is not the probability of “x” having that point value
• Any one point on the x-axis represents a non-zero point on the curve
• But the probability associated with that single point must be zero, since there are an infinite number of points on the x-axis
• We can meaningfully talk only about the probability of being between two points a and b on the x-axis
Probability as Area Under Curve
• The probability of getting a result between a and b is rep-resented by the area under the probability distribution curve between a and b

f (x)

P(a £ x £ b)

x

Cumulative Probability Distribution
• A cumulative probability distribution gives the probability that x is less than or equal to some value, a
• Relative probability of aircraft landing miss distances could be displayed in the following cumulative distribution

1.0

0.95

f (x)

0.5

x

xT

Data Analysis - Hour 3
• Special Probability Distributions:
• Binomial
• Normal
• Student’s t
• Chi squared
Binomial Distribution
• The binomial is a discrete distribution
• It tells us the probability of getting n successes in N trials given the probability (p) of a single success
• Limiting cases
• if n = N, then obviously P(N) = pN
• if n = 0, then P(0) = (1 -p)N
• or, letting q = 1 - p, P(0) = qN
• For 0 < n < N, the possible number of combinations of success and failure gives
Binomial Distribution -flight test ex.
• Two flight control systems are equally desirable
• What is probability that 6 out of 8 pilots would prefer system A over B?
• If A and B are truly equally good, probability of pilot picking A over B is 0.5 (P=q =0.5)
• Probability of 6 pilots picking A over B is:

= 0.109

• There is only a 11% probability that this would happen. If it did, it would mean that your initial assumptions about the two flight control systems was in error
Binomial Flt. Test Example
• If p = q = 0.5, then for N = 8, the binomial distribution would be and from the figure, P(2) is about 11%
Normal Distribution
• The normal distribution is a continuous probability distribution based on the binomial
• SINGLE MOST IMPORTANT DISTRIBUTION IN FLIGHT TEST ANALYSIS
• Any deviation from a mean value is assumed to be composed of multiples of elemental errors evenly distributed
• The mathematical derivation is left as an exercise
Normal Distribution
• Graphically, it can be seen that x = m gives the maximum value and x = m ± s are the two points of inflection on the curve

f (x)

x

m

m+s

m-s

Normal Distribution
• Thus the probability that x lies between some value a and b is given by:
• Major problem - cannot be solved explicitly
• numerical techniques are required
• tables could be used, but different tables would be required for each m and s.
Standard Normal Distribution
• By using a substitution of variables
• We can use tables for a normal distribution where the mean is zero and the deviation is one
• Thus
• Becomes
• Mean of zero and a standard deviation of one

-3

-2

-1

0

1

2

3

Standardized Normal Distribution

99.7%

95%

68%

f(z)

2.5%

13.5%

34%

34%

13.5%

2.5%

z

Examples - cruise performance
• Cruise performance test flown 40 times
• Mean fuel used was 8,000 pounds
• Standard deviation was found to be 500 pounds
• Find probability that on the next sortie, we will use between 7000 and 8200 pounds
• Given m = 8000, s = 500
• find the probability that 7000 < x < 8200
• From table: 0.6554-0.0228 = 0.6326
• 63% Probability that fuel used would be within the specified range
Student’s t Distribution
• Problem : To use the normal distribution we had to know the population mean and standard deviation
• Flight Test - don’t normally know the population - just have sample
• The difference between sample and population mean is described by the statistic:
Student’s t vs n
• Different t distributions must be tabulated for each value of n
• For large n, the t-distribution approaches the standard normal distribution - use normal distribution when n =30

n = 10

n = 2

t

t - Flight Test Examples
• B-33 landing distance example
Chi- Squared (c2 ) Distribution
• Just as the sample mean may differ from the population mean, we should expect a difference in the variances
• The difference is distributed according to:

1

2

3

4

5

6

7

8

9

10

c2 vs Sample Size

f (c) 2

n = 1

n = 4

n = 10

c2

c2 Examples
• Find c2 for 95th percentile (11.1)
• one-tailed
• 5 degrees of freedom
• Find c2 for 95th percentile (0.831,12.80)
• two-tailed
• 5 degrees of freedom
• Find the median value of c2 (27.3)
• 28 degrees of freedom
Data Analysis - Hour 4
• Confidence Limits
• Intervals for mean and variance
• Hypothesis Testing
• Null and alternate hypotheses
• Tests on mean and variance
Confidence Limits
• In practice, we take a sample from a population such as Take-off distance
• Report it as if it were the true answer
• Subsequent tests will differ - sample mean/variance will differ from true population
• Can be considered sufficiently accurate if we
• Standardize test method and conditions
• Take sufficient samples
• Quantitative methods (confidence intervals) exist to determine how certain we are that we have the correct answer
Central Limit Theorem

Given a population with mean m, and variance s2, then the distribution of successive sample means, from samples of n observations, approaches a normal distribution with mean m, and variance s2/n

Central Limit Theorem
• Regardless of original Distribution of A, the distribution of the means will be approximately normal - gets better as n increased
• Mean of the means will be the same as the mean of A
• Variance of means = function of variance of A divided by n

Sample

size n

Þ

x

x

f(z)

a

2

a

2

z

Confidence Interval for Mean
• If we take samples of size n, the means of multiple tests (okay samples) will be normally distributed
• Thus
Confidence Interval - Means
• If z comes from one of our samples

or, using the central limit theorem

• Thus
Confidence Interval - Means
• Thus (1 - a) percent of the time, the true population mean m, will be within a certain range about the sample mean
• The range of values is the interval
• And (1 - a) is the confidence level
Example - flight test
• Find 95% confidence interval for F-100 engine thrust given:

n = 50 engines tested

mean thrust = 22,700 lbs

s = 500 lbs

• At 95%, a =0.05, Z 1- a/2 = 1.96

 = 22,700 +/- 1.96 ( )

22,561 <  < 22,839

• At 99%, a =0.01, Z 1- a/2 = 2.58

 = 22,700 +/- 2.58 ( )

22,518<  < 22,882

• Observations
• Interval widens for increased certainty
• Had to use “s” as an estimate for , legitimate for n >30
Small Sample Confidence Intervals
• Some flight tests involved repeated numerous test points, most do not
• But when n <30, we must substitute t for z
• For example, if our earlier problem were based on only a sample of 5, what would the 95% confidence interval be?
Example - flight test
• Find 95% confidence interval for F-100 engine thrust given:

n = 5 engines tested

mean thrust = 22,700 lbs

s = 500 lbs

• At 95%, a/2 =0.025,  =4, t 4, 0.975 = 2.78

 = 22,700 +/- 2.78 ( )

22,078 <  < 23,321

vs. 22,561 <  < 22,839 for 95% with  =50

vs. 22,518 <  < 22,882 for 99% with  =50

• Had to use “s” as an estimate for , legitimate for n >30
Confidence Interval for Variance
• Similar to intervals for means, the confidence interval for variance is based on the c2 statistic:
• For example, find the 95% confidence interval where n = 6, s = 2
Confidence Interval for Variance
• At 95%, a/2 =0.025, 1- a/2 = 0.975, v =5, s = 2

>>>

• Large band due to small sample size, if n = 18, interval would be smaller
Hypothesis Testing
• Instead of just using data to estimate of some parameter, we hypothesize an answer and then use data to judge reasonableness
• Truth can be known with certainty only if we examine the entire population
• Example
• assume a coin is fair (hypothesis)
• toss the coin 100 times
• if results are
• 48 heads, conclude coin is fair
• 35 heads, conclude coin is not fair
Null Hypothesis
• Acceptance of a statistical hypothesis
• result of insufficient evidence to reject it
• doesn’t necessarily mean that it is true
• Thus, it is important to carefully select initial hypothesis (the null hypothesis - H0 )
• selected for purposes of rejecting it – called the null hypothesis
• if we don’t gather enough data we must accept the null hypothesis
• Formulated so that in case of insufficient data, we return to the status quo or safe conclusion
• Examples of null hypothesis
• the defendant is innocent
• the new RADAR is no better than the old
• the MTBF of a new part is no better than the old
Alternate Hypothesis
• Since we are trying to negate the null hypothesis (H0) with data, the alternate hypothesis (H1) must be defined -- H0 must be “opposite” of H1
• Examples:
• 1. H0: m = 15 H1: m ¹ 15
• 2. H0: p ³ 0.9 H1: p < 0.9
• 3. Lock-on range of new radar is better than old
Types of Errors
• A Type I error
• rejecting null hypothesis when it is true
• chance variation of fair coin gives 35/100 heads
• probability is denoted as a (the level of significance)
• A Type II error
• accepting null hypothesis when it is false
• 43/100 concluded as fair when P(A) = 0.4
• probability is denoted as b (the power of the test)
• We want small a
• as a decreases, b increases (fixed sample size)
• Large b implies we stay with the status quo, H0 more frequently than we should - a more “acceptable error”
• to decrease both , increase sample size
Hypothesis Testing
• Step One
• Form null and alternate hypothesis
• Step Two
• Choose level of significance (a)
• Define areas of acceptance and rejection (one or two tailed)
• Step Three
• Collect data and compare to expectations
• Step Four
• Accept or reject the null hypothesis
Hypothesis Testing“Two Tailed”
• Some tests - interested in extremes in either direction
• Two Tailed
• Example: Burn times on an ejection seat rocket motor
• Too short - don’t clear aircraft
• Too long - impose too many g’s on pilot
• Form hypothesis of the form
• H0: m = m0 H1: m ¹ m0
• Reject H0 whenever sample produce results too low or high
• Not the usual for flight test - usually deal with “One Tailed”
Hypothesis Flight Test Examples Two Tailed
• Early Testing of F-19 bombing system for 30º dive angles gave
• Cross range error were normally distributed
• Mean error of 20 ft and a standard deviation of 3 feet.
• After a flight control modification to solve a high AOA flying qualities problem, it was found
• Sample mean cross range error for nine bombs was 22 feet.
• Has the mean changed at the 0.05 level of significance?
Hypothesis TestingTwo Tailed
• Step One
• Form null and alternate hypothesis
• H0: m = 20 (status quo) H1: m ¹ 20
• Step Two
• Choose level of significance: (a) = 0.05 (given)
• Define areas of acceptance and rejection (one or two tailed)
• (a) = 0.05 would be divided into two tails - hi/lo
• extreme values in either direction would indicate change in m
•  not changed significantly from unmodified system
Hypothesis TestingTwo Tailed
• Step Three
• Collect data and compare to expectations
• Step Four
• Accept or reject the null hypothesis
Step 4 - accept or reject

Reject

• Since z = 2 which is > 1.96
• Conclude with 95% confidence to reject null hypothesis
• Mean cross range bombing error has changed due to flight control modification

Reject

a

2

a = 0.025

2

Accept

z

Hypothesis Testing“One Tailed”
• Most flight tests - interested in extremes in only one direction
• One Tailed - small sample,  unknown
• Example: Does aircraft satisfy contractual range requirements
• Only care if distance is shorter than specified
• Form hypothesis of the form
• H0: m  m0 H1: m  m0

Or

• H0: m  m0 H1: m m0
• Reject H0 whenever sample produce results extreme in one direction
Hypothesis Flight Test Examples One Tail
• Contract fuel climb requirements
• Use less than 1500 pounds in climb from Sea Level to 20,000 feet
• Test results
• Nine climbs average of 1600 lbs
• Sample standard deviation of 200lbs.
• Do we penalize the contractor?
Hypothesis TestingOne Tailed
• Step One
• Form null and alternate hypothesis
• H0: m  1500 (until proven guilty) H1: m  1500
• Step Two
• Choose (a) = 0.05 for level of significance
• (a) = 0.01 reserved for safety of flight questions
• Define areas of acceptance and rejection (one or two tailed)
• one tailed - contract not met only if fuel used was on the high side
Hypothesis TestingOne Tailed
• Step Three
• Collect data and compare to expectations
• Step Four
• Accept or reject the null hypothesis
Step 4 - accept or reject

Reject

• Since t = 1.5 which is < 1.867
• Conclude with 95% confidence to accept null hypothesis
• Contractor has met climb fuel requirements
• Put another way
• Don’t have data @95% confidence level to show contractor failed to meet specs

a = 0.05

Accept

z

Hypothesis Test ExamplesVariance
• “Four” steps still valid here
• Substitute chi-squared for z or t
• Example on variance
• The contract states the standard deviation of miss distances for particular weapon system delivery mode must not exceed 10 meters at 90 % confidence.
• In ten test runs we get s = 12 meters.
• Is the contractor in compliance?
Hypothesis TestingOne Tailed Variance
• Step One
• Form null and alternate hypothesis
• H0:  10 H1:  10
• Step Two
• (a) = 0.10 was specified
• smaller ’s good >>> implies one sided test
• Extremely large ’s will nullify H0
Hypothesis TestingOne Tailed Variance
• Step Three
• Collect data and compare to expectations
• Step Four
• Accept or reject the null hypothesis
• Since 13 < 14.7, accept H0 that   10 Meters
• Can’t conclude contractor has failed to meet spec

### Data Analysis - Hour 5

• Tests for non- normal distributions
• Sample size
• Error Analysis
Parametric vs. Nonparametric
• Non-parametric tests make no assumption about population distribution
• Everything so far --- assumed normal
• These tests less useful when used on normal distributions – require a larger sample size to give us same info from the test
• Use “goodness of fit tests” to determine distribution type
• Normal – use methods already describe
• Otherwise, use non- parametric
• Three non-parametric tests useful in flight test
Nonparametric Tests
• Three nonparametric tests we’ll use are
• Rank Sum Test
• also U test, Wilcoxon test, and Mann-Whitney test
• Sign Test
• can be applied to ordinal data
• Signed Rank Test
• combination of sign and rank sum tests
• All test the null hypothesis that two different samples come from the same population - assumes both are equivalent
• Calculates statistics from the two samples
• Determines probability --- decide if original assumption correct
Rank Sum Test“U Test or Mann Whitney”

The method (based on binominal distribution) consists of:

• Rank order all data from each sample
• Assign rank values to each data point
• average rank for repeated data values
• Compute the sum of the ranks for each sample (R1, R2)
• Calculate the U statistic for each sample (n = sample size)
• Compare the smaller U to the critical value in reference
• If U < critical value, reject H0 (i.e. 1 = 2 )
Rank Sum Example Radar Flight Test
• The target detection range (nm) of two radars was
• System 1: 9, 10, 11, 14, 15, 16, 20
• System 2: 4, 5, 5, 6, 7, 8, 12, 13, 17
• Is there a difference between the two systems at 90% confidence?

Score

4

5

5

6

7

8

9

10

11

12

13

14

15

16

17

20

System

2

2

2

2

2

2

1

1

1

2

2

1

1

1

2

1

Rank

1

2.5

2.5

4

5

6

7

8

9

10

11

12

13

14

15

16

Rank Sum Example
• Rank order all scores and assign rank values

R1 = 7+8+9+12+13+14+16 = 79

R2 = 1+2.5+2.5+4+5+6+10+11+15 =57

Calculate U1, U2

Rank Sum Flight Test Ex.
• Compare smaller U (12 in this case) with critical values for
• = 0.10 n1 = 7 n2 = 9 Ucr = 15
• Since U < Ucr

Reject null hypothesis that two radar’s have the same performance with 90% confidence

Sign Test
• Require > paired observations of two samples with a “better than” eval
• Can be used on ordinal data, such as pilots preferring system A or B
• Pilot preferring system A over B is same as B over A
• The probability of system A being preferred over system B, x times in N tests is just
• But if H0 is A=B, then p = q = .5, and
Sign Test
• But f(x) is just the probability for one discrete point, such as 3 of 8 pilots preferring A over B, and we need the whole tail
• Thus (i.e. sum)
Sign Test Example Modified Flight Control System
• Suppose 10 pilots evaluate handling qualities of two different sets of control laws during powered lift approaches
• The results are
• 7 prefer system B
• 2 prefer system A
• Should we switch to the new control laws?
• Null hypothesis is that both systems (old and new) are equally desirable
• Choose 0.5 level of significance since SOF not an issue
• Calculate probability of 0, 1 or 2 pilots choosing system A if there were really no difference
• If probability is less than level of significance, reject H0
• Conclude B is better than A
• Can only be 91% sure that B is really better than A
• Not enough – need 95% to justify added expense of System A
• Thus, accept H0 – no significant difference between A and B
Signed Rank Test
• Combines elements of both the Sign Test and the Rank Sum Test
• That is, the Sign Test can be made more powerful if there is some indication of how much one system was preferred over another
• Method:
• Rank differences by absolute magnitude
• Sum the positive and negative ranks (W+, W-)
• Compare the smaller W with critical values in reference
• Reject H0 if W < Wcr
Signed Rank Example
• If ten pilots who evaluated two competing systems gave them a Cooper Harper rating on a scale of 1 to 10:

Pilot System A System B Difference

1 3 1 2

2 5 2 3

3 3 4 -1

4 4 3 1

5 3 3 0

6 4 2 2

7 4 1 3

8 2 1 1

9 3 1 2

10 1 2 -1

Rank

2.5

2.5

2.5

2.5

6

6

6

8.5

9

Difference

-1

1

1

-1

2

2

2

3

3

Signed Rank Example
• Ranking differences by absolute magnitude, ignoring zero difference:
Signed Rank Example
• Summing positive and negative ranks:

W+ = 2.5 + 2.5 + 6 + 6 + 6+ 8.8 + 8.5 = 40.0

W- = 2.5 + 2.5 = 5.0

• Using  = 0.05, WCR =8 (one tailed criteria)
• Since 5 < 8 (WCR ), can reject H0
• There is a difference between A and B with 95% confidence
Sample Size
• One of the most significant aspects of statistics for flight testing is to determine how much you need to test
• Too few data points will result in poor conclusions or recommendations
• Too many data points will waste limited resources
• Two approaches for determining sample size
• Sample size when accuracy is the driving factor
• An approach for determining significant differences between means
Accuracy Driven
• Required to determine a population statistic such as takeoff distance within some accuracy ~ 10%
• Concept of confidence interval can be used to determine required number of sample points
• Remember the confidence interval of the mean:
• But is the error, thus
• System Program Office wants us to determine Takeoff distance within 10% during the test program
• Historically we find the standard deviation for similar aircraft to be about 20% of the mean
• We need to be 95% confident of our answer
• How many data points should we plan?
How Many Sorties Required to Determine T/O Distance?
• z0.975 = 1.96 for 95% confidence
•  = 0.2  historical is 20% of the mean
• Error = +/- 0.1 10% error
• Tests required () =

16 Takeoffs would be required

• Check to see if assumption about standard deviation remains reasonable (test hypothesis on variance) during testing
• For the general problem of whether or not a system meets a specification or if their is a significant difference between two systems, the approach is more complex
• The difference between paired samples (d) from two populations will have some distribution
• If the two populations are the same, the mean of the d’s will be zero
• If they are not the same, the mean will be non-zero

Determining Significant Differences Between Means

• If the difference between the population means is d1, then test results above and below a d of xcwill give
• Test result giving mean difference above xc
• Populations differ in their means with level of significance 
• Test result below xc
• Not a difference when in fact there was with probability ß

f (d)

a

b

d

d1=minimum significant difference

xc

Determining Significant Differences Between Means
• Move xc to right, reduce  but increase  etc.
• Only to reduce both is to increase sample size
• The sample size needed to determine the difference between two populations is a function of a, b, d1, s1, and s2,
General Approach Weapons System Delivery Accuracy - example
• How many data points are required to determine if a system meets the specification for a weapon delivery accuracy of 5 mils?
• We need
• a; normally set it at 0.10, 0.05, or 0.01 (0.01 is usually reserved for critical safety-of-flight issues) - use 0.05 here
• b; set this larger than a, typically 0.1 or 0.2 - use 0.1 here
• d1; the least difference considered significant - use 1 mil here
• s1 and s2; these come from testing (initially from historical data)
• note that s for a specification is zero
• assume 3 mils for s1 here (i.e results from previous test)
General Approach Weapons System Delivery Accuracy - example
• How many data points are required to determine if a system meets the specification for a weapon delivery accuracy of 5 mils?
• 77 Test points required - probably not feasible - must look at trade-offs
• How significant is it if we change  from 0.10 to 0.20 or change 1 from 1 to 1.5?
• The general approach
• has several choices
• Analyzing these options can lead to logical choices

n

a = 0.1

b = 0.1

b = 0.2

d1

Sample Size Non-parametric Tests
• Sample size cannot be determined with accuracy
• Signed rank test is about 90% efficient as test on means using z statistic
• Calculate n as just described and divide by 0.90
• How many pilots do we need to evaluate new flight control system laws and be 90% certain that there is a significant improvement (defined by Cooper Harper Scale)?

a= 0.10 b=0.20 (arbitrary) d1 = 1

s1, s2 - review of similar tests show s  1

Sample Size Non-parametric Tests
• Yields
• Thus -- 10 Evaluation pilots would be needed
Error Analysis
• Thus far we have discussed errors of directly measured parameters
• In flight test we normally combine observations into calculated values
• fuel used = fuel flow x time
• specific range = velocity / fuel flow
• The propagation or combinations of errors can thus be significantly larger the one individual piece would imply
Significant Figures
• The number of significant figures in a result implies a level of precision
• Definition
• the left most nonzero digit is the most significant figure
• the least significant figure is
• right most nonzero digit (no decimal point)
• right most digit (with a decimal point)
• all digits between least and most significant are significant digits
• Rules
• addition/subtraction: keep one more decimal digit than in least accurate number
• other: use one more digit than in least accurate, then round result to least accurate
• Ex. Timing event with watch with tenth of a second division
• shouldn’t record more than two decimal places --10.24 seconds
Error Propagation
• Precision of computed value is dependent on the precision of each directly measured value
• Example

Partial

Derivative

Form

In a computed value (say Q) it can be shown that the error in Q (DQ) where Q = f(a,b,c...) is:

Error Propagation
• But in this course, we have seen that individual errors are stochastic (randomly variable), so
• Example
• Find the standard deviation of CL (lift coefficient) given a 1% standard deviation each for n, W and Ve
Error Propagation
• Where:
• A 1 % error in each term gives a 2.4% error in the final result
Questions?

Questions?