Loading in 5 sec....

Flight Test and StatisticsPowerPoint Presentation

Flight Test and Statistics

- 146 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Flight Test and Statistics' - shalom

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Flight Test and Statistics“If you want to be absolutely certain you are right, you can’t say you know anything.”

### Data Analysis - Hour 5

Flight Test and Statistics

PRESENTED BY

Richard Duprey

Director, FAA Certification Programs

National Test Pilot School

Mojave, California

Flight Test and Statistics Overview

- Background on National Test Pilot School
- Coverage of Statistics
- Scope - six hours of academics
- Detail

- Use of statistics in flight test
- Types of questions we try to answer

NTPS Background

- Private non-profit
- Grants Master Science

- Only civilian school of its kind
- SETP equivalent to USAF and Navy Test Pilot Schools

- Offers variety of courses (Fixed Wing and Helicopters)
- Professional - 1 year
- Introductory
- Performance and Flying Qualities Testing
- Systems Testing
- Operational Test and Evaluation
- NVG

- FAA Test Pilot / FTE initial and recurrent training

Data Analysis - Hour 1

- Types of Errors
- Types of Data
- Elementary Probability
- Classical Probability
- Experimental Probability
- Axioms

- Examples

Introduction

- Flight testing involves data collection
- time to climb
- fuel flow for range estimates
- qualitative flying qualities ratings
- INS drift rate
- Landing and Take-off data
- Weapon effectiveness

- All of these experimental observations have inaccuracies
- Understanding these errors, their sources, and developing methods to minimize their effect is crucial to good flight testing

Types of Errors

- There are two very different types of errors
- systemic errors and random errors

- Systemic errors
- repeatable errors
- caused by flawed measuring process
- ex: measuring with an 11 inch ruler or airspeed indicator corrections

- Random errors
- not repeatable and usually small
- caused by unobserved changes in the experimental situation
- errors by observer - reading airspeed indicator
- unpredictable variations - small voltage fluctuations causing fuel counter errors

- can’t be eliminated but typically distributed about a well defined distribution

Types of Data

- There are four types of numerical data:
- NOMINAL DATA
- numerical in name only - say an aircraft configuration
- 1 = gear down, 2 = gear up, 3 = slats extended

- normal arithmetic processes not applicable
- 3 >1 or 3-1=2 are not valid relationships

- numerical in name only - say an aircraft configuration
- ORDINAL DATA
- contains information about rank order only
- #1 = C-150, #2 = B-1, #3 = F-15

- in terms of max speed: 3>1 is valid, but not 3-1=2

- contains information about rank order only

- NOMINAL DATA

Types of Data

- There are four types of numerical data (continued)
- INTERVAL DATA
- contains rank and difference information - ex: temperature in degrees Fahrenheit
- 30, 45, 60 at different times, 15 deg. difference
- zero point arbitrary, so 60o F is not twice 30oF

- contains rank and difference information - ex: temperature in degrees Fahrenheit
- RATIO DATA
- all arithmetic processes apply
- most flight test data falls into this category
- Can say that a 1000 pound per hour fuel flow is 4 times greater than 250 PPH

- INTERVAL DATA

Probability and Flight Test

- Quantitative analysis of random errors of measurement in flight testing must rely on probability theory
- Goal
- Student to understand what technique is appropriate and limitations on the results

Elementary Probability

- The probability of event A occurring is the fraction of the total times that we expect A to occur -

- Where: - P(A) is the probability of A occurring
- - na is the number of times we expect A to occur
- - N is the total number of attempts or trials

Elementary Probability

- From this definition, P(A) must always be between 0 and 1
- if A always happens, na = N and P(A) = 1
- if A never happens, na= 0 and P(A) = 0

- In order to determine P(A) we can take two different approaches
- make predictions based on foreknowledge (“a priori”)
- conduct experiments (“a posteriori”)

Classical (‘a priori’) Probability

- If it is true that
- every single trial leads to one of a finite number of outcomes
- and, every possible outcome is equally likely

- Then,
- na is the number of ways that A can happen
- N is the total number of possible outcomes

- For example:
- six-sided die implies six possible outcomes: N = 6
- if A is getting a 6 on one roll, na = 1
- P(A) = 1/6 = 0.1667

Second Example

- What is the probability of getting two heads when we toss two fair coins?
- There are four possible outcomes (N = 4)
- (H,H) (H,T) (T,H) (T,T)

- na = 1 since only one of the possible outcomes results in two heads (H,H)
- Thus P(A) = 1/4 = 0.25

Classical (‘a priori’) Probability

- Approach instructive
- Generally not applicable to flight test where:
- Possible outcomes infinite
- Each possible outcome not equally likely
- Leads us to second approach

Experimental (‘a posteriori’) Probability

- Experimental probability is defined as
- Where
- nA obs is the number of times we observe A

Versus . number of times we expect A to occur

- Nobs is the number of trials

Experimental Example

- If the probability of getting heads on a single toss of a coin is determined experimentally, we might get

1.0

Porb

(heads)

0.5

0

norb

1000

100

10

1

Probability Axioms

- Probability Theory can be used to describe relationships between events

Probability Axioms

- Three probability axioms are easily justified as opposed to proven
- P(not A) = 1 - P(A)
- Probability of something happening has to be one

- P(A or B) = P(A) + P(B)
- P(H or T) = 0.5 + 0.5 =1 for a single coin

- P(A and B) = P(A) x P(B)
- P(T and T) = 0.5 x 0.5 = 0.25 for two coins
- same answer we got when examining all possible outcomes

- P(not A) = 1 - P(A)
- The last two axioms require that
- each outcome is independent
- A occurring doesn’t affect probability of A or B occurring

- each outcome is mutually exclusive
- Only one can occur in a single trial

- each outcome is independent

Example

- Problem:
- Based on test data, 95% of the time an F-4 will successfully make an approach-end barrier engagement on an icy runway
- what is the probability that at least one of a flight of four F-4’s will miss?

- Based on test data, 95% of the time an F-4 will successfully make an approach-end barrier engagement on an icy runway
- Solution:
- P (1 or more miss) = 1 - P(all engage)
- Probability that at least one will miss is the complement of the probability that all will engage

- P (all engage) = P(1st success) × P(2nd ) × P(3rd) × P(4th)
= 0.95 × 0.95 × 0.95 × 0.95 = 0.954 = 0.81

Thus,

- P (1 or more miss) = 1 - 0.81 = 0.19

- P (1 or more miss) = 1 - P(all engage)

- Problem:
- What is the probability of getting 7 or 11 on a single roll of a pair of dice?

- Solution:
- Since getting 7 or 11 are independent, mutually exclusive events, we can say
- P (7 or 11) = P (7) + P (11)

- N = 62 = 36
- n7 = 6
- (6, 1) (1, 6) (5, 2) (2, 5) (4, 3) (3, 4)

- n11 = 2
- (6, 5) (5, 6)

- Thus,
- P (7) = 6/36, P (11) = 2/36
- P (7 or 11) = 6/36 + 2/36 = 0.222

- Since getting 7 or 11 are independent, mutually exclusive events, we can say

Data Analysis - Hour 2

- Populations and Samples
- Measures of Central Tendency
- Dispersion

- Probability Distributions
- Discrete
- Continuous
- Cumulative

Population & Samples

- A population is all possible observations
- Many populations are infinite
- A pair of dice can be rolled indefinitely
- Population of F-117 weapons deliveries is all the possible drops it could make in its lifetime

- Some populations are limited
- Votes by registered Republicans

- Many populations are infinite
- A sample is any subset of a population
- For example
- 100 rolls of a pair of dice
- Bomb scores for 100 weapon delivery sorties

- For example

Population Constructs

- Constructing a population
- Must impose assumptions
- Homogenous
- Independent
- Random

- Must impose assumptions

Sample Requirements

- Homogeneous
- the data must come from one population only
- DC-10 take-off data shouldn’t be used with MD-11

- Independent
- selecting one data point must not affect subsequent probabilities
- selecting and removing a heart from a deck of cards changes the probability of drawing another heart
- DC-10 landing 75 feet past touchdown aim point on one landing doesn’t change probability that next landing will miss by same distance (or any distance)

- Random
- equal probability of selecting any member of population
- using a member of a population with a bias would be non-random
- F-16 with boresight error would cause a bias in downrange miss distance

Measures of Central Tendency

- Given homogenous, independent, random sample, need to describe the contents of that sample
- Measure steel rod diameter with a micrometer - would get several different answers
- Tighten the micrometer
- Dust particles on the rod
- Reading scale on micrometer

- What to do with answers that are different?

Measures of Central Tendency

- There are three common measures of central tendency:
- Mean (arithmetic average) - most commonly used
- Mode
- most common value in the sample
- there may be more than one mode

- Median
- middle value
- for an even-numbered sample, average the two middle values

- Dangers ........

Dispersion

- Just reporting the mean as the answer can be very misleading
- Consider the following two samples, both with a mean of 100 (and same median as well)
- Sample 1: 99.9, 100, 100.1
- Sample 2: 0.1, 100, 199.9

- We also need to report how much the data generally differs from the mean value

Deviation

- We define deviation as the difference between the ith data point and the mean:
- Averaging the deviations does not help:

Mean Deviation

- Since there as many deviations above and below the mean, we could average the absolute values of deviations:

Standard Deviation

- While the mean deviation can be used, the standard deviation s is a more common measure of dispersion:
- versus
- The square of the standard deviation, s2, is called the variance

Notation

- Normally, we use Greek letters to denote statistics for populations:
m for population mean

s2 for population variance

- And we use Roman letters for sample statistics:
for sample mean

s2 for sample variance

Sample Standard Deviation

- One other difference exists between s and s
- The sample standard deviation has the sum of the squares divided by N - 1 versus N
- Mathematically, this is due to a loss of one degree of freedom
- The effect is to increase the standard deviation slightly
- Difference decreases as sample gets larger

Flight Test Example - PA28 Takeoff Distance

- Two data points eliminated - wrong configuration, improper technique
- Data adjusted for standard weight (2150 lbs.), runway slope (GPS), temperature, pressure, airspeed/altimeter corrections
- Technique, rotate at 65, liftoff at 70, maintain 75 until 50 feet AGL

Probability Distributions

- Statistical applications requires understanding of the characteristics of the data obtained
- Probability distributions gives us such understanding

Probability Distributions

- To understand probability distributions, consider the problem of tossing 2 coins
- Let n represent the number of heads for a single toss of both coins
- Then the probabilities of getting n = 0, 1, or 2 can be calculated:
- for n = 0, P(0) = 0.25
- for n = 1, P(1) = 0.5
- for n = 2, P(2) = 0.25

Discrete Distributions

- We can present the data as a bar graph

Empirical Distributions

- In flight test, we are concerned with empirical distributions versus theoretical in the coin example
- If we collect data on landing errors:

Continuous Distributions

- If we get more and more data, and make the intervals smaller, our histogram approaches a continuous curve:
Continuous Probability Distribution of Touchdown Miss Distance

- Can’t be interpreted same way as the previous discrete distribution

Continuous Distributions

- Height of curve above a point is not the probability of “x” having that point value
- Any one point on the x-axis represents a non-zero point on the curve
- But the probability associated with that single point must be zero, since there are an infinite number of points on the x-axis
- We can meaningfully talk only about the probability of being between two points a and b on the x-axis

Probability as Area Under Curve

- The probability of getting a result between a and b is rep-resented by the area under the probability distribution curve between a and b

f (x)

P(a £ x £ b)

x

Cumulative Probability Distribution

- A cumulative probability distribution gives the probability that x is less than or equal to some value, a
- Relative probability of aircraft landing miss distances could be displayed in the following cumulative distribution

1.0

0.95

f (x)

0.5

x

xT

Data Analysis - Hour 3

- Special Probability Distributions:
- Binomial
- Normal
- Student’s t
- Chi squared

Binomial Distribution

- The binomial is a discrete distribution
- It tells us the probability of getting n successes in N trials given the probability (p) of a single success
- Limiting cases
- if n = N, then obviously P(N) = pN
- if n = 0, then P(0) = (1 -p)N
- or, letting q = 1 - p, P(0) = qN

- For 0 < n < N, the possible number of combinations of success and failure gives

Binomial Distribution -flight test ex.

- Two flight control systems are equally desirable
- What is probability that 6 out of 8 pilots would prefer system A over B?
- If A and B are truly equally good, probability of pilot picking A over B is 0.5 (P=q =0.5)
- Probability of 6 pilots picking A over B is:
= 0.109

- There is only a 11% probability that this would happen. If it did, it would mean that your initial assumptions about the two flight control systems was in error

Binomial Flt. Test Example

- If p = q = 0.5, then for N = 8, the binomial distribution would be and from the figure, P(2) is about 11%

Normal Distribution

- The normal distribution is a continuous probability distribution based on the binomial
- SINGLE MOST IMPORTANT DISTRIBUTION IN FLIGHT TEST ANALYSIS

- Any deviation from a mean value is assumed to be composed of multiples of elemental errors evenly distributed
- The mathematical derivation is left as an exercise

Normal Distribution

- Graphically, it can be seen that x = m gives the maximum value and x = m ± s are the two points of inflection on the curve

f (x)

x

m

m+s

m-s

Normal Distribution

- Thus the probability that x lies between some value a and b is given by:
- Major problem - cannot be solved explicitly
- numerical techniques are required
- tables could be used, but different tables would be required for each m and s.

Standard Normal Distribution

- By using a substitution of variables
- We can use tables for a normal distribution where the mean is zero and the deviation is one
- Thus
- Becomes
- Mean of zero and a standard deviation of one

-2

-1

0

1

2

3

Standardized Normal Distribution99.7%

95%

68%

f(z)

2.5%

13.5%

34%

34%

13.5%

2.5%

z

Examples - cruise performance

- Cruise performance test flown 40 times
- Mean fuel used was 8,000 pounds
- Standard deviation was found to be 500 pounds

- Find probability that on the next sortie, we will use between 7000 and 8200 pounds
- Given m = 8000, s = 500
- find the probability that 7000 < x < 8200

- From table: 0.6554-0.0228 = 0.6326
- 63% Probability that fuel used would be within the specified range

Student’s t Distribution

- Problem : To use the normal distribution we had to know the population mean and standard deviation
- Flight Test - don’t normally know the population - just have sample
- The difference between sample and population mean is described by the statistic:

Student’s t vs n

- Different t distributions must be tabulated for each value of n
- For large n, the t-distribution approaches the standard normal distribution - use normal distribution when n =30

n = 10

n = 2

t

t - Flight Test Examples

- B-33 landing distance example

Chi- Squared (c2 ) Distribution

- Just as the sample mean may differ from the population mean, we should expect a difference in the variances
- The difference is distributed according to:

c2 Examples

- Find c2 for 95th percentile (11.1)
- one-tailed
- 5 degrees of freedom

- Find c2 for 95th percentile (0.831,12.80)
- two-tailed
- 5 degrees of freedom

- Find the median value of c2 (27.3)
- 28 degrees of freedom

Data Analysis - Hour 4

- Confidence Limits
- Intervals for mean and variance

- Hypothesis Testing
- Null and alternate hypotheses
- Tests on mean and variance

Confidence Limits

- In practice, we take a sample from a population such as Take-off distance
- Report it as if it were the true answer
- Subsequent tests will differ - sample mean/variance will differ from true population

- Can be considered sufficiently accurate if we
- Standardize test method and conditions
- Take sufficient samples

- Quantitative methods (confidence intervals) exist to determine how certain we are that we have the correct answer

Central Limit Theorem

Given a population with mean m, and variance s2, then the distribution of successive sample means, from samples of n observations, approaches a normal distribution with mean m, and variance s2/n

Central Limit Theorem

- Regardless of original Distribution of A, the distribution of the means will be approximately normal - gets better as n increased
- Mean of the means will be the same as the mean of A
- Variance of means = function of variance of A divided by n

Sample

size n

Þ

x

x

f(z)

a

2

a

2

z

Confidence Interval for Mean- If we take samples of size n, the means of multiple tests (okay samples) will be normally distributed

- Thus

Confidence Interval - Means

- If z comes from one of our samples
or, using the central limit theorem

- Thus

Confidence Interval - Means

- Thus (1 - a) percent of the time, the true population mean m, will be within a certain range about the sample mean
- The range of values is the interval
- And (1 - a) is the confidence level

Example - flight test

- Find 95% confidence interval for F-100 engine thrust given:
n = 50 engines tested

mean thrust = 22,700 lbs

s = 500 lbs

- At 95%, a =0.05, Z 1- a/2 = 1.96
= 22,700 +/- 1.96 ( )

22,561 < < 22,839

- At 99%, a =0.01, Z 1- a/2 = 2.58
= 22,700 +/- 2.58 ( )

22,518< < 22,882

- Observations
- Interval widens for increased certainty
- Had to use “s” as an estimate for , legitimate for n >30

Small Sample Confidence Intervals

- Some flight tests involved repeated numerous test points, most do not
- But when n <30, we must substitute t for z
- For example, if our earlier problem were based on only a sample of 5, what would the 95% confidence interval be?

Example - flight test

- Find 95% confidence interval for F-100 engine thrust given:
n = 5 engines tested

mean thrust = 22,700 lbs

s = 500 lbs

- At 95%, a/2 =0.025, =4, t 4, 0.975 = 2.78
= 22,700 +/- 2.78 ( )

22,078 < < 23,321

vs. 22,561 < < 22,839 for 95% with =50

vs. 22,518 < < 22,882 for 99% with =50

- Had to use “s” as an estimate for , legitimate for n >30

Confidence Interval for Variance

- Similar to intervals for means, the confidence interval for variance is based on the c2 statistic:
- For example, find the 95% confidence interval where n = 6, s = 2

Confidence Interval for Variance

- At 95%, a/2 =0.025, 1- a/2 = 0.975, v =5, s = 2
>>>

- Large band due to small sample size, if n = 18, interval would be smaller

Hypothesis Testing

- Instead of just using data to estimate of some parameter, we hypothesize an answer and then use data to judge reasonableness
- Truth can be known with certainty only if we examine the entire population
- Example
- assume a coin is fair (hypothesis)
- toss the coin 100 times
- if results are
- 48 heads, conclude coin is fair
- 35 heads, conclude coin is not fair

Null Hypothesis

- Acceptance of a statistical hypothesis
- result of insufficient evidence to reject it
- doesn’t necessarily mean that it is true

- Thus, it is important to carefully select initial hypothesis (the null hypothesis - H0 )
- selected for purposes of rejecting it – called the null hypothesis
- if we don’t gather enough data we must accept the null hypothesis
- Formulated so that in case of insufficient data, we return to the status quo or safe conclusion

- Examples of null hypothesis
- the defendant is innocent
- the new RADAR is no better than the old
- the MTBF of a new part is no better than the old

Alternate Hypothesis

- Since we are trying to negate the null hypothesis (H0) with data, the alternate hypothesis (H1) must be defined -- H0 must be “opposite” of H1
- Examples:
- 1. H0: m = 15 H1: m ¹ 15
- 2. H0: p ³ 0.9 H1: p < 0.9
- 3. Lock-on range of new radar is better than old

Types of Errors

- A Type I error
- rejecting null hypothesis when it is true
- chance variation of fair coin gives 35/100 heads

- probability is denoted as a (the level of significance)

- rejecting null hypothesis when it is true
- A Type II error
- accepting null hypothesis when it is false
- 43/100 concluded as fair when P(A) = 0.4

- probability is denoted as b (the power of the test)

- accepting null hypothesis when it is false
- We want small a
- as a decreases, b increases (fixed sample size)
- Large b implies we stay with the status quo, H0 more frequently than we should - a more “acceptable error”

- to decrease both , increase sample size

- as a decreases, b increases (fixed sample size)

Hypothesis Testing

- Step One
- Form null and alternate hypothesis

- Step Two
- Choose level of significance (a)
- Define areas of acceptance and rejection (one or two tailed)

- Step Three
- Collect data and compare to expectations

- Step Four
- Accept or reject the null hypothesis

Hypothesis Testing“Two Tailed”

- Some tests - interested in extremes in either direction
- Two Tailed

- Example: Burn times on an ejection seat rocket motor
- Too short - don’t clear aircraft
- Too long - impose too many g’s on pilot

- Form hypothesis of the form
- H0: m = m0 H1: m ¹ m0
- Reject H0 whenever sample produce results too low or high

- Not the usual for flight test - usually deal with “One Tailed”

Hypothesis Flight Test Examples Two Tailed

- Early Testing of F-19 bombing system for 30º dive angles gave
- Cross range error were normally distributed
- Mean error of 20 ft and a standard deviation of 3 feet.

- After a flight control modification to solve a high AOA flying qualities problem, it was found
- Sample mean cross range error for nine bombs was 22 feet.
- Has the mean changed at the 0.05 level of significance?

Hypothesis TestingTwo Tailed

- Step One
- Form null and alternate hypothesis
- H0: m = 20 (status quo) H1: m ¹ 20

- Step Two
- Choose level of significance: (a) = 0.05 (given)
- Define areas of acceptance and rejection (one or two tailed)
- (a) = 0.05 would be divided into two tails - hi/lo
- extreme values in either direction would indicate change in m
- not changed significantly from unmodified system

Hypothesis TestingTwo Tailed

- Step Three
- Collect data and compare to expectations

- Step Four
- Accept or reject the null hypothesis

Step 4 - accept or reject

Reject

- Since z = 2 which is > 1.96
- Conclude with 95% confidence to reject null hypothesis
- Mean cross range bombing error has changed due to flight control modification

Reject

a

2

a = 0.025

2

Accept

z

Hypothesis Testing“One Tailed”

- Most flight tests - interested in extremes in only one direction
- One Tailed - small sample, unknown

- Example: Does aircraft satisfy contractual range requirements
- Only care if distance is shorter than specified

- Form hypothesis of the form
- H0: m m0 H1: m m0
Or

- H0: m m0 H1: m m0

- H0: m m0 H1: m m0
- Reject H0 whenever sample produce results extreme in one direction

Hypothesis Flight Test Examples One Tail

- Contract fuel climb requirements
- Use less than 1500 pounds in climb from Sea Level to 20,000 feet

- Test results
- Nine climbs average of 1600 lbs
- Sample standard deviation of 200lbs.

- Do we penalize the contractor?

Hypothesis TestingOne Tailed

- Step One
- Form null and alternate hypothesis
- H0: m 1500 (until proven guilty) H1: m 1500

- Step Two
- Choose (a) = 0.05 for level of significance
- (a) = 0.01 reserved for safety of flight questions

- Define areas of acceptance and rejection (one or two tailed)
- one tailed - contract not met only if fuel used was on the high side

- Choose (a) = 0.05 for level of significance

Hypothesis TestingOne Tailed

- Step Three
- Collect data and compare to expectations

- Step Four
- Accept or reject the null hypothesis

Step 4 - accept or reject

Reject

- Since t = 1.5 which is < 1.867
- Conclude with 95% confidence to accept null hypothesis
- Contractor has met climb fuel requirements

- Put another way
- Don’t have data @95% confidence level to show contractor failed to meet specs

- Conclude with 95% confidence to accept null hypothesis

a = 0.05

Accept

z

Hypothesis Test ExamplesVariance

- “Four” steps still valid here
- Substitute chi-squared for z or t
- Example on variance
- The contract states the standard deviation of miss distances for particular weapon system delivery mode must not exceed 10 meters at 90 % confidence.
- In ten test runs we get s = 12 meters.
- Is the contractor in compliance?

Hypothesis TestingOne Tailed Variance

- Step One
- Form null and alternate hypothesis
- H0: 10 H1: 10

- Step Two
- (a) = 0.10 was specified
- smaller ’s good >>> implies one sided test
- Extremely large ’s will nullify H0

- (a) = 0.10 was specified

Hypothesis TestingOne Tailed Variance

- Step Three
- Collect data and compare to expectations

- Step Four
- Accept or reject the null hypothesis
- Since 13 < 14.7, accept H0 that 10 Meters
- Can’t conclude contractor has failed to meet spec

- Tests for non- normal distributions
- Sample size
- Error Analysis

Parametric vs. Nonparametric

- Non-parametric tests make no assumption about population distribution
- Everything so far --- assumed normal
- These tests less useful when used on normal distributions – require a larger sample size to give us same info from the test

- Use “goodness of fit tests” to determine distribution type
- Normal – use methods already describe
- Otherwise, use non- parametric

- Three non-parametric tests useful in flight test

Nonparametric Tests

- Three nonparametric tests we’ll use are
- Rank Sum Test
- also U test, Wilcoxon test, and Mann-Whitney test

- Sign Test
- can be applied to ordinal data

- Signed Rank Test
- combination of sign and rank sum tests

- Rank Sum Test
- All test the null hypothesis that two different samples come from the same population - assumes both are equivalent
- Calculates statistics from the two samples
- Determines probability --- decide if original assumption correct

Rank Sum Test“U Test or Mann Whitney”

The method (based on binominal distribution) consists of:

- Rank order all data from each sample
- Assign rank values to each data point
- average rank for repeated data values

- Compute the sum of the ranks for each sample (R1, R2)
- Calculate the U statistic for each sample (n = sample size)
- Compare the smaller U to the critical value in reference
- If U < critical value, reject H0 (i.e. 1 = 2 )

Rank Sum Example Radar Flight Test

- The target detection range (nm) of two radars was
- System 1: 9, 10, 11, 14, 15, 16, 20
- System 2: 4, 5, 5, 6, 7, 8, 12, 13, 17

- Is there a difference between the two systems at 90% confidence?

4

5

5

6

7

8

9

10

11

12

13

14

15

16

17

20

System

2

2

2

2

2

2

1

1

1

2

2

1

1

1

2

1

Rank

1

2.5

2.5

4

5

6

7

8

9

10

11

12

13

14

15

16

Rank Sum Example- Rank order all scores and assign rank values
R1 = 7+8+9+12+13+14+16 = 79

R2 = 1+2.5+2.5+4+5+6+10+11+15 =57

Calculate U1, U2

Rank Sum Flight Test Ex.

- Compare smaller U (12 in this case) with critical values for
- = 0.10 n1 = 7 n2 = 9 Ucr = 15

- Since U < Ucr
Reject null hypothesis that two radar’s have the same performance with 90% confidence

Sign Test

- Require > paired observations of two samples with a “better than” eval
- Can be used on ordinal data, such as pilots preferring system A or B
- Pilot preferring system A over B is same as B over A

- The probability of system A being preferred over system B, x times in N tests is just
- But if H0 is A=B, then p = q = .5, and

Sign Test

- But f(x) is just the probability for one discrete point, such as 3 of 8 pilots preferring A over B, and we need the whole tail
- Thus (i.e. sum)

Sign Test Example Modified Flight Control System

- Suppose 10 pilots evaluate handling qualities of two different sets of control laws during powered lift approaches
- The results are
- 7 prefer system B
- 2 prefer system A
- 1 had no preference

- Should we switch to the new control laws?

Sign Test Example Evaluation of new flight control system laws

- Null hypothesis is that both systems (old and new) are equally desirable
- Choose 0.5 level of significance since SOF not an issue
- Calculate probability of 0, 1 or 2 pilots choosing system A if there were really no difference
- If probability is less than level of significance, reject H0
- Conclude B is better than A

Sign Test Example Evaluation of new flight control system laws

- Can only be 91% sure that B is really better than A
- Not enough – need 95% to justify added expense of System A
- Thus, accept H0 – no significant difference between A and B

Signed Rank Test

- Combines elements of both the Sign Test and the Rank Sum Test
- That is, the Sign Test can be made more powerful if there is some indication of how much one system was preferred over another
- Method:
- Rank differences by absolute magnitude
- Sum the positive and negative ranks (W+, W-)
- Compare the smaller W with critical values in reference
- Reject H0 if W < Wcr

Signed Rank Example

- If ten pilots who evaluated two competing systems gave them a Cooper Harper rating on a scale of 1 to 10:
Pilot System A System B Difference

1 3 1 2

2 5 2 3

3 3 4 -1

4 4 3 1

5 3 3 0

6 4 2 2

7 4 1 3

8 2 1 1

9 3 1 2

10 1 2 -1

2.5

2.5

2.5

2.5

6

6

6

8.5

9

Difference

-1

1

1

-1

2

2

2

3

3

Signed Rank Example- Ranking differences by absolute magnitude, ignoring zero difference:

Signed Rank Example

- Summing positive and negative ranks:
W+ = 2.5 + 2.5 + 6 + 6 + 6+ 8.8 + 8.5 = 40.0

W- = 2.5 + 2.5 = 5.0

- Using = 0.05, WCR =8 (one tailed criteria)
- Since 5 < 8 (WCR ), can reject H0
- There is a difference between A and B with 95% confidence

- Since 5 < 8 (WCR ), can reject H0

Sample Size

- One of the most significant aspects of statistics for flight testing is to determine how much you need to test
- Too few data points will result in poor conclusions or recommendations
- Too many data points will waste limited resources

- Two approaches for determining sample size
- Sample size when accuracy is the driving factor
- An approach for determining significant differences between means

- Tradeoffs

Accuracy Driven

- Required to determine a population statistic such as takeoff distance within some accuracy ~ 10%
- Concept of confidence interval can be used to determine required number of sample points

- Remember the confidence interval of the mean:
- But is the error, thus

Accuracy Example How Many Sorties Required to Determine T/O Distance?

- System Program Office wants us to determine Takeoff distance within 10% during the test program
- Historically we find the standard deviation for similar aircraft to be about 20% of the mean
- We need to be 95% confident of our answer
- How many data points should we plan?

How Many Sorties Required to Determine T/O Distance?

- z0.975 = 1.96 for 95% confidence
- = 0.2 historical is 20% of the mean
- Error = +/- 0.1 10% error
- Tests required () =
16 Takeoffs would be required

- Check to see if assumption about standard deviation remains reasonable (test hypothesis on variance) during testing

General Approach for Determining Significant Differences Between Means

- For the general problem of whether or not a system meets a specification or if their is a significant difference between two systems, the approach is more complex
- The difference between paired samples (d) from two populations will have some distribution
- If the two populations are the same, the mean of the d’s will be zero
- If they are not the same, the mean will be non-zero

Determining Significant Differences Between Means

- If the difference between the population means is d1, then test results above and below a d of xcwill give
- Test result giving mean difference above xc
- Populations differ in their means with level of significance

- Test result below xc
- Not a difference when in fact there was with probability ß

f (d)

a

b

d

d1=minimum significant difference

xc

Determining Significant Differences Between Means

- Move xc to right, reduce but increase etc.
- Only to reduce both is to increase sample size
- The sample size needed to determine the difference between two populations is a function of a, b, d1, s1, and s2,

General Approach Weapons System Delivery Accuracy - example

- How many data points are required to determine if a system meets the specification for a weapon delivery accuracy of 5 mils?
- We need
- a; normally set it at 0.10, 0.05, or 0.01 (0.01 is usually reserved for critical safety-of-flight issues) - use 0.05 here
- b; set this larger than a, typically 0.1 or 0.2 - use 0.1 here
- d1; the least difference considered significant - use 1 mil here
- s1 and s2; these come from testing (initially from historical data)
- note that s for a specification is zero
- assume 3 mils for s1 here (i.e results from previous test)

General Approach Weapons System Delivery Accuracy - example

- How many data points are required to determine if a system meets the specification for a weapon delivery accuracy of 5 mils?
- 77 Test points required - probably not feasible - must look at trade-offs
- How significant is it if we change from 0.10 to 0.20 or change 1 from 1 to 1.5?

Tradeoffs

- The general approach
- can lead to unacceptable answers
- has several choices

- Analyzing these options can lead to logical choices

n

a = 0.1

b = 0.1

b = 0.2

d1

Sample Size Non-parametric Tests

- Sample size cannot be determined with accuracy
- Signed rank test is about 90% efficient as test on means using z statistic
- Calculate n as just described and divide by 0.90

- How many pilots do we need to evaluate new flight control system laws and be 90% certain that there is a significant improvement (defined by Cooper Harper Scale)?
a= 0.10 b=0.20 (arbitrary) d1 = 1

s1, s2 - review of similar tests show s 1

Sample Size Non-parametric Tests

- Yields
- Thus -- 10 Evaluation pilots would be needed

Error Analysis

- Thus far we have discussed errors of directly measured parameters
- In flight test we normally combine observations into calculated values
- fuel used = fuel flow x time
- specific range = velocity / fuel flow

- The propagation or combinations of errors can thus be significantly larger the one individual piece would imply

Significant Figures

- The number of significant figures in a result implies a level of precision
- Definition
- the left most nonzero digit is the most significant figure
- the least significant figure is
- right most nonzero digit (no decimal point)
- right most digit (with a decimal point)

- all digits between least and most significant are significant digits

- Rules
- addition/subtraction: keep one more decimal digit than in least accurate number
- other: use one more digit than in least accurate, then round result to least accurate

- Ex. Timing event with watch with tenth of a second division
- shouldn’t record more than two decimal places --10.24 seconds

Error Propagation

- Precision of computed value is dependent on the precision of each directly measured value
- Example
Partial

Derivative

Form

In a computed value (say Q) it can be shown that the error in Q (DQ) where Q = f(a,b,c...) is:

Error Propagation

- But in this course, we have seen that individual errors are stochastic (randomly variable), so
- Example
- Find the standard deviation of CL (lift coefficient) given a 1% standard deviation each for n, W and Ve

Error Propagation

- Where:
- A 1 % error in each term gives a 2.4% error in the final result

Questions?

Questions?

Download Presentation

Connecting to Server..