Loading in 2 Seconds...

INFO 630 Evaluation of Information Systems Prof. Glenn Booker

Loading in 2 Seconds...

- 361 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'INFO 630 Evaluation of Information Systems Prof. Glenn Booker' - MikeCarlo

Download Now**An Image/Link below is provided (as is) to download presentation**

Download Now

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### INFO 630Evaluation of Information SystemsProf. Glenn Booker

Week 1 – Statistics foundation

INFO630 Week 1

Syllabus

- This class focuses on understanding the types of measurements which can support a software development or maintenance project, including many key business metrics
- We will use the statistics program PASW (was SPSS) to manipulate data and generate graphs early in the course

INFO630 Week 1

Course overview

- The course has three main objectives
- How measurements are made (the statistical foundations); week 1
- How to choose what to measure (GQ(I)M approach and Ishikawa’s tools); week 4
- The rest of the course is devoted to understanding different types of measurements used for software development and maintenance

INFO630 Week 1

My background

- DOD and FAA contractor for 18 years
- Use a Systems Engineering approach - because software doesn’t live in a vacuum!
- Mostly work with long-lived systems, so maintenance issues get lots of attention
- Metrics focus on supporting decision making during a project

INFO630 Week 1

Week 1 overview

- Identify the need for better measurement in software projects
- Discuss software life cycles
- Discuss process and quality models (ISO 9000, CMMI, 6s, etc.)
- Define measurement scales and basic measures
- Introduce key statistical concepts: R2, 95% confidence interval, t-statistic

INFO630 Week 1

Software Crisis

- For every six new large-scale software systems put into operation, two others are canceled
- Average software development project overshoots its schedule by 50%
- Three quarters of all large scale systems are operating failures that either do not function as intended or are not used at all

INFO630 Week 1

Software Crisis

- Most computer code is handcrafted from raw programming languages by artisans using techniques they neither measure or are able to repeat consistently
- There is a desperate need to evaluate software product and process through measurement and analysis
- That’s why we have required this course!

INFO630 Week 1

Software Life Cycles

- Some measurements are based on traditional software development life cycles, such as the waterfall life cycle
- They can be adapted to other life cycles

INFO630 Week 1

RequirementsAnalysis

Architectural (High Level) Design

Detailed(Low Level)Design

Coding &Unit Test

SystemTesting

Waterfall LifeCycle ModelINFO630 Week 1

Waterfall Model

- Conceptual Development includes defining the overall purpose of the product, who would use it, and how it relates to other products
- Requirements Analysis includes definition of WHAT the product must do, such as performance goals, types of functionality, etc.

INFO630 Week 1

Waterfall Model

- Architectural Design, or high level design, determines the internal and external interfaces, component boundaries and structures, and data structures
- Detailed Design, or low level design, breaks the high level design down into detailed requirements for every module

INFO630 Week 1

Waterfall Model

- Coding is the actual writing of source code, scripts, macros, and other artifacts
- Unit Testing covers testing the functionality of each module against its requirements
- System Testing can include “string” or component tests of several related modules, integration testing of several major components, and full scale system testing

INFO630 Week 1

Waterfall Model

- After system testing, there may be early release options, such as alpha and beta testing, before official release of the product
- Early releases test the ability of your organization to deliver and support the product, respond to customer inquiries, and fix problems

INFO630 Week 1

Prototyping Life Cycle

- When requirements are very unclear, an iterative prototyping approach can be used to resolve interface and feature requirements before the rest of development is done
- Do preliminary requirements analysis
- Iterate Quick Design, Build Prototype, Refine Design until customer is happy

INFO630 Week 1

Prototyping Life Cycle

- Then resume full scale development of the system using some other life cycle model
- It’s critical to do quick development cycles during prototyping, or else you’re just redeveloping the whole system over and over

INFO630 Week 1

Spiral Life Cycle

- Used for resolving severe risks before development begins, the spiral life cycle uses more types of techniques than just prototyping to resolve each big risk
- Then another life cycle is used to develop the system
- That’s a key point – the spiral life cycle isn’t used by itself!

INFO630 Week 1

Iterative Life Cycle

- Many modern techniques, such as the Rational Unified Process (RUP) advocate an iterative life cycle
- RUP has four major phases, defined by the maturity of the system rather than traditional life cycle activities
- Inception, Elaboration, Construction, and Transition

INFO630 Week 1

Iterative Life Cycle

- Like the spiral, iterative life cycles are driven by the need to resolve key risks, but here they are resolved all the way to implementation
- Much more focus on early implementation of the core system, then building on it with each iteration

INFO630 Week 1

Cleanroom Methodology

- The Cleanroom methodology is a mathematically rigorous approach to software development
- Uses formal design specification, statistical testing, and no unit testing
- Produces software with certifiable levels of reliability
- Very rarely used in the US

INFO630 Week 1

Life Cycle Standards

- The IEEE Software Engineering Standards are one source of information on many aspects of software development and maintenance
- The standard ISO/IEC 12207, “Software Life Cycle Processes” has collected all major life cycle activities into one overall guidance document

You can download ISO/IEC 12207 – see IEEE instructions on my web site

INFO630 Week 1

Who cares…

- …about statistics and measuring software activities?
- The main models for guiding a software project, ISO 9000 and the Capability Maturity Model Integration (CMMI), both recommend use of statistical process control (SPC) techniques to help predict future performance by an organization
- Six Sigma is all about SPC

INFO630 Week 1

Process Maturity Models

- Quality standards and goals are often embodied in process maturity standards, to guide organizations’ process improvement efforts
- The primary software standard is the Software Engineering Institute’s (SEI’s) Capability Maturity Model Integration (CMMI)

INFO630 Week 1

CMMI

- Describes five maturity levels:
- 1. Initial; all processes are ad hoc, chaotic, not well defined. Do your own thing.
- 2. Repeatable; a project follows a set of defined processes for management and conduct of software development

INFO630 Week 1

CMMI

- 3. Defined; every project within the organization follows processes tailored from a common set of templates
- 4. Managed; statistical control over processes has been achieved
- 5. Optimizing; defect prevention and application of innovative new process methods are used

INFO630 Week 1

Other CMM’s

- CMMI is based on the original CMM for Software (SW-CMM)
- The latter led to many other variations before the models were “integrated” in 2000
- The index of all SEI reports was ftp://ftp.sei.cmu.edu/public/documents/sei.documents.pdf
- (File missing as of Sept. 2010)

INFO630 Week 1

Malcolm Baldrige

- The Malcolm Baldrige National Quality Award (MBNQA) is a US-based quality award created in 1988 by the Department of Commerce
- Includes a broader scope, such as customer satisfaction, strategic planning, and human resource management

INFO630 Week 1

ISO 9000

- The international standard for quality management of an organization is ISO 9000
- Now applies to almost every type of business, but was first used for manufacturing
- Hence it includes activities like ‘calibration of tools’

INFO630 Week 1

ISO 9000

- ISO 9000 is facility-based, whereas CMMI is organization-based
- A building gets ISO 9000 certification, but a project or organization gets CMMI
- Was revised and republished in 2008
- Previous editions were 1987, 1994, and 2000
- “ISO 9001:2008 and ISO 14001: 2004 are implemented by over a million organizations in 175 countries.” – from here

INFO630 Week 1

Six Sigma

- Six Sigma was founded by Motorola
- It’s best known for its Black Belt and Green Belt certifications
- It focuses on process improvements needed to consistently achieve extremely high levels of quality

INFO630 Week 1

Enter Measurement

- Measurement is critical to all process and quality models (CMMI, ISO 9000, MBNQA, etc.)
- Need to define basic concepts of measurement so we can speak the same language

INFO630 Week 1

Engineering in a Nutshell

- So in order to create any Product, we need Resources to use Tools in accordance with some Processes
- Each of these major areas (Product, Resources, Tools, and Processes) can be a focus of measurement

INFO630 Week 1

Measurement Needs

- Statistical meaning - need long set of measurements for one project, and/or many projects
- Could use measurement to test specific hypotheses
- Industry uses of measurement are to help make decisions and track progress
- Need scales to make measurements!

INFO630 Week 1

Measurement Scales

- The measurement scales form the French word for black, noir (as in “film noir”)
- Nominal (least useful)
- Ordinal
- Interval
- Ratio (most useful)
- ‘NOIR’ is just a mnemonic to remember their sequence

INFO630 Week 1

Nominal Scale

- A nominal (“name”) scale groups or classifies things into categories, which:
- Must be jointly exhaustive (cover everything)
- Must be mutually exclusive (can’t be in two categories at once)
- Are in any sequence (none better or worse)

INFO630 Week 1

Nominal Scale

- Common examples include
- Gender, e.g. “This room contains 19 people, of whom 10 are female, and 9 male”
- Portions of a system, e.g. suspension, drivetrain, body, etc.
- Colors (5 blue shirts and 3 red shirts)
- Job titles (though you could argue they’re hierarchical)

INFO630 Week 1

Ordinal Scale

- This measurement ranks things in order
- Sequence is important, but the intervals between ranks is not defined numerically
- Rank is relative, such as “greater than” or “less than”
- E.g. grades, CMM Maturity levels, inspection effectiveness ratings

INFO630 Week 1

Interval Scale

- An interval scale measures quantitative differences, not just relative
- Addition and subtraction are allowed
- Only examples: common temperature scales (°F or C), or a single date (Feb 15, 1962)
- Interval scale measurements are very rare!!
- A zero point, if any, may be arbitrary (90 °F is *not* six times hotter than 15 °F!)

INFO630 Week 1

Ratio Scale

- A ratio scale is an interval scale with a non-arbitrary zero point
- Allows division and multiplication
- E.g. defect rates (defects/KSLOC), test scores, absolute temperature (K or R), anything you can count, lengths, weights, etc.
- The “best” type of scale to use, whenever feasible

INFO630 Week 1

Scale Hierarchy

- Measurement scales are hierarchical:ratio (best) / interval / ordinal / nominal
- Lower level scales can always be derived if data is from a higher scale
- E.g. defect rates (a ratio scale) could be converted to {High, Medium, Low} or {Acceptable, Not Acceptable}, which are ordinal scales

INFO630 Week 1

Why Are Scales Important?

- The types of statistical analyses which are possible depend on the type of scale used for the measurements
- In statistics, this is roughly broken into parametric tests (for interval or ratio scaled data) or non-parametric tests (for nominal or ordinal scaled data)
- Some tests are more specific about the data scale(s) needed, e.g. a test might require ordinal data

INFO630 Week 1

Basic Measures - Ratio

- Used for two exclusive populations
- Everything fits into one category or the other, never both
- Ratio = (# of testers) : (# of developers)
- E.g. tester to developer ratio is 1:4

INFO630 Week 1

Proportions and Fractions

- Used for multiple (> 2) populations
- Proportion = (Number of this population) / (Total number of population)
- Sum of all proportions equals unity
- E.g. survey results
- Proportions based on integer units; whereas fractions are based on real numbered units
- E.g. what proportion of students answered A, B, C, or D on an exam question?

INFO630 Week 1

Percentage

- A proportion or fraction multiplied by 100 becomes a percentage
- Only cite percentages when N (total population measured) is above ~30 to 50; always provide N for completeness
- Why? Statistical methods are meaningless for very small populations

INFO630 Week 1

Rate

- Rate conveys the change in a measurement, such as over time, dx/dt.
- Rate = (# observed events / # of opportunities)*constant
- Rate requires exposure to the riskbeing measured
- E.g. defects per kSLOC = (# defects)/(# of kSLOC)*1000

INFO630 Week 1

Data Analysis

- Raw data is collected, such as the date a particular problem was reported
- Refined data is extracted from one or more raw data, e.g. the time it took that problem to be resolved
- Refined data is analyzed to produce derived data, such as the average time to resolve problems

INFO630 Week 1

Models

- Focus on select elements of the problem at hand and ignores irrelevant ones
- May show how parts of the problem relate to each other
- May be expressed as equations, mappings, or diagrams, such as
- Effort = a + b*SLOC
- where SLOC = source lines of code
- Effort = a*(SLOC)b
- May be derived before or after measurement (theory vs. empirical)

INFO630 Week 1

Exponential Notation

- You might see output of the form +2.78E-12
- This example means +2.78 * 10-12
- A negative exponent, e.g. –12, makes it a small number, 10-12 is 0.000000000001
- The leading number, here +2.78, controls whether it is a positive or negative number

INFO630 Week 1

Precision

- Keep your final output to a consistent level of precision, e.g. don’t report one number as “12” and another as “11.862512598235678563256091”
- Pick a reasonable level of precision (“significant digits”) similar to the accuracy of your inputs
- Wait until the final answer to round off

INFO630 Week 1

Graphing

- A typical graph shows Y on the vertical axis, and X on the horizontal axis
- Y is the dependent variable, and X is the independent variable, since you can pick any value of X and determine its matching value of Y iff (if and only if) Y is a function of X
- PASW/SPSS will sometimes ask for X and Y, other times independent and dependent variables

INFO630 Week 1

Linear Regression

y

^

y

y

^

y = a + b*x

^

y = y + e

Choose the “best” line by minimizing the sum of the squares of the horizontal distances between the empirical points (your data) and the line

x

INFO630 Week 1

What is R Squared?

Coefficient of determination, R2, is a measure of the goodness of fit of a regression

R2 ranges from 0 to 1.

R2 = 1 is a perfect fit (all data points fall on the estimated line)

R2 = 0 means that the variable(s) have no explanatory power

Having R2 closer to 1 helps choose which math model is best suited to a problem

INFO630 Week 1

R?

- All by itself, R is the 'correlation coefficient'

R can be from -1 to +1

R=+1 means perfect positive correlation (Y goes up as X goes up)

R=-1 means perfect negative correlation (Y goes down as X goes up)

- Yes, R and R2 have completely different names

INFO630 Week 1

Expressing Uncertainty

- We can show the uncertainty in our measurements by putting the standard error after each term
- A line is given in the form y = b0 + b1*x
- Show standard errors with y = (b0 +/- seb0) + (b1 +/- seb1)*x
- See example on next slide

INFO630 Week 1

Y = (6.2+/-1.9) + (1.3 +/-0.42) X

- The numbers in parentheses are the estimated coefficients, plus or minus the standard errors associated with those coefficients
- In the example the constant parameter (here the y-intercept) was estimated to be 6.2 with a standard error of 1.9
- The parameter associated with x (here, the slope) was estimated to be 1.3 with a standard error of 0.42

INFO630 Week 1

Level of Confidence

- Generally, we can say that the actual value of a parameter estimate is in the range of + 2 standard deviations of its estimated value with a 95% level of confidence
- Thus the value of the constant parameter lies between 2.4 (i.e. 6.2 – 2*1.9) and 10.0 (=6.2 + 2*1.9) with a 95% level of confidence
- With this level of confidence, the parameter estimate associated with x (slope) lies between .46 and 2.14

INFO630 Week 1

Level of Confidence

- The default level of confidence for statistical testing is 95%
- Life-and death measurements (e.g. medical testing) use 99%
- Some wimpy software studies might use level of confidence as low as 80%
- Anything below 95% confidence is a sign of really weak data, and they’re struggling to find a connection

INFO630 Week 1

The “t” Statistic

- The t-statistic is defined ast = (parameter estimate) / (standard error)
- If |t| > 2 then the parameter estimate is significantly different from zero at the 95% level of confidence
- In the example on slide 56:

t = 6.2/1.9 = 3.26 for the constant term

t = 1.3/0.42 = 3.1 for the slope

Hence both terms are significantly different from zero.

- Note: For very precise work, use |t| > 1.96 instead of 2

INFO630 Week 1

What can we ignore?

- If a parameter estimate associated with a variable is not significantly different from zero at the 95% level of confidence, then the variable should be omitted from the analysis
- This is important later for seeing if a curve fit is useful; if the coefficients pass the T-test, they may be used

INFO630 Week 1

95% Confidence Interval

- The 95% Confidence Interval (CI) helps limit how much of a data set’s distribution is practically possible
- Any parameter with a measured mean and non-zero std error could fall in any range of values – though that range may be very unlikely to occur
- The 95% CI lets us exclude everything outside of [mean +/- 2 std errors] as “too unlikely”

INFO630 Week 1

The Normal Distribution

- The “normal” or Gaussian distribution is the familiar bell shaped curve
- It’s the basis for the 95% CI
- The width of the distribution is measured by the standard deviation s
- The area under the curve within +/- 1s from the middle covers 68.26% of all events
- Within +/- 2s covers 95.44%

INFO630 Week 1

The Normal Distribution

- Within +/- 3s covers 99.73%
- Within +/- 4s covers 99.9937%
- Within +/- 5s covers 99.999943%
- Within +/- 6s covers 99.9999998%
- The Six Sigma quality objective is to have the number of defects under a few parts per million (ppm)
- 3.4 ppm, for reasons discussed on their site, instead of the value implied above

INFO630 Week 1

Download Presentation

Connecting to Server..