Info 630 evaluation of information systems prof glenn booker
Download
1 / 62

INFO 630 Evaluation of Information Systems Prof. Glenn Booker - PowerPoint PPT Presentation


  • 359 Views
  • Uploaded on

INFO 630 Evaluation of Information Systems Prof. Glenn Booker Week 1 – Statistics foundation Syllabus This class focuses on understanding the types of measurements which can support a software development or maintenance project, including many key business metrics

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'INFO 630 Evaluation of Information Systems Prof. Glenn Booker' - MikeCarlo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Info 630 evaluation of information systems prof glenn booker l.jpg

INFO 630Evaluation of Information SystemsProf. Glenn Booker

Week 1 – Statistics foundation

INFO630 Week 1


Syllabus l.jpg
Syllabus

  • This class focuses on understanding the types of measurements which can support a software development or maintenance project, including many key business metrics

  • We will use the statistics program PASW (was SPSS) to manipulate data and generate graphs early in the course

INFO630 Week 1


Course overview l.jpg
Course overview

  • The course has three main objectives

    • How measurements are made (the statistical foundations); week 1

    • How to choose what to measure (GQ(I)M approach and Ishikawa’s tools); week 4

    • The rest of the course is devoted to understanding different types of measurements used for software development and maintenance

INFO630 Week 1


My background l.jpg
My background

  • DOD and FAA contractor for 18 years

  • Use a Systems Engineering approach - because software doesn’t live in a vacuum!

    • Mostly work with long-lived systems, so maintenance issues get lots of attention

    • Metrics focus on supporting decision making during a project

INFO630 Week 1


Week 1 overview l.jpg
Week 1 overview

  • Identify the need for better measurement in software projects

  • Discuss software life cycles

  • Discuss process and quality models (ISO 9000, CMMI, 6s, etc.)

  • Define measurement scales and basic measures

  • Introduce key statistical concepts: R2, 95% confidence interval, t-statistic

INFO630 Week 1


Software crisis l.jpg
Software Crisis

  • For every six new large-scale software systems put into operation, two others are canceled

  • Average software development project overshoots its schedule by 50%

  • Three quarters of all large scale systems are operating failures that either do not function as intended or are not used at all

INFO630 Week 1


Software crisis7 l.jpg
Software Crisis

  • Most computer code is handcrafted from raw programming languages by artisans using techniques they neither measure or are able to repeat consistently

  • There is a desperate need to evaluate software product and process through measurement and analysis

  • That’s why we have required this course!

INFO630 Week 1


Software life cycles l.jpg
Software Life Cycles

  • Some measurements are based on traditional software development life cycles, such as the waterfall life cycle

  • They can be adapted to other life cycles

INFO630 Week 1


Waterfall life cycle model l.jpg

Conceptual Development

RequirementsAnalysis

Architectural (High Level) Design

Detailed(Low Level)Design

Coding &Unit Test

SystemTesting

Waterfall LifeCycle Model

INFO630 Week 1


Waterfall model l.jpg
Waterfall Model

  • Conceptual Development includes defining the overall purpose of the product, who would use it, and how it relates to other products

  • Requirements Analysis includes definition of WHAT the product must do, such as performance goals, types of functionality, etc.

INFO630 Week 1


Waterfall model11 l.jpg
Waterfall Model

  • Architectural Design, or high level design, determines the internal and external interfaces, component boundaries and structures, and data structures

  • Detailed Design, or low level design, breaks the high level design down into detailed requirements for every module

INFO630 Week 1


Waterfall model12 l.jpg
Waterfall Model

  • Coding is the actual writing of source code, scripts, macros, and other artifacts

  • Unit Testing covers testing the functionality of each module against its requirements

  • System Testing can include “string” or component tests of several related modules, integration testing of several major components, and full scale system testing

INFO630 Week 1


Waterfall model13 l.jpg
Waterfall Model

  • After system testing, there may be early release options, such as alpha and beta testing, before official release of the product

  • Early releases test the ability of your organization to deliver and support the product, respond to customer inquiries, and fix problems

INFO630 Week 1


Prototyping life cycle l.jpg
Prototyping Life Cycle

  • When requirements are very unclear, an iterative prototyping approach can be used to resolve interface and feature requirements before the rest of development is done

    • Do preliminary requirements analysis

    • Iterate Quick Design, Build Prototype, Refine Design until customer is happy

INFO630 Week 1


Prototyping life cycle15 l.jpg
Prototyping Life Cycle

  • Then resume full scale development of the system using some other life cycle model

  • It’s critical to do quick development cycles during prototyping, or else you’re just redeveloping the whole system over and over

  • INFO630 Week 1


    Spiral life cycle l.jpg
    Spiral Life Cycle

    • Used for resolving severe risks before development begins, the spiral life cycle uses more types of techniques than just prototyping to resolve each big risk

      • Then another life cycle is used to develop the system

      • That’s a key point – the spiral life cycle isn’t used by itself!

    INFO630 Week 1


    Iterative life cycle l.jpg
    Iterative Life Cycle

    • Many modern techniques, such as the Rational Unified Process (RUP) advocate an iterative life cycle

    • RUP has four major phases, defined by the maturity of the system rather than traditional life cycle activities

      • Inception, Elaboration, Construction, and Transition

    INFO630 Week 1


    Iterative life cycle18 l.jpg
    Iterative Life Cycle

    • Like the spiral, iterative life cycles are driven by the need to resolve key risks, but here they are resolved all the way to implementation

      • Much more focus on early implementation of the core system, then building on it with each iteration

    INFO630 Week 1


    Cleanroom methodology l.jpg
    Cleanroom Methodology

    • The Cleanroom methodology is a mathematically rigorous approach to software development

      • Uses formal design specification, statistical testing, and no unit testing

      • Produces software with certifiable levels of reliability

      • Very rarely used in the US

    INFO630 Week 1


    Life cycle standards l.jpg
    Life Cycle Standards

    • The IEEE Software Engineering Standards are one source of information on many aspects of software development and maintenance

    • The standard ISO/IEC 12207, “Software Life Cycle Processes” has collected all major life cycle activities into one overall guidance document

    You can download ISO/IEC 12207 – see IEEE instructions on my web site

    INFO630 Week 1


    Who cares l.jpg
    Who cares…

    • …about statistics and measuring software activities?

      • The main models for guiding a software project, ISO 9000 and the Capability Maturity Model Integration (CMMI), both recommend use of statistical process control (SPC) techniques to help predict future performance by an organization

      • Six Sigma is all about SPC

    INFO630 Week 1


    Process maturity models l.jpg
    Process Maturity Models

    • Quality standards and goals are often embodied in process maturity standards, to guide organizations’ process improvement efforts

    • The primary software standard is the Software Engineering Institute’s (SEI’s) Capability Maturity Model Integration (CMMI)

    INFO630 Week 1


    Slide23 l.jpg
    CMMI

    • Describes five maturity levels:

      • 1. Initial; all processes are ad hoc, chaotic, not well defined. Do your own thing.

      • 2. Repeatable; a project follows a set of defined processes for management and conduct of software development

    INFO630 Week 1


    Slide24 l.jpg
    CMMI

    • 3. Defined; every project within the organization follows processes tailored from a common set of templates

    • 4. Managed; statistical control over processes has been achieved

    • 5. Optimizing; defect prevention and application of innovative new process methods are used

    INFO630 Week 1


    Other cmm s l.jpg
    Other CMM’s

    • CMMI is based on the original CMM for Software (SW-CMM)

      • The latter led to many other variations before the models were “integrated” in 2000

    • The index of all SEI reports was ftp://ftp.sei.cmu.edu/public/documents/sei.documents.pdf

      • (File missing as of Sept. 2010)

    INFO630 Week 1


    Malcolm baldrige l.jpg
    Malcolm Baldrige

    • The Malcolm Baldrige National Quality Award (MBNQA) is a US-based quality award created in 1988 by the Department of Commerce

      • Includes a broader scope, such as customer satisfaction, strategic planning, and human resource management

    INFO630 Week 1


    Iso 9000 l.jpg
    ISO 9000

    • The international standard for quality management of an organization is ISO 9000

    • Now applies to almost every type of business, but was first used for manufacturing

      • Hence it includes activities like ‘calibration of tools’

    INFO630 Week 1


    Iso 900028 l.jpg
    ISO 9000

    • ISO 9000 is facility-based, whereas CMMI is organization-based

      • A building gets ISO 9000 certification, but a project or organization gets CMMI

    • Was revised and republished in 2008

      • Previous editions were 1987, 1994, and 2000

      • “ISO 9001:2008 and ISO 14001: 2004 are implemented by over a million organizations in 175 countries.” – from here

    INFO630 Week 1


    Six sigma l.jpg
    Six Sigma

    • Six Sigma was founded by Motorola

    • It’s best known for its Black Belt and Green Belt certifications

    • It focuses on process improvements needed to consistently achieve extremely high levels of quality

    INFO630 Week 1


    Enter measurement l.jpg
    Enter Measurement

    • Measurement is critical to all process and quality models (CMMI, ISO 9000, MBNQA, etc.)

    • Need to define basic concepts of measurement so we can speak the same language

    INFO630 Week 1


    Engineering in a nutshell l.jpg

    (Software or otherwise)

    Resources (People!)

    Technology and Tools

    Product

    Processes

    Engineering in a Nutshell

    INFO630 Week 1


    Engineering in a nutshell32 l.jpg
    Engineering in a Nutshell

    • So in order to create any Product, we need Resources to use Tools in accordance with some Processes

    • Each of these major areas (Product, Resources, Tools, and Processes) can be a focus of measurement

    INFO630 Week 1


    Measurement needs l.jpg
    Measurement Needs

    • Statistical meaning - need long set of measurements for one project, and/or many projects

    • Could use measurement to test specific hypotheses

    • Industry uses of measurement are to help make decisions and track progress

    • Need scales to make measurements!

    INFO630 Week 1


    Measurement scales l.jpg
    Measurement Scales

    • The measurement scales form the French word for black, noir (as in “film noir”)

      • Nominal (least useful)

      • Ordinal

      • Interval

      • Ratio (most useful)

    • ‘NOIR’ is just a mnemonic to remember their sequence

    INFO630 Week 1


    Nominal scale l.jpg
    Nominal Scale

    • A nominal (“name”) scale groups or classifies things into categories, which:

      • Must be jointly exhaustive (cover everything)

      • Must be mutually exclusive (can’t be in two categories at once)

      • Are in any sequence (none better or worse)

    INFO630 Week 1


    Nominal scale36 l.jpg
    Nominal Scale

    • Common examples include

      • Gender, e.g. “This room contains 19 people, of whom 10 are female, and 9 male”

      • Portions of a system, e.g. suspension, drivetrain, body, etc.

      • Colors (5 blue shirts and 3 red shirts)

      • Job titles (though you could argue they’re hierarchical)

    INFO630 Week 1


    Ordinal scale l.jpg
    Ordinal Scale

    • This measurement ranks things in order

    • Sequence is important, but the intervals between ranks is not defined numerically

    • Rank is relative, such as “greater than” or “less than”

      • E.g. grades, CMM Maturity levels, inspection effectiveness ratings

    INFO630 Week 1


    Interval scale l.jpg
    Interval Scale

    • An interval scale measures quantitative differences, not just relative

    • Addition and subtraction are allowed

      • Only examples: common temperature scales (°F or C), or a single date (Feb 15, 1962)

      • Interval scale measurements are very rare!!

    • A zero point, if any, may be arbitrary (90 °F is *not* six times hotter than 15 °F!)

    INFO630 Week 1


    Ratio scale l.jpg
    Ratio Scale

    • A ratio scale is an interval scale with a non-arbitrary zero point

    • Allows division and multiplication

      • E.g. defect rates (defects/KSLOC), test scores, absolute temperature (K or R), anything you can count, lengths, weights, etc.

    • The “best” type of scale to use, whenever feasible

    INFO630 Week 1


    Scale hierarchy l.jpg
    Scale Hierarchy

    • Measurement scales are hierarchical:ratio (best) / interval / ordinal / nominal

    • Lower level scales can always be derived if data is from a higher scale

      • E.g. defect rates (a ratio scale) could be converted to {High, Medium, Low} or {Acceptable, Not Acceptable}, which are ordinal scales

    INFO630 Week 1


    Why are scales important l.jpg
    Why Are Scales Important?

    • The types of statistical analyses which are possible depend on the type of scale used for the measurements

      • In statistics, this is roughly broken into parametric tests (for interval or ratio scaled data) or non-parametric tests (for nominal or ordinal scaled data)

      • Some tests are more specific about the data scale(s) needed, e.g. a test might require ordinal data

    INFO630 Week 1


    Basic measures ratio l.jpg
    Basic Measures - Ratio

    • Used for two exclusive populations

      • Everything fits into one category or the other, never both

    • Ratio = (# of testers) : (# of developers)

    • E.g. tester to developer ratio is 1:4

    INFO630 Week 1


    Proportions and fractions l.jpg
    Proportions and Fractions

    • Used for multiple (> 2) populations

    • Proportion = (Number of this population) / (Total number of population)

    • Sum of all proportions equals unity

      • E.g. survey results

    • Proportions based on integer units; whereas fractions are based on real numbered units

      • E.g. what proportion of students answered A, B, C, or D on an exam question?

    INFO630 Week 1


    Percentage l.jpg
    Percentage

    • A proportion or fraction multiplied by 100 becomes a percentage

    • Only cite percentages when N (total population measured) is above ~30 to 50; always provide N for completeness

      • Why? Statistical methods are meaningless for very small populations

    INFO630 Week 1


    Slide45 l.jpg
    Rate

    • Rate conveys the change in a measurement, such as over time, dx/dt.

      • Rate = (# observed events / # of opportunities)*constant

    • Rate requires exposure to the riskbeing measured

      • E.g. defects per kSLOC = (# defects)/(# of kSLOC)*1000

    INFO630 Week 1


    Data analysis l.jpg
    Data Analysis

    • Raw data is collected, such as the date a particular problem was reported

    • Refined data is extracted from one or more raw data, e.g. the time it took that problem to be resolved

    • Refined data is analyzed to produce derived data, such as the average time to resolve problems

    INFO630 Week 1


    Models l.jpg
    Models

    • Focus on select elements of the problem at hand and ignores irrelevant ones

      • May show how parts of the problem relate to each other

      • May be expressed as equations, mappings, or diagrams, such as

        • Effort = a + b*SLOC

          • where SLOC = source lines of code

        • Effort = a*(SLOC)b

      • May be derived before or after measurement (theory vs. empirical)

    INFO630 Week 1


    Exponential notation l.jpg
    Exponential Notation

    • You might see output of the form +2.78E-12

      • This example means +2.78 * 10-12

      • A negative exponent, e.g. –12, makes it a small number, 10-12 is 0.000000000001

      • The leading number, here +2.78, controls whether it is a positive or negative number

    INFO630 Week 1


    Precision l.jpg
    Precision

    • Keep your final output to a consistent level of precision, e.g. don’t report one number as “12” and another as “11.862512598235678563256091”

    • Pick a reasonable level of precision (“significant digits”) similar to the accuracy of your inputs

    • Wait until the final answer to round off

    INFO630 Week 1


    Graphing l.jpg
    Graphing

    • A typical graph shows Y on the vertical axis, and X on the horizontal axis

      • Y is the dependent variable, and X is the independent variable, since you can pick any value of X and determine its matching value of Y iff (if and only if) Y is a function of X

    • PASW/SPSS will sometimes ask for X and Y, other times independent and dependent variables

    INFO630 Week 1


    Linear regression l.jpg
    Linear Regression

    y

    ^

    y

    y

    ^

    y = a + b*x

    ^

    y = y + e

    Choose the “best” line by minimizing the sum of the squares of the horizontal distances between the empirical points (your data) and the line

    x

    INFO630 Week 1


    What is r squared l.jpg
    What is R Squared?

    Coefficient of determination, R2, is a measure of the goodness of fit of a regression

    R2 ranges from 0 to 1.

    R2 = 1 is a perfect fit (all data points fall on the estimated line)

    R2 = 0 means that the variable(s) have no explanatory power

    Having R2 closer to 1 helps choose which math model is best suited to a problem

    INFO630 Week 1


    Slide53 l.jpg
    R?

    • All by itself, R is the 'correlation coefficient'

      R can be from -1 to +1

      R=+1 means perfect positive correlation (Y goes up as X goes up)

      R=-1 means perfect negative correlation (Y goes down as X goes up)

    • Yes, R and R2 have completely different names

    INFO630 Week 1


    Expressing uncertainty l.jpg
    Expressing Uncertainty

    • We can show the uncertainty in our measurements by putting the standard error after each term

    • A line is given in the form y = b0 + b1*x

    • Show standard errors with y = (b0 +/- seb0) + (b1 +/- seb1)*x

      • See example on next slide

    INFO630 Week 1


    Y 6 2 1 9 1 3 0 42 x l.jpg
    Y = (6.2+/-1.9) + (1.3 +/-0.42) X

    • The numbers in parentheses are the estimated coefficients, plus or minus the standard errors associated with those coefficients

      • In the example the constant parameter (here the y-intercept) was estimated to be 6.2 with a standard error of 1.9

      • The parameter associated with x (here, the slope) was estimated to be 1.3 with a standard error of 0.42

    INFO630 Week 1


    Level of confidence l.jpg
    Level of Confidence

    • Generally, we can say that the actual value of a parameter estimate is in the range of + 2 standard deviations of its estimated value with a 95% level of confidence

      • Thus the value of the constant parameter lies between 2.4 (i.e. 6.2 – 2*1.9) and 10.0 (=6.2 + 2*1.9) with a 95% level of confidence

      • With this level of confidence, the parameter estimate associated with x (slope) lies between .46 and 2.14

    INFO630 Week 1


    Level of confidence57 l.jpg
    Level of Confidence

    • The default level of confidence for statistical testing is 95%

    • Life-and death measurements (e.g. medical testing) use 99%

    • Some wimpy software studies might use level of confidence as low as 80%

      • Anything below 95% confidence is a sign of really weak data, and they’re struggling to find a connection

    INFO630 Week 1


    The t statistic l.jpg
    The “t” Statistic

    • The t-statistic is defined ast = (parameter estimate) / (standard error)

      • If |t| > 2 then the parameter estimate is significantly different from zero at the 95% level of confidence

      • In the example on slide 56:

        t = 6.2/1.9 = 3.26 for the constant term

        t = 1.3/0.42 = 3.1 for the slope

        Hence both terms are significantly different from zero.

      • Note: For very precise work, use |t| > 1.96 instead of 2

    INFO630 Week 1


    What can we ignore l.jpg
    What can we ignore?

    • If a parameter estimate associated with a variable is not significantly different from zero at the 95% level of confidence, then the variable should be omitted from the analysis

    • This is important later for seeing if a curve fit is useful; if the coefficients pass the T-test, they may be used

    INFO630 Week 1


    95 confidence interval l.jpg
    95% Confidence Interval

    • The 95% Confidence Interval (CI) helps limit how much of a data set’s distribution is practically possible

    • Any parameter with a measured mean and non-zero std error could fall in any range of values – though that range may be very unlikely to occur

    • The 95% CI lets us exclude everything outside of [mean +/- 2 std errors] as “too unlikely”

    INFO630 Week 1


    The normal distribution l.jpg
    The Normal Distribution

    • The “normal” or Gaussian distribution is the familiar bell shaped curve

      • It’s the basis for the 95% CI

    • The width of the distribution is measured by the standard deviation s

      • The area under the curve within +/- 1s from the middle covers 68.26% of all events

      • Within +/- 2s covers 95.44%

    INFO630 Week 1


    The normal distribution62 l.jpg
    The Normal Distribution

    • Within +/- 3s covers 99.73%

    • Within +/- 4s covers 99.9937%

    • Within +/- 5s covers 99.999943%

    • Within +/- 6s covers 99.9999998%

  • The Six Sigma quality objective is to have the number of defects under a few parts per million (ppm)

    • 3.4 ppm, for reasons discussed on their site, instead of the value implied above

  • INFO630 Week 1