1 / 33

ST 370 Probability and Statistics for Engineers Lecture 1 Introduction

ST 370 Probability and Statistics for Engineers Lecture 1 Introduction. Dr. Zhao-Bang Zeng Department of Statistics NC State University. Syllabus.

gayle
Download Presentation

ST 370 Probability and Statistics for Engineers Lecture 1 Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ST 370 Probability and Statistics for Engineers Lecture 1 Introduction Dr. Zhao-Bang Zeng Department of Statistics NC State University

  2. Syllabus • Textbook: D.C. Montgomery and G.C. Runger. Applied Statistics andProbability for Engineers. 5th edition, Wiley, 2011. We will cover selected topics in Chapter 1-9, 11, 13, and 14. • Course Webpage: http://statgen.ncsu.edu/zeng/ST370/ • Lecture: TH 3:00 pm-4:15 pm, 232A Withers Hall • Office Hour: • Instructor: Wed/Fri 10-12am, or by appointment at 3211C Broughton Hall • TA: Tues 1:00-3:00pm at Tutorial Center in 1101 SAS Hall • Homework: Assignments will be submitted and graded through WebAssignhttp://webassign.ncsu.edu • Exam: Two midterm (Sept 23, Oct 28) and final exam (Dec 9). • Computing Resources: StatCrunch • Optional Computing Resources in R lab (R-Lab), MATLAB (M-Lab), and Splus labs (S-Lab) 

  3. Course objectives By the end of the course you should be able to: • Construct basic numeric and graphical summaries of data • Plan and analyze simple factorial designs • Calculate probabilities using basic probability distributions • Make inference using basic statistics • (more detailed learning objectives at course webpage)

  4. Course Topics and Objectives Introduction to types of studies, data collection Graphical and numerical summaries Design of experiments Factorial data analysis Analysis of Variance Simple linear regression Discrete random variables Continuous random variables Normal Distribution Sampling Distribution of Confidence Intervals for µ Hypothesis Testing for µ

  5. What Engineers Do? • An engineer is someone who solves problems of interest to society with the efficient application of scientific principles by: • Refining existing products • Designing new products or processes

  6. The Creative Process

  7. Statistics Supports The Creative Process • The field of statistics deals with the collection, presentation, analysis, and use of data to: • Make decisions • Solve problems • Design products and processes • It is the science of data.

  8. Variability • Statistical techniques are useful to describe • and understand variability. • By variability, we mean successive observations • of a system or phenomenon do not produce • exactly the same result. • Statistics gives us a framework for describing this • variability and for learning about potential • sources of variability.

  9. An Engineering Example of Variability Eight prototype units are produced and their pull-off forces are measured (in pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1. All of the prototypes does not have the same pull-off force. We can see the variability in the above measurements as they exhibit variability. The dot diagram is a very useful plot for displaying a small body of data - say up to about 20 observations. This plot allows us to see easily two features of the data; the location, or the middle, and the scatteror variability. Figure 1-2 Dot diagram of the pull-off force data.

  10. Statistics • Statistics provides a framework for describing the variability in a system, and for learning the impact on the system of various factors.

  11. Methods of statistics follow a process A preliminary step: Identify the research objective • What is the question to be answered? • How to collect data (observation values on individual subjects) that we want to make statement about? • The statement is in reference to what group of interest or population

  12. The process • Collect the information needed to answer the questions Gaining access to the entire population may pose problems, and thus we typically look at a subset of the population, called asample, to observe the variable of interest

  13. Example: Want opinion on issue at NCSU • Variable of interest = opinion • One observation = one student • Sample = students giving opinion • Population = who do we want to generalize to? Possibilities are • Everyone at NCSU (census) • Engineering students • Male students • Undergrads @ NCSU • ST370 students

  14. Example: lifetime of a lightbulb • Variable of interest = lifetime (in hours) • One observation = one lightbulb • Sample = 30 lightbulbs (30 lifetimes) • Population = all lightbulbs that could be manufactured Here the population is conceptual, and not finite

  15. Enumerative vs. Analytical • In an enumerative study, we have a finite population (e.g. population is our class) • In an analytical study, we have an infinite/conceptual population (does not all exist in one time/place)

  16. Why do we take samples? • Why do we take samples (instead of observing the whole population)? • The population may be too large • Time restrictions • The population might be conceptual like in the example above • Impractical (the experiment breaks what we are testing) • Limited resources to collect accurate data • Population might be inaccessible

  17. The process 2. Organize and summarize the information • After we observe and record data (values of one or more variables in a sample), we can perform descriptive statistics that describe the data through numerical measurements, tables, charts, graphs ….. • We want to know about the distribution of the variable(s); that is, the possible values and the corresponding prevalence of different (sets of) possible values …..

  18. Summarizing data (cont.) Sometimes we might settle for summaries of the distribution:  • Summaries of the distribution of the whole population are called parameters • Summaries of the distribution of the sample (observed values) only are called statistics

  19. Parameters vs. Statistics

  20. The process 3. Draw conclusions from the information • Make inferential statement on the question posted based on the information collected from the sample in reference to the population.

  21. Example A textile manufacturer is investigating a new drapery yarn, which the company claims has a mean thread elongation of 12 kilograms . A random sample of 12 specimens has an mean thread elongation of 11.5 kilograms. How reliable is this estimate of the true mean value? How can we test the hypothesis that the true mean is 12 kilograms?

  22. Hypothesis Tests • Hypothesis Test • A statement about a process behavior value. • Compared to a claim about another process value. • Data is gathered to support or refuse the claim. • One-sample hypothesis test: • Example: • drapery yarn mean thread elongation= 12 kilograms • vs. • drapery yarn mean thread elongation ≠12 kilograms

  23. Example: SAT Scores • Parents and teachers have been concerned about declining SAT scores and sought ways to halt the decline. • 50 students (24 males and 26 females), matched according to socio-economic background, participated in a study to examine the effect of classroom atmosphere (strict or permissive) on student performance, as measured by SAT scores at the end of the school year. • The students were divided into two groups of 25 each (12 males and 13 females), with Group 1 to study under a strict atmosphere while Group 2 studies under a very permissive atmosphere.

  24. Example: SAT Scores • After nine months, all students were given the same standardized tests: the math test and the verbal test.

  25. Example: SAT Scores • This example involves experimental design, data collection, data analysis,andstatistical inference. • How? • Questions: • Does stricter classroom atmosphere increase the average score? • Is the group size 50 large enough to make a confident conclusion? • Why were the students “matched according to socio-economic background?” • Why “12 males and 13 females per group”?

  26. For this example: • Population: the entire group of individuals of that region that take the SAT • Sample: the 50 students selected in the study • Sample size: 50 • Statistical inference: Based on the data from the study, we infer whether a stricter classroom atmosphere increases SAT scores in general

  27. Types of Variables • Acategorical or qualitative variable places an individual into one of several groups or categories. • Aquantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense.

  28. Example: SAT Scores • Which variables are categorical or qualitative? • Which variables are quantitative?

  29. Discrete vs. continuous variables A discrete variable is quantitative and has either a finite number of possible values or a countable number of values (can be lined up with 0,1,2,3,…)   • Counts are a classical example A continuous variable is quantitative and has an uncountably infinite number of possible values (takes values in intervals or a continuum)  • Lifetime of a light bulb is continuous, though we tend to make it discrete by grouping into number of days, etc.

  30. Basic Methods of Collecting Data Three basic methods of collecting data: • A retrospective study using historical data • Data collected in the past for other purposes. • An observationalstudy • Data, presently collected, by a passive observer. • A designed experiment • Data collected in response to process input changes.

  31. Observational vs. Experimental • Observational study – investigator’s role is basically passive. Individuals in a sample are observed and values are recorded, but the processed is disturbed as little as possible, i.e. no attempt is made to manipulate or influence the variables of interest.This type of study isgood for establishing whether two variables are related, or to learn characteristics of a population. Observational studies are carried out when control is unethical or impossible.

  32. Observational vs. experimental • Experimental study (Designed experiment) – investigator’s role is active. The controllable variables of the system are changed and the output data is recorded. Inference is made about which variables are responsible for the observed variation. Another way of saying this is that treatments are applied to experimental units, to try to determine the effects of the treatment on the response variable. This type of study is better for establishing causation.

  33. Causality • It is easier to make inferential statement on the causality from balanced designed experiment • Experiments designed with basic principles such as randomization are needed to establish cause-and-effect relationships. However: • Real systems are complex, that is complete randomization may be difficult to execute. • There may be important variables in the background that are changing and are the true reason for instances of favorable system behavior (Lurking variables)

More Related