1 / 15

Overview

Overview. Too many variables, and too many people or things, cause thorny problems in data analysis, and the issue is not simply the availability of sufficient computing power to handle all that data .

zeno
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview • Too many variables, and too many people or things, cause thorny problems in data analysis, and the issue is not simply the availability of sufficient computing power to handle all that data. • Principal components analysis seeks to identify and quantify those components by analyzing the original, observable variables. In many cases, we can wind up working with just a few—on the order of, say, three to ten—principal components or factors instead of tens or hundreds of conventionally measured variables.

  2. observable variables Z1 X1 Z2 X2 Z3 X3 Which component explains the most variance?

  3. Data Structures character vector numeric vector Dataframe: d <- c(1,2,3,4)e <- c("red", "white", "red", NA)f <- c(TRUE,TRUE,TRUE,FALSE)mydata <- data.frame(d,e,f)names(mydata) <- c("ID","Color","Passed") List: w <- list(name="Fred", age=5.3) Numeric Vector: a <- c(1,2,5.3,6,-2,4) Character Vector: b <- c("one","two","three") Framework Source: Hadley Wickham Matrix: y<-matrix(1:20, nrow=5,ncol=4)

  4. Identity Matrix Inverse

  5. Covariance

  6. R PRINCOMP

  7. Principal Components

  8. Principal Components

  9. Factor Loadings

  10. Categorical Variables

More Related