1 / 16

Advanced Excel for Data Analysis

Advanced Excel for Data Analysis. Yale University StatLab. What will be covered. Following up on Intro to Excel Workshop Importing and exporting data Data management cleaning and manipulation Statistics and data analysis Pivot tables Graphing and charts Writing custom Macros/VBA

glynn
Download Presentation

Advanced Excel for Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Excel for Data Analysis Yale University StatLab

  2. What will be covered • Following up on Intro to Excel Workshop • Importing and exporting data • Data management • cleaning and manipulation • Statistics and data analysis • Pivot tables • Graphing and charts • Writing custom Macros/VBA • More time for specific questions at the end

  3. Quick Review of Some Basics • We’re using Excel 2007 • Excel 2010 looks different again (improvement) • Recent Excel improvements/changes • More rows (not 65536 anymore) • Note new file suffixes: .xlsx and .xlsm • Not backwards compatible -save as .xls to re-open in older versions-free download allows older versions to open • More levels of sorting and filtering • Better charting

  4. When and Why Excel • It’s a spreadsheet • Great for calculations and formulas • Okay for Data Management • But that’s what most people use it for • Should probably be using Access • Everybody has Excel (including Macs) • No VBA or SharePoint access for Macs • That’s how we got the Data • Ubiquitous file format widely imported and exported • We’ve been using it our whole lives, right? • Most of us actually only scratch the surface.

  5. Importing And Exporting • Important for Importing and Exporting: Define Variable Types (string, date, int) • Open • “All Files” (*.*) • Text to Column, Delimited (basic) • Let’s see an example http://statlab.stat.yale.edu/help/workshops/introData/dataintro • StatTransfer • Supports Excel from/to anything • Does batch file conversion • Embedding in Word/Powerpoint • Keeps data linked so if you change data you don’t have to re-import to document

  6. Working with Data • Treat Data Like Data, not a bunch of cells • Defining Data Ranges • Name your Sheets • Name your data ranges • Conditional formatting and Data Validation • Quick way to determine whether your data looks good • Set ranges so that if a value falls outside the allowed range, the cell formatting changes • Create a unique ID row if it doesn’t exist • Allows you to restore order and merge

  7. Filtering, Sorting and Finding Data • Most common question: I need to find unique values or a subset of my data. • Sorting and Filtering – make sure you highlight the entire dataset! • Adding a conditional dummy variable is easy • Remember to copy and paste as values • Lookups • VLOOKUP and HLOOKUP • Used with named ranges

  8. Text Manipulation • If cell A1 = “abcde ” • TRIM(A7)=abcde • gets rid of extra spaces • RIGHT (A7,2)=de • and LEFT, MID • SUBSTITUTE (A7, “c”, “o”)=abode • Comparing Cells (especially in Macros): • If cell A1 = “12345” (a string, not a #) • VALUE(A1)=12345 • Useful for building syntax, repetitive text for other programs

  9. Statistics and Data Analysis • Many basic summary functions • AVERAGE, STDEV, CORREL, etc. • CONFIDENCE for confidence intervals • Probability distributions • NORMDIST, NORMINV, etc. • Array functions • LINEST, FORECAST • Data Analysis Tools • Data tab -> Data Analysis • If not installed, upper left circle -> Add-Ins • Linear regression, Chi-square

  10. Pivot Tables and Pivot Charts • Easy snapshot views of your data. • Pivot charts for Large Data • Great for transforming time series data into tables • Be aware: Default calculation is “COUNT” and you probably want “SUM” • Click-and-drag interface makes this easy to use, but requires manual refresh anytime your data changes

  11. Graphing and Charts • Dependent on column/row as x,y • You may need to copy, paste special/transpose to get the chart type you want • Charts for publishing • Black and white usually preferred • Patterns not gradients • Dots Per Inch (DPI) • Charts should depend on what is being conveyed • Bar graph (measurements) • Scatter plots (correlation, multivariate) • No Pie Charts

  12. More Advanced Charts • Trend lines • Changing the data to series • Removing/Hiding lines • Error Bars • Stock (High-Low-Close) • Excel wants defined variable names, but it can be tricked • Excellent tips for advanced graphing online • http://peltiertech.com/

  13. Writing Your Own Macros • Be careful: You can’t UNDO running a macro. • Use the record function to get the idea and then customize for your needs. • Basic structure is Object Oriented • Object.property() • The built-in Editor will show options • Supports If, then, else, for loops , while loops • Be careful: You can’t UNDO running a macro.

  14. More advanced • Financial Calculation functions • Present value, IRR, amortization • Database connections • What-if analysis • Forms (with Data Validation) • Password protection

  15. Resources • Lots of Yale resources available • StatLab Consultants • Sherlock Campbell • Themba Flowers • Orbis.yale.edu • We have access to several online resources • Excel Hacks (O’Reilly) • YouTube. (Oh really?) • Internet Searches • Bing might actually be better for Office

  16. Thank you Questions?

More Related