1 / 20

Lecture1 – Introduction and Organization

Lecture1 – Introduction and Organization. Rice ELEC 697 Farinaz Koushanfar Fall 2006. Summary. Syllabus Course outline Motivation Class census. Syllabus – ELEC 697. Title: “Applications of Modern Statistical Learning Theory in Embedded Networked Systems” Instructor

corina
Download Presentation

Lecture1 – Introduction and Organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture1 – Introduction and Organization Rice ELEC 697 Farinaz Koushanfar Fall 2006

  2. Summary • Syllabus • Course outline • Motivation • Class census

  3. Syllabus – ELEC 697 • Title: “Applications of Modern Statistical Learning Theory in Embedded Networked Systems” • Instructor • Farinaz Koushanfar, Rice University • Meeting time • 02:30 PM - 03:50 PM TR  • Meeting place • 2014, Duncan Hall • Prerequisites • Self-contained, but assuming undergraduate level knowledge of probability and math

  4. Syllabus - Overview and Goals • Overview • Practical statistical learning methods and tools • Modeling and optimizing emerging embedded systems • Research areas: embedded networked systems, sensor networks, your research area, assuming you will need the methods there • Emphasizing the methods rather than the theoretical aspects • Goals • Solid understanding of the state-of-the-art learning methods • Hands-on experience with statistical modeling SW • Applications of statistical modeling in SN, Internet, Networks, Intrusion detection, CAD, VLSI • A universal tool for your own research

  5. Syllabus – Book and More… • Textbook • The elements of statistical learning: data mining, inference, and prediction, T. Hastie; R. Tibshirani; J. Friedman; New York : Springer, 2001. • Recommended further reading • Pattern Classification (2nd ed.), by R. Duda; P. Hart; D. Stork; Wiley Interscience, 2001. • Modern Applied Statistics with S-PLUS, Third Edition, W. Venables; B. Ripley; Springer, 1999. • Papers from the literature • Course webpage • http://www.ece.rice.edu/~fk1/classes/ELEC697.htm

  6. Syllabus – Grading and Project • Grading • Weekly assignments (20%) • Mid-semester oral presentation (15%) • Paper presentation and discussion (15%) • Class project report (30%) • Class project presentation (20%) • Project • Groups of 1 or 2 (collaborations encouraged) • Dataset to analyze and model, can be more theoretical • Either propose or select from my projects/datasets

  7. Syllabus - Software • Hands-on experience with data analysis and modeling tool • S programming language (Splus/R) • You can download R from CRAN at: http://cran.us.r-project.org/ • Documentation is also available at CRAN • Many more resources available on the web

  8. Course Outline • Week 1: Orientation and overview of supervised learning and its applications in embedded networks • Week 2: Intro to R, Linear regression, model selection, validation • Week 3: Applications of regression in embedded networks (HW 0) • Week 4: Linear classification: LDA, logistic, separating hyperplanes • Week 5: Applications of classifications in embedded networks (HW 1) • Week 6: Available datasets, possible project proposals, and project selection • Week 7: Model assessment and selection • Week 8: Applications of models selection and validation in embedded networked systems (HW 2)

  9. Course Outline (Cont’d) • Week 9: Kernel methods • Week 10: Applications of kernel methods in embedded networked systems (HW 3) • Week 11: Mid-term project proposal and presentations • Week 12: Model inference and averaging: boosting, ML, EM • Week 13: Applications of model inference in embedded networked systems (HW4) • Week 14: Progress report -- presenting the related work to your project and your goals • Week 15: Summary • Week 16: Final project presentation and reports (Report) + Paper presentations!

  10. Class Consensus • Tell me about yourself! • Your name • Your year of study • Your field – or your interest • Your advisor

  11. Statistical Learning - General Key role in science, finance, and industry. Examples: • Predict the prob. of a second hearth attack (demographic, diet, clinical measures) • Stock prices in 6 months (company performance and economic data) • Estimate no.’s in a handwritten zip-code • Estimate the glucose in diabetic patient blood (infrared absorption spectrum) • Identify the risk factors in a prostate cancer (clinical and demographic variables)

  12. xbow MICA2 DOT motes Contaminant Transport Seismic Response Environmental Sensing Sensor Networks (SN) Courtesy of Prof. Deborah Estrin (UCLA-CENS)

  13. Statistical Learning - SN • Classification/target detection • Modeling the biological systems • Inter-sensor modeling • Sleeping coordination, compression, intrusion detection/security • Characterization of sensors - a rapidly growing market, e.g. • Pressure sensors – revenue: $4,018.8M in 2004, projected $5,545.1M in 2011 • Image sensors - $4B++ in 2005, led by the camera phone application • Fiber-optic sensors - $288.1M now, will be $304.3M in 2006 • Bio-sensors - ?? • Proximity, Photoelectric, Linear Displacement Sensors - $1B in 2004, will be 1.05B in 2007 • Nano-sensors – will grow more than 30%+ by 2009 Sensors & Transducers Magazine (S&T e-Digest), Vol.62, Issue 12, December 2005, pp.456-461

  14. Statistical Learning – VLSI/CAD • nanometer-scale devices: increased process variation and decreased predictability of circuit performance • Traditionally corner-case models were used – pessimistic • The magnitude of variations in the gate length, are predicted to increase from 35% in a 130nm technology to ~60% in a 70nm • The variations are specified the fraction 3/ • The major trade-off is the computational efficiency Photoresist line pattern PDF King, Wada, Woo, IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 17, NO. 2, MAY 2004

  15. Sources of Variations • Process variations • The value of process parameters observed after fabrication • Parametric yield: the fraction of manufactured samples that meet the performance constraints • Environmental variations • Modeling variations • Power and delay models used to perform design, analysis and optimization are inaccurate • Other sources • Change in process parameters with time • Hot electrons • Process instability

  16. The Theme of the Course • About practical learning methods – something you can learn and use in your research • This is not an embedded system design course nor a sensor network design course! • The research topics are to motivate real applications of the statistical learning in other fields • You do not need any prior knowledge of these subjects to learn in this course • Dynamic reading list

  17. Learning from Data • Supervised learning • Outcome measurement: either categorical or quantitative • Predict outcome from a set of features • Training set of data • A good learner can predict a testing set well • Unsupervised learning • Only features, no outcome

  18. Example 1: Email Spam • Categorical outcome: spam or email • 4601 email messages • Rule based learning, e.g. • if (%george < 0.6) & (%you > 1.5) then spam else email

  19. Example 2: Prostate Cancer • Correlation b/w the level of prostate specific antigen (PSA) and clinical predictors • Regression problem!

  20. Example 3: Handwritten Digit Recognition • Automatic envelope sorting procedure • 16x16 8-bit grayscale, intensity from 0-255 • Classification problem!

More Related