1 / 15

A Data Mining Course for Computer Science and non Computer Science Students

A Data Mining Course for Computer Science and non Computer Science Students. Jamil Saquer Computer Science Department Missouri State University Springfield, MO. Outline. Introduction Motivation Challenges Design of the Course Topics Covered Assignments Examination Format Conclusion.

zohar
Download Presentation

A Data Mining Course for Computer Science and non Computer Science Students

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Data Mining Course for Computer Science and non Computer Science Students Jamil Saquer Computer Science Department Missouri State University Springfield, MO

  2. Outline • Introduction • Motivation • Challenges • Design of the Course • Topics Covered • Assignments • Examination Format • Conclusion

  3. Introduction • What is data mining (DM)? • non-trivial process of identifying valid, novel, useful, and ultimately understandable patterns in large volumes of data. • DM is an interdisciplinary topic • Has many things in common with machine learning and pattern recognition

  4. Motivation for the Course • Introducing more electives • Introducing graduate level CS courses • Informatics Program • Interest to faculty members and students from other departments • Author’s main area of research

  5. Challenges in Designing the Course • Diverse student population • CS vs. non-CS • undergrad vs. grad • Solution • Informatics program in design stages • MNAS CS option is new • Therefore, emphasis on undergrad CS students

  6. Accommodating other students • Minimize prerequisites • CS 2 (or even CS 1) • Capable of using a DM software • Scientific background/mentality • One from business, another from GGP • For grad CS students: • project requires more research • Tests could be a little different • Emphasize understanding basic DM concepts and using software for mining data

  7. Design of the Course • Used book by Dunham • Book divided into 3 parts • About 1 week spent on definitions, applications, motivations, challenges, … • Core of the course spent on core DM subjects: classification, clustering, mining association rules • Last week for project presentations

  8. Classification • Assigning objects to classes • supervised learning • Example: classify a military vehicle as a friendly or an enemy vehicle • Methods covered include: decision trees, Naïve Bayesian, k-nearest neighbor, backpropogation

  9. Clustering • Grouping objects into different classes • unsupervised learning • Example: cluster Weblog data to discover groups of similar access patterns • Techniques covered include: link algorithms, nearest neighbor, k-means, PAM, BIRCH, DBSCAN, CURE, ROCK

  10. Association Rules • Finding patterns that occur together • Example: diapers and beer are usually bought together • Techniques covered: Apriori, sampling, partitioning, FP-growth

  11. Assignments • Students need to learn how to mine data • One assignment on each core DM topic • apply two different algorithms on at least two data sets, one has to be relatively large • can use any DM package (Weka) • Students write a report • Students learn how to run an experiment

  12. Term Project • Group projects • Either provide a non-trivial implementation of a DM algorithm • Or, learn about a DM topic not discussed in class • Graduate students required to read at least three research papers and to write a report • All students present their project in class

  13. Examination Format • Open book • Two types of questions • First type, require basic knowledge of the material • definitions, T/F, short answers • Second type, apply certain algorithms on small data sets

  14. Conclusion • DM is an interesting course for CS and non-CS students • DM can be taught for non-CS students • A DM course can be taught for students with minimal CS background

  15. Questions

More Related