Data Mining Information Retrieval Web Search. Logistics. Instructor: Byron Gao http://cs.txstate.edu/~jg66/ Contact, office hours etc … Course page Database prerequisite? Textbook Jiawei Han and Micheline Kamber Workload Grading: curve History Weighting Late policy, academic honesty
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Statistical Summary, Querying, and Reporting
Data Preprocessing/Integration, Data Warehouses
Paper, Files, Web documents, Scientific experiments, Database Systems
Data Mining Engine
Database or Data Warehouse Server
data cleaning, integration, and selection
DatabaseArchitecture: Typical Data Mining System
AlgorithmData Mining: Confluence of Multiple Disciplines
login-name < department < university < country
e.g., (association) rule length, (decision) tree size
e.g., confidence, P(A|B) = #(A and B)/ #(B), classification reliability or accuracy, certainty factor, rule strength, rule quality, discriminating weight, etc.
potential usefulness, e.g., support (association), noise threshold (description)
not previously known, surprising (used to remove redundant rules, e.g., Illinois vs. Champaign rule implication support ratio)