1 / 59

A Conceptual Framework of Data Mining

A Conceptual Framework of Data Mining. Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2 yyao@cs.uregina.ca http://www.cs.uregina.ca/~yyao. Acknowledgements. Thanks to Professors Wang Jue Zhou Zhi-Hua Zhou Aoying

hastin
Download Presentation

A Conceptual Framework of Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Conceptual Framework of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2 yyao@cs.uregina.ca http://www.cs.uregina.ca/~yyao

  2. Acknowledgements • Thanks to Professors • Wang Jue • Zhou Zhi-Hua • Zhou Aoying for the kind invitation and this opportunity.

  3. Motivations “The question typically is not what is an ecosystem, but how do we measure certain relationships between populations, how do some variables correlate with other variables, and how can we use this knowledge to extend our domain.” Salthe, S.N. Evolving Hierarchical Systems, Their Structure and Representation

  4. Motivations “… the scientist is usually not, on the other hand, a self-conscious epistemologist. That would mean going beyond his area of narrow training for the purpose of questioning its point. Functioning as a scientist means functioning within the rules of a game learned during the apprenticeship in which examination of the philosophic foundations of the game plays a characteristically tiny role.”

  5. Motivations (Data Mining) • One is more interested in the algorithms for finding “knowledge”, but not what is knowledge. • One is more interested in a more implementation-oriented view or framework of data mining, rather than a conceptual framework for the understanding of the nature of data mining.

  6. Data mining • Function-oriented approaches: Requirements • Theory-oriented approaches: Mathematical/statistical methods • Procedure/process-oriented approaches: KDD processes • There does not exist a concept framework for data mining.

  7. Motivations (General) • We are more interested in doing than understanding. • We are more interested in actual systems and methods than a powerful point of view. • We are more interested in solving a real world problem than acquisition of knowledge. • We have enough knowledge, but not sufficient wisdom in using the knowledge.

  8. Motivations • Four international workshops have been held on foundations of data mining. • There still does not exist a well accepted and non-controversial framework. • Many papers do not cover the “foundations of data mining”.

  9. The question • How to view and study data mining? • What can we learn from our experiences? • From other fields. • From well established branches.

  10. Knowledge structure and problem solving in physics Reif and Heller, 1982. “Effective problem solving in a realistic domain depends crucially on the content and structure of the knowledge about the particular domain.” The knowledge about physics “specifies special descriptive concepts and relations described at various level of abstractness, is organized hierarchically, and is accompanied by explicit guidelines specifying when and how this knowledge is to be applied.”

  11. Knowledge structure and education • Experts and novices differ in their knowledge organization. • Experts are able to establish multiple representations of the same problem at different levels of granularity. • Experts are able to see the connections between different grain-sized knowledge.

  12. Cognitive Science Posner, 1989 • According to the cognitive science approach, to learn a new filed is to build appropriate cognitive structures and to learn to perform computations that will transform what is know into what is not yet known.

  13. A New View • Data mining as a field of study, rather than simply a collections of algorithms, or a combination of several fields. • The study of data mining may be viewed as a scientific enquiry into the nature of data mining and the scope of data mining methods.

  14. Three basic questions • What are the foundations of data mining? • What is the scope of the foundations of data mining? • What are the differences between existing researches and the research on the foundations of data mining?

  15. A potential solution The study of the nature of data mining The study of data mining methods The philosophical foundations The theoretical foundations The mathematical foundations The technological foundations

  16. A conceptual framework • A layered framework can be established. • Each layer/level deals with the problem in different contexts: • in mind and in the abstract • in machine • application.

  17. Philosophy layer Technique layer Application layer A layered model of Data Mining • Philosophy level • Algorithm/technique level • Application level

  18. Philosophy layer Technique layer Application layer A layered model Philosophy level:What is knowledge? The study of knowledge & knowledge discovery in mind and in the abstract. What is knowledge representation? How to express and communicate knowledge? What is the relationship between knowledge in mind and in real world? How to classify knowledge? How to organize knowledge?

  19. Philosophy layer Technique layer Application layer A layered model Technique level: How to discover knowledge? The study of knowledge & knowledge discovery in machine. How to code, storage, retrieve knowledge in computer? How to develop an efficient algorithm? How to improve an existing technique?

  20. Philosophy layer Technique layer Application layer A layered model Application level: How to use the discovered knowledge The study of the applications of discovered Knowledge. Is the discovered knowledge useful? Is the discovered knowledge meaningful? How to use the knowledge?

  21. Technique level The study of knowledge & knowledge discovery in machine. Application level The study of the applications of discovered Knowledge. Philosophy level The study of knowledge & knowledge discovery in mind and in the abstract. A layered model • The division among the three levels is not a clear cut, and may have overlaps with each other. • The inner layers establish a foundation for the outer layers. • The outer layers may raise questions for the inner layers.

  22. A layered model of KDD • The results from philosophy level will provide guideline and set the stage for the algorithm and application levels. • Philosophical study does not depend on the availability of specific techniques. • Technical study is not constrained by a particular application. • The existence of a type of knowledge in data is unrelated to whether we have an algorithm to extract it. • The existence of an algorithm does not necessarily imply that the discovered knowledge is meaningful and useful

  23. A layered model of KDD • The three levels represent the understanding, discovery, and utilization of knowledge. • Any of them is indispensable in the study of intelligence and intelligent systems. • They must be considered together in a common framework through multi-disciplinary studies, rather than in isolation.

  24. Application of the layered framework • Concept formation and learning can be studied within the layered framework. • The reconsideration brings a better understanding of the problem.

  25. Application of the layered framework • Concept formation and learning can be studied within the layered framework. • The reconsideration brings a better understanding of the problem.

  26. Philosophy level study of concept • Classical view A concept is described jointly by its intension and extension.

  27. Philosophy level study of concept

  28. Philosophy level study of concept • Two basic issues of concept formation Aggregation aims at the identification of a group of objects so that they form the extension of a concept. Characterization attempts to describe a set of objects as their intension.

  29. Differentiation Aggregation Integration Characterization Philosophy level study of concept • Classical view Concept formation

  30. Philosophy level study of concept • Classical view Differences vs. Aggregation Concept formation Characterization

  31. Philosophy level study of concept • Classical view Differences vs. Aggregation Concept formation Characterization Similarities

  32. Philosophy level study of concept • Classical view vs. Aggregation Concept formation Extension Characterization Intension Shell

  33. Philosophy level study of concept Concept formation Concept learning Context Hierarchy

  34. Philosophy level study of concept Concept formation Concept learning Context Hierarchy

  35. Technique level study of concept Given a context - Search for the intension Search for the extension Analyze the concepts relationship

  36. Technique level study of concept • Intensions of concepts defined by a language

  37. Technique level study of concept • Intensions of concepts defined by a language

  38. Technique level study of concept • Conjunctive concept space

  39. Technique level study of concept • Conjunctive concept space

  40. Technique level study of concept

  41. Technique level study of concept • Extensions of concepts defined by an information table

  42. Technique level study of concept

  43. Technique level study of concept • Extensions of concepts defined by an information table

  44. Technique level study of concept • Relationship between concepts in an information table

  45. Technique level study of concept • Relationship between concepts in an information table

  46. Technique level study of concept • Probabilistic measures:

  47. Technique level study of concept • Probabilistic measures:

  48. Technique level study of concept Concept learning as search

  49. Technique level study of concept Concept learning as search

  50. Technique level study of concept Concept learning as search

More Related