1 / 20

University of Economics, Prague

University of Economics, Prague. MLNET related activities of Laboratory for Intelligent Systems and Dept. of Information and Knowledge Engineering http://lisp.vse.cz/~berka/MLNet.html. Research. probabilistic methods - decomposable probability models and bayesian networks

daw
Download Presentation

University of Economics, Prague

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. University of Economics, Prague MLNET related activities of Laboratory for Intelligent Systems and Dept. of Information and Knowledge Engineering http://lisp.vse.cz/~berka/MLNet.html

  2. Research • probabilistic methods - decomposable probability models and bayesian networks • symbolic methods - generalized association rules and decision rules • logical calculi for knowledge discovery in databases (c) Petr Berka, LISp, 2000

  3. People Petr Berka Jiří Ivánek Radim Jiroušek Jan Rauch Vojtěch Svátek Tomáš Kočka (c) Petr Berka, LISp, 2000

  4. Software LISp-Miner • two data mining procedures: 4FT Miner (generalised association rules) and KEX (decision rules), • large preprocessing module including SQL, • output of rules in database format enables the users to implement own interpretation procedures. (c) Petr Berka, LISp, 2000

  5. LISP-Miner procedures • 4FT-Miner (GUHA procedure) generalised association rules in the form Ant ~ Suc / Cond • KEX weighted decision rules in the form Ant ==> C (weight) (c) Petr Berka, LISp, 2000

  6. 4FT-Miner Data Matrix: CLIENTS LOANS Id Age Sex Salary District Amount Payment Months Quality 1 45 F 28 000 Prague 48 000 1 000 48 good ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 70 000 18 M 12 000 Brno 36 000 2 000 18 bad Problem:Are there segments of clients SC and segments of loans SL such that To be in SC is at 90% equivalent to have a loan from SL and there is at least 100 such clients Ant is at 90% equivalent to Suc Ant 0.90%, 100 Sucis true iff a/(a+b+c)  0.9  a  100 Suc Suc a - number of objects satisfying Ant and Suc Ant a b b- number of objects satisfying Ant and not satisfying Suc Ant c d c- number of objects not satisfying Ant and satisfying Suc d- number of objects satisfying neither Ant nor Suc (c) Petr Berka, LISp, 2000

  7. 4FT Miner • Input: • Data matrix, • quantifier 0.90%, 100 • Derived attributes for SC (possible Ant): Age (7 values), Sex (2 values), Salary (3 values), District (77 values) • Derived attributes for SL (possible Suc): Amount (6 values), Duration (5 values), Quality (2 values) • Output: • All Ant 0.90%, 100 Suc true in data matrix • (5 equivalences from about 5 milions possible relations) • an example: • Age(20 - 30)  Sex(F)  Salary(low)  District (Prague) 0.90%, 100 Amount<20,50)  Quality(Bad) • Suc Suc • a/(a+b+c) = 0.95  0.9 Ant 950 30 •  950  100 Ant 20 69000 (c) Petr Berka, LISp, 2000

  8. KEX - classification (c) Petr Berka, LISp, 2000

  9. KEX - learning (c) Petr Berka, LISp, 2000

  10. LISp-Miner (c) Petr Berka, LISp, 2000

  11. LISp-Miner (c) Petr Berka, LISp, 2000

  12. LISp-Miner (c) Petr Berka, LISp, 2000

  13. LISp-Miner (c) Petr Berka, LISp, 2000

  14. 4FT Miner and KEX Applications • truck reliability assessment • quality control in a brewery • segmentation of clients of a bank • short-term electric load prediction (c) Petr Berka, LISp, 2000

  15. LISp Miner References: • Berka,P. - Ivanek,J.: Automated Knowledge Acquisition for PROSPECTOR-like Expert Systems. In: (Bergadano, deRaedt eds.) Proc. ECML'94, Springer 1994, 339-342. • Berka,P. - Rauch,J.: Data Mining using GUHA and KEX. In: (Callaos, Yang, Aguilar eds.) 4th. Int. Conf. on Information Systems, Analysis and Synthesis ISAS'98, 1998, Vol 2, 238- 244. • Rauch,J.: Classes of Four Fold Table Quantifiers. In: (Zytkow, Quafafou eds.) Principles of Data Mining and Knowledge Discovery. Springer 1998, 203 - 211. (c) Petr Berka, LISp, 2000

  16. Datasets PKDD‘99 Discovery Challenge data (http://lisp.vse.cz/pkdd99/chall.htm) • financial data: clients of a bank, their accounts, transactions, loans etc, • medical data: patients with collagen disease (c) Petr Berka, LISp, 2000

  17. Financial data (c) Petr Berka, LISp, 2000

  18. Medical data (c) Petr Berka, LISp, 2000

  19. Organized conferences Teaching (in czech) KDD KDD seminar ML Other activities http://lisp.vse.cz/ecml97/ http://lisp.vse.cz/pkdd99/ (c) Petr Berka, LISp, 2000

  20. New projects SOL-EU-NET project „Data Mining and Decision Support for Business Competitiveness: A European Virtual Enterprise“ (supported by EU grant IST-1999-11.495) (c) Petr Berka, LISp, 2000

More Related