1 / 39

Comparing between machine learning methods for a remote monitoring system.

Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University. Overview. The remote monitoring system The project database Machine learning methods: Decision of Association Rules Inductive Logic Programming Decision Tree

stormy
Download Presentation

Comparing between machine learning methods for a remote monitoring system.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University

  2. Overview • The remote monitoring system • The project database • Machine learning methods: • Decision of Association Rules • Inductive Logic Programming • Decision Tree • Applying the methods for project database and comparing the results

  3. Remote Monitoring System - Description • Support Center has ongoing information on customer’s equipment • Support Center can, in some situations, know that customer is going to be in trouble • Support Center initiates a call to the customer • Specialist connects to site from remote and tries to eliminate problem before it has influence

  4. Gateway Product TCP/IP [FTP] AIX/NT AIX/NT/95 Modem TCP/IP [Mail/FTP] Modem Support Server Customer Remote Monitoring System - Description

  5. Remote Monitoring System - Technique • One of the machines on site, the Gateway, is able to initiate a PPP connection to the support server or to ISP • All the Products on site have a TCP/IP connection to the Gateway • Background tasks on each Product collect relevant information • The data collected from all Products is transferred to the Gateway via ftp • The Gateway automatically dials to the support server or ISP, and sends the data to the subsidiary • The received data is then imported to database

  6. Project Database • 12 columns, 300 records • Each record includes failure information of one product at a specific customer site • The columns are: record no., date, IP address, operating system, customer ID, product, release, product ID, category of application, application, severity, type of service contract

  7. Project Goals • Discover valuable information from database • Improve the products marketing and the customer support of the company • Learn different learning methods, and use them for the project database • Compare the different methods, based on the results

  8. The Learning Methods • Discovery of Association Rules • Inductive Logic Programming • Decision Tree

  9. Discovery of Association Rules - Goals • Finding relations between products which are bought by the customers • Impacts on product marketing • Finding relations between failures in a specific product • Impacts on customer support (failures can be predicted and handled before influences)

  10. Discovery of Association Rules - Definition • A technique developed specifically for data mining • Given • A dataset of customer transactions • A transaction is a collection of items • Find • Correlations between items as rules • Example • Supermarket baskets

  11. Determining Interesting Association Rules • Rules have confidence and support • IF x and y THEN z with confidence c • If x and y are in the basket, then so is z in c% of cases • IF x and y THEN z with support s • The rule holds in s% of all transactions

  12. Discovery of Association Rules - Example • Input Parameters: confidence=50%; support=50% • If A then C: c=66.6% s=50% • If C then A: c=100% s=50%

  13. Itemsets are Basis of Algorithm • Rule A => C • s=s(A, C) = 50% • c=s(A, C)/s(A) = 66.6%

  14. Algorithm Outline • Find all large itemsets • Sets of items with at least minimum support • Apriori algorithm • Generate rules from large itemsets • For ABCD and AB in large itemset the rule AB=>CD holds if ratio s(ABCD)/s(AB) is large enough • This ratio is the confidence of the rule

  15. Pseudo Algorithm

  16. Relations Between Products

  17. Relations Between Failures

  18. Inductive Logic Programming - Goals • Finding the preferred customers, based on: • The number of products bought by the customer • The failures types (i.e severity level) occurred in the products

  19. Inductive Logic Programming - Definition • Inductive construction of first-order clausal theories from examples and background knowledge • The aim is to discover, from a given set of pre-classified examples, a set of classification rules with high predictive power • Examples: • IF Outlook=Sunny AND Humidity=High THEN PlayTennis=No

  20. Horn clause induction Given: P: ground facts to be entailed (positive examples); N: ground facts not to be entailed (negative examples); B: a set of predicate definitions (background theory); L: the hypothesis language; Finda predicate definition (hypothesis) such that • for every (completeness) • for every (consistency)

  21. Inductive Logic Programming - Example • Learning about the relationships between people in a family circle

  22. Algorithm Outline • A space of candidate solutions and an acceptance criterion characterizing solutions to an ILP problem • The search space is typically structured by means of the dual notions of generalization (induction) and specialization (deduction) • A deductive inference rule maps a conjunction of clauses G onto a conjunction of clauses S such that G is more general than S • An inductive inference rule maps a conjunction of clauses S onto a conjunction of clauses G such that G is more general than S. • Pruning Principle: • When B and H don’t include positive example, then specializations of H can be pruned from the search • When B and H include negative example, then generalizations of H can be pruned from the search

  23. Pseudo Algorithm

  24. The preferred customers If ( Total_Products_Types( Customer ) > 5 ) and ( All_Severity(Customer) < 3 ) then Preferred_Customer

  25. Decision Trees - Goals • Finding the preferred customers • Finding relations between products which are bought by the customers • Finding relations between failures in a specific product • Compare the Decision Tree results to the previous algorithms results.

  26. Decision Trees - Definition • Decision tree representation: • Each internal node tests an attribute • Each branch corresponds to attribute value • Each leaf node assigns a classification • Occam’s razor: prefer the shortest hypothesis that fits the data • Examples: • Equipment or medical diagnosis • Credit risk analysis

  27. Algorithm outline • A the “best” decision attribute for next node • Assign A as decision attribute for node • For each value of A, create new descendant of node • Sort training examples to leaf nodes • If training examples perfectly classified, Then STOP, Else iterate over new leaf nodes

  28. Pseudo algorithm

  29. Information Measure • Entropy measures the impurity of the sample of training examples S : • is the probability of making a particular decision • There are c possible decisions • The entropy is the amount of information needed to identify class of an object in S • Maximized when all are equal • Minimized (0) when all but one is 0 (the remaining is 1)

  30. Information Measure • Estimate the gain in information from a particular partitioning of the dataset • Gain(S, A) = expected reduction in entropy due to sorting on A • The information that is gained by partitioning S is then: • The gain criterion can then be used to select the partition which maximizes information gain

  31. Decision Tree - Example

  32. S: [9+,5-] E=0.940 S: [9+,5-] E=0.940 humidity wind high normal weak strong N [6+,1-] E=0.592 P [3+,3-] E=1.00 [3+,4-] E=0.985 [6+,2-] E=0.811 Gain (S, Humidity) = .940 - (7/14).985 - (7/14).592 = .151 Gain (S, Wind) = .940 - (8/14).811 - (6/14)1.0 = .048 Decision Tree - Example (Continue) Which attribute is the best classifier? Gain(S, Outlook) = 0.246 Gain(S, Temperature) = 0.029

  33. {D1, D2, …, D14} [9+,5-] outlook sunny overcast rain {D3,D7,D12,D13} [4+,0-] {D1,D2,D8,D9,D11} [2+,3-] {D4,D5,D6,D10,D14} [3+,2-] ? ? Yes Decision Tree Example – (Continue) Ssunny = {D1,D2,D8,D9,D11} Gain(Ssunny, Humidity) = .970 – (3/5)0.0 – (2/5)0.0 = .970 Gain(Ssunny, Temperature) = .970 – (2/5)0.0 – (2/5)1.0 – (1/5)0.0 = .570 Gain(Ssunny, Wind) = .970 – (2/5)1.0 – (3/5).918 = .019

  34. outlook sunny overcast rain humidity Yes wind high normal strong weak No Yes No Yes Decision Tree Example – (Continue)

  35. Overfitting • The tree may not be generally applicable called overfitting • How can we avoid overfitting? • Stop growing when data split not statistically significant • Grow full tree, then post-prun • The post-pruning approach is more common • How to select “best” tree: • Measure performance over training data • Measure performance over separate validation data set

  36. Reduced-Error Pruning • Split data into training and validation set • Do until further pruning is harmful: • Evaluate impact on validation set of pruning each possible node (plus those below it) • Greedily remove the one that most improves validation set accuracy • Produces smallest version of most accurate sub-tree

  37. MaxSev < 2.5 >= 2.5 NO: 7 YES: 0 NoOfProducts >= 4.5 < 4.5 NO: 3 YES: 8 NO: 0 YES: 3 The Preferred Customer Target attribute is TypeOfServiceContract

  38. Product6 0 1 NO: 0 YES: 15 Product2 0 1 NO: 0 YES: 1 Product9 0 1 NO: 4 YES: 0 NO: 0 YES: 1 Relations Between Products Target attribute is Product3

  39. Application10 0 1 NO: 0 YES: 11 Application8 0 1 NO: 2 YES: 2 Application2 0 1 NO: 1 YES: 0 NO: 5 YES: 1 Relations Between Failures Target attribute is Application5

More Related