1 / 28

Associative Classification (AC) Mining for A Personnel Scheduling Problem

Associative Classification (AC) Mining for A Personnel Scheduling Problem. Fadi Thabtah. Trainer scheduling problem. Schedule. Courses (events). Resources. Locations. Staff (trainers). Timeslots. Trainer scheduling problem.

iago
Download Presentation

Associative Classification (AC) Mining for A Personnel Scheduling Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Associative Classification (AC) Mining for A Personnel Scheduling Problem Fadi Thabtah

  2. Trainer scheduling problem Schedule Courses (events) Resources Locations Staff (trainers) Timeslots

  3. Trainer scheduling problem • Assigning a number of training courses (events) to a limited number of training staff, locations, and timeslots • Each course has a numerical priority value • Each trainer is penalised depending on the travel distance

  4. Objective Function Total priority for scheduled events Total penalty for training staff MAX

  5. Hyperheuristic approach • Operates at a higher level of abstraction than metaheuristics • You may think of it as a supervisor that manages the choice of simple local search neighbourhoods (low-level heuristics) at any time

  6. Low-level heuristics • Problem-oriented • Represent simple methods used by human experts • Easy to implement • Examples: • Add new event to the schedule • Swap two events in the schedule • Replace one event in the schedule by another

  7. Low Level Heuristic 1 Low Level Heuristic 2 Low Level Heuristic 3 Hyperheuristic Current solution Perturbed solution

  8. Initial solution Objective value Hyperheuristic algorithm Objective value Objective value CPU time Current solution (according to acceptance criterion) Set of low-level heuristics Perturbed solution Selected low-level heuristic Building a Schedule using A hyperheuristic

  9. Advantages of hyperheuristics • Cheap and fast to implement • Produce solutions of good quality (comparable to those obtained by hard-to-implement metaheuristic methods) • Require limited domain-specific knowledge • Robustness: can be effectively applied to a wide range of problems and problem instances

  10. Current Hyperheuristics Approaches • Simple hyperheuristics (Cowling et al., 2001-2002) • Choice-function-based(Cowling et al., 2001 – 2002) • Based on genetic algorithms(Cowling et al., 2002; Han et al., 2002) • Hybrid Hyperheuristics. (Cowling, Chakhlevitch 2003-2004)

  11. Why Data Mining Scenario: While constructing the solution of the scheduling problem, the hyperheuristic manages the choice of appropriate LLH in each choice point, therefore an expert decision maker is needed (Classification). Two approaches: • Learn the performance of LLH from past schedules to predict appropriate LLH in current one • While constructing schedule learn and predict LLH Or what so called, Learn “On-the-fly”

  12. RowIds RowId A1 A1 A2 A2 Class 1 1 x1 x1 y1 y1 c1 2 2 x1 x2 y2 y4 c2 3 3 x1 x1 y1 y1 c2 4 x1 y2 c1 5 x2 y1 c2 6 x2 y1 c1 7 x2 y3 c2 8 x1 y3 c1 9 x2 y4 c1 10 x3 y1 c1 Classification : A Two-Step Process 1. Classifier building: Describing a set of predetermined classes • 2. Classifier usage: • Calculate error rate • If Error rate is acceptable, then apply the classifier to test data Classification Algorithm • Training Data Class/ LLH • Test Data Classification Rules

  13. Learning the Performance of LLH (Hyperheuristic Solution) Applied K times Data Mining Techniques Produce Derived Hyperheuristic Algorithm Guide Rules Set (If/Then)

  14. Transaction Id Items Time 12 bread, milk, juice 10:12 13 bread, juice, milk 12:13 14 milk, beer, bread, juice 13:22 15 bread, eggs, milk 13:26 16 beer, basket, bread, juice 15:11 Association Rules Mining • Advantages: • Items shelving • Sales promotions • Future planning • Strong tool that aims to find relationships between variables in a database. • Its applied widely especially in market basket analysis in order to infer items from the presence of other items in the customer’s shopping cart • Example : if a customer buys milk, what is the probability that he/she buys cereal as well? • Unlike classification, the target class is not pre-specified in association rule mining. • Transactional Database

  15. Associative Classification (AC) • Special case of association rule that considers only the class label as a consequent of a rule. • Derive a set of class association rules from the training data set which satisfy certain user-constraints, i.e support and confidence thresholds. • To discover the correlations between objectsand class labels. • Ex: • CBA • CPAR • CMAR

  16. Training Data AC Steps Associative classification Algorithm Frequent Ruleitems: Attribute values that pass support threshold user Class Association Rules

  17. Rule support and confidence Given a training data set T, for a rule • The support of R, denoted as sup(R) , is the number of objects in T matchingR condition and having a class label c • The confidence of R , denoted as conf(R), is the the number of objects matchingR condition and having class label cover the number of objects matchingR condition • Any Item has a support larger than the user minimum support is called frequent itemset

  18. Current Developed Techniques • MCAR (Thabtah et al., Pceeding of the 3rd IEEE International Conference on Computer Systems and Applications (pp. 1-7) • MMAC (Thabtah, et al., Journal ofKnowledge and InformationSystem (2006)00:1-21. MCAR Characteristics: • Combinations of two general data mining approaches, i.e. (association rule, classification) • Suitable for traditional classification problems • Employs a new method of finding the rules MMACC characteristics: • Produces classifiers of the form: that are suitable to not only traditional binary classification problems but also useful to multi-class labels problems such as Medical Diagnoses and Text Classification. • Presents three Evaluation Accuracy measures

  19. Data and Experiments Learning Approach : Learn the performance of LLH from past schedules to predict appropriate LLH in current one Supp=5%, confidence=40% Number of datasets : 12-16 UCI data and 9 solutions Of the training scheduling problem Algorithms used: CBA (AC algorithm) • MMAC (AC algorithm) • Decision Tree algorithms (C4.5) • Covering algorithms (RIPPER) • Hybrid Classification algorithm (PART)

  20. Relative prediction accuracy in term of PART for the Accuracy Measures of MMAC algorithm

  21. Relative prediction accuracy in term of CBA for the Accuracy Measuresof MMAC algorithm

  22. Number of Rules of CBA, PART and Top-label

  23. Accuracy (%) for PART, RIPPER, CBA and MMAC on UCI data sets

  24. Comparison between AC algorithms on 12 UCI data sets

  25. MCAR vs. CBA and C4.5 On UCI data sets

  26. Conclusions • Associative classification is a promising approach in data mining • Since more than LLHs could improve the objective function in the hyperheuristic, we need a multi-label rules in the classifier • Associative classifiers produce more accurate classification models than traditional classification algorithms such as decision trees and rule induction approaches • One challenge in associative classification is the exponential growth of rules, therefore pruning becomes essential

  27. Future Work • Constructing a hyperheuristic approach for the personnel scheduling problem • Investigating the use of multi-class labels classification algorithms with a hyperheuristic • Implementing of a new data mining techniques based on dynamic learning suitable for scheduling and optimization problem. • Investigate rule pruning in AC mining

  28. Questions?

More Related