A Genetic Algorithm for Workload Scheduling In Cloud Based e-Learning

A Genetic Algorithm for Workload Scheduling In Cloud Based e-Learning Octavian Morariu Cristina Morariu Theodor Borangiu University Politehnica Bucharest

A resource view on Public(large)/Private(small) Clouds Public Clouds Amazon, Microsoft Azure, Google General purpose Unlimited resources Private Clouds More specific purpose Limited resources Request based provisioning of resources The fact that private clouds have a more specific purpose offers an opportunity to generate near optimal scheduling for workloads

Resource allocation Public Clouds: FIFS No approval process involved Private Clouds: Request/Approve model Cloud administrator has to assure workload distribution/capacity when approving requests for cloud resources Approval Process implies using a reservation model for resources

Application of small clouds Software Testing Product Support Product Development Data mining and reporting Proof of concepts e-Learning

Optimum Workload Scheduling Takes into considerations the primary factors that affect performance in a virtualized environments and the dependency between them. Defining workload types CPU intensive IO intensive Memory over-commitment (VMWare) Allows allocation of more memory to the virtual machines than is actually available in the physical machine, if the virtual machines are running similar operating systems IO profile of a workload has a great impact on virtualization performance

Cloud Computing in E-Learning (VCL Model) VCL (Virtual Computing Laboratory) University of North Carolina Usage model is based on the workload types: Single Seat (VCL-Desktop) Multiple Synched Seats (VCL-Class) Servers (VCL-Server) Research Clusters (VCL-Research) High Performance Computing Clusters (VCL-HPC) Resources allocated: on-demand (“now”) model, or by reservation

Considered Scenario Workloads are provisioned and de-provisioned automatically based on a pre-generated schedule The student can use the workload to complete the laboratory activities during the scheduled time slot and when this time slot expires, the scheduler would stop the workload, de-provision it and store it in its current state. The workload for the next student is prepared, provisioned and started.

Request Model Planned Scheduling Requests these are submitted by the professors before the learning cycle starts and based on these requests the weekly schedule will be generated One-Off Scheduling Requests these can be submitted at any time by professors and are subject to manual approval by cloud administrators according to the existing load.

Genetic Algorithm Prototype Data objects and LabInstace data structure The central object LabInstance contains a reference to a Professor object, a StudentGroup, a Laborator and holds information about the duration of the class and how many times it does repeat in the week. The Laborator object contains a reference to a WorkloadType instance, which stores the workload characteristics in terms of CPU profile, IO profile and operating system details. The StudentGroup class contains the name of the student group and the number of students.

The Fitness Function (Hard Conditions) Total Load: Iterates all the time slots and computes a sum of all workloads that are scheduled in that time slot. If the total load scheduled is less than the maximum estimated capacity of the cloud, then 10 points are awarded. Student Group Overlap: Checks that there is no overlap in the schedule for the same student group. In other words, it assures that a student group is not scheduled twice in the same time slot. If the condition is fulfilled for all student groups, then 7 points are awarded.

The Fitness Function (Soft Conditions) Memory Over-commitment: This condition checks that for each time slot that only a single set of WorkloadType are scheduled. For each slot that fulfills this condition 3 points are awarded. When all time slots are evaluated, the score is divided by the number of time slots and added to the global score. CPU Intensive: Iterates the time slots and checks that no more than two CPU intensive WorkloadTypes are scheduled at the same time. For each time slot that passes this check one point is awarded. The score is again divided by the number of time slots and added to the global score. IO Intensive: Checks that for each time slot there are no more than two IO intensive WorkloadTypes scheduled. Uniform Distribution: Computes a factor characterizing the distribution of workloads. The factor is computed by first calculating the average number of workloads scheduled across all time slots and then by evaluating the difference between the calculated average and the number of workloads scheduled. A threshold of 20 workloads is considered acceptable, so if the threshold is respected across all time slots, 5 points are added to the global score.

The Selection Operation Done in two steps: Computing the fitness value for each individual in the population Sorting the population based on the results The best 65% individuals are selected for crossover operation

Crossover Operation Represents the combination of two Schedule instances that produce an offspring. The crossover operation is implemented by generating a random number X (crossover point) between 1 and N, where N is the number of LabInstances. The offspring will inherit the schedule of the first parent from LabInstance1 to LabInstanceX and the schedule of the second parent from LabInstanceX+1 to LabInstanceN.

The Mutation Operation Is applied to a randomly chosen subset of individuals in each generation Consists in rescheduling of one LabInstance from the data structure. The new schedule is generated by randomly assigning a new time slot for the selected LabSchedule instance. Both the LabSchedule instance and the new time slot are selected and generated randomly.

Genetic Algorithm Structure Step1: generateInitialPopulation() Step2: while (best individual fitness < min_fitness){ Step3: do_crossover(best 65% individuals) Step4: calculate_fitness(offsprings) Step5: remove_worst(worst 35% individuals) Step6: calculate_best_individual_fitness Step7: }

Experimental Results Load distribution for Windows (in gray)/Linux (in black)

Experimental Results CPU intensive (in gray)/IO intensive (in black)

Thank You

A Genetic Algorithm for Workload Scheduling In Cloud Based e-Learning

A Genetic Algorithm for Workload Scheduling In Cloud Based e-Learning

Presentation Transcript

A genetic algorithm for structure based de-novo design

Coordinated Workload Scheduling

Scheduling Algorithm

A Parallel Genetic Algorithm FOR Predictive Job Scheduling

Scheduling in Cloud

Using Parallel Genetic Algorithm in a Predictive Job Scheduling

Power aware scheduling/workload placement over cloud

An Event Based Simulation Algorithm for Train Scheduling

A Genetic Algorithm

Genetic Algorithm and Their Applications to Scheduling

A Genetic Algorithm with Dominance Properties for Single Machine Scheduling Problems

Cost-based scheduling algorithm for workflow-based application in optical grid

QOS-based scheduling algorithm for workflow-based application in utility grid

Genetic Algorithm

A genetic algorithm to achieve scheduling flexibility

Genetic Algorithm

Genetic Algorithm

A genetic algorithm-based method for feature subset selection

A Genetic Algorithm for Designing Materials:

Genetic Algorithm in Job Shop Scheduling

GENETIC ALGORITHM

Cloud Based Employee Scheduling Software