Qiang Guan, Ziming Zhang and Song Fu University of North Texas

Efficient and Accurate Anomaly Identification Using Reduced Metric Space in Cloud Computing Systems Qiang Guan, Ziming Zhang and Song Fu University of North Texas

Introduction • Anomaly detection is a vital element of operations in large scale datacenter. • Detecting patterns in a given data set that do not conform to an established normal behavior.

Challenges • Continuous monitoring and large system scale lead to the overwhelming volume of data collected by health monitoring tool. • The large number of metrics that are measured make the data model extremely complex. • High metric dimensionality will cause low detection accuracy and high computational complexity.

This paper • Presents a metric selection framework for online anomaly detection in utility cloud. • Select most essential metrics by applying metric selection and extraction methods. • Identify anomalies using an incremental clustering approach. • Implement a prototype and evaluate the performance.

Dimensionality Reduction • Transforms the collected health-related performance data to a new metric space with only the most important metrics preserved. • In this paper: • Metric selection using mutual information. • Metric extraction by metric space combination and separation.

Metric Selection • Select the best subset of the original metric set based on mutual information. • The mutual information of two random variables is a quantity that measures the mutual dependence of the two random variables.

Metric Selection(Cont.) • However, finding the optimal metric subset id NP-hard. =>

Incremental Search Method • Given Sk-1, try to select the kth metric that maximizes dependency() from the remaining metrics in (M-Sk-1). • →S1 ⊂ S2 ⊂ ... ⊂ Sn

Incremental Search Method(Cont.) • Sn* • Find the range of i, where the cross-validation error erri has small mean and small variance. • err* = Min(erri) • n* equals to the smallest i, for which Si has err*.

Metric Extraction • Creates new metrics by transformation or combination of the original metrics. • Two methods: • Metric space combination • Metric space separation

Metric Space Combination • Dataset D = [x1, x2, …, xL] • Record xi = [x1,i, x2,i, …, xn,i] T • Covariance matrix of D: V=DDT • Calculate the eigenvalues {λi} of V and sort them in descending order. • Choose n’ by:

Metric Space Combination(Cont.) • The corresponding n’ eigenvectors are the new metrics. • Apply Gram-Schmidt orthogonalization process to compute eigenvectors {ej}.

Metric Space Separation • Separate desired data from mixed data. • Record x = [x1, x2, …, xL] T • Component e =[e1, e2, …, en’] T • x = Ae → e = Wx • Find an optimal transformation matrix W so that {ej} are maximally independent.

Metric Space Separation(Cont.) • Independent component analysis (ICA) • A computational method for separating a multivariate signal into additive subcomponents. • A special case of blind source separation.

Incremental Clustering • Data points are considered one at a time, and assigned to existing groups without affecting the existing group significantly. • “A data point goes into the nearest group if the Euclidean distance between this point and the centroid of the group smaller than δ; else create a new group.” • Update centroid after new point comes in. • Adjust δ if cloud operators find false-negative. • Normal but assigned to anomaly.

Experiment Setting • 362 servers. • Each server hosts up to ten VMs. • Benchmarks: • RUBiS distributed online service benchmark • MapReduce jobs • Fault injection • CPU, memory, disk, and network faults.

Experiment Setting(Cont.) • Monitoring tools • sysstat: runtime performance data in Dom0 • Modified perf: performance counters from hypervisor. • Total 518 metrics. • 182 + 336 • However, only 406 non-constant. • Monitor every minute from 2011/01/20 to 2011/08/11.

Metric SelectionResult • 406→14 • Metric space reduced by 96.6%

Metric Extraction Results • Metric extraction and metric selection v.s. Metric extraction only.

Detection Precision

Conclusion • Anomaly detection is important. • self-managing cloud resources and enhancing system dependability. • They present a metric selection framework with metric selection and extraction mechanisms. • The selected and extracted metric set contributes to highly efficient and accurate anomaly detection.

Qiang Guan, Ziming Zhang and Song Fu University of North Texas

Qiang Guan, Ziming Zhang and Song Fu University of North Texas

Presentation Transcript

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

The University of North Texas

Qiang (John) Fu, M.D., Ph.D.

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries

University of North Texas Libraries