1 / 31

Common Anomaly Detection Platform

Common Anomaly Detection Platform. Tony Xing Senior Product Manager @ Microsoft. Bio. Senior Product Manager of Shared Data team @ Microsoft Data quality and anomaly detection NRT datasets Data Ingestion Senior Product Manager of Skype Data team @ Microsoft Real time analytics

gloriaadams
Download Presentation

Common Anomaly Detection Platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Common Anomaly Detection Platform • Tony Xing • Senior Product Manager @ Microsoft

  2. Bio • Senior Product Manager of Shared Data team @ Microsoft • Data quality and anomaly detection • NRT datasets • Data Ingestion • Senior Product Manager of Skype Data team @ Microsoft • Real time analytics • Anomaly detection • Cross platform SDKs

  3. Agenda • Context • Anomaly detection 101 • Problem statement • Design principles • How it works • Algorithms • Challenges and future work

  4. Context

  5. Shared Data

  6. Shared Data

  7. Anomaly Detection 101

  8. What is Anomaly Detection • Anomaly detection is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset • Widely used in • System health monitoring • Business metric monitoring • Application performance monitoring • “My current value is not what it should be as of right now”

  9. Rule setting vs. automated Automate the process of finding outliers across the streams of data with a time dimension

  10. Manual rule setting is impossible for large number of time series Single AD algorithm can not fit all signal types Precision vs. recall Analysis and diagnostics when issues happen Near real time detection Scalable Customers needs flexibility in plugging in different sources Problem Statement

  11. What is CAP • One stop shop for metric monitoring, analysis and diagnostics • Key capabilities Automation: Full automation from creating rules to detection without human intervention Extensibility: Can plug in new data sources and anomaly detection algorithms. Scalability & real time: linear scale out Azure service Finer Granularity: support time series AD in hour/minute level REST APIs: REST APIs available for all operations. Allow easy integration into other product experience Algorithm tuning: allow easier tuning of algorithm

  12. How it works – Automation Onboarding Helps data owners register the incoming streams Creating rules & detecting The creating rules component creates detection rules which are then used by the detecting component to detect potential anomalies Contain machine learning and statistical analysis algorithms Alerting Once anomalies are found, alerting component will send anomaly info to the data owner

  13. How it works - Extensibility • Defined a generic interface of training and detection • Each algorithm provider would implement per defined interface • For example for each data point, we expect following from algorithm providers • Whether it is an anomaly • What is the predicted/expected value by algorithm • What is the suggested lower bound • What is the suggested upper bound • Confidence level • …

  14. How it works – Extensibility

  15. How it works - Scalability

  16. Algorithms Intro

  17. Algorithm - Service Insider Good in time series with periodical pattern Holt-Winters algorithm - Train model and predict Improvements for robustness: Use Median Absolute Deviation (MAD) to get robust estimation Handling for data missing and noise (e.g., data smoothing) Automatically capture the slow and regular trend and seasonal pattern GLR (Generalized Likelihood Ratio) - Used to detect anomalies Improvements Floating Threshold GLR, to dynamically adjust the model using the new input data Outlier removal for noisy data

  18. Automatic detection of time series types (seasonal/non-seasonal) Automatic detection of seasonality/trend, instead of manual setting Add the feedback channels for end users to intuitively tune the algorithms Other Improvements

  19. Good in detecting slow upward/downward trend, spike and dip, change in dynamic range General framework for online change detection in time series Has the property we are interested in changed in distribution? User specifies meaning of “new value strangeness” given history At each time t we receive a new value Add it to the history. For each item i in the history s[i] = strangeness function of (value[i], history) Let p[t] = (#{i: s[i] > s[t]}+ r*#{i: s[i]==s[t]})/N, where r is uniform in (0,1) Uniform r makes sure p is uniform Azure ML - Exchangeability Martingale

  20. Azure ML - Exchangeability Martingale

  21. Algorithm – Exponential Smoothing

  22. Result Evaluation of exponential smoothing In some cases with periodical signal with trending, many false positives could be generated

  23. Result Evaluation - ServiceInsider

  24. Result Evaluation – EM

  25. Result Evaluation – ES based

  26. Result Evaluation – ServiceInsider and Azure ML

  27. Real time vs. accuracy Automated handling of data pattern change Easy tuning or usage of different algorithms Challenges and Future Work

  28. Real time vs. Accuracy • Real time vs. Accuracy • Some data streams are not stable from the perspective of data point latency

  29. Data Pattern Change

  30. Easy Tuning • Tuning the algorithm parameters to achieve right detection precision and recall is a pain to the users • Service insider 2 parameters • EM based: 7 parameters • ES based: 3 parameters • Creative UI to hide those details • Do without human tuning at all!

  31. Questions!

More Related