1 / 27

Event Correlation & Life Cycle Management

Event Correlation & Life Cycle Management. How will they coexist in the NFV world? May 10, 2017. Dale Sorsby Bill Coward Michael Evenchick. Introduction & Monitoring. Introduction to NFV. Network Function Virtualization as defined by Wikipedia is:

mlyman
Download Presentation

Event Correlation & Life Cycle Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Event Correlation & Life Cycle Management How will they coexist in the NFV world? May 10, 2017 Dale Sorsby Bill Coward Michael Evenchick

  2. Introduction & Monitoring

  3. Introduction to NFV • Network Function Virtualization as defined by Wikipedia is: • "a network architecture concept that uses the technologies of IT virtualization to virtualize entire classes of network node functions into building blocks that may connect, or chain together, to create communication services." • It is not a Virtual Network Function (VNF) but the environment in which VNFs exist.

  4. Network Service Architecture

  5. NFV Functional Swim Lanes

  6. Monitoring – What? • Cloud Infrastructure (Compute, Network, Storage) • Management Infrastructure (VIM, VNFM, NFVO) • Virtual resources (cpu, mem, disk) • Virtual Instances (VNFs) • Virtual Services (Full Service chain through cloud) • Overlay networks (Tunnel to Cloud, Cloud Networks) • Underlay networks (Interfaces and connectivity of devices) • Physical network devices (Routers, Switches, etc)

  7. Network Monitoring Cross Domain Orchestrator Visualization Correlation SDWAN Controller Network Orchestrator Collection Central Data Center Regional Data Center VIM NFV Orchestrator SDNC OpenStack - VIM VNF Manager Provider Core Branch/ Campus SDN Controller Servers Managed Access VNF Internet Distributed Cloud SDWAN Centralized Cloud

  8. Infrastructure Monitoring Cross Domain Orchestrator Visualization Correlation SDWAN Controller Network Orchestrator Collection Central Data Center Regional Data Center VIM NFV Orchestrator SDNC OpenStack - VIM VNF Manager Provider Core Branch/ Campus SDN Controller Managed Access Internet Distributed Cloud SDWAN Centralized Cloud

  9. Application Monitoring Cross Domain Orchestrator Visualization Correlation SDWAN Controller Network Orchestrator Collection Central Data Center Regional Data Center VIM NFV Orchestrator SDNC OpenStack - VIM VNF Manager Provider Core Branch/ Campus SDN Controller Managed Access Internet Distributed Cloud SDWAN Centralized Cloud

  10. Service Monitoring – Coordinated Effort Cross Domain Orchestrator Visualization Correlation SDWAN Controller Network Orchestrator Collection Central Data Center Regional Data Center VIM NFV Orchestrator SDNC OpenStack - VIM VNF Manager Provider Core Branch/Campus SDN Controller Internet (IPsec) Servers Managed Access VNF Internet Distributed Cloud SDWAN Centralized Cloud

  11. Operationalizing NFV • Don't reinvent the wheel • Use built in intelligence • Utilize strength of existing systems • Brownfield • Integrate with existing assurance systems • Don't introduce new applications/views to operators

  12. Service Assurance - Collection and Challenges • Integration • How do you know everything has been gathered? • Incomplete alerts/notifications • Reliable • Comprehensive • New approach for monitoring NFV environments – things change

  13. Event Correlation

  14. Event Correlation - Definition • Event Correlation: an automated process of understanding and revealing relationships between complex system events. • Requires Holistic Awareness Telemetry/Intelligence • Integrate Multiple Data Sources & Types, Protocols • Scale, 1000’s Physical & Logic Resource Elements  • Event Relationships can be Topology, Temporal or Service … • Data, Information, Knowledge, Wisdom

  15. Event Correlation - Challenges • Challenge, make sense of events/or lack of, so they are actionable. • Internal OpenStack telemetry may have scalability & sizing challenges, Ceilometer, RabbitMQ, Nagios, Heka… • Integration/Access to multiple systems & Data Types/Formats • Healing Collisions,  VNFO, NFVM, Heat … must Yield to CDO • Open Source Event Correlation Tools, Simple Event Correlator (SEC), Drools, RiverMuse… SP/CG ?

  16. Event Correlation - Opportunity • Closed Loop Monitoring, Alerting, and Healing • Event Aggregation,  Suppression, Prioritization, Routing, Enrichment • Historical Knowledge Support & Identify Service Impact / Ripple • Effective event correlation supports decision making knowledge and Automation/LCM • Reduced Operation Expense, Improved Incident Inter-Department Coordination • Improve Event-to-Incident Resolution  Process/Time • Smarter not Harder

  17. Event Correlation – Future • ONAP, Data Collection, Analytics, and Events (DCAE) • OPNFV, VNF Event Streaming (VES) Project, common data model • Open Source, Vitrage, Monasca, Zabbix • Assisted/Machine Learning & A.I. Capabilities  • Need for comprehensive Framework for life cycle management

  18. Life Cycle Management

  19. Life Cycle Management, Where? And When? • It is great to automate! So... • Everything wants to perform Life Cycle Management • VIM: "OpenStack Heat" will perform life cycle management when it detects issues with the items that it Orchestrated. • SDN Controller (e.g. Contrail) can perform life cycle management under some condition similar to Heat when it owns the Service Instances. • VNFM will perform life cycle management of the VNF • NFVO will perform life cycle management of the NFV service through the cloud • Cross Domain Orchestrator(CDO) can also perform life cycle management of NFV services and possibly the underlay network • Result is confusion and overlap • So which system is really responsible for what? • And where is the correlation?

  20. Trouble Scenario – VNF Fails

  21. Trouble Scenario – VNF Fails • Heat determines the failure and attempts to restore the VNF • SDN Controller determines the failure and attempts to restore the VNF • VNFM determines the failure and attempts to restore the VNF • NFVO determine the service outage and attempts to restore the service • The Cross Domain Orchestrator determines the service outage and attempts to restore the service. • Will all 5 try to heal? Probably not but experiences have shown multiple elements heal and the result is failed healing and systems being out of sync.  • Heat determines the failure and instantiates a new VNF  • The new VNF claims ports on the network • The NFVO determines the failure and attempts to destroy the Heat stack • This fails due to not being able to free all the ports on the network • Unwanted Result: Systems are now out of sync and the status of the service is in question. If brought up by Heat, any Customer specific configuration placed on the VNF by the VNFM will be missing.

  22. Trouble Scenario – VNF Management Network Failure

  23. Trouble Scenario – VNF Management Network Failure • VNFM determines VNF failure and attempts to heal the VNF  • NFVO determines service issues and attempts to heal the service.  • Will these occur? It will depend on what and how the VNF is being monitored but experience shows this happens. If either one of these happens, the service which could actually have been fine, will be impacted. • Unwanted Result: a 5 minute outage caused entirely by the systems put in place to minimize the customer impact. 

  24. Trouble Scenario – Network Outage

  25. Trouble Scenario – Network Outage • Detection of traffic flow issue • VNFM determines VNF failure and attempts to heal the VNF  • NFVO determines service issues and attempts to heal the service.  • Cross Domain Orchestrator determines there is an issue with the service and attempts to heal the service? • Will these occur? It will depend on what and how the VNF is being monitored but experience shows this happens. If either one of these happens, the service which could actually have been fine, will be impacted. • Unwanted Result: a 5 minute outage caused entirely by the systems put in place to minimize the customer impact. While in reality the customer may have seen a limited outage 10-30 seconds

  26. Cross Domain Orchestration • It is immature and in its infancy but could be part of the answer. • It will need to insure all of the necessary events are collected • It will need to tightly integrate with correlation • It will need to tightly integrate with all Domain Orchestrators to have an end to end view of the service • It will need to permit each Domain Orchestrator to control their domain

  27. Life Cycle Management & Existing Correlation • Lessons Learned • Work with Operations and the system they use - The best tools are only the best because they get used! • Systems working with other systems is the only way to achieve success with end to end life cycle management. So plan to integrate with other systems from the beginning not as an after thought! • Existing systems will need to learn and handle the new world or they will eventually have to be replaced! • "Stay in your swim lane but understand the world is bigger than you!"

More Related