1 / 47

Network Management

Network Management. Lecture 4. Performance Management. The practice of optimizing network service response time. It also entails managing the consistency and quality of individual and overall network services.

nell
Download Presentation

Network Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Management Lecture 4

  2. Performance Management • The practice of optimizing network service response time. • It also entails managing the consistency and quality of individual and overall network services. • The most important service is the need to measure the user/application response time. • For most users, response time is the critical performance success factor. This variable will shape the perception of network success by both your users and application administrators. (cisco)

  3. What is Performance Management • Quantification of performance indicators on • Server • Network • Workstation • Applications • Standard performance goals are: • Response time • Utilization • Throughput • Capacity

  4. In Performance Management • Need to • Maintain continuous indicator for performance evaluation • Verify what levels of service need to be maintained • Identify actual and potential bottlenecks • Establish and report on usage trends

  5. Objectives of Performance Management • Need to ensure that network highway remains accessible and not crowded • Provide a consistent level of service • Avoid degradation of performance • Provideproactivemanagement

  6. Performance Indicators Required • Transmission capacity • Expressed in bits per second • Signal Propagation delay • Time required to transmit signal to its destination • Longer the propagation, the longer the delay

  7. Performance Indicators Required • Topology • Star, Tree, Ring, BUS or Combination of Star and Ring • Would limit number of workstations or hosts per cable segment which can be attached to the network • The higher the number of nodes, the lower the performance

  8. Performance Indicators Required • Frame/Packet Size • Most LANs are designed to support only a specific, fixed size frame or packet • If message is larger than the frame size, it must be broken into smaller sizes • Increased in number of frames per message would add to delay

  9. Performance Indicators Required • Access protocols • Most influential metric • e.g CSMA/CD, Token ring • User Traffic profile • Time of use • Type of message generated by user (Single, Broadcast) • Number of users on line

  10. Performance Indicators Required • Buffer Size • Piece of memory used to receive, store, process and forward messages • If buffer is too small, delays or discarding of packets may occur

  11. Performance Indicators Required • Data Collision and Retransmission • Collision is inevitable • Factors to be considered • Time it takes to detect collision • Transmission time of collided messages

  12. Performance Indicators Required • Resource usage • How much resource is used by the user, application • How much reserves are left • Processing Delays • Can be caused by both host and network • Host delays - divided into system and application processing delays

  13. Performance Indicators Required • Processing Delays (Con’t) • Network - hardware and software cause • (Network card vs network driver) • Throughput • Measurement of transmission capacity • Statistical measurement over time

  14. Performance Indicators Required • Availability • Service availability from an end-user’s point of view • If delays are long, then even if the network is available, where the end user is concern, the network is virtually unavailable

  15. Performance Indicators Required • Fairness of measured data • Important to take measurements at peak to average ratio levels • Collect data at known high usage and average usage periods • Sample measurement • Measurement of traffic volume • Ensure sampling interval is the same as the above

  16. Performance Management Measurement Methods • Collect data on current utilisation of network devices and links • Static vs dynamic • Once off or continuos sampling • Event reporting or polling • Analyse the relevant data • Set utilisation thresholds • Simulate the network

  17. Performance Management Measurement Methods • Good sample size collected • Do not just use one measurement • Do several and take average • Ensure samples are representative • Do measurements at different times of the day/week • Compare load (e.g lunch time load vs end of month)

  18. Performance Management Measurement Methods • Beware of the unexpected • Unusual use on day of test • Backups at 3 am

  19. Threshold and Exception Reporting • Define indicators • Determine frequency of measurements • Define threshold for each indicator • Get guidelines from vendors

  20. Threshold and Exception Reporting • Design reporting systems • Determine information areas and indicators • What equipment, networks or objects are monitored • Determine distribution matrix • Who gets reports • How often and at what level of detail • Presentation

  21. Network Performance Analysis • Data Analysis • What are the effects of hardware/ software on the network? • Dependent on • Network type/protocols • Packet size • Buffer size • Processes running • Routing algorithms

  22. Network Performance Tuning • Tune to service requirements • Calculate payback in advance • Observe the 80-20 rule and 1:4 internet traffic rule • Focus on critical resources • Determine when capacity is exhausted • Define objectives • Determine time frames

  23. System Design for Better Performance • CPU speed is more important than network speed • No effects on bottlenecks • Reduce packet count to reduce software overheads • Each packet has its associated overheads • The more the number of packets, the more the overheads • Increase packets size to reduce packet overheads

  24. System Design for Better Performance • Minimise context switching • e.g kernel to user mode • Waste processing time and power • Reduced by having library procedures that send data to do internal buffering until a substantial amount has been collected before processing. • Minimise copying • Copying e.g from buffer to kernal to network layer buffer to transport layer buffer • Copy procedures should be minimised if not required

  25. Event Correlation Techniques • Basic elements • Detection and filtering of events • Correlation of observed events using AI • Localize the source of the problem • Identify the cause of the problem • Techniques • Rule-based reasoning • Model-based reasoning • Case-based reasoning • Codebook correlation model • State transition graph model • Finite state machine model

  26. Rule-Based Reasoning

  27. Rule-Based Reasoning • Rule-based paradigm is an iterative process • RBR is “brittle” if no precedence exists • An exponential growth in knowledge base poses problem in scalability • Problem with instability if packet loss < 10% alarm green if packet loss => 10% < 15% alarm yellow if packet loss => 15% alarm red Solution using fuzzy logic

  28. Configuration for RBR Example

  29. RBR Example

  30. Model-Based Reasoning • Object-oriented model • Model is a representation of the component it models • Model has attributes and relations to other models • Relationship between objects reflected in a similar relationship between models

  31. MBR Event Correlator Example: Hub 1 fails Recognized by Hub 1 model Hub 1 model queries router model Router model declares no failure Router model declares failure Hub 1 model declares Failure Hub 1 model declares NO failure

  32. Case-Based Reasoning • Unit of knowledge • RBR rule • CBR case • CBR based on the case experienced before; extend to the current situation by adaptation • Three adaptation schemes • Parameterized adaptation • Abstraction / re-specialization adaptation • Critic-based adaptation

  33. CBR: Matching Trouble Ticket Example: File transfer throughput problem

  34. CBR: Parameterized Adaptation • A = f(F) • A’ = f(F’) • Functional relationship f(x) remains the same

  35. CBR: Abstraction / Re-specialization • Two possible resolutions • A = f(F) Adjust network load level • B = g(F) Adjust bandwidth • Resolution based on constraint imposed

  36. CBR: Critic-Based Adaptation • Human expertise introduces a new case • N (network load) is an additional parameter added to the functional relationship

  37. CBR-Based Critter

  38. Codebook Correlation Model: Generic Architecture • Yemini, et.al. proposed this model • Monitors capture alarm events • Configuration model contains the configuration of the network • Event model represents events and their causal relationships • Correlator correlates alarm events with event model and determines the problem that caused the events

  39. Codebook Approach Approach: • Correlation algorithms based upon coding approach to even correlation • Problem events viewed as messages generated by a system and encoded in sets of alarms • Correlator decodes the problem messages to identify the problems Two phases: 1. Codebook selection phase: Problems to be monitored identified and the symptoms they generate are associated with the problem. This generates codebook (problem-symptom matrix) 2. Correlator compares alarm events with codebook and identifies the problem.

  40. Causality Graph • Each node is an event • An event may cause other events • Directed edges start at a causing event and terminate at a resulting event • Picture causing events as problems and resulting events as symptoms

  41. Labeled Causality Graph • Ps are problems and Ss are symptoms • P1 causes S1 and S2 • Note directed edge from S1 to S2 removed; S2 is caused directly or indirectly (via S1) by P1 • S2 could also be caused by either P2 or P3

  42. Codebook • Codebook is problem-symptom matrix • It is derived from causality graph after removing directed edges of propagation of symptoms • Number of symptoms => number of problems • 2 rows are adequate to identify uniquely 3 problems

  43. Correlation Matrix • Correlation matrix is reduced codebook

  44. State Transition Model • Used in Seagate’s NerveCenter correlation system • Integrated in NMS, such as OpenView • Used to determine the status of a node

  45. State Transition Model Example • NMS pings hubs every minute • Failure indicated by the absence of a response

  46. State Transition Graph

  47. Finite State Machine Model • Finite state machine model is a passive system; state transition graph model is an active system • An observer agent is present in each node and reports abnormalities, such as a Web agent • A central system correlates events reported by the agents • Failure is detected by a node entering an illegal state

More Related