Management Pack Development

1. Management Pack Development Vlad Joanovic Senior Program Manager Microsoft Corporation

2. Agenda Building manageable applications Common integration patterns Customer data Technical best practices and architecture Understand how to model an application and use the model to build a management pack Understanding of the steps required to build a management pack What to do next and what questions to ask Summary References

3. Operation Manager's Focus Areas Best for Windows Best for Physical and Virtual New Monitoring Scenarios End-to-end IT service mgmt (distributed apps, web services) Client Monitoring Audit and Compliance SP2: Non-windows monitoring Best for Data Center Management Solution-based selling

4. Microsoft Investing in Ops Mgr Operations Manager Team: SP1 and developing SP2 Includes non-windows monitoring Management Packs Over 85 Operations Management Packs delivered since RTM Window Server 2008 MPs and Support New MP Authoring Console! Other Microsoft Areas: Corporate: Engyro acquisition for Ops Mgr interoperability Microsoft Consulting Services: Server Infrastructure Optimization Visio: New Connectors for Visio 2007 Solution Accelerator: Service Level Dashboard for System Center Operations Manager 2007 � New Beta! Visual Studio: Team System Management Model Designer Forefront: next generation security products � Stirling - built on Operations Manager 2007

5. Our Commitment to MP Partners Make it as easy as possible for you to develop onto Operations Manager Authoring Console and documentation Responsive business and technical resources Built on open standards such as WS-Management and common models Options for us to work together Joint sales and marketing plans Content (case studies), training, and readiness to support joint efforts On �line marketing through System Center Alliance www.microsoft.com/systemcenter/alliance Management Pack catalog Frequent communication Joint collaboration in planning for future releases By building partner solutions to extend Operations Manager you increase your opportunities, and we close more joint sales!

6. Thinking about monitoring Most instrumentation seen in the wild today� Doesn�t tell me if a service or application is working well Was great for the developer while debugging Reports a symptom, and rarely alone is suitable to make a diagnosis Most monitoring today is � Added on by the people who are responsible for keeping the application running Rarely a part of the up front system or application design effort A best guess on the part of the person or team who designed monitoring rules based on what instrumentation is visible after setting up an application on a test environment. Today we�ll talk about improving these Just food for thought if you are shipping applications that need to be monitored.

7. Life cycle states & measurements

8. Three important questions Is my application healthy? Use health measures to show there are no customer impacting issues Look at redundant measures that detect elements that have failed Look at the balance of work across the system Are critical dependencies able to perform in concert without major disruption to users? Are the users of my application happy? How fast do your pages load from request to responsiveness? Look at abandon page rates relative to overall traffic Can an end to end interaction happen without interruption? Consider artificial transactions as a weak proxy for these How well do the parts of my application work together? Look at subsystem measures that signal imbalances Instrument for detecting problems where they occur Be able to follow a call from end to end if necessary

9. Failure-mode analysis Definition An up-front design effort for a monitoring plan that is similar to threat modeling Produces Instrumentation plan Design artifacts used to write code that helps detect failures Typically shows up in specs Monitoring plan Used by operations to configure the monitoring system(s) during deployment Health model Describes health at the end-user and subsystem level Used to understand impact of specific types of failures on each subsystem Guides mitigation and recovery documentation Helps drive escalations from Tier 1 to Tier 2 Driven by: Monitoring champion: Ensures that monitoring is part of design process

10. Failure mode analysis Process Step 1: List what can go wrong and cause harm to service: Identify all failure modes: List predictable ways to fail Understand if an item is a way to fail or an effect of a failure Prioritize according to impact on service health, probability, cost Include physical, software, and network components Step 2: Identify a detection strategy for each failure mode Each high-impact item needs at least two detection methods Detection can be a measure or event, or can require watchdog (code) Step 3: Add these detection elements to your code effort Some are probes, some are monitors (automate as much as possible) Step 4: Write your management pack Result is the basis of instrumentation and monitoring plans Failure modes are root causes Detecting root causes directly is optimal Inferring root causes via symptoms requires correlation

11. Example: Blue client library Log cannot be written Garbage collection interferes with expected behaviors Machine.config is not set up right Wrong version of .NET framework on machine CRC failures Discovery layer responding slowly Discovery layer not responding Discovery layer returns incorrect location Location cache is out of date Unrecognized or bad blob format causes crash Remote code execution exploit occurs Version mismatch Assembly load fails Unable to find a specific collection Unable to communicate with anything Authorization: Client library not able to communicate Connection pool is not adding new connections Client goes into infinite loop and spams blue Memory not available to hold working set Config file missing Config file corrupt Config file points to wrong places Communicates with wrong end point

12. Coverage matrix

13. Common integration patterns An application that runs on windows Windows Service Windows Application Web Service An application that runs where there is not a healthservice Exposes syslog, snmp, ws-man, other providers of mgmt type data Proxy management Another application exists that is already monitoring

14. App that runs on windows Agent is where monitoring happens Management server, Gateway and RMS have �agents� too so monitoring can happen there -OpsMgr has already defined and discovered the objects that represent the windows computers being monitored above with an �Agent� (this includes Mgmt servers, gateways and RMS�s) -this means a user has to deploy an OpsMgr agent to the server/client that you want to monitor -if discovery for your object are going to run within OpsMgr your first discovery has to be associated to something OpsMgr already knows about Agent is where monitoring happens Management server, Gateway and RMS have �agents� too so monitoring can happen there -OpsMgr has already defined and discovered the objects that represent the windows computers being monitored above with an �Agent� (this includes Mgmt servers, gateways and RMS�s) -this means a user has to deploy an OpsMgr agent to the server/client that you want to monitor -if discovery for your object are going to run within OpsMgr your first discovery has to be associated to something OpsMgr already knows about

15. An application not on HS No health service is really any machine/system that does not have a local agent / health service that runs on it. This means it will need to be discovered from a windows machine that does have a health service and data proxies through that windows machine. Note: Proxy monitoring needs to be turned on for any agent that wants to discover instances that are not �hosted� by the windows computer that is responsible for that HS. This can be changed from the administration -> agents � right click properties and choose security. No health service is really any machine/system that does not have a local agent / health service that runs on it. This means it will need to be discovered from a windows machine that does have a health service and data proxies through that windows machine. Note: Proxy monitoring needs to be turned on for any agent that wants to discover instances that are not �hosted� by the windows computer that is responsible for that HS. This can be changed from the administration -> agents � right click properties and choose security.

16. Proxy Management Proxy machine should be a windows machine that has an agent on it or management server so the proxy software (ie. service) can be monitored by a standard windows app mp. Proxy machine should be a windows machine that has an agent on it or management server so the proxy software (ie. service) can be monitored by a standard windows app mp.

17. Pattern Comparison *if the computer is monitored agentless then the object is owned by the HS on the management server or RMS that monitors that computer (user picks this when they choose how to monitor the computer. 99% of our customers choose to monitor with an agent. **RMS is the default HS that manages any object that is discovered through the SDK. This can be changed through specifying a special shouldmanage relationship with the object and the HS that should monitor it. SDK objects after being discovered typically have events and performance data created against it that should go through the workflows associated to those objects � the SDKevent and SDKperformance data sources only run on the RMS. If there is a really large amount of data that needs to be collected through the SDK/connector then the SDK data source method is not the right one. A ballpark estimation of when there is too much data is around 100-500 data items perf min being inserted (depends on number of objects, number of associated workflows, etc). If larger streams of data are expected then new custom performance and event data sources would need to be developed and used instead of the SDK ones for data generation for the workflows. These can execute on Management servers or agents that are dedicated to the connector while not negatively affecting the performance of the RMS. For more detail on building data source modules � contact vladj@microsoft.com ***discovery could be done as in the no health service case and does not have to be done through SDK. For discovery data rules that max discovery size (Ie. XML that discovery data produces) is 4MB*if the computer is monitored agentless then the object is owned by the HS on the management server or RMS that monitors that computer (user picks this when they choose how to monitor the computer. 99% of our customers choose to monitor with an agent. **RMS is the default HS that manages any object that is discovered through the SDK. This can be changed through specifying a special shouldmanage relationship with the object and the HS that should monitor it. SDK objects after being discovered typically have events and performance data created against it that should go through the workflows associated to those objects � the SDKevent and SDKperformance data sources only run on the RMS. If there is a really large amount of data that needs to be collected through the SDK/connector then the SDK data source method is not the right one. A ballpark estimation of when there is too much data is around 100-500 data items perf min being inserted (depends on number of objects, number of associated workflows, etc). If larger streams of data are expected then new custom performance and event data sources would need to be developed and used instead of the SDK ones for data generation for the workflows. These can execute on Management servers or agents that are dedicated to the connector while not negatively affecting the performance of the RMS. For more detail on building data source modules � contact vladj@microsoft.com ***discovery could be done as in the no health service case and does not have to be done through SDK. For discovery data rules that max discovery size (Ie. XML that discovery data produces) is 4MB

18. MP Methodology Build application to be manageable Gather monitoring requirements and start building MP Failure mode analysis Seek out the knowledge and encode in MP often in the heads of customers, support staff, operators, subject matter experts, etc Deploy it in real data centers Refine it based on real world data Is it noisy? Or not identifying all the unhealthy situations? Did it identify the problems before a customer called the helpdesk with an issue? Regularly update MP as more knowledge is gained Also provide requirements into application for missing monitoring scenarios This doesn�t just apply to an application it could apply to any service, system or device, etc. Key is to build the application to be manageable from the beginning. The first point really means having the right instrumentation so a monitoring product like OpsMgr can monitor it � this means events in the event log, log files, snmp, syslog, ws-man. MP requirements need to be identified � what should be monitored? How can it be monitored? It is simple if everything runs on windows as that is where we have infrastructure (and soon to have it on non-windows). The key is to start building a MP � the first one is never perfect and it is important to get the knowledge on how users today are monitoring your application (ie. playbook or operations guides) and encode this into the MP which OpsMgr can execute at pennies a transaction which scales better than people manually checking things After this is done it needs to be validated in a real world environment � this is the only way that it is possible to tell if the MP adds value. Key things to look for is does the MP identify problems before users call about them? Is it saying there is a problem when everything is just fine? Once some MP is there it needs to be refined and continually updated to say fresh with knowledge. The potential here is huge � as imagine our support team helping 1 customer with an issue in which they spent lots of time figuring out what the issue was and how it happened, then they encode this knowledge/scenario into a new MP so that any customer in the world that uses that new MP will not have to go through that same troubleshooting experience and will know directly from the MP that there is an issue and how it should be fixed.This doesn�t just apply to an application it could apply to any service, system or device, etc. Key is to build the application to be manageable from the beginning. The first point really means having the right instrumentation so a monitoring product like OpsMgr can monitor it � this means events in the event log, log files, snmp, syslog, ws-man. MP requirements need to be identified � what should be monitored? How can it be monitored? It is simple if everything runs on windows as that is where we have infrastructure (and soon to have it on non-windows). The key is to start building a MP � the first one is never perfect and it is important to get the knowledge on how users today are monitoring your application (ie. playbook or operations guides) and encode this into the MP which OpsMgr can execute at pennies a transaction which scales better than people manually checking things After this is done it needs to be validated in a real world environment � this is the only way that it is possible to tell if the MP adds value. Key things to look for is does the MP identify problems before users call about them? Is it saying there is a problem when everything is just fine? Once some MP is there it needs to be refined and continually updated to say fresh with knowledge. The potential here is huge � as imagine our support team helping 1 customer with an issue in which they spent lots of time figuring out what the issue was and how it happened, then they encode this knowledge/scenario into a new MP so that any customer in the world that uses that new MP will not have to go through that same troubleshooting experience and will know directly from the MP that there is an issue and how it should be fixed.

19. Management Pack contents Class definitions � �the model� Discoveries Monitors � �health model� Tasks diagnostics, recovery, generic tasks console tasks Rules event and performance collection generic rules Knowledge Views Reports Class definitions and its structure (model) Discoveries define how to discover instances of the classes and their relationship to other objects Monitors specify health states to detect and how to detect problems Tasks Diagnostics Tasks describe what extra data is needed to troubleshoot the problem Recovery Tasks describe how to fix the problem Generic Tasks describe common actions that can be executed for a given instance of a object Type when demanded Rules Event and Performance Collection Rules specify the event and performance data that needs to be collected Generic Rules describe what actions to execute and when to execute for all instances of a object Type Knowledge that is displayed in alerts and monitors Reports that provide common sets of data against the modelClass definitions and its structure (model) Discoveries define how to discover instances of the classes and their relationship to other objects Monitors specify health states to detect and how to detect problems Tasks Diagnostics Tasks describe what extra data is needed to troubleshoot the problem Recovery Tasks describe how to fix the problem Generic Tasks describe common actions that can be executed for a given instance of a object Type when demanded Rules Event and Performance Collection Rules specify the event and performance data that needs to be collected Generic Rules describe what actions to execute and when to execute for all instances of a object Type Knowledge that is displayed in alerts and monitors Reports that provide common sets of data against the model

20. Getting started Data flow Where can the objects be discovered? How is the instrumentation data going to get to an OpsMgr HealthService? Is something else already �monitoring� Building the actual MP Standalone Authoring Console Operations Manager Console XML file OpsMgr SDK apis

21. Building a MP Define the application/model Define how to discover the application Define the health model with knowledge Define views Define tasks � diagnostic, recoveries, other Define reports

22. Who are the customers? Anxious IT Managers Don�t Sleep Well InformationWeek�-�March 12, 2007 Two out of three IT managers say that they are kept awake at night worrying about work 75 percent admit ongoing anxiety about application performance concerns 25% of the responded reported suffering physical symptoms. including nausea, headaches, migraines, panic attacks, heart arrhythmia, and muscle twitches. And nightmares. Terry Beehr, a Central Michigan University professor of psychology "If IT goes down, a lot of other departments can't do their work.� "IT is 24-by-7, plus that's combined with heavy workloads and work that needs to be done quickly,"

23. What do the customers want? What MP features are most important? Health Model 61% Alerting rules 23% Alert views 6%, health views 6%, reports 3% What features are least important? Tasks 50% Alert views 16%, performance views 10%, reports10% Of the following monitoring aspects is most applicable? Availability 84% Configuration 10%, performance6% 65% of customers expect SOME MP 30 days after product release 84% of customers expect knowledge to be kept up to date and live with current knowledge bases (quarterly updates) 61% of customers expect �quality� in the mp over timeliness or coverage 81% of customers want to know when a MP is available or has been updated Get them listed in our catalog & tell us the version so we can let user know when they have the wrong version

24. MP quality Coverage Are all features being monitored? Monitor the critical ones or the ones that break the most and cause the most support cases first (release then expand) Depth Health and availability Performance Transactions (synthetic or real) Quality Customer overrides required? Knowledge correct and up to date Tasks to help users resolve common problems

25. Quality MP author questions Does the MP show the object as red/down when it is really green/up? False alerts are a waste of time Would this red alert be something that a user should get out of bed at 3am for? Are there cases where the helpdesk learned about a problem with the application before the management pack said there was a problem? These are important to ask customers about and follow up: Is this something that can be added to the MP? Instrumentation or transaction? Or is the application missing instrumentation to enable the MP? Remember the 3 important questions!!!

26. Define the application Step 1 Define your application Define the key components Define the important properties Define how the key components are tied together Step 2 Identify how it relates to the rest of the OpsMgr Model Common bases classes Why relate to other classes? Rollup health, allow user to see associations 3 general approaches Discovery these in your MP Discovery these in another MP (depends on your mp) User discovers this in a distributed application Step 2 can really be done 3 different ways: 1. You can discover the associations to the OpsMgr classes from your MP � this is what you have to do if your classes are hosted directly (or indiretly from hosting relationships in the classes you inherited from) from the OpsMgr classes. For example � you inherit from computer rule and define a new computer role for your application. When you discover an instance of your application you always have to tell OpsMgr which object (computer in this case) is hosting your instance. This automatically ties your objects to something users are familiar with (a computer) and they will be able to drill into your health model from the computers health explorer) 2. These relationships can be discovered in a different management pack that depends on yours and the microsoft MP you have depended on. This can be done for reference or containment relationships. For example your application uses SQL � depending on what HS SQL is being monitored by and if you want to place a dependency on SQL in your base MP your application definition could contain or reference the SQL DB that you use. 3. There are many cases where the relationships are very hard to determine and in this case the user can using our distributed application designer drag and drop a collection of instances from many different mps into 1 distributed application definition so that app�s health is a union of all the objects that the user added in.Step 2 can really be done 3 different ways: 1. You can discover the associations to the OpsMgr classes from your MP � this is what you have to do if your classes are hosted directly (or indiretly from hosting relationships in the classes you inherited from) from the OpsMgr classes. For example � you inherit from computer rule and define a new computer role for your application. When you discover an instance of your application you always have to tell OpsMgr which object (computer in this case) is hosting your instance. This automatically ties your objects to something users are familiar with (a computer) and they will be able to drill into your health model from the computers health explorer) 2. These relationships can be discovered in a different management pack that depends on yours and the microsoft MP you have depended on. This can be done for reference or containment relationships. For example your application uses SQL � depending on what HS SQL is being monitored by and if you want to place a dependency on SQL in your base MP your application definition could contain or reference the SQL DB that you use. 3. There are many cases where the relationships are very hard to determine and in this case the user can using our distributed application designer drag and drop a collection of instances from many different mps into 1 distributed application definition so that app�s health is a union of all the objects that the user added in.

27. OpsMgr Model Keys to success A model is never perfect - approximations of the real world never are need to balance complexity and simplicity � keep it simple! Pitfalls to avoid Marketing objects as internal if they are required to be in distributed applications Modeling classes for objects that are too transient or not useful for a typical admin to use Modeling objects that have a very large instance space on 1 HS Anytime more than 1 object exists on a HS should think about cookdown strategies

28. OpsMgr Model OpsMgr ships a large collection of predefined classes or types Decide which class or type is closest to your model Use the predefined class or type as a foundation for your application model

29. Model

30. Operating System Model

31. Computer Role Model

32. Defining the application DEMO

33. Discovery 2 different execution & data flow methods Discovery rule in OpsMgr � timed intervals (every hour) or based on some event SDK discovered � based on when connector calls SDK API Incremental vs Snapshot Incremental specifies logic around removals, additions lives somewhere else Snapshot means discovery submission will always include ALL instances OpsMgr implements business logic around removals, additions HealthService �Hosting� implications HS that discovered the object by default �hosts� all workflows against it When SDK discovers an object RMS hosts it (this can be changed) Reference counting and removing objects If X discovery rules discover an object, X rules need to undiscover an object If discovery is disabled by an override, or object it was targeted to goes away instances may remain (use powershell command in notes) Discovery rule in OpsMgr is executed by HS in monitoringhost.exe using specified creds � action account by default Objects are ref counted so if X discovery rules discovery them then X discovery rules would have to �undiscover� them.� If your discovery rules are no longer even running due to a targeting change (or disabling the discovery rule) then you would have to delete them through this powershell command. http://blogs.msdn.com/boris_yanushpolsky/archive/2007/11/20/opsmgr-sp1-removing-instances-for-which-discovery-is-disabled.aspx Discovery rule in OpsMgr is executed by HS in monitoringhost.exe using specified creds � action account by default Objects are ref counted so if X discovery rules discovery them then X discovery rules would have to �undiscover� them.� If your discovery rules are no longer even running due to a targeting change (or disabling the discovery rule) then you would have to delete them through this powershell command. http://blogs.msdn.com/boris_yanushpolsky/archive/2007/11/20/opsmgr-sp1-removing-instances-for-which-discovery-is-disabled.aspx

34. Discovery timeline (snapshot) MP imported -new classes and relationships exist in the model -anything targeting an object that already exists (like Microsoft.Windows.Server.2003) will automatically be sent to the healthservices that �host� those instances -eg. SQL server MP defines a discovery that looks at every windows server (if SQL is installed on a client that is monitored by the same OpsMgr management group then it will NOT monitor it) -This discovery looks at the registry to see if SQL is installed � it will do this against ALL objects of type Microsoft.Windows.Server.Computer. This object is owned by the HS that is local to that windows server (except if that machine is being monitored agentless � then the object is �owned� by a HS on the management server or RMS that is configured to monitor that server agentlessly � user decides this during discovery of the system) Discovery runs on the HSs that match its target and send the output to the mgmt server �Discoveries can be based on things like registry keys (which are cheap and every MP should use this as the first and only discovery rule that gets associated to every object in our system) to things like WMI queries, or as completely flexible as scripts � vbs or js or pearl (anything supported by cscript) as there is a scripting api that allows discovery data to be created. Lets say in our example there is only 1 server being monitored by the OpsMgr mgmt group that is a Microsoft.Windows.Server.Computer and it also happens to have 2 instances of SQL installed on it. In this case 2 instances of SQL would come back from the discovery rule (they run independently on different healthservices). OpsMgr would see these 2 instances � look at what exists for this class already in the DB (at this point nothing) and then it inserts these 2 objects so if the user looks at a state view that shows all SQLDBEngine instances they will see these 2 objects. Now lets say the user deletes 1 instance of SQL from the machine that the discovery just ran on. The discover rule that looks for SQL instances is configured to run daily. The discovery rule is in snapshot mode (only in a script can it be changed to incremental) so the discovery rule runs without any state carried over from the previous run � it just always discovers all instances � period. This allows OpsMgr to infer (at the DB level where the discovery data is being inserted) that 1 of the instances must not exist anymore so it is removed and deleted from the model so the user doesn�t see the state associated to it anymore as it doesn�t exist. Same happens with discovery through the SDK � if snapshot is turned on. The only difference is there is nothing to schedule the discovery through the SDK (unless you use a rule from OpsMgr to schedule this). Basically the connector can submit discovery data anytime it wants. The big question is does it have the means to give OpsMgr deltas and do the BL to keep the model up to date or does it just want to submit ALL instances ALL the time and let OpsMgr figure it out. For small discovery data sets (less than 100 objects) snapshot is convenient � as discovery data set grows to 1000 or 10,000s then you will definitely want to make sure that your connector does not do snapshot as processing large amounts of discovery data is expensive to process MP imported -new classes and relationships exist in the model -anything targeting an object that already exists (like Microsoft.Windows.Server.2003) will automatically be sent to the healthservices that �host� those instances -eg. SQL server MP defines a discovery that looks at every windows server (if SQL is installed on a client that is monitored by the same OpsMgr management group then it will NOT monitor it) -This discovery looks at the registry to see if SQL is installed � it will do this against ALL objects of type Microsoft.Windows.Server.Computer. This object is owned by the HS that is local to that windows server (except if that machine is being monitored agentless � then the object is �owned� by a HS on the management server or RMS that is configured to monitor that server agentlessly � user decides this during discovery of the system) Discovery runs on the HSs that match its target and send the output to the mgmt server �Discoveries can be based on things like registry keys (which are cheap and every MP should use this as the first and only discovery rule that gets associated to every object in our system) to things like WMI queries, or as completely flexible as scripts � vbs or js or pearl (anything supported by cscript) as there is a scripting api that allows discovery data to be created. Lets say in our example there is only 1 server being monitored by the OpsMgr mgmt group that is a Microsoft.Windows.Server.Computer and it also happens to have 2 instances of SQL installed on it. In this case 2 instances of SQL would come back from the discovery rule (they run independently on different healthservices). OpsMgr would see these 2 instances � look at what exists for this class already in the DB (at this point nothing) and then it inserts these 2 objects so if the user looks at a state view that shows all SQLDBEngine instances they will see these 2 objects. Now lets say the user deletes 1 instance of SQL from the machine that the discovery just ran on. The discover rule that looks for SQL instances is configured to run daily. The discovery rule is in snapshot mode (only in a script can it be changed to incremental) so the discovery rule runs without any state carried over from the previous run � it just always discovers all instances � period. This allows OpsMgr to infer (at the DB level where the discovery data is being inserted) that 1 of the instances must not exist anymore so it is removed and deleted from the model so the user doesn�t see the state associated to it anymore as it doesn�t exist. Same happens with discovery through the SDK � if snapshot is turned on. The only difference is there is nothing to schedule the discovery through the SDK (unless you use a rule from OpsMgr to schedule this). Basically the connector can submit discovery data anytime it wants. The big question is does it have the means to give OpsMgr deltas and do the BL to keep the model up to date or does it just want to submit ALL instances ALL the time and let OpsMgr figure it out. For small discovery data sets (less than 100 objects) snapshot is convenient � as discovery data set grows to 1000 or 10,000s then you will definitely want to make sure that your connector does not do snapshot as processing large amounts of discovery data is expensive to process

35. Discovery guidelines Keys to success Use registry as a discovery source when targeting an object that OpsMgr discovered (ie. windows computer, server, client) Having a simple model is simple to discover Pitfalls to avoid Discovering data too often � makes the default configuration report useless General rule: balance discovery frequency with amount of likely change. Side with less frequent or use event driven model If it changes a lot and is important to track think about using a perf counter or event Making the model too complex � too many classes, relationships

36. Discovery sources on HS Registry Look for presence of HKLM\Software\YourApp Look for HKLM\Software\YourApp\Version = 6.0.6278.0 WMI Perform �select * from Win32_logicaldisk� Script Anything your heart desires (and can be implemented in a script)

37. Discovering the application DEMO

38. Defining Health Model Provide a way for OpsMgr to understand whether application is healthy or not Monitors (state is based on monitor state transitions) Monitors can create alerts for state transitions Rules � create alerts that do not impact state Provide a way for OpsMgr to automatically diagnose or fix the problem (optional) Diagnostics Recoveries Provide knowledge to the operator on how to fix the application A set of monitors needs to be defined for each object. By definition an object that has no monitors associated to it will show up �unmonitored� even if it has alerts that are open and associated to it Model defined earlier constricts how health can be rolled up (health can roll up over any containment relationship and this includes hosting) Hs hosting the object could also prevent desired rollup Job/contract of the MP/model is to appropriately make sure �state� of an object is always up to date Users use the objects in DA and having the right state is critical Users mostly use the alerts generated from monitors or rules so create alerts on health state transitions A set of monitors needs to be defined for each object. By definition an object that has no monitors associated to it will show up �unmonitored� even if it has alerts that are open and associated to it Model defined earlier constricts how health can be rolled up (health can roll up over any containment relationship and this includes hosting) Hs hosting the object could also prevent desired rollup Job/contract of the MP/model is to appropriately make sure �state� of an object is always up to date Users use the objects in DA and having the right state is critical Users mostly use the alerts generated from monitors or rules so create alerts on health state transitions

39. Monitors (State monitoring) 39

40. Health Model 40

41. Health Model � Roll up 41

42. Health Model guidelines Keys to success Provide a HM that captures the different symptoms and appropriately reports state Pitfalls to avoid When marking rules or monitors as �Remoteable� make sure they really are remoteable For agentless monitoring To be public or not to be public (meaning Internal) If Internal no customer will be able to add diagnostics or recoveries For scripted based monitors Do implement OnDemand detection If a HS has multiple instances on a HS (like SQL databases) Make sure monitors have a criteria on the instance key Run As profiles Customers want to run agent action account as a low privilege account Use a profile for all things resulting in workflows � rules, monitors, tasks and discoveries

43. Blue client library

44. Defining the health model DEMO

45. Cookdown strategy Will your model include monitoring more than 1 object of the same class on a HS? Workflows are created in HS by associated rules/monitors/tasks Workflows are per instance � if there are 100s or 1000s of instances � there will be 100s or 1000s * (rules + monitors) = much bigger number RMS and MS have an ability in which workflows run per type * (rules + monitors) = much smaller number Called �special aggregate monitors� Cookdown could have a similar effect at reducing load to HS and monitoringhost.exe

46. Cookdown dataflow Scenario: HS has 5 instances of LOBAPP. Data is coming from WMI class LOBAPPPerf � there is an instance in wmi for each instance in OpsMgr Datasource�s are expensive If the configuration being passed into a module (including a datasource module) is exactly the same in 2 or more instance then the healthservce will �cook� these down and wire the workflow as described on the right. If the configuration being passed into each DS is different then it cannot be cooked down. For example imgine in the above wmi case that the WQL query being used was something like this: Select * from LOBAPPPerf where LOBAppPerfKey = $Target/Property[Type=�MyCompanyname.OpsMgrLOBAPPClass�]/key$� This would not be able to be cooked down as clearly the config for each DS is different as the LOBAPP key property is being passed into the query (which is part of the configuration for a wmi data source module). A better way to do this would be to specify this as the query: select * from LOBAppPerf as this now can get cooked down and then in the rest of the workflow a condition detection module can match the specific instance (this is way more efficient than running the WQL query 5 times).If the configuration being passed into a module (including a datasource module) is exactly the same in 2 or more instance then the healthservce will �cook� these down and wire the workflow as described on the right. If the configuration being passed into each DS is different then it cannot be cooked down. For example imgine in the above wmi case that the WQL query being used was something like this: Select * from LOBAPPPerf where LOBAppPerfKey = $Target/Property[Type=�MyCompanyname.OpsMgrLOBAPPClass�]/key$� This would not be able to be cooked down as clearly the config for each DS is different as the LOBAPP key property is being passed into the query (which is part of the configuration for a wmi data source module). A better way to do this would be to specify this as the query: select * from LOBAppPerf as this now can get cooked down and then in the rest of the workflow a condition detection module can match the specific instance (this is way more efficient than running the WQL query 5 times).

47. Scripting Extend the monitoring abilities of OpsMgr 2007 Discovery Create new tasks Create new monitor types Execute monitoring business logic in rules Run synthetic transactions to simulate common user behaviour Good when no monitoring instrumentation exists for a particular scenario Easier to debug scripts Debugging scirpts: Use Visual Studio for Script Debugging Use Script Debugger: Find it here http://www.microsoft.com/downloads/details.aspx?familyid=2f465be0-94fd-4569-b3c4-dffdf19ccd99&displaylang=en When it is installed you can launch a script in the debugger with the //x switch. �� cscript.exe //x script.vbs parameter1 parameter2�� The script debugger allows you to set breakpoints, and to step through code line by line, examining the contents of the variables in the command window.�� To see the contents of a variable in the command window, enter ?variableName. Use a debugging tool like PrimalScript. For XML formatting use XmlSpy or Visual Studio for some quick formatting Set an argument to the script �LogDetail� and if it is true/false dump out verbose log information, and set it as an overridable parameter. Use Poor man�s debugging techniques like MsgBox or Wscript.Echo in the script. Debugging scirpts: Use Visual Studio for Script Debugging Use Script Debugger: Find it here http://www.microsoft.com/downloads/details.aspx?familyid=2f465be0-94fd-4569-b3c4-dffdf19ccd99&displaylang=en When it is installed you can launch a script in the debugger with the //x switch. �� cscript.exe //x script.vbs parameter1 parameter2�� The script debugger allows you to set breakpoints, and to step through code line by line, examining the contents of the variables in the command window.�� To see the contents of a variable in the command window, enter ?variableName. Use a debugging tool like PrimalScript. For XML formatting use XmlSpy or Visual Studio for some quick formatting Set an argument to the script �LogDetail� and if it is true/false dump out verbose log information, and set it as an overridable parameter. Use Poor man�s debugging techniques like MsgBox or Wscript.Echo in the script.

48. Reporting Out of the box reports Availability, configuration Extend reporting New generic reports or linked (specialized) reports Supporting database objects (views, functions, stored procedures) New storage structures (tables) and collection for new data types

49. SCE & management packs SCE = System Center Essentials OpsMgr 2007 MP with rules/monitors �Enabled� value configured as OnEssentialMonitoring = Enabled both in OpsMgr and Essentials by default critical rules/monitors appropriate for all environments OnStandardMonitoring = Enabled only in OpsMgr by default warning and information rules/monitors appropriate for complex environments/IT specialists Customer can override; enable rules/monitors in Essentials that are marked OnStandardMonitoring

50. Releasing a MP During creation MP is an XML file Before releasing to customers required to �seal� it Not a way to protect IP Verifies your identity during import Makes it �read only� and enforces backwards compatibility during upgrades

51. Putting it all together Your objects can be added to a distributed application Value and customers scenarios enabled through the health model associated to your object(s)

52. Distributed Applications DEMO

53. Summary Understand which pattern you are building and data flow strategy Discovery, instrumentation Building a management pack Define the application Define discovery Define the application health model Seal your management pack Release and repeat � focus on knowledge and customer value Think of the contract signed between OpsMgr and a MP: -MP will define classes -OpsMgr will ensure workflows run on the HS for the types specified, including executing discovery rules, monitors, rules, tasks -MP will keep instances/relationships up to date with real world environment -MP will keep state correct and not show green when the object should be or is really red or red with the object is green (there is a yellow) Think of the contract signed between OpsMgr and a MP: -MP will define classes -OpsMgr will ensure workflows run on the HS for the types specified, including executing discovery rules, monitors, rules, tasks -MP will keep instances/relationships up to date with real world environment -MP will keep state correct and not show green when the object should be or is really red or red with the object is green (there is a yellow)

54. References MP authoring guide Develop against OpsMgr SP1 OpsMgr Authoring console Microsoft management packs as samples Blogs Feedback - vladj@microsoft.com or for more information on building MPs MP authoring links and information � OpsMgr 2007 SP1 RTM bits: Upgrade to be installed on RTM: http://www.microsoft.com/downloads/details.aspx?FamilyId=EDE38D83-32D1-46FB-8B6D-78FA1DCB3E85&displaylang=en Full install that includes SP1: http://www.microsoft.com/downloads/details.aspx?FamilyId=C3B6A44C-A90F-4E7D-B646-957F2A5FFF5F&displaylang=en � Here is where you can get the latest authoring console: http://download.microsoft.com/download/f/4/3/f438d6a0-290c-42b8-8f9c-c6660f89e1aa/OpsMgr07_x64_AuthConsole.exehttp://download.microsoft.com/download/f/4/3/f438d6a0-290c-42b8-8f9c-c6660f89e1aa/OpsMgr07_x86_AuthConsole.exe � Here are our authoring guides: Operations Manager 2007 Management Pack Authoring Guide: http://download.microsoft.com/download/7/4/d/74deff5e-449f-4a6b-91dd-ffbc117869a2/OM2007_AuthGuide.doc The report authoring guide for Operations Manager 2007: http://download.microsoft.com/download/7/4/d/74deff5e-449f-4a6b-91dd-ffbc117869a2/OpsMgr2007_RprtGuide.doc � MP Authoring resources and lots of samples: http://www.authormps.com Blogs: OpsMgr++ http://blogs.msdn.com/boris_yanushpolsky Notes on System Center http://blogs.msdn.com/mariussutara/default.aspx System Center modules http://blogs.msdn.com/sampatton/default.aspx SDK and connectors: http://blogs.msdn.com/jakuboleksy/default.aspx Report authoring: http://blogs.msdn.com/eugenebykov/ Clive's Support blog: http://blogs.technet.com/cliveeastwood/default.aspx Powershell http://blogs.msdn.com/scshell/ � � Here is some info on how to generate the XML files for MPs that are imported into your OpsMgr system as the microsoft MPs are very complete samples. All MPs can be found at our catalog site here: http://www.microsoft.com/technet/prodtechnol/scp/catalog.aspx These XML files will be useful to view in the authoring console or XML directly: it is possible to do via the command shell. In both cases, you need to find the management pack you want to export through whatever method you prefer and then export it as below: Command Shell Example: Get-ManagementPack -Name "Microsoft.SystemCenter.2007" | Export-ManagementPack -Path "C:\" SDK Example: using System; using System.Collections.ObjectModel; using Microsoft.EnterpriseManagement; using Microsoft.EnterpriseManagement.Configuration; using Microsoft.EnterpriseManagement.Configuration.IO; � namespace WorkSamples { partial class Program { static void ExportManagementPack() { // Connect to the local management group ManagementGroup mg = new ManagementGroup("localhost"); � // Get any management pack you want ManagementPack managementPack = mg.GetManagementPack("Microsoft.SystemCenter.2007", "31bf3856ad364e35", new Version("6.0.5000.0")); // Provide the directory you want the file created in ManagementPackXmlWriter xmlWriter = new ManagementPackXmlWriter(@"C:\"); xmlWriter.WriteManagementPack(managementPack); } } } BlogMP authoring links and information � OpsMgr 2007 SP1 RTM bits: Upgrade to be installed on RTM: http://www.microsoft.com/downloads/details.aspx?FamilyId=EDE38D83-32D1-46FB-8B6D-78FA1DCB3E85&displaylang=en Full install that includes SP1: http://www.microsoft.com/downloads/details.aspx?FamilyId=C3B6A44C-A90F-4E7D-B646-957F2A5FFF5F&displaylang=en � Here is where you can get the latest authoring console: http://download.microsoft.com/download/f/4/3/f438d6a0-290c-42b8-8f9c-c6660f89e1aa/OpsMgr07_x64_AuthConsole.exehttp://download.microsoft.com/download/f/4/3/f438d6a0-290c-42b8-8f9c-c6660f89e1aa/OpsMgr07_x86_AuthConsole.exe � Here are our authoring guides: Operations Manager 2007 Management Pack Authoring Guide: http://download.microsoft.com/download/7/4/d/74deff5e-449f-4a6b-91dd-ffbc117869a2/OM2007_AuthGuide.doc The report authoring guide for Operations Manager 2007: http://download.microsoft.com/download/7/4/d/74deff5e-449f-4a6b-91dd-ffbc117869a2/OpsMgr2007_RprtGuide.doc � MP Authoring resources and lots of samples: http://www.authormps.com Blogs: OpsMgr++ http://blogs.msdn.com/boris_yanushpolsky Notes on System Center http://blogs.msdn.com/mariussutara/default.aspx System Center modules http://blogs.msdn.com/sampatton/default.aspx SDK and connectors: http://blogs.msdn.com/jakuboleksy/default.aspx Report authoring: http://blogs.msdn.com/eugenebykov/ Clive's Support blog: http://blogs.technet.com/cliveeastwood/default.aspx Powershell http://blogs.msdn.com/scshell/ � � Here is some info on how to generate the XML files for MPs that are imported into your OpsMgr system as the microsoft MPs are very complete samples. All MPs can be found at our catalog site here: http://www.microsoft.com/technet/prodtechnol/scp/catalog.aspx These XML files will be useful to view in the authoring console or XML directly: it is possible to do via the command shell. In both cases, you need to find the management pack you want to export through whatever method you prefer and then export it as below: Command Shell Example: Get-ManagementPack -Name "Microsoft.SystemCenter.2007" | Export-ManagementPack -Path "C:\" SDK Example: using System; using System.Collections.ObjectModel; using Microsoft.EnterpriseManagement; using Microsoft.EnterpriseManagement.Configuration; using Microsoft.EnterpriseManagement.Configuration.IO; � namespace WorkSamples { partial class Program { static void ExportManagementPack() { // Connect to the local management group ManagementGroup mg = new ManagementGroup("localhost"); � // Get any management pack you want ManagementPack managementPack = mg.GetManagementPack("Microsoft.SystemCenter.2007", "31bf3856ad364e35", new Version("6.0.5000.0")); // Provide the directory you want the file created in ManagementPackXmlWriter xmlWriter = new ManagementPackXmlWriter(@"C:\"); xmlWriter.WriteManagementPack(managementPack); } } } Blog

55. Questions

Management Pack Development

Management Pack Development

Presentation Transcript

system center operations manager 2007 management pack lifecycle management

Management Pack Authoring

Strategy Development Blank Slide Pack

Management Development

Management Pack Melee: Understanding MOM 2005 Management Packs

The Career Development Pack

Veeam Management Pack for VMware

Oracle GoldenGate Management Pack Update (CON9105)

PACK-UP/PEB, PHASE KIT MANAGEMENT

Development Management

Pack

Pack

Pack

Pack

Linux Smart Management Pack

PACK

Management Development

Pack Smart, Pack Dry

Strategy Development Blank Slide Pack