1 / 68

PhD Dissertation Defense Energy-Efficient Pro-active Techniques for Safe & Survivable Cyber-Physical Systems

PhD Dissertation Defense Energy-Efficient Pro-active Techniques for Safe & Survivable Cyber-Physical Systems. By: Tridib Mukherjee Committee: Prof. Sandeep Gupta Prof. Karamvir Chatha Prof. Partha Dasgupta Prof. Daniel Stanzione. Sponsors:. Outline. Cyber-Physical Systems (CPS)

nicki
Download Presentation

PhD Dissertation Defense Energy-Efficient Pro-active Techniques for Safe & Survivable Cyber-Physical Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PhD Dissertation DefenseEnergy-Efficient Pro-active Techniques for Safe & Survivable Cyber-Physical Systems By: TridibMukherjee Committee: Prof. Sandeep Gupta Prof. KaramvirChatha Prof. ParthaDasgupta Prof. Daniel Stanzione Sponsors:

  2. Outline • Cyber-Physical Systems (CPS) • Crisis response planning and preparedness • Energy-efficient job management in data centers • Ad hoc Networks • Conclusions/ Future Research Directions

  3. Cyber Physical Systems (CPS) • Pro-active systems can anticipate an event and act in advance to avoid or minimize the consequences of the event. • Migration from Interactive to Pro-active computing for “systems intimately connected to the world around” was suggested in 2000 by David Tennenhouse, Director of Intel Research. • Pro-active CPS can involve actions in both the physical and cyber world. • Example of pro-active operations in the physical environment • pre-setting the cooling in data center to avoid equipment redline temperatures. • preparedness drills for responding to crises/disasters. From interactive to pro-active systems Courtesy: Idealog Magazine Courtesy: Vanderbilt University & Drexel University • Dynamic distributed systems to monitor, coordinate, control, integrate and facilitate physical processes • Physical environment can consist of human inhabitants • Computing entities are autonomic and embedded. • Operations in computing entities affect the physical environment & vice versa. • Key Issues • Physical Interactions • Critical Applications

  4. Research Problem and Approach • Three major problems of pro-active operations in CPS • can be difficult to achieve under uncertain environments in CPSs (e.g. crisis response) • can lead to high cost of operation for large scale CPS (e.g. data centers) • can be highly energy-inefficient for energy-constrained computing entities (e.g. ad-hoc networks) • Research Approach– constraint based optimization to balance pro-activity for three different applications with different objectives and requirements • Crisis preparedness: pro-active planning and evaluation of crisis response when actions’ outcomes are uncertain while meeting real-time constraints for human survivability. • Data centers: pro-active job scheduling to dynamically reduce cooling demands while meeting thermal constraints for equipment safety. • Ad hoc networks: pro-active route management to meet end-to-end reliability constraints while minimizing the energy overhead. How to balance pro-activity depending on system requirements?

  5. Research Contributions

  6. Outline • Cyber-Physical Systems (CPS) • Crisis response planning and preparedness • Energy-efficient job management in data centers • Ad hoc Networks • Conclusions/ Future Research Directions

  7. Importance of Crisis Preparedness • In 2004, over $4 billion of Homeland Security Grants allocated for assistance to the first responders. • In 2005, $7.4 billion fund budgeted for Emergency Preparedness and Response (around 20% of the total budget). • over $3.5 billion (50%) budgeted for assistance to first responders. • Since March 1, 2003, approximately $8 billion awarded to state, tribal and local governments to prevent, prepare for, respond to and recover from acts of terrorism and all hazards.

  8. Critical Application: Fire in Building Critical Event Additional Critical Events Detection Detection Requires pro-active evaluation and planning of crisis response Crisis Response Preparedness Recovery Evaluation of Crises Response Trapped People & Rescuers Detect fire using information from sensors • Notify 911 • provide information to the first responders Detect trapped people HUMAN INTERACTION • Analyze the Spatial Properties • how to reach the source of fire; • which exits are closest; • is the closest exist free to get out; • Determine the required actions • instruct the inhabitants to go to nearest safe place; • co-ordinate with the rescuers to evacuate (normally using ad hoc networks). Survivability – effectiveness of response plan to avoid disasters (life/property losses) How to evaluate and plan actions with uncertain outcomes?

  9. Criticality & Critical Event Management • Critical events • Causes emergencies/crisis. • Leads to loss of lives/property. • Criticality • Effects of critical events on the smart-infrastructure. • Critical State – state of the system under criticality. • Window-of-opportunity (W) – temporal constraint for criticality. • Survivability– effectiveness of the criticality response actions in minimizing the disasters. Critical Event CRITICAL STATE NORMAL STATE Timely Criticality Response within window-of-opportunity Mismanagement of any criticality DISASTER (loss of lives/property)

  10. Related Work Preparedness Measures Unaware of uncertainties Cumbersome Documents Model-based Verification • Different modeling options • Hybrid automata can capture continuous time dynamics in physical world. • A special case is timed I/O automata which can time variation for the system. • Recent work has focused on probabilistic timed automata. • We use Markov Decision Process. • Can enable developing stochastic planning policies. Reliability Formal Modeling QOS Preparedness Drills Physical lay-out design Real-time Objective Evaluation Personnel Training Pro-activity Synergistic Planning Human Cyber Physical Cyber- physical Level of Abstraction

  11. Background on Model based Verification/Analysis Markov Decision Process based Criticality Response Model (CRM) • Model based analysis normally used to verify critical systems such as avionics. • no need for actual scenario generation putting lives/property at risk. • Formal models for abstraction of the system behavior. • Expected system properties depend on the requirements. • Formal models analyzed through model checking to verify the system properties. • We use model based analysis to evaluate effectiveness of crisis response processes. System Behavior System Requirements Formal Models Expected Properties Model Checking Property Verification Requirement Verification Criticality Response Evaluation Tool (CRET) CRM can also be used to develop Criticality Response Planning (CRP) policies

  12. Proposed Markov Decision Process: Criticality Response behavior Model (CRM) NORMAL STATE • State-based stochastic model • System in different critical states • A state represents the combination of criticalities in the system • States are organized in a hierarchical manner • A level in the hierarchy represents the number of criticalities in each state in that level • Normal state has a level 0 (i.e. there are no criticalities in the normal state) • Critical Events • Makes state transitions down the hierarchy • Associated with criticality characteristics • window-of-opportunity) • Probability of the critical event • Time to detect the criticality • Mitigative Link • Corresponds to response actions • Makes state transitions up the state hierarchy. • Associated with response action characteristics • probability of actions’ success considering uncertainties due to human involvement. • Time to complete the action Mitigative Link (ML) Critical Event Survivability– probability of reaching normal state depend onMLs’ success probabilities, additional criticality probabilities and conformity to window-of-opportunity. T. Mukherjee, K. Venkatasubramanian and S. K. S. Gupta,Performance Modeling of Critical Event Management for Ubiquitous Computing Applications, Proceedings of ACM MSWiM (MSWiM'06), Terromolinos, Spain, October 2006

  13. Reachability to the Normal State • Reachability to the normal state from any arbitrary critical state s • s’ an immediate upstream state when action a is performed. NORMAL STATE • An action’s Q-value (qualifiedness) determined by probability of reaching normal state when the action is performed • s’ an immediate upstream state when action a is performed. sn WOOP met s’ p(s, a, s’) WOOP NOT met a s Probability of reaching the normal state from state i Actions’ Qualifiedness (Q-value) s’ = sn s’  sn & WOOP met WOOP NOT met Probability of reaching normal state if ANY additional criticality occurs at state i Probability of reaching normal state if NO additional criticality occurs at state i Probability of a criticality at state i Normal state is stochastically reachable from a state iff maximum Q-value from that state is non-zero.

  14. CRP strategies • Optimal – at each state select action with max Q-value. • Greedy – at each state, select action with optimum values of immediate parameters • e.g. Minimum Time (MT), Maximum Probability (MP), Maximum number of Mitigated Criticalities (MMC). • Markov Decision Planning (MDP) – At each state, select action with maximum utility • utility uses the state-based stochastic model parameters.

  15. MDP-based CRP strategies • At a state, an action has utility based on action’s probability and reward • Actions’ reward function can be a combination of the associated parameters • Locally maximum criticality Mitigation Per unit Time (LMPT) • No knowledge of subsequent criticalities in the reward. • Subsequent Criticality Aware locally maximum criticality Mitigation Per unit Time (SCAMPT) • Actions’ reward in LMPT is enhanced with the knowledge of probable subsequent criticalities. expected maximum utility from next state reward Reward = number of criticality mitigated per unit time Reward is same as LMPT except that probabilities of subsequent criticalities are taken into account Tridib Mukherjee, and Sandeep K. S. Gupta,  CRM: A Formal Method to Model & Evaluate Crises Response of Distributed Cyber-Physical Systems, Under Review in TPDS.

  16. CRM for fire emergencies in Offshore Oil & Gas Production Platforms (OGPP) Criticalities • c1 – Fire Alarm. • c2 – Imminent danger e.g. health hazards. • c3 – Assistance required to others e.g. trapped personnel. • c4 – Evacuation path not tenable. 0.5375 0.0154 Fire Alarm 0.0311 0.1849 0.4319 0.5 0.1977 0.2011 Fire Alarm & Imminent Danger Fire Alarm & Non-tenable Path Fire Alarm & Assistance Required 0.5562 0.5827 0.371 0.2953 0.0635 0.449 0.3661 0.4764 Window-of-opportunity Fire Alarm & Imminent Danger & Assistance Required Fire Alarm & Imminent Danger & Non-tenable Path Fire Alarm & Assistance Required & Non-tenable Path • survival time under asphyxiation. 0.4242 0.5447 0.5447 State transition probabilities derived from established probability distribution in [1]. 0.4242 0.3803 0.4172 0.0311 Fire Alarm & Imminent Danger & Non-tenable Path & Assistance Required Fire Alarm & Imminent Danger & Assistance Required & Non-tenable Path [1] D. G. DiMattia, F. I. Khan, and P. R. Amyotte, “Determination of human error probabilities for offshore platform musters,” Journal of Loss Prevention in the Process Industries, vol. 18, pp. 488–501, 2005. Tridib Mukherjee, and Sandeep K. S. Gupta, A Modeling Framework for Evaluating Effectiveness of Smart-Infrastructure Crises Management Systems , 2008 IEEE International Conference on Technologies for Homeland Security (HST'08), Waltham, MA, USA, April 2008 Enables Objective Evaluation of Criticality Response in OGPP to Improve Crisis Preparedness

  17. Sample Q-value Analysis • Preparedness: Q-value based analysis allow comparison among plans for • Different number of criticalities • Different detection and action completion times • Different states (i.e. different combination of simultaneous criticalities) Other applications: Resource access control to facilitate the planned actions under emergencies.

  18. Criticality Response Evaluation Tool (CRET) AADL based Criticality Response System Architecture Specification Model based decision Model Representation AADL based CRP Specification AADL based CRM Specification Model Parsing Criticality Detection/Monitoring Component Criticality Detection/Monitoring Component AADL OSATE Analysis Plug-ins XML Representation and Analysis Software using Matlab Criticality 1 Criticality n Can specify any response planning policy transcending beyond the proposed CRP strategies Q-value Analysis Decision Making Component Model Processing Response to Criticality1 Response to Criticalityn Preparedness: Check Reachability to normal state based on Q-value analysis Response Actuation Component ResponseActuation Component Tridib Mukherjee, and Sandeep K. S. Gupta, CRET: A Crisis Response Evaluation Tool to Improve Crisis Preparedness, 2009 IEEE International Conference on Technologies for Homeland Security (HST'09), Waltham, MA, USA, May 2009

  19. Summary of Contributions T. Mukherjee, K. Venkatasubramanian and S. K. S. Gupta,Performance Modeling of Critical Event Management for Ubiquitous Computing Applications, Proceedings of ACM MSWiM (MSWiM'06), Terromolinos, Spain, October 2006 • Crisis Response Model (CRM) • Markov decision process based modeling of crisis response • Development of Q-value as evaluation criteria for reachability to normal state • Crisis Response Planning (CRP) • Optimal and naïve (greedy) strategies • Markov decision planning strategies • Crisis Response Evaluation Tool • Objective evaluation of crisis response Tridib Mukherjee, and Sandeep K. S. Gupta,  CRM: A Formal Method to Model & Evaluate Crises Response of Distributed Cyber-Physical Systems, Under Review in TPDS. Tridib Mukherjee, and Sandeep K. S. Gupta, A Modeling Framework for Evaluating Effectiveness of Smart-Infrastructure Crises Management Systems , 2008 IEEE International Conference on Technologies for Homeland Security (HST'08), Waltham, MA, USA, April 2008 Tridib Mukherjee, and Sandeep K. S. Gupta, CRET: A Crisis Response Evaluation Tool to Improve Crisis Preparedness, 2009 IEEE International Conference on Technologies for Homeland Security (HST'09), Waltham, MA, USA, May 2009 K. Venkatasubramanian, T. Mukherjee, and S. K. S. Gupta, ''CAAC - An Adaptive and Proactive Access Control Approach for Emergencies for Smart Infrastructures", Accepted in the Special Issue on Adaptive Security Systems in ACM Transactions on Autonomic and Adaptive Systems (TAAS). S. K. S. Gupta , T. Mukherjee, and K. Venkatasubramanian, ‘’ Criticality Aware Access Control Model For Pervasive Applications", Proceedings of 4th IEEE Conf. on Pervasive Computing (PERCOM), Pisa, Italy, 2006.

  20. Outline • Cyber-Physical Systems (CPS) • Crisis response planning and preparedness • Energy-efficient job management in data centers • Ad hoc Networks • Conclusions/ Future Research Directions

  21. Importance of the Problem • Cooling is the chief driver of increased data center construction cost, costing up to $5000 per square foot in initial purchase price. • Cooling is one of the leading contributors to ongoing total cost of ownership, costing one half to one watt per watt spent on computation. • If we can eliminate even 25% of total cooling costs, that can translate to a $1-$2 million annual cost reduction in a single large data center.

  22. softwaredimension Application Thermal-aware data centerjob scheduling Thermal-aware VM (middleware) CPU Load balancing O/S Dynamic voltage scaling Fan speed scaling Dynamic frequency scaling firmware Circuitry redundancy physicaldimension IC room Case/chassis Related Work Proactive Approach Reactive Solutions

  23. Heat Interferences in Data Centers Temp Self Interference Cross Interference Outlet Outlet Inlet Safety:inlet should be within the red-line temperature to avoid equipment failure. • Problem:cooling has to be pro-actively set very low to have all inlet temperatures under redline. • Solution:proactivespatio-temporal job scheduling to minimize interference & cooling demands.

  24. Typical HPC Job Characteristics • Job execution times are usually overestimated during submission in HPC data centers. • Jobs can be spread over time to reduce peak utilization • Trade-off with throughput, turn-around time and resource utilization. From job traces at ASU HPC data center

  25. Conventional Spatial and Temporal Scheduling

  26. Balancing Utilization Over Time

  27. Conceptual overview of thermal-aware job scheduling Balancing utilization over time reduces the peak computing resource utilization leaving room for thermal-aware spatial scheduling at all time Peak air inlet temperaturedetermines upper bound toCRAC temperature setting CRAC temperature settingdetermines it’s efficiency(Coefficient of Performance) Spatial job scheduling (placement) determines temperature distribution at any time using a linear thermal model Coefficient of Performance(source: HP) The lower the peak inlet temperature the higher the CRAC efficiency Q. Tang, T. Mukherjee, S. K. S. Gupta, and P. Cayton, ''Sensor-based Fast Thermal Evaluation Model for Energy-efficient High-performance Datacenters", In the International Conf. Intelligent Sensing Info.Proc. (ICISIP2006), Dec 2006. Temperature distributiondetermines the equipmentpeak air inlet temperature T. Mukherjee, G. Varsamopoulos, S. K. S. Gupta, and S. Rungta, '‘Measurement-based Power Profiling of Datacenter Equipment", (Extended Abstract) In the Workshop on Green Computing (with CLUSTER2007), Austiin, USA, Sep 2007. There is a spatio-temporal job schedule that minimizes the total energy (cooling + computing) consumption. Find it!

  28. Thermal-aware Job Scheduling Problem PROBLEM: Given a set of incoming jobs, find a job scheduling (i.e. job start times) and placement (i.e. server assignment) to minimize the total data center energy consumption subject to meeting of job deadlines (submitted times for execution) – requires 3D (job x server x time) decision-making. Cooling Energy Supply Temperature Upper Bound Computing Energy Job Migration Overhead Capacity Constraint: server assigned less server available Server Required: Required no. of servers assigned for jobs Deadline Constraint: job finish time less than deadline Arrival Constraint: job start time later than arrival T. Mukherjee, A. Banerjee, G. Varasamopoulos, and S. K. S. Gupta, ‘Spatio-temporal Thermal-Aware Job Scheduling to Minimize Energy Consumption in Virtualized Heterogeneous Data Centers", Elsevier Journal on Computer Networks (ComNet), Special Issue on Virtualized Data Centers, ACCEPTED (2009).

  29. Thermal-aware Job Scheduling Algorithms SCINT Algorithm: Heuristic solution (genetic algorithm) • Take a feasible solution and perform mutations until certain number of iterations. • Spreads the jobs over time while meeting the deadline. • Offline in nature requiring the job backlog information • Takes hours of operation. EDF-LRH Algorithm: Tries to mimic the behavior of SCINT by spreading jobs using the Earliest Deadline First (EDF) scheduling approach. • Place jobs to servers contributing the Lowest Recirculated Heat (LRH) • Online in nature maintaining EDF job queues as and when jobs arrive • Takes milliseconds of operation. FCFS Algorithm: Does not conventional temporal scheduling approach but uses thermal-aware job placement techniques for energy-savings. • Place jobs to servers contributing the Lowest Recirculated Heat (LRH) • Online in nature taking milliseconds of operations T. Mukherjee, A. Banerjee, G. Varasamopoulos, and S. K. S. Gupta, ‘Spatio-temporal Thermal-Aware Job Scheduling to Minimize Energy Consumption in Virtualized Heterogeneous Data Centers", Elsevier Journal on Computer Networks (ComNet), Special Issue on Virtualized Data Centers, ACCEPTED (2009).

  30. Total Energy Consumption • SCINT saves up to 60% of energy consumption. • EDF-LRH mimics the behavior of SCINT specially for low average data center utilization.

  31. Summary of Contributions • Problem Formulation to minimize energy-consumption in data centers • Spatio-temporal thermal-aware job scheduling algorithms to • Offline algorithm SCINT • Online algorithm EDF-LRH • Measurement based power profiling of data center equipment • Linear power model • Preliminary software architecture • Configure MOAB for thermal-aware job placement. Q. Tang, T. Mukherjee, S. K. S. Gupta, and P. Cayton, ''Sensor-based Fast Thermal Evaluation Model for Energy-efficient High-performance Datacenters", In the International Conf. Intelligent Sensing Info.Proc. (ICISIP2006), Dec 2006. T. Mukherjee, G. Varsamopoulos, S. K. S. Gupta, and S. Rungta, '‘Measurement-based Power Profiling of Datacenter Equipment", (Extended Abstract) In the Workshop on Green Computing (with CLUSTER2007), Austiin, USA, Sep 2007. T. Mukherjee, A. Banerjee, G. Varasamopoulos, and S. K. S. Gupta, ‘Spatio-temporal Thermal-Aware Job Scheduling to Minimize Energy Consumption in Virtualized Heterogeneous Data Centers", Elsevier Journal on Computer Networks (ComNet), Special Issue on Virtualized Data Centers, ACCEPTED (2009). T. Mukherjee, Q. Tang, C. Ziesman, S. K. S. Gupta, and P. Cayton, ‘Spftware Architecture for Dynamic Thermal Management in Data Centers", International Conference on Communication Systems Software (COMSWARE), Bangalore, India, Jan, 2007. T. Mukherjee, Q. Tang, C. Ziesman, and S. K. S. Gupta, ‘Dynamic Thermal Control and Management towards Reducing Utility Cost in Data Centers ", International Workshop on Feedback Control Implementation and Design in Computing Systems and Networks (FeBID), 2006. T. Mukherjee, S. K. S. Gupta, and P. Cayton, ‘emo - Temparature-aware job placement in data centers using Moab cluster management software ", Research@Intel Day, Intel, Santa Clara, June, 2006.

  32. Outline • Cyber-Physical Systems (CPS) • Crisis response planning and preparedness • Energy-efficient job management in data centers • Ad hoc Networks • Conclusions/ Future Research Directions

  33. Optimum Tuning of Pro-active Route Maintenance in ad-hoc networks

  34. Application-aware Adaptive Optimization Sub-layer

  35. Proactive PP+BTP PP+BP PP+B PP+BT Proactive Routing Protocol Classification and Research Contributions Contributions: • Analytical Model for determining optimum β & φ for different proactive protocols.1,2,3 • Developing a PP+B type of protocol maintaining energy-efficient routes. • Improves Self-Stabilizing Shortest Path Spanning Tree (SS-SPST) for energy-efficiency. 4,5 Employs Beacons, & Triggered Updates Employs only Beacons Employs Beacons, & Periodic Updates Employs Beacons, Periodic, & Triggered Update WRP, OLSR etc. BFST, SS-SPST etc. FSR, IARP etc. DSDV, TBRPF etc. 1T. Mukherjee, S. K. S. Gupta, and G. Varasamopoulos, ''Energy Optimization for Proactive Unicast Route Maintenance in MANETs under End-to-End Reliability Requirements", In Elsevier Journal on Performance Evaluation, Vol. 66, Issue 3-5, Pages 141-157, Mar, 2009. 2T. Mukherjee, S. K. S. Gupta, and G. Varasamopoulos, ''Analytical Model for Optimizing Periodic Route Maintenance in Proactive Routing for MANETs", In the Proc of ACM MSWiM, Crete Island, Greece, Oct 2007. 3T. Mukherjee, S. K. S. Gupta, and G. Varasamopoulos, ''Application-Aware Adaptive Tuning of Proactive Routing Protocols for MANETs", Under review in Transactions on Autonomic and Adaptive Systems (TAAS). 4T. Mukherjee, G. Varasamopoulos, and S. K. S. Gupta, ''Self-Managing Energy-Efficient Multicast Support in MANETs under End-to-End Reliability Constraints", In Elsevier Journal on Computer Networks (ComNet), Vol. 53, Issue 10, Pages 1603-1627, July, 2009. 5T. Mukherjee, G. Sridharan, and S. K. S. Gupta, ''Energy-Aware Self-Stabilization in Mobile Ad Hoc Networks: A Multicasting Case Study", In the 21st IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS), Long Beach, California, 26-30th March, 2007.

  36. Outline • Cyber-Physical Systems (CPS) • Crisis response planning and preparedness • Energy-efficient job management in data centers • Ad hoc Networks • Conclusions/ Future Research Directions

  37. Conclusions • Pro-activity need to be incorporated in a synergistic manner to ensure safety and survivability in the CPSs. • Pro-activity require handling of uncertain outcomes of the pro-active actions • Pro-activity leads to high energy consumption. • Crisis preparedness and planning for human survivability under crisis • Abstracting the crisis response behavior as system-as-a-whole can take into account the human uncertainties in the physical world. • Facilitates stochastic planning and evaluation of the crisis response • Model based verification and analysis enables the crisis response evaluation in an objective manner. • Dynamic determination of the period of route maintenance in the ad hoc networks can effectively balance the energy-reliability trade-off. • Data center thermal management for thermal safety of the equipment • Dynamically reducing cooling demands through thermal-aware job scheduling and placement can save up to 60% of the energy consumption while ensuring the users’ perception of job completion.

  38. Future Research Directions • Abstract modeling for CPS • physical interference modeling • can be governed by differential equations for physical dynamics. • Crisis preparedness • Considering action cost in the analysis of response processes • Enhance the actions’ Q-value with the cost • Model dynamics in complex scenarios • dynamic unpredictable state-space instead of static predictable state-space • Model composition in distributed and composite systems • derive system-level global stochastic model by combining multiple sub-system-level local stochastic models (e.g. fire in a hospital require two sub-systems: i) fire management; and ii) medical emergency management • Data center • Integration of power management with thermal-aware job scheduling • Integration of cooling control with thermal-aware job scheduling to develop a synergistic control architecture.

  39. Questions ?? Impact Lab (http://impact.asu.edu) Creating Humane Technologies for Ever-Changing World

  40. Background • Pro-active systems can anticipate an event and act in advance to avoid or minimize the consequences of the event. • Pro-active CPS is necessary to address the following design requirements • Safety: impact of the physical interactions should remain within a desirable limit to avoid any damage to the physical and computing entities. • Survivability: the operations in the physical and cyber subsystems ensure and/or incur no harm to the human inhabitants. • Migration from Interactive to Pro-active computing for “systems intimately connected to the world around” was suggested in 2000 by David Tennenhouse, Director of Intel Research. • Pro-active CPS can involve actions in both the physical and cyber world. • Example of pro-active operations in the physical environment • Safety: pre-setting the cooling in data center to avoid equipment redline temperatures. • Survivability: preparedness drills for responding to crises/disasters. • Example of pro-active operations in the cyber world • Safety: schedule jobs in data centers such that equipment redline temperatures avoided. • Survivability: pro-active route maintenance in ad hoc networks employed for crisis response to ensure low latency for information exchange.

  41. Example Cyber-Physical Systems • Utilities • Advanced Electric Power Grid • Water Distribution • Pressure Pipes Gas/Oil • Search & Rescue • Crisis Response, etc. • Monitoring Systems • Pervasive Health Monitoring • Monitoring of fire and chemical radiation plumes • Wild-life Monitoring • Forest Monitoring

  42. Design Decisions Critical applications should be able to avoid/handle dangerous physical conditions (e.g. life/property losses). Security Survivability Reliability Safety Real-time Quality Interactions between physical and cyber components should not detrimentally impact the physical conditions. This dissertation focuses on the safety & survivability of CPS

  43. Physical Interactions (Interference)

  44. Reachability Metric NORMAL STATE Reachability in terms ofQ-valueorQualifiednessof actions • probability of reaching normal state based on • Probabilities of MLs. • Probabilities of CLs at intermidiate states. • Conformity to timing requirements. Q-valueis a quantitative measure to evaluate crises response. Critical Link (CL) Mitigative Link (ML)

  45. Execution Times

  46. AADL based criticality response system architecture specification Criticalities Events in System Criticality Detection/Monitoring Component Criticality Detection/Monitoring Component System Modes Critical States Event Dependent Mode Transition State Transitions Criticality 1 Criticality n Decision Making Component Response Actions Response to Criticality1 Response to Criticalityn Windows of Opportunity Mode Properties Response Actuation Component ResponseActuation Component Action Times mapped to AADL Constructs CRM Components

  47. Criticality Specification Criticalities Events in System System Modes Critical States Event Dependent Mode Transition State Transitions Response Actions Windows of Opportunity Mode Properties Action Times mapped to AADL Constructs MCMA Components

  48. State and State Transition Specification Criticalities Events in System System Modes Critical States Event Dependent Mode Transition State Transitions Response Actions Windows of Opportunity Mode Properties Action Times mapped to AADL Constructs MCMA Components

  49. State and State Transition Specification Criticalities Events in System System Modes Critical States Event Dependent Mode Transition State Transitions Response Actions Windows of Opportunity Mode Properties Action Times mapped to AADL Constructs MCMA Components

  50. Sample Schema for Intermediate XML representation Allows any expressions to specify policies

More Related