- 66 Views
- Uploaded on
- Presentation posted in: General

Optimal redundancy allocation for information technology disaster recovery in the network economy

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Optimal redundancy allocation for information technology disaster recovery in the network economy

Benjamin B.M. Shao

IEEE Transaction on Dependable and Secure Computing, Vol. 2, NO. 3, July-September 2005

Presented by: Derek KD Jiang 江坤道

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- Modern organizations have become increasingly reliant on IT to facilitate business operation.
- The issue of how to strengthen IT capability so that a company can prevent or quickly recover from disasters becomes a serious concern.

- Perform a impact analysis to:
- Identify the disasters likely occur in the environment.
- Evaluate the degree to which IT are vulnerable to sustain.
- Take necessary measures to protect those IT functions according the importance.

- This paper incorporate redundancy into critical IT functions and aims to maximize the survivability against potential disasters.

- Adopting cluster-centric approach, this paper concentrate on managing resources around independent clusters IT functions where each cluster is assigned its own dedicated solutions.
- An optimization model is proposed, taking into account the significance of IT functions, the cost of IT solutions, and the availability of resources subject to budget limitation.

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- Redundancy is a design principle of having one or more backup systems in case of failure of the main system.
- The use of redundancy in preparation for disasters is of potential advantage due to two aspects.
- Proactive prevention
- Reactive recovery

- The objective is to select among competing alternatives for redundancy level and reap the best returns from a limited budget.
- A quantitative model can provide the guidelines for allocating optimal redundancy levels to critical IT functions needing to be protected.

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- Suppose an organization is planning for taking measures of redundancy, and the budget is limited.
- Several possible disasters have been identified with the potential to affect IT functions and to cause business discontinuity.
- How to allocate redundancy to IT functions such that survivability is maximized and the cost still remains under budget?

- The redundancy allocation problem (RAP) is formulated below

- Survivability Smid in this context is defined as the likelihood of IT asset i to withstand disaster d and to ensure IT function m remains operational.

IT function m fails against disaster d only when all of its selected solutions fail at the same time.

In other words, as long as one of the selected solutions survives the disaster, IT function would be in operation.

- Ensures that at least one solution is selected and allocated to each IT function. Notably, IT function without redundancy is allowable.
- Indicates that the total costs can’t exceed the budget limit B.

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- The proposed model is a 0-1 integer programming problem with a nonlinear objective function.
- Due to the nonlinearity of the objective function, LR cannot be employed to tackle this problem.
- A partial enumeration procedure based on probabilistic dynamic programming is presented.

The sum of failure probabilities of each IT function due to any disasters.

The recursive formula, where m < M

- We define a state of system T as the available budget and stage m as IT function.
- Let be the failure rate of the system composed of IT functions m, m+1,…, M.

- For stage (IT function) m, state (budget) T cannot exceed the total available budget B minus the minimum costs to be allocated for stage 1,…, m-1.
- T must be at least equal to the cost of the least expensive solution in the current stage to ensure at least one solution for IT function m.

For T not in the range, Fm(T) is defined as 1, so it won’t be chosen.

- Fm(T) of (4) deals with the risks of disaster occurrence and involves the calculation of expected failure rate of IT function m according to the remaining budget T.
- The initial stage m=M and,

- The optimal objective function value F* is obtained as F1(B), representing the minimum overall failure rate of the whole system composed of all M IT functions with a budget of B.
- The original maximum overall survivability S* of RAP is then equal to 1 - F1(B).

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- Two LANs (M=2) with weight w1= 0.3, w2=0.7 respectively.
- Flooding disaster that occurs with a likelihood of 0.05 (i.e., p1=0.05, p2=0.95 for no disaster).
- It considers incorporating redundant bridges into LAN1 and redundant switches into LAN2 with a budget B=14.

- For LAN1
- Four types of bridges are available (n1=4), with C11=8, C12=2, C13=4, and C14=6.
- The survival rates are S111=0.1, S121=0.09, S131=0.15, and S141=0.21 (i.e., v111=0.9, v121=0.91, v131=0.85, v141=0.79).
- Their availabilities when no disaster occurs are S112=0.9999, S122=0.9993, S132=0.9997, and S142=0.9995 (i.e., v112=0.0001, v122=0.0007, v132=0.0004, v142=0.0005).

- For LAN2
- Three types of switches are available (n2=3), with C21=4, C22=6, and C23=5.
- The survival rates are S211=0.06, S221=0.1, S231=0.2 (i.e., v211=0.94, v221=0.9, v231=0.8).
- Their availabilities when no disaster occurs are S212=0.9994, S122=0.9990, S132=0.9996 (i.e., v212=0.0006, v222=0.0010, v232=0.0004)

- Starts with stage=2
- Since the least expensive switch for LAN2 has cost C21=4, and the least expensive bridge for LAN1 has cost C12=2, the valid range for T is .
- Equation (6) then calculate F2(T) for T=4,…, 12. Take F2(6) for example:

(X21, X22, X23)=(0, 0, 1), (0, 1, 0), (1, 0, 0). The minimum F2(T) = 0.02827 is associated with (0, 0, 1).

- Next, we proceed to find the optimal solution F1(14) in the final stage m=1.

The minimum F1(14) is associated with (X11, X12 , X13, X14) = (0, 1, 0, 1), with F* = 0.03905 using F2(6) = 0.02827. Namely, the maximum survivability S* against flooding equal 1 – F* = 1 – 0.03905 = 0.96095.

- Introduction
- Redundancy for IT disaster recovery
- Redundancy allocation model
- Solution procedure
- Examples
- Conclusion

- Contributions
- It presents one of the earliest quantitative studies to allocate redundancy for recovery planning.
- An exact solution method based on probabilistic dynamic programming is presented to help obtain optimal solution of redundancy allocation.
- Through sensitivity analysis, the model can further help IT managers make betters decisions.

- IT plays an extremely important role in modern business operations, nevertheless, it has potential vulnerabilities against disasters.
- RAP redundant allocation model proposed in this paper can fulfill the need for a structured decision analysis of recovery planning.

- For future research, we can further categorize assets into hardware, software, and other types to examine the impacts of each asset type on the redundancy allocation decisions.
- Specific assumptions of dependent IT functions or shared solutions can be made to address a different set of IT disaster recovery problems.

Thanks for your patience