1 / 5

Distributed systems faults and it's solution

Distributed systems is very good topic in PhD thesis but most of the students face many problems during their thesis project. Here is the list of those problems with their solutions. For more information Visit: www.techsparks.co.in

Download Presentation

Distributed systems faults and it's solution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2017 Distributed Systems Common Faults and it's Solution www. Techsparks.co.in [Techsparks] +91-96531-59085

  2. Distributed Systems Common Faults and it's Solution What is Fault? When the underlying assumptions of any system get violated then it is referred as a fault. The disrupted internal data state that reflects a fault is called an error. A failure accounts for the externally visible deviation from specifications. 1.Faults in Distributed Systems 1.Data Corruption 2.Hanging Processes 3.Misleading Return Values 4.Misbehaving Machines 5.Hardware/Software/Network Outages 6.Over commitment of Resources 7.Insufficient Disk Space 2. Silent-Fail-StutterModel Reasons for this type of failure: 1.System memory

  3. Silent-fail-stutter is an appropriate model because a memory tester program can discover a corrupt memory chip and doing this test incurs a cost. 2.Processor cache The behaviour is not fail-stop because a faulty cache processor does not retain information about a cache block failure across reboots even in the case of permanent failure. 3.Implications of Silent-Fail-Stutter Since components may fail and might not send the signal of failure to other components, some components periodically or on certain events verify the state of each component and in case failure is detected, report it to other components in Distributed systems. If a single component keeps the check, designers should make sure that this component is more trustworthy than the ones it checks. All the components can co-operate to perform this operation of checking in a distributed manner. 4.Failure Detection

  4. To detect whether a failure has occurred or not and if there is need to trace the cause of that failure. A sudden failure of lower level component can result in the failure higher components of the chain and to trace the fault; we may need to jump down the hierarchy. 5.Evaluation We looked at how components should convey the results of test and we decided to use a database to log the results of tests and timestamp of tests. We also recorded the results of application execution into the database.

  5. 6.Conclusions We have successfully detected the faults in large distributed systems and proposed silent-fail-stutterfault model to precisely model component behaviour while keeping up tractability. 7.References 1.Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: Enabling scalable virtualorganizations. International Journal of Supercomputing Applications (2001) 2.Patterson, D.A., Gibson, G.A., Katz, R.H.: A case for redundant arrays of inexpensive disks(raid). In Boral, H., Larson, P.A., eds.: Proceedings of the 1988 ACM SIGMOD InternationalConference on Management of Data, Chicago, Illinois, June 1-3, 1988, ACM Press (1988)109–116 3.Avizienis, A., Laprie, J.: Dependable computing: From concepts to design diversity. In:Proceeding of the IEEE. Volume 74. (1986) 629–638

More Related