backward and forward looking at dependable and secure computing
Skip this Video
Download Presentation
Backward and forward looking at dependable and secure computing

Loading in 2 Seconds...

play fullscreen
1 / 29

Backward and forward looking at dependable and secure computing - PowerPoint PPT Presentation

  • Uploaded on

Backward and forward looking at dependable and secure computing. Yinghua Min Fellow of IEEE Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China At PRDC09, 2009/11/16. Outline. Historical review of dependable computing FTCS DSN IFIP WG10.4 PRDC

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Backward and forward looking at dependable and secure computing' - baakir

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
backward and forward looking at dependable and secure computing

Backward and forward looking at dependable and secure computing

Yinghua Min

Fellow of IEEE

Institute of Computing Technology,

Chinese Academy of Sciences, Beijing, China

At PRDC09, 2009/11/16

  • Historical review of dependable computing
    • FTCS
    • DSN
    • IFIP WG10.4
    • PRDC
  • New challenges of dependable and secure computing
  • Old techniques facing new environments
  • Concentrated on practical problems, rather than conceptual games
  • Established in 1970
  • FTC for critical applications
    • Aviation
    • Spaceflight
    • Railway transportation
  • A highly academic symposium
dependable computing
Dependable computing
  • People understood that our area needed some extension.
  • A. Avizienis and Jean-Claude Laprieproposed the concept of Dependable Computing at FTCS-15 in 1985.
  • Human being is included in systems then.
    • Malicious faults
  • FTCS
  • DCCA

DSN in 2000

  • Since 2000
  • DSN has pioneered the fusion between security and dependability.
  • Understanding the need to simultaneously fight against cyber attacks, accidental faults, design errors, and unexpected operating conditions.
  • 1989 Joint Symposium on Fault--Tolerant Computing, Chongqing, China, July 18-20, 1989
  • 1991 Pacific Rim international symposium on FTS, Kawasaki, Japan
  • 1999 Pacific Rim international symposium on Dependable Computing, Hong Kong, China.
    • Keynote: Computer Crime in Hong Kong (Mr. Anthony Fung)
      • From the HK police department
    • Computer Crime and Internet Fraud
    • Its evidence for litigation support
trusted computing
Trusted Computing
  • Trusted Computing Platform Alliance (TCPA) in 1999
  • TCG since 2003
  • TPM → TCM (Trusted C Module) 2008
  • Trusted root → security chip → trusted BIOS → trusted OS → trusted systems
  • Basically for PCs in the area of secure computing
ieee transactions on dependable and secure computing
IEEE Transactions on Dependable and Secure Computing
  • Since 2004
  • Separate dependable computing from secure computing
system dependability
System dependability
  • The system dependability situation has been getting worse rather than improving in recent years. Quoting the AMSD Roadmap, “the availability that was typically achievable by (wired) telecommunication services, and computer systems in the 1990s was 99.999 percent to 99.9 percent. Now cellular phone services, and web-based services, typically achieve an availability of only 99 per cent to 90 per cent” (AMSD Roadmap 2003, p. 31).

The European Commission’s Accompanying Measure on System Dependability

new challenges
New challenges
  • Three key requirements for computers
    • High performance
    • Low power
    • Dependability
      • Nano-ICs, more vulnerable
        • to transient (or soft) errors
        • to permanent malfunctions due to materials aging or wearout mechanisms.
  • Nano-scale IC reliability
  • Counterfeit ICs
  • Dependability and security in cloud computing
  • Signal integrality
  • Dependant software needs evidence.
nano scale ic reliability
Nano-scale IC reliability
  • The "International Technology Roadmap for Semiconductors" [SIA] estimates that by 2019 the feature size of process technology will reach 7nm, but only between 10% and 20% of chips will be defect free.
  • Power densities to skyrocket and on-chip temperatures to increase
  • Small delay defects, adjacentline coupling, crosstalk and process variation induced unreliability
    • variability-tolerant design
    • appropriate measures are taken, such as fault tolerance, redundancy, repair and reconfiguration.
counterfeit electronic components
Device lead condition shows parts were used

Marking indicates an Op Amp from ADI…

… but contains die for a Voltage Reference from PMI

Evidence of prior marking for a part with inferior performance …

Part number indicates a CLCC package, but this package is a CDP…

… accompanied by

bogus test report

CounterfeitElectronic Components
  • These are incidents that jeopardize the performance and reliability of electronics.
baofeng com incident in china incident in China
  • Network outages in Jiangsu, Anhui, Guangxi, Henan, Gansu, and Zhejiang in China, May 19, 2009
  • The network failure was led by the domain name system (DNS) failure of, the website of the Chinese music player provider
  • The failure further caused the surge of DNS server visits and the decrease of processing performance of the network.
  • The servers of DNSPod were attacked by a malicious virus.
  • The incident was caused by a software fault or an attack?--- Maybe both
bohrbugs and mandelbugs
Bohrbugs and Mandelbugs
  • Bohrbugs
    • An unusual software bug that consistently makes its presence known under conditions that are either well-defined, possibly unknown or both.
  • Mandelbugs
    • A bug whose behavior doesn't appear malicious, but has such a high level of complexity that it appears when errors are accumulated for some time.
  • Bohrbugs behaving like Mandelbugs
    • Becoming an attack
dependability in the cloud
Dependability in the Cloud
  • On April 26 2008, Amazon’s Elastic Cloud (EC2) had an outage
    • due to a single customer applying a very large set of unusual firewall rules
    • triggering a performance degradation bug in Amazon’s distributed firewall.
  • Availability and privacy are serious challenges for applications hosted on cloud infrastructure.
challenges on cloud infrastructure
Challenges on cloud infrastructure
  • Cloud applications increase risk levels
    • Sharing of cloud resources by entities that engage in a wide range of behaviors and employ best practices to varying degrees
  • An environment with a few large cloud infrastructure providers
    • increases the risk of common mode outages affecting a large number of applications
    • provides highly visible targets for attackers.
  • Multiple administrative domains between the application and infrastructure operators reduces end-to-end system visibility and error propagation information, thus making problem detection and diagnosis very difficult.
  • A cloud provider's economies of scale allow levels of investment in redundancy and dependability, but smaller operators may not.
old ftc techniques facing new environments
Old FTC techniques facing new environments
  • Checkpointing
  • Redundancy
  • Software fault-tolerance in middleware
  • ECC in mass storage systems
  • Fault detection and diagnosis in virtual machines
  • Assessment of dependability and security
checkpointing for supercomputers
Checkpointing for supercomputers
  • Periodic checkpointing → cooperative checkpointing
    • At runtime, the application requests acheckpoint.
    • The system grants or denies the checkpoint (to skip some of them)
      • based on various system-wide heuristics, including disk ornetwork usage and reliability information.
  • Usingcooperative checkpointing in one instance
    • reduced boundedslowdown by a factor of nine,
    • improved system utilization,and lost no more work to failures than periodic checkpointing
      • even when event prediction had a 90%false negative rate.






Checkpointing at micro-operation level







Violation Occurs

Violation detected

  • Sliding window based on sensor delay
  • Delayed-commit: completed results buffered in the buffers until verified to be correct
    • Noise-speculative
    • Noise-verified
  • Rollback to a previous noise-verified state when a violation is detected


  • At the application level and at a hardware level.
  • Byzantine fault tolerance
    • Algorithms that are robust to arbitrary types of failures in distributed algorithms.
    • Do not require any centralized control that have some guarantee of always working correctly.
  • Data integrity
    • Redundancy in different places
    • RAID (redundant array of independent disks), a fault-tolerant storage device that uses data redundancy.
  • Synchronization is a big challenge.
software fault tolerance in middleware
Software fault-tolerance in middleware
  • Optimal fault tolerance strategy for both stateless and stateful Web services
    • Retry
    • Recovery block
    • N-version programming
  • Network characteristics:
    • Freedom
    • Dynamic
    • Multi-tier service
  • Debug performance problems of multi-tier services of black boxes.
soft errors
Soft errors
  • Soft errors involve changes to data
  • Cosmic rays creating energetic neutrons and protons
  • The importance of soft errors increases as chip technology advances.
    • chip-level soft error
      • the radioactive atoms in the chip's material decay and release alpha particles into the chip.
      • Built-in Soft Error Resilience (BISER) Cell
    • system-level soft error
      • the data being processed is hit with a noise phenomenon
transient faults
Transient Faults
  • Program replication
    • N-version programming
    • Time redundant technique,
    • Virtual duplex systems
    • Tandem Nonstop Cyclone is a custom system designed to use process replicas for transaction processing workloads.
  • Transient Fault Tolerance for Multi-core Architectures
  • Redundancy at the process level
  • Ensuring correct hardware execution or ensuring correct software execution
assessment of dependability and security
Assessment of dependability and security
  • The original definition of dependability is the ability to deliver service that can justifiably be trusted.
    • Justification
    • Evaluation
    • Banchmarking
    • Standardization
  • A dependability and security gap that is often perceived by users as a lack of trustworthiness in computer applications, and that is in fact undermining the network and service infrastructures that constitute the very core of the knowledge-based society.
difficulties for assessment
Difficulties for assessment
  • The assessment of dependability in a standard and comparable way, considering all
    • Component failures
    • Software bugs
    • Human mistakes
    • Interaction mistakes
    • Malicious attacks
  • The quality of measurements
  • The assessment of dependability in component based, dynamic and adaptive systems and networks
  • The integration with the development process
denial of service dos
Denial of service (DoS)
  • Effects of DoS attacks are experienced by users as a severe slowdown, service quality degradation, or service disruption.
  • We need accurate, quantitative, and versatile DoS impact metrics regardless of the underlying mechanism for service denial, attack dynamics, legitimate traffic mix, or network topology.
  • Measuring DoS through selected legitimate traffic parameters:
    • packet loss,
    • traffic throughput or goodput,
    • request/response delay,
    • transaction duration, and
    • allocation of resources.
conceptual games
Trustworthy computing

Trusted computing

Secure computing




Dependable computing




Confident computing








Conceptual games
concluding remarks
Concluding remarks
  • Dependable computing is a forever topic for information technology
    • Dependability is as important as high performance, and low power.
  • New challenges are coming with the advance of IT
  • The gap between academia and industry
  • Concentrate on practical problems, rather than conceptual games