Backward and forward looking at dependable and secure computing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Backward and forward looking at dependable and secure computing PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on
  • Presentation posted in: General

Backward and forward looking at dependable and secure computing. Yinghua Min Fellow of IEEE Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China At PRDC09, 2009/11/16. Outline. Historical review of dependable computing FTCS DSN IFIP WG10.4 PRDC

Download Presentation

Backward and forward looking at dependable and secure computing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Backward and forward looking at dependable and secure computing

Backward and forward looking at dependable and secure computing

Yinghua Min

Fellow of IEEE

Institute of Computing Technology,

Chinese Academy of Sciences, Beijing, China

At PRDC09, 2009/11/16


Outline

Outline

  • Historical review of dependable computing

    • FTCS

    • DSN

    • IFIP WG10.4

    • PRDC

  • New challenges of dependable and secure computing

  • Old techniques facing new environments

  • Concentrated on practical problems, rather than conceptual games


Backward and forward looking at dependable and secure computing

FTCS

  • Established in 1970

  • FTC for critical applications

    • Aviation

    • Spaceflight

    • Railway transportation

  • A highly academic symposium


Dependable computing

Dependable computing

  • People understood that our area needed some extension.

  • A. Avizienis and Jean-Claude Laprieproposed the concept of Dependable Computing at FTCS-15 in 1985.

  • Human being is included in systems then.

    • Malicious faults

  • FTCS

  • DCCA

DSN in 2000


Backward and forward looking at dependable and secure computing

DSN

  • Since 2000

  • DSN has pioneered the fusion between security and dependability.

  • Understanding the need to simultaneously fight against cyber attacks, accidental faults, design errors, and unexpected operating conditions.


Backward and forward looking at dependable and secure computing

PRDC

  • 1989 Joint Symposium on Fault--Tolerant Computing, Chongqing, China, July 18-20, 1989

  • 1991 Pacific Rim international symposium on FTS, Kawasaki, Japan

  • 1999 Pacific Rim international symposium on Dependable Computing, Hong Kong, China.

    • Keynote: Computer Crime in Hong Kong (Mr. Anthony Fung)

      • From the HK police department

    • Computer Crime and Internet Fraud

    • Its evidence for litigation support


Trusted computing

Trusted Computing

  • Trusted Computing Platform Alliance (TCPA) in 1999

  • TCG since 2003

  • TPM → TCM (Trusted C Module) 2008

  • Trusted root → security chip → trusted BIOS → trusted OS → trusted systems

  • Basically for PCs in the area of secure computing


Ieee transactions on dependable and secure computing

IEEE Transactions on Dependable and Secure Computing

  • Since 2004

  • Separate dependable computing from secure computing


System dependability

System dependability

  • The system dependability situation has been getting worse rather than improving in recent years. Quoting the AMSD Roadmap, “the availability that was typically achievable by (wired) telecommunication services, and computer systems in the 1990s was 99.999 percent to 99.9 percent. Now cellular phone services, and web-based services, typically achieve an availability of only 99 per cent to 90 per cent” (AMSD Roadmap 2003, p. 31).

The European Commission’s Accompanying Measure on System Dependability


New challenges

New challenges

  • Three key requirements for computers

    • High performance

    • Low power

    • Dependability

      • Nano-ICs, more vulnerable

        • to transient (or soft) errors

        • to permanent malfunctions due to materials aging or wearout mechanisms.

  • Nano-scale IC reliability

  • Counterfeit ICs

  • Dependability and security in cloud computing

  • Signal integrality

  • Dependant software needs evidence.


Nano scale ic reliability

Nano-scale IC reliability

  • The "International Technology Roadmap for Semiconductors" [SIA] estimates that by 2019 the feature size of process technology will reach 7nm, but only between 10% and 20% of chips will be defect free.

  • Power densities to skyrocket and on-chip temperatures to increase

  • Small delay defects, adjacentline coupling, crosstalk and process variation induced unreliability

    • variability-tolerant design

    • appropriate measures are taken, such as fault tolerance, redundancy, repair and reconfiguration.


Counterfeit electronic components

Device lead condition shows parts were used

Marking indicates an Op Amp from ADI…

… but contains die for a Voltage Reference from PMI

Evidence of prior marking for a part with inferior performance …

Part number indicates a CLCC package, but this package is a CDP…

… accompanied by

bogus test report

CounterfeitElectronic Components

  • These are incidents that jeopardize the performance and reliability of electronics.


Baofeng com incident in china

Baofeng.com incident in China

  • Network outages in Jiangsu, Anhui, Guangxi, Henan, Gansu, and Zhejiang in China, May 19, 2009

  • The network failure was led by the domain name system (DNS) failure of Baofeng.com, the website of the Chinese music player provider

  • The failure further caused the surge of DNS server visits and the decrease of processing performance of the network.

  • The servers of DNSPod were attacked by a malicious virus.

  • The incident was caused by a software fault or an attack?--- Maybe both


Bohrbugs and mandelbugs

Bohrbugs and Mandelbugs

  • Bohrbugs

    • An unusual software bug that consistently makes its presence known under conditions that are either well-defined, possibly unknown or both.

  • Mandelbugs

    • A bug whose behavior doesn't appear malicious, but has such a high level of complexity that it appears when errors are accumulated for some time.

  • Bohrbugs behaving like Mandelbugs

    • Becoming an attack


Dependability in the cloud

Dependability in the Cloud

  • On April 26 2008, Amazon’s Elastic Cloud (EC2) had an outage

    • due to a single customer applying a very large set of unusual firewall rules

    • triggering a performance degradation bug in Amazon’s distributed firewall.

  • Availability and privacy are serious challenges for applications hosted on cloud infrastructure.


Challenges on cloud infrastructure

Challenges on cloud infrastructure

  • Cloud applications increase risk levels

    • Sharing of cloud resources by entities that engage in a wide range of behaviors and employ best practices to varying degrees

  • An environment with a few large cloud infrastructure providers

    • increases the risk of common mode outages affecting a large number of applications

    • provides highly visible targets for attackers.

  • Multiple administrative domains between the application and infrastructure operators reduces end-to-end system visibility and error propagation information, thus making problem detection and diagnosis very difficult.

  • A cloud provider's economies of scale allow levels of investment in redundancy and dependability, but smaller operators may not.


Old ftc techniques facing new environments

Old FTC techniques facing new environments

  • Checkpointing

  • Redundancy

  • Software fault-tolerance in middleware

  • ECC in mass storage systems

  • Fault detection and diagnosis in virtual machines

  • Assessment of dependability and security


Checkpointing for supercomputers

Checkpointing for supercomputers

  • Periodic checkpointing → cooperative checkpointing

    • At runtime, the application requests acheckpoint.

    • The system grants or denies the checkpoint (to skip some of them)

      • based on various system-wide heuristics, including disk ornetwork usage and reliability information.

  • Usingcooperative checkpointing in one instance

    • reduced boundedslowdown by a factor of nine,

    • improved system utilization,and lost no more work to failures than periodic checkpointing

      • even when event prediction had a 90%false negative rate.


Backward and forward looking at dependable and secure computing

Noise-speculative

Noise-speculative

Noise-speculative

Noise-verified

Noise-verified

Noise-verified

Checkpointing at micro-operation level

Committed

state

Committed

state

Processor

State

Violation Occurs

Violation detected

  • Sliding window based on sensor delay

  • Delayed-commit: completed results buffered in the buffers until verified to be correct

    • Noise-speculative

    • Noise-verified

  • Rollback to a previous noise-verified state when a violation is detected

19


Redundancy

Redundancy

  • At the application level and at a hardware level.

  • Byzantine fault tolerance

    • Algorithms that are robust to arbitrary types of failures in distributed algorithms.

    • Do not require any centralized control that have some guarantee of always working correctly.

  • Data integrity

    • Redundancy in different places

    • RAID (redundant array of independent disks), a fault-tolerant storage device that uses data redundancy.

  • Synchronization is a big challenge.


Software fault tolerance in middleware

Software fault-tolerance in middleware

  • Optimal fault tolerance strategy for both stateless and stateful Web services

    • Retry

    • Recovery block

    • N-version programming

  • Network characteristics:

    • Freedom

    • Dynamic

    • Multi-tier service

  • Debug performance problems of multi-tier services of black boxes.


Soft errors

Soft errors

  • Soft errors involve changes to data

  • Cosmic rays creating energetic neutrons and protons

  • The importance of soft errors increases as chip technology advances.

    • chip-level soft error

      • the radioactive atoms in the chip's material decay and release alpha particles into the chip.

      • Built-in Soft Error Resilience (BISER) Cell

    • system-level soft error

      • the data being processed is hit with a noise phenomenon


Transient faults

Transient Faults

  • Program replication

    • N-version programming

    • Time redundant technique,

    • Virtual duplex systems

    • Tandem Nonstop Cyclone is a custom system designed to use process replicas for transaction processing workloads.

  • Transient Fault Tolerance for Multi-core Architectures

  • Redundancy at the process level

  • Ensuring correct hardware execution or ensuring correct software execution


Assessment of dependability and security

Assessment of dependability and security

  • The original definition of dependability is the ability to deliver service that can justifiably be trusted.

    • Justification

    • Evaluation

    • Banchmarking

    • Standardization

  • A dependability and security gap that is often perceived by users as a lack of trustworthiness in computer applications, and that is in fact undermining the network and service infrastructures that constitute the very core of the knowledge-based society.


Difficulties for assessment

Difficulties for assessment

  • The assessment of dependability in a standard and comparable way, considering all

    • Component failures

    • Software bugs

    • Human mistakes

    • Interaction mistakes

    • Malicious attacks

  • The quality of measurements

  • The assessment of dependability in component based, dynamic and adaptive systems and networks

  • The integration with the development process


Denial of service dos

Denial of service (DoS)

  • Effects of DoS attacks are experienced by users as a severe slowdown, service quality degradation, or service disruption.

  • We need accurate, quantitative, and versatile DoS impact metrics regardless of the underlying mechanism for service denial, attack dynamics, legitimate traffic mix, or network topology.

  • Measuring DoS through selected legitimate traffic parameters:

    • packet loss,

    • traffic throughput or goodput,

    • request/response delay,

    • transaction duration, and

    • allocation of resources.


Conceptual games

Trustworthy computing

Trusted computing

Secure computing

Robustness

Survivability

Adoptability

Dependable computing

Availability

Maintainability

Reliability

Confident computing

Controllability

Cybersecurity

Manageability

Assurance

Usability

Integrity

Safety

Conceptual games


Concluding remarks

Concluding remarks

  • Dependable computing is a forever topic for information technology

    • Dependability is as important as high performance, and low power.

  • New challenges are coming with the advance of IT

  • The gap between academia and industry

  • Concentrate on practical problems, rather than conceptual games


Backward and forward looking at dependable and secure computing

Thank you for your attention!


  • Login