1 / 41

Physical Programming: Beyond Mere Logic

Physical Programming: Beyond Mere Logic. Bran Selic Rational Software Canada bselic@rational.com. What I am Hoping For. E T HEORY A ND P RACTICE OF S OFTWARE. The Ideal and the Real. PLAT  N. By focussing on the imperfect world of physical reality we may miss the essence.

Download Presentation

Physical Programming: Beyond Mere Logic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Physical Programming: Beyond Mere Logic Bran SelicRational Software Canadabselic@rational.com

  2. What I am Hoping For E THEORY AND PRACTICE OF SOFTWARE

  3. The Ideal and the Real PLATN • By focussing on the imperfect world of physical reality we may miss the essence • Software seems much closer to the “ideal” world

  4. The Software World • Fundamental design principle: separate program logic from the underlying implementation technology • separation of concerns • software portability Program Logic HL ProgrammingLanguages Computing Environment & Technology

  5. The Real-Time Software World Program Logic HL ProgrammingLanguages Computing Environment & Technology • Key question: How long will it take? • The quantitative characteristics of the computing environment encroach upon the purity of the logic • software design involves engineering tradeoffs

  6. A Simple Programming Application CPU Printer DB • Traverse a transactions log database and print all transactions pertaining to a specific account open (DB); for i := 1 to DB.size do record := read (DB); if (record.acctNo = myAccount)then print (record); enddo; close (DB);

  7. Porting to a Distributed Environment Replicated DBservers CPU CPU CPU Printer DB DB • Can it really be this simple? Network open (DB);for i := 1 to DB.size do record := read (DB); if (record.acctNo = myAccount)then print (record); enddo;close (DB); RPC_open(DB);for i := 1 to DB.size do record := RPC_read(DB); if (record.acctNo = myAccount)then print (record); enddo;RPC_close(DB);

  8. Some (Unstated!) Assumptions • The CPU and database are fast enough for the needs of the application • e.g. random access database hardware • The CPU and database fail as a unit • i.e., no need to contend with failures of the database • Communications is reliable • order preserving • exactly once semantics • A system never has anything more important to do than what it is doing at the moment

  9. Partial Failures • Distributed systems can exhibit partial failures • fault tolerance: ability to recover from partial failures • Issue: failure recovery strategy • fault detection • failure recovery • fault diagnosis • Issue: how do other sites detect that a site has failed? • (apparent) lack of activity/response • how do we distinguish between a failed site and a lost message? • Timeout is the only general mechanism available • how long do we wait? • Tradeoff between responsiveness vs. degree of certainty

  10. A More Realistic Distribution Scenario • Dealing with partial failures DB := locate_database (Network)exception abort; RPC_open (DB)exception do DB := locate_database (Network)exception abort; enddo; for i := 1 to DB.size do record := RPC_read (DB)exception do DB := locate_database (Network)exception abort; for j := 1 to (i-1) do RPC_read (DB) exception abort; retry; enddo; if (record.acctNo = myAccount)then print (record); enddo; RPC_close (DB); Most of the code is in the exception handlers!

  11. Asynchronous Events and Fault Tolerance • Partial system failures are only one kind of event that may need to be handled in the course of execution of a distributed program • Others: • high-priority situations (e.g., imminent deadlines) • aborts • These events are often unpredictable • may occur at any point in the execution of a program • fault tolerance requires that whenever they occur and whatever they are, we need to deal with them

  12. Revisiting An Old Assumption Step N Handler AN Exception! Handler B Step N+1 Handler AN+1 Exception! Step N+2 • Is the traditional “main path” focussed programming style appropriate when exceptions are the rule?

  13. Asynchronous Event Handling Event B Handler B Step N Step N+1 Step N+2 • This is nicely captured by the state-event matrix of finite state machines Event S Event A etc. Handler AN Handler AN+1 Handler AN+2

  14. A Conclusion • In an event-driven and deadline-based application, a state machine-based programming model may be more appropriate than the traditional algorithmic (“main path”) programming model • The environment strikes back • the program logic is strongly affected by the environment

  15. Communication Media Failures • Message loss • due to hardware failures • due to software failures (e.g., buffer overflow) • Message reordering • due to different paths • due to variable delays (e.g., due to variable message lengths) • retransmission due to fault-tolerant protocols • Message duplication • due to faulty hardware • retransmission due to fault-tolerant protocols

  16. Transmission Delays Processing Site Processing Site observer on on off off “on” State? “on” • Possibility of out of date status information

  17. Relativistic Effects clientA notifier1 notifier2 clientB E2 E1 E1 E2 time • Relativistic effects: • different observers see different event orderings (due to different and variable transmission delays)

  18. Distribution Transparencies Processing Site Processing Site Reliable Comm Service Reliable Comm Service Communications Medium • Providing supporting layers of functionality that shield the application from the undesirable effects of distribution • e.g., reliable communication protocols client server

  19. Impossibility Result No.1 It is not possible to guarantee that agreement can be reached in finite time over an asynchronous communication medium, if the medium is lossy or one of the distributed sites can fail • Fischer, M., N. Lynch, and M. Paterson, “Impossibility of Distributed Consensus with One Faulty Process” Journal of the ACM, (32, 2) April 1985.

  20. Impossibility Result No.2 Even when communication is fully reliable, it is not possible to guarantee common knowledge if communication delays are unbounded • Halpern, J.Y, and Moses, Y., “Knowledge and common knowledge in a distributed environment” Journal of the ACM, (37, 3) 1990.

  21. The “End-To-End” Argument • Transparency mechanisms are intended to protect the application from observing the undesirable effects of distribution • Most transparency types require distributed agreement! • The end-to-end argument [Saltzer et al.]: • if transparency cannot be guaranteed, the application is not really shielded from the effects of distribution • the overhead of introducing transparency mechanisms may not be justified

  22. Stepping Back... • Most distribution problems are a consequence of the encroachment of the physical world into the pliable and limitless “logical” world of software • the problem is fundamental (e.g., the end-to-end argument) • Traditional Programming = Logic • Physical Programming = Logic + Physics • like traditional engineers, software designers must take into account the raw material out of which they spin their logic • finite resources, finite delays, finite reliability...

  23. Quality of Service Concepts • The physical characteristics of software can be specified using the general notion of Quality of Service (QoS): a specification of how well a service is (to be) performed • e.g. throughput, capacity, response time • usually a quantitative measure • QoS specifications are two sided: • offered QoS: the QoS that is offered to clients • required QoS: the QoS required by a client

  24. Resources and Quality of Service S1 S1 • Resource: an element whose functional capacity is limited, directly or indirectly, by the finite capacities of the underlying physical computing environment • The services of a resource are characterized by one or more QoS attributes • capacity, reliability, availability, response time, etc. Client Resource Resource Demand OfferedQoS RequiredQoS {RequiredQoS  OfferedQoS}

  25. Simple Example Client1 Client2 access ( ) access ( ) {Deadline = 5 ms} myMonitor • Concurrent tasks accessing a monitor with known response time characteristics Required QoS {Deadline = 3 ms} {MaxExecutionTime = 4 ms} Offered QoS

  26. Types and “Physical” Types • The purpose of types is to tell us about the externally relevant properties of software components so that we can validate whether they are being used appropriately • Physical types: type specifications that incorporate QoS characteristics • Answer two key engineering questions: • can this component support the “load” intended for it? • what does this component require to support its offered QoS?

  27. Physical Type Example • A semaphore type: class Semaphore { {heap= 10 bytes} -- required QoS {CPU 5 MIPS} -- required QoS get(){proc 0.4*CPU us;stack=4 bytes}; rel(){proc 0.4*CPU us;stack=4 bytes}; } • Usage: mySema : Semaphore; mySema.get() {proc 3 us} -- req. QoS

  28. Violation of Encapsulation? • Aren’t the offered QoS characteristics a consequence of the implementation? • Not necessarily... • The offered QoS characteristics can and should be defined independently of the implementation • the “worst-case” numbers of traditional engineering • The contractual obligations that the component designer is willing to assume

  29. Physical Type Checking • Can physical types be statically checked? • The good news: Yes, they can (in most cases) • The bad news: typically requires complex analysis methods (queueing network analysis, schedulability analysis, etc.) • but then, model checking and theorem proving is not simple either • Some issues: • Typically, QoS-based analyses cannot be done incrementally -- the full system context is required • but then, the same holds for many formal verification methods • Each type of QoS (e.g., bandwidth, CPU performance) combines differently

  30. Required QoS S1 S2 ResourceB Client ResourceA S1 S2 CPU CPU Physical Processor • Like all guarantees, the offered QoS is contingent on the component getting what it needs to do its job • There are two distinct dimensions to this: • the peer dimension • the layering dimension

  31. Logical Viewpoint • Example: logical view of aircraft simulator software INSTRUCTOR STATION AIRFRAME ATMOSPHEREMODEL PILOT CONTROLS CONTROLSURFACES GROUNDMODEL ENGINES

  32. Engineering (Realization) Viewpoint Processor OS process OS process OS process Processor Ethernet LAN stack stack stack TCP/IP socket TCP/IP socket • The realization of a specific set of logical components using facilities of the run-time environment

  33. Viewpoints and Mappings Logical Viewpoint INSTRUCTOR STATION Processor OS process OS process AIRFRAME Processor Ethernet LAN ATMOSPHEREMODEL stack stack stack Engineering Viewpoint PILOT CONTROLS CONTROLSURFACES OS process GROUNDMODEL TCP/IP socket ENGINES TCP/IP socket Realizationmappings

  34. The Engineering Viewpoint • The engineering viewpoint represents the “raw material” out of which we construct the logical viewpoint • the quality of the outcome is only as good as the quality of the ingredients that are put in • as in all true engineering, the quantitative aspects of the logical model are often crucial (How long will it take? How much will be required?…)

  35. Distributed Systems Dilemma • Dilemma:How can we account for the engineering characteristics of the system without prematurely and possibly unnecessarily committing to a specific technology? • Proposed solution: Include in the logical model a generic (technology-neutral) specification of the required/expected characteristics of the engineering environment

  36. Viewpoint Separation Required Environment Engineering Viewpoint (alternative A) Engineering Viewpoint (alternative B) UNIX Process WinNT Process UNIX Process WinNT Process • Required Environment: a technology-neutral environment specification required by the logical elements of a model Logical Viewpoint

  37. Required Environment Specifications Airframe logical element (client) required QoS values CPU : 3 MIPs Bandw. : 70Mbit/s Mem : 2MB offered QoS values 20MB 3MIPs 100Mbit/s CPU LAN engineering element (resource) • What a logical component needs in order to perform its function according to spec realization mapping

  38. Required Environment Partitions INSTRUCTOR STATION AIRFRAME ATMOSPHEREMODEL PILOT CONTROLS CONTROLSURFACES GROUNDMODEL ENGINES • Logical elements often share common QoS requirements QoS domain (e.g.,failure unit, uniform comm properties)

  39. QoS Domains • Specify a domain in which certain QoS values apply throughout: • failure characteristics (failure modes, availability, reliability) • CPU speeds • communications characteristics (delay, throughput, capacity) • etc. • The QoS values of a domain can be compared against those of a concrete engineering environment to see if a given environment is adequate for a specific model

  40. “Physical” Programming • The notions of QoS and QoS domains enable the design of distributed systems that properly account for the effects of distribution and other non-transparent physical phenomena, while allowing for a high degree of portability and technology independence • They are also the basis for formal verification of realization mappings {required QoS  QoS of the proposed engineering environment} • May also be used to automatically synthesize engineering environments that satisfy a given QoS specification of a logical model

  41. Conclusions and an Appeal... • The physical aspects of software will not go away • ignoring them can be perilous especially when working with distributed systems • most interesting software systems of the future will be distributed and will have stringent dependability requirements (“cannot reboot the Internet”) • What is needed is a proper theoretical framework for dealing with physical types • The QoS framework described here is currently being incorporated into a profile of UML for real-time applications

More Related