Programming Languages: The Essence of Computer Science

Programming Languages: The Essence of Computer Science Robert HarperCarnegie Mellon University October, 2002

CS Is About Programming • How to build systems. • Better, faster, cheaper. • Reliable, maintainable, extensible. • Evaluating and comparing systems. • Performance, behavior, security. • Compliance, usability. • What can and can’t be done. • Algorithms and complexity.

Programming Is Linguistic • Programming is an explanatory activity. • To yourself, now and in the future. • To whoever has to read your code. • To the compiler, which has to make it run. • Explanations require language. • To state clearly what is going on. • To communicate ideas over space and time.

Programming Languages Are Essential • Therefore languages are of the essence in computer science. • Programming languages, in the familiar sense. • Description languages, design patterns, architecture languages, specification languages.

Some Conventional Wisdom • PL research is over. • The language of the future is X … for various values of X. • It’s not about the one, true language! • Anyone can design a PL. • Clearly, anyone does. • But look what you get! TCL, Perl, TeX, C++. • PL research is irrelevant to practice. • Seen as a purely academic pursuit. • But the tide is turning as successes accumulate.

Some Accomplishments • High-level languages. • Static type disciplines. • Automatic storage management. • Objects, classes, ADT’s. • Sophisticated compiler technology. • Specification and verification.

Why People Don’t Notice • It takes decades to go from research to practice. • Similar to other areas, such as algorithms, AI. • Large installed base of programmers. • Ideas become “common sense” long before they are widely used. • Even very simple things, such as lexical scope, were once controversial! • Not to mention data abstraction, OOP, ….

Some Important Trends • Safety with performance. • High-level languages with good compilers. • Low-level languages with good type systems. • Languages, not tools. • The design should live with the code. • Conformance should be checkable throughout the evolution of the code.

Some Important Trends • Interplay between theory and practice. • “Prove theorems and report numbers.” • Build systems to assess practicality and to discover new problems. • Full-spectrum coverage. • Functional and imperative. • High and low-level languages. • Design and implementation.

Some Important Trends • Type theory as the GUT of PL’s. • Provides a precise criterion for safety and sanity of a design. • “Features” correspond to types. • Close connections with logics and semantics.

What Is A Type System? • Static semantics: the well-formed programs. • Dynamic semantics: the execution model.

What is a Type System? • Safety theorem: types predict behavior. • Types describe the states of an abstract machine model. • Execution behavior must cohere with these descriptions. • Thus a type is a specification and a type checker is a theorem prover.

Types Are Specifications • Examples: • e : float means that e evaluates to a value in, say, floating point register 0. • e : float ! int means that e is a procedure that is called with arg in FR0 and returns with result in GR0. • e : queue means that e behaves like a queue.

Types Are Formal Methods • Type checking is the most successful formal method! • In principal there are no limits. • In practice there is no end in sight. • Examples: • Using types for low-level languages, say inside a compiler. • Extending the expressiveness of type systems for high-level languages.

Types in Compilation • Conventional compilers: Source = L1 L2  …  Ln = Target : T1 • Types apply only to the source code. • Type check, then discard types. • If compiler is correct, target code is safe.

Typed Intermediate Languages • Generalize syntax-directed translation to type-directed translation. • Intermediate languages come equipped with a type system. • Compiler transformations translate both a program and its type. • Translation preserves typing: if e:T then e*:T* after translation

Typed Intermediate Languages • Type-directed translation: Source = L1 L2  …  Ln = Target : : : T1 T2  …  Tn • Transfers typing properties from source code to object code. • Check integrity of compiler. • Exploit types during code generation.

Certifying Compilers • Types on object code certify its safety. • Type checking ensures safety. • Type information ensures verifiability. • Examples of certified object code: • TAL = typed assembly language. • PCC = bare code + proof of safety. • Many variations are being explored.

TAL Example fact: 8.{r1:int, sp:{r1:int, }::} jgz r1, positive mov r1,1 ret positive: push r1 ; sp:int::{t1:int,sp:}:: sub r1,r1,1 call fact[int::{r1:int,sp:}::] imul r1,r1,r2 pop r2 ; sp:{r1:int,sp:}::ret

Types for Low-Level Languages • What is a good type system for a low-level language? • Should expose data representations. • Should allow for low-level “hacks”. • Should be verifiably safe. • Should not compromise efficiency. • Current systems make serious compromises. • Very weak safety properties. • Force atomicity of complex operations.

Example: Memory Allocation • Most type systems take an atomic view of constructors. • Allocate and initialize in “one step”. • Even HLL’s like Java impose restrictions. • We’d like to expose the “fine structure”. • Support code motion such as coalescing. • Allow incremental initialization. • But nevertheless ensure safety!

Example: Memory Allocation • An allocation protocol (used in TILT): • Reserve: obtain raw, un-initialized space. • Initialize: assign values to the parts. • Allocate: baptize as a valid object. • Current type systems cannot handle this. • Partially initialized objects. • In-place modification of parts. • Interaction with collected heap.

HP Example: Memory Allocation Heap ? 1 ? 2 ? 0 ? AP LP AP HP

A Low-Level Type System • Borrow two ideas from linear logic. • Restricted and unrestricted variables. • A modality to mediate between them. • Restricted variables are resources. • Think: pointer into middle of an object. • Unrestricted variables are standard. • Think: heap pointer.

A Low-Level Type System • Variables are bound to validobjects. • Can be used freely. • Garbage collected when inaccessible. • Resources are bound to parts of objects-in-creation. • Cannot be passed around freely. • Explicitly allocated and disposed.

Restrictions on Resources • Linearity:use resources exactly once. • Admits re-typing after initialization. • Ensure allocation before general usage. • Ordering:resource adjacency matters. • Admit “pointers into the middle” of objects. • Supports in-place, piecemeal initialization.

Variables and Resources • Typing judgments: • Ordering of x’s does not matter. • Abstract “mobile” locations. • Ordering and usage of a’s does matter. • Abstract “pinned” locations, with (a form of) pointer arithmetic.

Low-Level Type Constructors • Contiguous data: 1²2. • Two contiguous values. • Two adjacent words: int ² int. • Mobile data object: !. • A fully initialized object of type . • Example: 1£2 := ! (1²2). • A pointer to an adjacent pair of values.

Allocating a Pair Create names for parts. Resource a is used up! Initialize a1, using it up. Reserve space at a. Re-introduce a1. • Allocate (1,2): Fuse parts and allocate. Must be empty on return.

What Have We Gained? • The ordered type system ensures: • All reserved data is eventually allocated. • Initialization can happen in any order. • Cannot read un-initialized memory. • May also be used for inter-operability. • Adjacency requirements. • Marshalling and un-marshalling. • Precise control over representations.

Types for High-Level Languages • Recall: types express invariants. • Calling sequence, data representation. • Abstraction boundaries (polymorphism). • Theme: capture more invariants. • Representation invariants. • Protocol compliance. • Trade-off expressiveness for simplicity and convenience.

Data Structure Invariants • Example: bit strings. • nat: no leading 0’s. • pos : a non-zero Nat. • bits : any old thing. • Goal: check such properties of code. • Incr takes nat to pos, preserves pos. • Left shift preserves nat, pos.

Data Structure Invariants • Properties of interest: • pos ` nat ` bits • Operations: • : bits • add0 : bits ! bits Æ nat ! pos • add1 : bits ! bits Æ nat ! pos

Data Structure Invariants • Logical consequences: • add0 : nat ! pos Æ nat ! nat • add1 : pos ! pos • : bits • Type check code to check invariants! • Simple bi-directional algorithm suffices. • Example: increment.

Data Structure Invariants • Increment:inc: bits ! bits Æ nat ! posfun inc () )1 | inc (b0)) b1 | inc (b1)) (inc b)0

Data Structure Invariants • Fully mechanically checkable! • Type check inc twice, once for each conjunct. • First pass: assume argument is just bits, derive that result is therefore bits. • Second pass: assume argument is nat, derive that result is pos. • Requires checking entailment of properties. • Decidable for subtype-like behavior.

Value Range Invariants • Array bounds checking (a la Pascal): • [0..size(A)] • Null and non-null objects: • null, really(C) • Aliasing: • its(c): the object c itself

Watch Out! • Such types involve dynamic entities. • [0..size(A)] : A is an array. • its(o) : o is a run-time object. • But types are static! • What is an expression of type[0..size(if … then A else B)]??? • How to get static checking?

Dependent Types • Solution: compile-time proxies. • Track values insofar as possible. • Existentially quantify when its not. • Examples: • 0 : its(0) • + : its(1) £ its(2) ! its(4) • if … then 1 else 2 : 9 n its(n)

Types for the World • What about the state of the world? • The lock l is held. • The file f is open. • The contents of cell c is positive. • But such properties change as execution proceeds! • Here I’m holding the lock, there I’m not. • The file is open now, not later.

Types for the World • Want a simple, checkable type system for the “world”. • Types characterize the current state. • Types of the state change over time. • But what kind of type system? • Properties of the world are ephemeral. • Facts are no longer persistent. • Need: a logic of ephemera.

Types for the World • Ephemera behave strangely: • Cannot be replicated: holding twice is not the same as holding once. • Cannot be ignored: must not lose track of whether you hold a lock. • Once again linear logic does the trick! • Model parts of the world as resources. • Type changes track state changes.

Types for the World • A very simple locking protocol: • acquire :8 l its(l) / free(l) ! unit / held(l) • release : 8 l its(l) / held(l) ! free(l) • new : unit !9 l unit / free(l) Value / World A “linear” function type. Gets “used up”. “Replaces” premise.

Types for the World • What does this buy you? • Stating and enforcing simple protocol properties. • Locality of reasoning: focus only on what changes, not what stays the same. • It’s much harder than it sounds! • Separation logic: Reynolds and O’Hearn. • But it can be done: Vault Project at MSR.

Summary • PL research is providing creative, useful solutions to practical problems. • Building on decades of fundamental research, esp. in logic and type theory. • There is much, much more coming! • Many ripe fruit, lots of new developments. • Many good problems yet to be addressed.

Acknowledgements • Much of the work discussed here was done by or in collaboration withKarl Crary (CMU), Frank Pfenning (CMU), David Walker (Princeton), Greg Morrisett (Cornell). • Plus dozens of our current and former students at Carnegie Mellon.

Programming Languages: The Essence of Computer Science

Programming Languages: The Essence of Computer Science

Presentation Transcript

Lab 0

Chapter 13 Programming Languages and Program Development

Programming Languages and Lifecycle Cost

Python Programming: An Introduction To Computer Science

Languages and Compilers (SProg og Oversættere)

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science

CSEP505: Programming Languages Lecture 6: Types, Types, and Subtypes

Molecular Programming

Advanced C #

Advanced C #

University of Macau Faculty of Science and Technology Computer and Information Science

C ++ Programming Languages

The Evolution of Major Programming Languages

Software II: Principles of Programming Languages

Logic Programming

ICOM 4036 Structure and Properties of Programming Languages Lecture 1 Prof. Bienvenido Velez

A Programming-Languages View of Data Races

CNT 4714: Enterprise Computing Fall 2014 Programming Multithreaded Applications in Java

COMPUTER GRAPHICS

MT311 Java Programming and Programming Languages