Type Systems for Modularity

Type Systems for Modularity Robert Harper Fall Semester, 2002

This Course • Graduate reading seminar. • Pre-requisites: PL core or permission. • Experience with SML or O’Caml. • Web page:http://www.cs.cmu.edu/~rwh/courses/module systems • Course format: • Students present papers. • I will go first.

This Course • Expectations: • The class works iff everyone contributes equally and actively. • It is imperative that you come to class prepared by having read the relevant papers for that day. • Offenders will fail the course. • If you don’t have time, don’t take the course.

Goals • Goal: develop the type theory of module systems. • Surprisingly intricate! • Confluence of many important issues. • Methodology: typed l-calculus. • Declarative type system. • Scalable. • Supports type-based implementations.

Name Space Management • Divide program namespace into segments. • Avoid cluttering global name space. • Facilitate team development and code re-use. • Generalization of lexical scoping. • Name resolution relative to a module.

Namespace Management • Supported by most languages: • Packages in Java, Lisp. • Modules in Modula-2, -3. • Structures in ML.

Namespace Management structure Url = structval push = …val pull = … end structure Stack = structval push = … end

Namespace Management structure S = structtype elt = …val enq = …val deq = …val null = … end

Types and Values • Structures in ML contain type definitions and value definitions. • Functions, exceptions are certain values. • Datatype’s are modules. • Constituents are accessed by paths. • S.x the value component x of structure S • S.t the type component t of structure S • Types now involve modules!

Hierarchical Structure • Divide large programs into relatively independent “chunks”. • Structure programs as trees or dag’s of components. • Isolate components within other components to enforce locality.

Hierarchical Structure structure Thread = structstructure TQueue = struct val deq = …endval yield = …val spawn = … end

Hierarchical Structure • Paths are extended to navigate through sub-structures: • S.T.U.x the value x in U of T of S • S.T.t the type t in T of S • Paths are also called long names.

Interface Ascription • Ascription = associating an interface with a module. • Descriptive: characterize visible properties of a module. • Just a “sanity check”. • Prescriptive: characterize and limit visible properties of a module. • Imposes restrictive “view”.

Interface Matching • When does one interface match another? • When does I satisfy J’s requirements? • Structural: interfaces match based on strength of requirements. • To match means to fulfill requirements. • Flexible, supports re-use. • But can incur “unintended” matches.

Interface Matching • Nominal: interfaces match only if explicitly declared to do so. • Inflexible, requires revision. • But avoids accidental matches. • Structural seems more practical. • Cannot anticipate the future.

Java Interfaces • Java interfaces are nominal, descriptive. • Must explicitly declare the interfaces a class can have. • Interfaces describe behavior, not code. • Java interfaces are not types. • Really “fully abstract” classes with special hacks for “multiple inheritance”. • Fundamentally a kludge.

ML Signatures • ML signatures are structural, prescriptive. • Signatures require that a module have certain components with specified properties. • Types of values. • Type equality relationships. • Signature ascription limits the client’s view of a module to what is specified. • Transparent: propagate type definitions. • Opaque: no implicit propagation.

ML Signatures signature SIG = sigtype tval trans : t -> tval out : t -> intval in : int -> t end structure S : SIG = structtype t = int*intfun trans(x,y) = (y+1,x)fun out(x,y) = yfun in(x) = (x,0)fun f(x,y) = (x+1,y+1) end

ML Signatures signature SIG = sigtype tval trans : t -> tval out : t -> intval in : int -> t end structure S :> SIG = structtype t = int*intfun trans(x,y) = (y+1,x)fun out(x,y) = yfun in(x) = (x,0)fun f(x,y) = (x+1,y+1) end

Principal Signatures • The principal signature of a module completely characterizes the compile-time significance of that module. • Every other signature is a weakening of the principal signature. • Type checkers work by computing principal signatures. • It is never obvious whether a module system has principal signatures (often they do not).

Signature Matching • ML defines a sub-signature pre-ordering S·S’ stating that S is stronger than S’. • At least as many components with the required types. • At least as much sharing. • Ascription: check that the principal signature matches the target signature. • Principal = least wrt to this ordering.

Parameterization • Instantiate generic modules. • Abstract common patterns. • Support reusable libraries. • Separate compilation is a form of parameterization. • Client parameterized on provider.

Parameterization • Few languages, apart from ML, support parameterized modules. • Can be hacked around using class loaders, but not easily or as a linguistic mechanism. • Functors are “first-order” parameterized modules. • Take and yield structures. • Some kind of “function”, but what kind?

ML Functors signature ORDSET = sigtype tval leq : t * t -> bool end functor StringDict(structure Key : ORDSET) = structtype dict = … Key.t …fun insert (x, k, d) = … end

Dependency • Hierarchy and parameterization introduce dependent signatures. • Signature of a structure may refer to types in a substructure. • Result signature of a functor may refer to types in the argument. • Key: Signatures depend on structures.

Dependency signature THREAD = sigtype threadstructure TQ : sig type queue val deq : queue -> threadendval block : thread * TQ.queue -> unit end

Dependency • Result signature of a functor typically depends on the argument. • Eg, key type of dictionary is the carrier of the ordered set. • Need dependent functor signatures to be fully expressive. • Characteristic of ML modules.

Dependency functor Dict(structure Key : ORDSET structure Elt : SET) :>sig type dict val insert : Key.t * Elt.t * dict -> dict … end

Dependency • How to define dependent signatures in isolation? • Contains a “free reference” to the type component of some structure. • Two main solutions: • Parameterized signatures [Haskell]. • Sharing specifications/type definitions [ML]. • Also transparent ascription.

Dependency signature QUEUE = sigtype elttype queueval insert : elt * queue -> unit end signature THREAD = sigtype threadstructure TQ : QUEUE where type elt = thread… end

Dependency signature DICT = sigtype elttype keytype dictval insert : key * elt * dict -> dict… end

Dependency functor Dict(structure Key : ORDSET structure Elt : SET):> DICT where type elt = Elt.t and type key = Key.T = struct … end

Dependency • Using where type is similar to instantiation of a parameterized signature. • A parameterized signature would take types as arguments. • But we avoid having to make a decision in advance about which types are parameters and which are results! • Can change from one situation to the next. • Awkward to work with parameterization and instantiation.

Data Abstraction • Concrete = public representation with no restrictions on use. • Abstract = private representation usable only via specific public operations. • Want programmer control over whether a type is abstract or concrete. • All or none is too inflexible.

Data Abstraction • Abstract types are opaque. • No revelation of type identity. • Concrete types are transparent. • Type identity is revealed. • Translucent = opaque + transparent. • Under programmer control.

Data Abstraction • Type definitions in a signature reveal type identity. • Transparent ascription (:) augments target signature with definitions of all opaque types. • where type adds “new” definitions to an “old” signature. • A form of specification inheritance.

Data Abstraction signature BIGNUM = sigtype Tval from_int : int -> Tval * : T * T -> T… end structure BigNum :> BIGNUM = …

Data Abstraction signature GROUP = sigtype Tval e : Tval inv : T -> Tval * : T * T -> T End structure Z :> GROUP = structtype t = intval e = 0val inv = ~val * = op + end

Data Abstraction signature GROUP_Z =GROUP where type T = int structure Z :> GROUP_Z = … structure Z : GROUP = …

Data Abstraction signature SS_DICT = sigtype elt = stringtype key = stringtype dict… end structure SS_Dict :> SS_DICT = … structure SS_Dict : DICT = …

Two Forms of Sharing • Type sharing: equate abstract types without revealing their representation. • Two “views” of the same abstract type. • Symmetric specification of relationship. • Module sharing: equal modules have the same implementations. • Specify interpretation of a type without imposing abstraction.

Type Sharing signature IPS = sigstructure S : FIELDstructure V : VECTORsharing type S.t = V.scalarval dot : V.t * V.t -> S.t end

Structure Sharing • Originally Standard ML supported structure sharing. • Each structure had a unique identity. • Could insist that two structures have the same identity. • But this was dropped, to simplify the language. • O’Caml never had it.

Separate Compilation • Implementation-on-interface dependency. • Client sees only the interface of the provider. • Recompile client only if interface changes. • Implementation-on-implementation dependency. • Client sees the code of the provider, so cannot be separately compiled • Typified by inheritance: fragile base-class (fbc) problem.

Separate Compilation • Desirable: arbitrary choice of separated or integrated compilation of modules. • “Cleave” the program at any module boundary. • Requirement: fully syntactic interfaces. • Must capture full static significance of a module in an interface. • The principal signature of a module must be syntactically definable in the language.

Incremental Recompilation • On-implementation dependencies and non-syntactic interfaces prohibit separate compilation. • Cannot cut dependency by an interface. • Alternative: incremental recompilation. • Believe the interface, if present. • Elaborate the code if no interface present. • Do the “least” amount of work.

Separate Compilation and Incremental Recompilation • SML/NJ CM supports IR. • CM files describe makeup of system. • Automatic dependency analysis. • Not really possible, so restrictions are made. • TILT and O’Caml support SC and IR. • Cannot always cleave program, but respects a given cleavage.

First- vs Second-Class Modules • First-class is not always better than second class! • Must “assume the worst”, which reduces flexibility and expressiveness. • But fcm’s can be stored away, passed around at will. • No restriction on their role as values. • Can conditionally compute modules based on run-time values.

First- vs Second-Class Modules • Second-class is not always better than first class! • Cannot store modules or create them based on run-time conditions. • But unknown modules are relatively benign. • Stronger type sharing relationships.

The Phase Distinction • Two phases: • Compile time (static) • Run time (dynamic). • Static type checking = type checking without testing code equality. • Essential for decidability. • Otherwise the language “does not respect the phase distinction”.

Type Systems for Modularity

Type Systems for Modularity

Presentation Transcript

Type Systems

Modularity for HPC - WootinJ -

Modularity Clustering

Type Safety for Systems Programming

Type Systems

Type Systems

Modularity

modularity

Type Systems

Modularity

Modularity…

Enforcing Modularity

Enforcing Modularity

Modularity

Modularity

Type Systems

Modularity

Type Systems

Type Systems for Programming Languages