1 / 39

XM λ

XM λ. Contents. What is the problem? Hosoya’s approach Shields’ approach XM Lambda and the UHConclusion. What is the problem?. XML, a standard language of first-order, tree-like datatypes

hina
Download Presentation

XM λ

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XMλ

  2. Contents • What is the problem? • Hosoya’s approach • Shields’ approach • XMLambda and the UHConclusion

  3. What is the problem? XML, a standard language of first-order, tree-like datatypes XML works well for describing static documents, but documents are typically dynamic, generated by a server Implementing a server for dynamic documents in conventional languages is hard: • no direct support for XML or scripting language syntax • no compile-time checks to ensure valid documents Can custom languages developed for XML be embedded as combinatory libraries within a Haskell-like language?

  4. XML element Msg = ( ( (To|Bcc)* & From), Body) element To = String element Bcc = String element From = String element Body = P* element P = String <Msg> <To>jrommes@cs.uu.nl</To> <Bcc>doaitse@cs.uu.nl</Bcc> <From>joep@geevers.com</From> <Body> <P>Our presentation is finished!</P> </Body> </Msg>

  5. XML element Msg = ( ( (To|Bcc)* & From), Body) element To = String element Bcc = String element From = String element Body = P* element P = String | : union * : sequence & : unordered tuple , : ordered tuple

  6. What we are looking for: XML → Functional Program. document-type definition → type definitions Regular expression → type element → term Document validation → type checking

  7. Possible solutions • Using a universal datatype Data Element = Atom String | Node String (List Element)

  8. Data Element = Atom String | Node String (List Element) Node “Msg” [ Node “To” [Atom “jrommes@cs.uu.nl”], Node “Bcc [Atom “doaitse@cs.uu.nl”], Node “From” [Atom “joep@geevers.com”], Node “Body” [ Node “P” [Atom “Our...”] ] ] No validation possible

  9. Possible solutions • Using a universal datatype • Using a newtype declarations Newtype Msg = Msg (List (Either To Bcc), From, Body ) Newtype From = From String Newtype To = To String Newtype Bcc = Bcc String Newtype Body = List P Newtype P = P String

  10. Newtype Msg = Msg (List (Either To Bcc), From, Body Newtype From = From String Newtype To = To String Newtype Bcc = Bcc String Newtype Body = List P Newtype P = P String Msg ( [ Left ( To “jrommes@cs.uu.nl”), Right ( Bcc “doaitse@cs.uu.nl”), From “joep@geevers.com”, Body [ P “Our...” ] ) Sound, but not complete.

  11. Possible solutions • Using a universal datatype • Using a newtype declarations • Using regular expression types as primitive Hosoya

  12. Possible solutions • Using a universal datatype • Using a newtype declarations • Using regular expression types as primitive • Using Type-Indexed rows Shields

  13. Hosoya’s approach

  14. Why Regular Expression Types? • Static typechecking: generated XML documents conform to DTD • Or: invalid documents can never arise • For example: A <table> must have at least one <tr>

  15. Why Regular Expression Patterns? • Convenient programming constructs for manipulating documents • For instance, jump over arbitrary length data and extract specific data: type Person = person[Name,Email*,Tel?] match p with person[Name ,Email+ ,Tel ] -> … …

  16. XDuce: Values • Primitives represent XML documents (trees) • For example: person[name[“Joep”] ,email[“Joep@geevers.com”]] • I.e. a value is a sequence of nodes

  17. XDuce: Regular Expression Types • Types correspond to document schemas • Familiar XML regular expressions: type Tel = tel[String] type Tels = Tel* type Recip = Bcc|Cc (Name, Tel*), Addr T? = T|() T+ = T,T*

  18. Subtyping • Many algebraic laws: • Associativity of concatenation and union: A|(B|C) (A|B)|C • Commutativity of union: A|B  B|A • These laws are crucial for XML processing, but lead to complicated specification

  19. Subtyping • Subtyping as set inclusion • First define which values belong to type • One type is a subtype of another if the former denotes a subset of the latter • For example: (Name*, Tel*) <: (Name|Tel)*

  20. Pattern Matching: Exhaustiveness type Person = person[Name,Email*,Tel?] match p with person[Name,Email+,Tel?] -> … person[Name,Email*,Tel] -> … • Not exhaustive • Use subtyping to check: the input type must be a subtype of the union of the pattern types

  21. Pattern Matching: Irredundancy match p with person[Name,Email*,Tel?] -> … person[Name,Email+,Tel] -> … • Second clause redundant • A clause is redundant iff all the input values that can be matched by the pattern can also be matched by preceding patterns

  22. Pattern Matching: Type Inference type Name = name[String] match (ps as Person*) with person[name[val n as String] ,Email*,Tel?] ,rest -> … • Avoid excessive type annotations • Use input type and pattern to infer types of • bare variables (rest) • bound variables (n)

  23. Functions • First-order functions (explicitly typed): fun f(P):T = e • For example: fun tels(val ps as Person*):Tel* = match ps with person[Name,Email*,tel[val t]],rest -> tel[t],tels(rest) person[Name,Email*],rest -> tels(rest)

  24. Higher-order Functions • Functions as first-class citizen • Why desireable? • Abstraction • Not supported by XDuce • What is needed? • Subtyping for arrow types • So why not support higher-order functions?

  25. Higher-order Functions • Function definitions given by fixed set G • G is used in T-APP (instead of standard rule) • Consequence: T-ABS fails • Fix: redefine T-APP • Type annotations needed for check of pattern match

  26. Parametric Polymorphism • Generic typing using vars instead of actual types • Why desireable? • Abstraction from structure of problem • What is needed? • Type abstraction • Type application • So why no parametric polymorphism?

  27. Parametric Polymorphism • Problems: forall X . (U|X) -> (T|X) • Pattern matching problems: • Exhaustiveness / irredundancy checks • Type inference • Typing constraints cannot be represented forall X {U,T}.(U|X) -> (T|X)

  28. Conclusions • Typed language with XML docs as primitive values • Regular expression types are fundamental • Regular expression pattern matching • No higher-order functions • No parametric polymorphism

  29. Shields’ approach “It is required that content models in element type declarations be deterministic” Consequence 1: regular expressions must be 1-unambiguous Unions and unordered tuples are formed from distinct members. ( ( To , Bcc ) & (Bcc, To) ) is 1-unambiguous ( (Bcc, To) & Bcc ) is not ( (To | Bcc) & Bcc ) is not

  30. Shields’ approach “It is required that content models in element type declarations be deterministic” Consequence 2: possible to transform any XML element into a term: * sequence list , tuple tuple | union → type-indexed sum & unordered tuple → type-indexed product | and & are both formed from Type-Index Rows

  31. Type-Indexed Rows A type-indexed row is a list of types Type constructors • Empty: Row • (_#_): Type → Row → Row For example: (Int # Bool # Empty)

  32. Type-indexed product TIP: • (All _): Row → Type • Type-indexed coproduct TIC: • (One _): Row → Type

  33. Insertion Constraints Insertion constraints used to guarantee distinctness of elements: a ins (Int # Bool # Empty) constrains a to be any other than Int or Bool (List b) ins (Int # Bool # Empty) Is True

  34. Type-indexed product TIP: • Triv: All Empty • (_ && _): extension forall (a: Type) (b: Row) . a ins b => a → All b → All (a#b) • Type-indexed coproduct TIC: • (Inj _): injection forall (a: Type) (b: Row) . a ins b => a → One (a#b)

  35. Let tuple = \(x && y && Triv) . (x, y)In tuple (True && 1 && Triv) Type checking: Unify All(x#y#Empty) and All(Int#Bool#Empty) Under constraint: x ins (y#Empty) Overall term has type (Int, Bool)or(Bool, Int) !

  36. Equality constraints ( c # d # Empty ) eq ( Int # Bool # Empty ) Propagates until sufficient information is found to be simplified

  37. Simplifying constraints • Simple unification: (a → Int) eq (Bool → b) a eq Bool, Int eq b • Row unification: (Int # a # Empty) eq (Bool # b # Empty) (Int eq b), (a # Empty) eq (Bool # Empty) • insertion: (a,b) ins (Bool # c # Empty) (a,b) ins (c # Empty)

  38. Introducing fresh typenames • Monomorphic: newtype xCoord = Int All (xCoord # Int # Empty) • Polymorphic: newtype xCoord = \ (a:Type).a Allows same newtypes within a record !! Introduction opaque newtypes Type arguments are ignored in insertion constraints : newtype opaque xCoord = \(a:Type).a

  39. XMLambda and UHConclusion • Why regular expression types (Hosoya)? • Fundamental regular expression types • Powerful pattern matching • No higher order functions and polymorphism • Subtyping and parametric polymorphism? • Why type indexed rows (Shields)? • Flexibility: more general than regular expression types • All nice characteristics of FP • Constraint system?

More Related