- 112 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Refactoring Functional Programs' - indira-duncan

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Refactoring Functional Programs

Simon Thompson

with

Huiqing Li

Claus Reinke

www.cs.kent.ac.uk/projects/refactor-fp

Overview

- Review mini-project.
- Implementation of HaRe.
- Larger-scale examples.
- Case study.

Mini-project feedback

- Refactorings performed.
- Refactorings and language features?
- Machine support feasible? Useful?
- ‘Not-quite’ refactorings? Support possible here?

Examples

- Argument permutations (NB partial application).
- (Un)group arguments.
- Slice function for a component of its result.
- Error handling / exception handling.

More examples

- Introduce type synonym, selectively.
- Introduce ‘branded’ type.
- Modify the return type of a function fromTtoMaybeT, Either T S, [T].
- Ditto for input types … and modify variable names correspondingly.

Proof of concept …

- To show proof of concept it is enough to:
- build a stand-alone tool,
- work with a subset of the language,
- pretty print the results of refactorings.

… or a useful tool?

- Integrate with existing program development tools: stand-alone program links to editors emacs and vim, any other IDEs also possible.
- Work with the complete language: Haskell 98?
- Preservetheformattingand commentsin the refactored source code.
- Allow users to extend and script the system.

The refactorings in HaRe

- Rename
- Delete
- Lift / Demote
- Introduce definition
- Remove definition
- Unfold
- Generalise
- Add / remove params

Move def between modules

Delete /add to exports

Clean imports

Make imports explicit

Data type to ADT

All these refactorings are module aware.

The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

Information needed

- Syntax: replace the function called sq, not the variable sq…… parse tree.
- Static semantics: replace this function sq, not all the sqfunctions …… scope information.
- Module information: what is the traffic between this module and its clients …… call graph.
- Type information: replace this identifier when it is used at this type …… type annotations.

Infrastructure: decisions

- Build a tool that caninteroperatewith emacs, vim, … yet actseparately.
- Leverage existing libraries for processing Haskell 98, for tree transformation … as few modifications as possible.
- Be as portable as possible, in the Haskell space.
- Abstract interface to compiler internals?

Haskell landscape (end 2002)

- Parser: many
- Type checker: few
- Tree transformations: few
- Difficulties
- Haskell 98 vs. Haskell extensions.
- Libraries: proof of concept vs. distributable.
- Source code regeneration.
- Real project

Programatica

- Project at OGI to build a Haskell system …
- … with integral support for verification at various levels: assertion, testing, proof etc.
- The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis …
- … freely available under BSD licence.

The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

First steps … lifting and friends

- Use the Haddock parser … full Haskell given in 500 lines of data type definitions.
- Work by hand over the Haskell syntax: 27 cases for expressions …
- Code for finding free variables, for instance …

Finding free variables ‘by hand’

- instance FreeVbls HsExp where
- freeVbls (HsVar v) = [v]
- freeVbls (HsApp f e)
- = freeVbls f ++ freeVbls e
- freeVbls (HsLambda ps e)
- = freeVbls e \\ concatMap paramNames ps
- freeVbls (HsCase exp cases)
- = freeVbls exp ++ concatMap freeVbls cases
- freeVbls (HsTuple _ es)
- = concatMap freeVbls es
- … etc.

This approach

- Boilerplate code … 1000 lines for 100 lines of significant code.
- Error prone: significant code lost in the noise.
- Want to generate the boiler plate and the tree traversals …
- … DriFT: Winstanley, Wallace
- … Strafunski: Lämmel and Visser

Strafunski

- Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points.
- Top-down / bottom up, type preserving / unifying,

full

stop

one

Strafunski in use

- Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, …
- Strafunski allows us to work within Haskell …
- Other options? Generic Haskell, Template Haskell, AG, …

Rename an identifier

- rename:: (Term t)=>PName->HsName->t->Maybe t
- rename oldName newName =applyTPworker
- where
- worker =full_tdTP(idTP‘adhocTP‘idSite)
- idSite :: PName -> Maybe PName
- idSite v@(PN name orig)
- | v == oldName
- = return (PN newName orig)
- idSite pn = return pn

The coding effort

- Transformations: straightforward in Strafunski …
- … the chore is implementing conditions that the transformation preserves meaning.
- This is where much of our code lies.

Move f from module A to B

- Is fdefined at the top-level of B?
- Are the free variables in faccessible within module B?
- Will the move require recursive modules?
- Remove the definition of f from module A.
- Add the definition to module B.
- Modify the import/export lists in module A, Band the client modules ofAandB if necessary.
- Change uses of A.f to B.f or f in all affected modules.
- Resolve ambiguity.

The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

Program rendering example

- -- This is an example
- module Main where
- sumSquares x y = sq x + sq y
- where sq :: Int->Int
- sq x = x ^ pow
- pow = 2 :: Int
- main = sumSquares 10 20
- Promote the definition of sq to top level

Program rendering example

- module Main where
- sumSquares x y
- = sq powx + sq powy where pow = 2 :: Int
- sq :: Int->Int->Int
- sq pow x = x ^ pow
- main = sumSquares 10 20
- Using a pretty printer: comments lost and layout quite different.

Program rendering example

- -- This is an example
- module Main where
- sumSquares x y = sq x + sq y
- where sq :: Int->Int
- sq x = x ^ pow
- pow = 2 :: Int
- main = sumSquares 10 20
- Promote the definition of sq to top level

Program rendering example

- -- This is an example
- module Main where
- sumSquares x y = sq powx + sq powy
- where pow = 2 :: Int
- sq :: Int->Int->Int
- sq pow x = x ^ pow
- main = sumSquares 10 20
- Layout and comments preserved.

Token stream and AST

- White space and comments in the token stream.
- Modification of the AST guides the modification of the token stream.
- After a refactoring, the program source is extracted from the token stream not the AST.
- Heuristics associate comments with program entities.

Production tool

Programatica

parser and

type checker

Refactor

using a

Strafunski

engine

Render code

from the

token stream

and

syntax tree.

Production tool (optimised)

Programatica

parser and

type checker

Refactor

using a

Strafunski

engine

Render code

from the

token stream

and

syntax tree.

Pass lexical

information to

update the

syntax tree

and so avoid

reparsing

What have we learned?

- Emerging Haskell libraries make it practical(?)
- Efficiency and robustness
- type checking large systems,
- linking,
- editor script languages (vim, emacs).
- Limitations of editor interactions.
- Reflections on Haskell itself.

Reflections on Haskell

- Cannot hide items in an export list (cf import).
- Field names for prelude types?
- Scoped class instances not supported.
- ‘Ambiguity’ vs. name clash.
- ‘Tab’ is a nightmare!
- Correspondence principle fails …

Correspondence

- Operations on definitions and operations on expressions can be placed in one to one correspondence
- (R.D.Tennent, 1980)

Definitions

where

f x y = e

f x

| g1 = e1

| g2 = e2

Expressions

let

\x y -> e

f x = if g1 then e1 else if g2 … …

Correspondencef x

| g1 = e1

f x

| g2 = e2

Can ‘fall through’ a function clause … no direct correspondence in the expression language.

f x = if g1 then e1 else if g2 …

No clauses for anonymous functions … no reason to omit them.

Function clausesWork in progress

- ‘Fold’ against definitions … find duplicate code.
- All, some or one? Effect on the interface …
- f x = … e … e …
- Traditional program transformations
- Short-cut fusion
- Warm fusion

Where next?

- Opening up to users: API or little language?
- Link with other IDEs (and front ends?).
- Detecting ‘bad smells’.
- More useful refactorings supported by us.
- Working without source code.

Larger-scale examples

- More complex examples in the functional domain; often link with data types.
- Dawning realisation that can some refactorings are pretty powerful.
- Bidirectional … no right answer.

Algebraic or abstract type?

- data Tr a
- = Leaf a |
- Node a (Tr a) (Tr a)

- flatten :: Tr a -> [a]
- flatten (Leaf x) = [x]
- flatten (Node s t)
- = flatten s ++
- flatten t

- Tr
- Leaf
- Node

Algebraic or abstract type?

- Tr
- isLeaf
- isNode
- leaf
- left
- right
- mkLeaf
- mkNode

- data Tr a
- = Leaf a |
- Node a (Tr a) (Tr a)
- isLeaf = …
- isNode = …
- …

- flatten :: Tr a -> [a]
- flatten t
- | isleaf t = [leaf t]
- | isNode t
- = flatten (left t)
- ++ flatten (right t)

Pattern matching syntax is more direct …

… but can achieve a considerable amount with field names.

Other reasons? Simplicity (due to other refactoring steps?).

Allows changes in the implementation type without affecting the client: e.g. might memoise

Problematic with a primitivetype as carrier.

Allows an invariant to be preserved.

Algebraic or abstract type?Outside or inside?

- Tr
- isLeaf
- isNode
- leaf
- left
- right
- mkLeaf
- mkNode

- data Tr a
- = Leaf a |
- Node a (Tr a) (Tr a)
- isLeaf = …
- isNode = …
- …

- flatten :: Tr a -> [a]
- flatten t
- | isleaf t = [leaf t]
- | isNode t
- = flatten (left t)
- ++ flatten (right t)

Outside or inside?

- Tr
- isLeaf
- isNode
- leaf
- left
- right
- mkLeaf
- mkNode
- flatten

- data Tr a
- = Leaf a |
- Node a (Tr a) (Tr a)
- isLeaf = …
- isNode = …
- flatten t = …

If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten.

The more outside the better, therefore.

If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type.

Layered types possible: put the utilities in a privileged zone.

Outside or inside?Memoise flatten :: Tr a->[a]

- data Tree a
- = Leaf { val::a } |
- Node { val::a,
- left,right::(Tree a) }
- leaf = Leaf
- node = Node
- flatten (Leaf x) = [x]
- flatten (Node x l r) =
- (x : (flatten l ++ flatten r))

- data Tree a
- = Leaf { val::a,
- flatten:: [a] } |
- Node { val::a,
- left,right::(Tree a),
- flatten::[a] }
- leaf x = Leaf x [x]
- node x l r
- = Node x l r (x : (flatten l ++
- flatten r))

Memoise flatten

- Invisible outside the implementation module, if tree type is already an ADT.
- Field names in Haskell make it particularly straightforward.

Data type or existential type?

data Shape data Shape

= Circle Float | = forall a. Sh a => Shape a

Rect Float Float

class Sh a where

area :: Shape -> Float area :: a -> Float

area (Circle f) = pi*r^2 perim :: a -> Float

area (Rect h w) = h*w

data Circle = Circle Float

perim :: Shape -> Float

perim (Circle f) = 2*pi*r instance Sh Circle

perim (Rect h w) = 2*(h+w) area (Circle f) = pi*r^2

perim (Circle f) = 2*pi*r

data Rect = Rect Float

instance Sh Rect

area (Rect h w) = h*w

perim (Rect h w) = 2*(h+w)

Constructor or constructor?

data Expr data Expr

= Epsilon | .... | = Epsilon | .... |

Then Expr Expr | Then Expr Expr |

Star Expr Star Expr |

Plus Expr

plus e = Then e (Star e)

Monadification: expressions

data Expr

= Lit Integer | -- Literal integer value

Vbl Var | -- Assignable variables

Add Expr Expr | -- Expression addition: e1+e2

Assign Var Expr -- Assignment: x:=e

type Var = String

type Store = [ (Var, Integer) ]

lookup :: Store -> Var -> Integer

lookup st x = head [ i | (y,i) <- st, y==x ]

update :: Store -> Var -> Integer -> Store

update st x n = (x,n):st

Monadification: evaulation

eval :: Expr -> evalST :: Expr ->

Store -> (Integer, Store) State Store Integer

eval (Lit n) st evalST (Lit n)

= (n,st) = do

return n

eval (Vbl x) st evalST (Vbl x)

= (lookup st x,st) = do

st <- get

return (lookup st x)

Monadification: evaulation 2

eval :: Expr -> evalST :: Expr ->

Store -> (Integer, Store) State Store Integer

eval (Add e1 e2) st evalST (Add e1 e2)

= (v1+v2, st2) = do

where v1 <- evalST e1

(v1,st1) = eval e1 st v2 <- evalST e2

(v2,st2) = eval e2 st1 return (v1+v2)

eval (Assign x e) st evalST (Assign x e)

= (v, update st' x v) = do

where v <- evalST e

(v,st') = eval e st st <- get

put (update st x v)

return v

Classes and instances

- Type Store = [Int]
- empty :: Store
- empty = []
- get :: Var -> Store -> Int
- get v st = head [ i | (var,i) <- st, var==v]
- set :: Var -> Int -> Store -> Store
- set v i = ((v,i):)

Classes and instances

- Type Store = [Int]
- empty :: Store
- get :: Var -> Store -> Int
- set :: Var -> Int -> Store -> Store
- empty = []
- get v st = head [ i | (var,i) <- st, var==v]
- set v i = ((v,i):)

Classes and instances

- class Store a where
- empty :: a
- get :: Var -> a -> Int
- set :: Var -> Int -> a -> a
- instance Store [Int] where
- empty = []
- get v st = head [ i | (var,i) <- st, var==v]
- set v i = ((v,i):)
- Need newtype wrapper in Haskell 98 …

end

Understanding a program

- Take a working semantic tableau system written by an anonymous 2nd year student …
- … refactor to understand its behaviour.
- Nine stages of unequal size.
- Reflections afterwards.

Built-in types

[Prop]

[[Prop]]

used for branches and tableaux respectively.

Modify by adding

type Branch = [Prop]

type Tableau = [Branch]

Change required throughout the program.

Simple edit: but be aware of the order of substitutions: avoid

type Branch = Branch

v1: Name typesExisting names

tableaux

removeBranch

remove

become

tableauMain

removeDuplicateBranches

removeBranchDuplicates

and add comments clarifying the (intended) behaviour.

Add test datum.

Discovered some edits undone in stage 1.

Use of the type checker to catch errors.

test will be useful later?

v2: Rename functionsChange from literate form:

Comment …

> tableauMain tab

> = ...

to

-- Comment …

tableauMain tab

= ...

Editing easier: implicit assumption was that it was a normal script.

Could make the switch completely automatic?

v3: Literate normal scriptFrom explicit recursion:

displayBranch

:: [Prop] -> String

displayBranch [] = []

displayBranch (x:xs)

= (show x) ++ "\n" ++

displayBranch xs

to

displayBranch

:: Branch -> String

displayBranch

= concat . map (++"\n") . map show

Abstraction: move from explicit list representation to operations such as map and concat which could be over any collection type.

First time round added incorrect (but type correct) redefinition … only spotted at next stage.

Version control: un/redo etc.

v4: Modify function definitionsremoveBranchDup :: Branch -> Branch

removeBranchDup [] = []

removeBranchDup (x:xs)

| x == findProp x xs = [] ++ removeBranchDup xs

| otherwise = [x] ++ removeBranchDup xs

findProp :: Prop -> Branch -> Prop

findProp z [] = FALSE

findProp z (x:xs)

| z == x = x

| otherwise = findProp z xs

v5: Algorithms and types (1)removeBranchDup :: Branch -> Branch

removeBranchDup [] = []

removeBranchDup (x:xs)

| findProp x xs = [] ++ removeBranchDup xs

| otherwise = [x] ++ removeBranchDup xs

findProp :: Prop -> Branch -> Bool

findProp z [] = False

findProp z (x:xs)

| z == x = True

| otherwise = findProp z xs

v5: Algorithms and types (2)removeBranchDup :: Branch -> Branch

removeBranchDup = nub

findProp :: Prop -> Branch -> Bool

findProp = elem

v5: Algorithms and types (3)removeBranchDup :: Branch -> Branch

removeBranchDup = nub

Fails the test! Two duplicate branches output, with different ordering of elements.

The algorithm used is the 'other' nub algorithm, nubVar:

nub [1,2,0,2,1] = [1,2,0]

nubVar [1,2,0,2,1] = [0,2,1]

Code using lists in a particular order to represent sets.

v5: Algorithms and types (4)Add the definition:

nubVar = …

to the module

ListAux.hs

and replace the definition by

import ListAux

Editing easier: implicit assumption was that it was a normal script.

Could make the switch completely automatic?

v6: Library function to moduleRemanings: including foo and bar and contra (becomes notContra).

An instance of filter,

looseEmptyLists

is defined using filter, and subsequently inlined.

Put auxiliary function into a where clause.

Generally cleans up the script for the next onslaught.

v7: HousekeepingsplitNotNot :: Branch -> Tableau

splitNotNot ps = combine (removeNotNot ps) (solveNotNot ps)

removeNotNot :: Branch -> Branch

removeNotNot [] = []

removeNotNot ((NOT (NOT _)):ps) = ps

removeNotNot (p:ps) = p : removeNotNot ps

solveNotNot :: Branch -> Tableau

solveNotNot [] = [[]]

solveNotNot ((NOT (NOT p)):_) = [[p]]

solveNotNot (_:ps) = solveNotNot ps

v8: Algorithm (1)splitXXX removeXXX solveXXX for each of nine rules.

The algorithm applies rules in a prescribed order, using an integer value to pass information between functions.

Aim: generic versions of splitremovesolve

Change order of rule application … effect on duplicates.

Add map sort to top level pipeline before duplicate removal.

v8: Algorithm (2)Wholesale replacement of lists by a Set library.

map mapSet

foldr foldSet (careful!)

filter filterSet

The library exposes the representation: pick, flatten.

Use with discretion … further refactoring possible.

Library needed to be augmented with

primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b

v9: Replace lists by sets.Drastic simplification: no explicit worries about

… ordering (and equality), (removal of) duplicates.

Hard to test intermediate stages: type change is all or nothing …

… work with dummy definitions and the type checker.

Further opportunities: why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).

v9: Replace lists by sets (2)Conclusions of the case study

- Heterogeneous process: some small, some large.
- Are all these stages strictly refactorings: some semantic changes always necessary too?
- Importance of type checking for hand refactoring … … and testing when any semantic changes.
- Undo, redo, reordering the refactorings … CVS.
- In this case, directional … not always the case.

Teaching and learning design

- Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.
- Learning a language: learn how you could modify the programs that you have written …
- … appreciate the design space, and
- … the features of the language.

Conclusions

- Refactoring + functional programming: good fit.
- Real benefit from using available libraries … with work.
- Want to use the tool in building itself.
- Much more to do than we have time for.

Download Presentation

Connecting to Server..