refactoring functional programs n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Refactoring Functional Programs PowerPoint Presentation
Download Presentation
Refactoring Functional Programs

Loading in 2 Seconds...

play fullscreen
1 / 77

Refactoring Functional Programs - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on

Refactoring Functional Programs. Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp. Session 2. Overview. Review mini-project. Implementation of HaRe. Larger-scale examples. Case study. Mini-project feedback. Refactorings performed.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Refactoring Functional Programs' - indira-duncan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
refactoring functional programs

Refactoring Functional Programs

Simon Thompson

with

Huiqing Li

Claus Reinke

www.cs.kent.ac.uk/projects/refactor-fp

overview
Overview
  • Review mini-project.
  • Implementation of HaRe.
  • Larger-scale examples.
  • Case study.
mini project feedback
Mini-project feedback
  • Refactorings performed.
  • Refactorings and language features?
  • Machine support feasible? Useful?
  • ‘Not-quite’ refactorings? Support possible here?
examples
Examples
  • Argument permutations (NB partial application).
  • (Un)group arguments.
  • Slice function for a component of its result.
  • Error handling / exception handling.
more examples
More examples
  • Introduce type synonym, selectively.
  • Introduce ‘branded’ type.
  • Modify the return type of a function fromTtoMaybeT, Either T S, [T].
  • Ditto for input types … and modify variable names correspondingly.
proof of concept
Proof of concept …
  • To show proof of concept it is enough to:
  • build a stand-alone tool,
  • work with a subset of the language,
  • pretty print the results of refactorings.
or a useful tool
… or a useful tool?
  • Integrate with existing program development tools: stand-alone program links to editors emacs and vim, any other IDEs also possible.
  • Work with the complete language: Haskell 98?
  • Preservetheformattingand commentsin the refactored source code.
  • Allow users to extend and script the system.
the refactorings in hare
The refactorings in HaRe
  • Rename
  • Delete
  • Lift / Demote
  • Introduce definition
  • Remove definition
  • Unfold
  • Generalise
  • Add / remove params

Move def between modules

Delete /add to exports

Clean imports

Make imports explicit

Data type to ADT

All these refactorings are module aware.

the implementation of hare
The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

information needed
Information needed
  • Syntax: replace the function called sq, not the variable sq…… parse tree.
  • Static semantics: replace this function sq, not all the sqfunctions …… scope information.
  • Module information: what is the traffic between this module and its clients …… call graph.
  • Type information: replace this identifier when it is used at this type …… type annotations.
infrastructure decisions
Infrastructure: decisions
  • Build a tool that caninteroperatewith emacs, vim, … yet actseparately.
  • Leverage existing libraries for processing Haskell 98, for tree transformation … as few modifications as possible.
  • Be as portable as possible, in the Haskell space.
  • Abstract interface to compiler internals?
haskell landscape end 2002
Haskell landscape (end 2002)
  • Parser: many
  • Type checker: few
  • Tree transformations: few
  • Difficulties
  • Haskell 98 vs. Haskell extensions.
  • Libraries: proof of concept vs. distributable.
  • Source code regeneration.
  • Real project
programatica
Programatica
  • Project at OGI to build a Haskell system …
  • … with integral support for verification at various levels: assertion, testing, proof etc.
  • The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis …
  • … freely available under BSD licence.
the implementation of hare1
The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

first steps lifting and friends
First steps … lifting and friends
  • Use the Haddock parser … full Haskell given in 500 lines of data type definitions.
  • Work by hand over the Haskell syntax: 27 cases for expressions …
  • Code for finding free variables, for instance …
finding free variables by hand
Finding free variables ‘by hand’
  • instance FreeVbls HsExp where
  • freeVbls (HsVar v) = [v]
  • freeVbls (HsApp f e)
  • = freeVbls f ++ freeVbls e
  • freeVbls (HsLambda ps e)
  • = freeVbls e \\ concatMap paramNames ps
  • freeVbls (HsCase exp cases)
  • = freeVbls exp ++ concatMap freeVbls cases
  • freeVbls (HsTuple _ es)
  • = concatMap freeVbls es
  • … etc.
this approach
This approach
  • Boilerplate code … 1000 lines for 100 lines of significant code.
  • Error prone: significant code lost in the noise.
  • Want to generate the boiler plate and the tree traversals …
  • … DriFT: Winstanley, Wallace
  • … Strafunski: Lämmel and Visser
strafunski
Strafunski
  • Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points.
  • Top-down / bottom up, type preserving / unifying,

full

stop

one

strafunski in use
Strafunski in use
  • Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, …
  • Strafunski allows us to work within Haskell …
  • Other options? Generic Haskell, Template Haskell, AG, …
rename an identifier
Rename an identifier
  • rename:: (Term t)=>PName->HsName->t->Maybe t
  • rename oldName newName =applyTPworker
  • where
  • worker =full_tdTP(idTP‘adhocTP‘idSite)
  • idSite :: PName -> Maybe PName
  • idSite v@(PN name orig)
  • | v == oldName
  • = return (PN newName orig)
  • idSite pn = return pn
the coding effort
The coding effort
  • Transformations: straightforward in Strafunski …
  • … the chore is implementing conditions that the transformation preserves meaning.
  • This is where much of our code lies.
move f from module a to b
Move f from module A to B
  • Is fdefined at the top-level of B?
  • Are the free variables in faccessible within module B?
  • Will the move require recursive modules?
  • Remove the definition of f from module A.
  • Add the definition to module B.
  • Modify the import/export lists in module A, Band the client modules ofAandB if necessary.
  • Change uses of A.f to B.f or f in all affected modules.
  • Resolve ambiguity.
the implementation of hare2
The Implementation of HaRe

Information

gathering

Pre-condition

checking

Program

transformation

Program

rendering

program rendering example
Program rendering example
    • -- This is an example
    • module Main where
    • sumSquares x y = sq x + sq y
    • where sq :: Int->Int
    • sq x = x ^ pow
    • pow = 2 :: Int
    • main = sumSquares 10 20
  • Promote the definition of sq to top level
program rendering example1
Program rendering example
    • module Main where
    • sumSquares x y
    • = sq powx + sq powy where pow = 2 :: Int
    • sq :: Int->Int->Int
    • sq pow x = x ^ pow
    • main = sumSquares 10 20
  • Using a pretty printer: comments lost and layout quite different.
program rendering example2
Program rendering example
    • -- This is an example
    • module Main where
    • sumSquares x y = sq x + sq y
    • where sq :: Int->Int
    • sq x = x ^ pow
    • pow = 2 :: Int
    • main = sumSquares 10 20
  • Promote the definition of sq to top level
program rendering example3
Program rendering example
    • -- This is an example
    • module Main where
    • sumSquares x y = sq powx + sq powy
    • where pow = 2 :: Int
    • sq :: Int->Int->Int
    • sq pow x = x ^ pow
    • main = sumSquares 10 20
  • Layout and comments preserved.
token stream and ast
Token stream and AST
  • White space and comments in the token stream.
  • Modification of the AST guides the modification of the token stream.
  • After a refactoring, the program source is extracted from the token stream not the AST.
  • Heuristics associate comments with program entities.
production tool
Production tool

Programatica

parser and

type checker

Refactor

using a

Strafunski

engine

Render code

from the

token stream

and

syntax tree.

production tool optimised
Production tool (optimised)

Programatica

parser and

type checker

Refactor

using a

Strafunski

engine

Render code

from the

token stream

and

syntax tree.

Pass lexical

information to

update the

syntax tree

and so avoid

reparsing

what have we learned
What have we learned?
  • Emerging Haskell libraries make it practical(?)
  • Efficiency and robustness
      • type checking large systems,
      • linking,
      • editor script languages (vim, emacs).
  • Limitations of editor interactions.
  • Reflections on Haskell itself.
reflections on haskell
Reflections on Haskell
  • Cannot hide items in an export list (cf import).
  • Field names for prelude types?
  • Scoped class instances not supported.
  • ‘Ambiguity’ vs. name clash.
  • ‘Tab’ is a nightmare!
  • Correspondence principle fails …
correspondence
Correspondence
  • Operations on definitions and operations on expressions can be placed in one to one correspondence
  • (R.D.Tennent, 1980)
correspondence1
Definitions

where

f x y = e

f x

| g1 = e1

| g2 = e2

Expressions

let

\x y -> e

f x = if g1 then e1 else if g2 … …

Correspondence
function clauses
f x

| g1 = e1

f x

| g2 = e2

Can ‘fall through’ a function clause … no direct correspondence in the expression language.

f x = if g1 then e1 else if g2 …

No clauses for anonymous functions … no reason to omit them.

Function clauses
work in progress
Work in progress
  • ‘Fold’ against definitions … find duplicate code.
  • All, some or one? Effect on the interface …
    • f x = … e … e …
  • Traditional program transformations
      • Short-cut fusion
      • Warm fusion
where next
Where next?
  • Opening up to users: API or little language?
  • Link with other IDEs (and front ends?).
  • Detecting ‘bad smells’.
  • More useful refactorings supported by us.
  • Working without source code.
slide40
API

Refactorings

Refactoring

utilities

Strafunski

Haskell

slide41
DSL

Combining forms

Refactorings

Refactoring

utilities

Strafunski

Haskell

larger scale examples
Larger-scale examples
  • More complex examples in the functional domain; often link with data types.
  • Dawning realisation that can some refactorings are pretty powerful.
  • Bidirectional … no right answer.
algebraic or abstract type
Algebraic or abstract type?
  • data Tr a
  • = Leaf a |
  • Node a (Tr a) (Tr a)
  • flatten :: Tr a -> [a]
  • flatten (Leaf x) = [x]
  • flatten (Node s t)
  • = flatten s ++
  • flatten t
  • Tr
  • Leaf
  • Node
algebraic or abstract type1
Algebraic or abstract type?
  • Tr
  • isLeaf
  • isNode
  • leaf
  • left
  • right
  • mkLeaf
  • mkNode
  • data Tr a
  • = Leaf a |
  • Node a (Tr a) (Tr a)
  • isLeaf = …
  • isNode = …
  • flatten :: Tr a -> [a]
  • flatten t
  • | isleaf t = [leaf t]
  • | isNode t
  • = flatten (left t)
  • ++ flatten (right t)
algebraic or abstract type2

Pattern matching syntax is more direct …

… but can achieve a considerable amount with field names.

Other reasons? Simplicity (due to other refactoring steps?).

Allows changes in the implementation type without affecting the client: e.g. might memoise

Problematic with a primitivetype as carrier.

Allows an invariant to be preserved.

Algebraic or abstract type?
outside or inside
Outside or inside?
  • Tr
  • isLeaf
  • isNode
  • leaf
  • left
  • right
  • mkLeaf
  • mkNode
  • data Tr a
  • = Leaf a |
  • Node a (Tr a) (Tr a)
  • isLeaf = …
  • isNode = …
  • flatten :: Tr a -> [a]
  • flatten t
  • | isleaf t = [leaf t]
  • | isNode t
  • = flatten (left t)
  • ++ flatten (right t)
outside or inside1
Outside or inside?
  • Tr
  • isLeaf
  • isNode
  • leaf
  • left
  • right
  • mkLeaf
  • mkNode
  • flatten
  • data Tr a
  • = Leaf a |
  • Node a (Tr a) (Tr a)
  • isLeaf = …
  • isNode = …
  • flatten t = …
outside or inside2

If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten.

The more outside the better, therefore.

If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type.

Layered types possible: put the utilities in a privileged zone.

Outside or inside?
memoise flatten tr a a
Memoise flatten :: Tr a->[a]
  • data Tree a
  • = Leaf { val::a } |
  • Node { val::a,
  • left,right::(Tree a) }
  • leaf = Leaf
  • node = Node
  • flatten (Leaf x) = [x]
  • flatten (Node x l r) =
  • (x : (flatten l ++ flatten r))
  • data Tree a
  • = Leaf { val::a,
  • flatten:: [a] } |
  • Node { val::a,
  • left,right::(Tree a),
  • flatten::[a] }
  • leaf x = Leaf x [x]
  • node x l r
  • = Node x l r (x : (flatten l ++
  • flatten r))
memoise flatten
Memoise flatten
  • Invisible outside the implementation module, if tree type is already an ADT.
  • Field names in Haskell make it particularly straightforward.
data type or existential type
Data type or existential type?

data Shape data Shape

= Circle Float | = forall a. Sh a => Shape a

Rect Float Float

class Sh a where

area :: Shape -> Float area :: a -> Float

area (Circle f) = pi*r^2 perim :: a -> Float

area (Rect h w) = h*w

data Circle = Circle Float

perim :: Shape -> Float

perim (Circle f) = 2*pi*r instance Sh Circle

perim (Rect h w) = 2*(h+w) area (Circle f) = pi*r^2

perim (Circle f) = 2*pi*r

data Rect = Rect Float

instance Sh Rect

area (Rect h w) = h*w

perim (Rect h w) = 2*(h+w)

constructor or constructor
Constructor or constructor?

data Expr data Expr

= Epsilon | .... | = Epsilon | .... |

Then Expr Expr | Then Expr Expr |

Star Expr Star Expr |

Plus Expr

plus e = Then e (Star e)

monadification expressions
Monadification: expressions

data Expr

= Lit Integer | -- Literal integer value

Vbl Var | -- Assignable variables

Add Expr Expr | -- Expression addition: e1+e2

Assign Var Expr -- Assignment: x:=e

type Var = String

type Store = [ (Var, Integer) ]

lookup :: Store -> Var -> Integer

lookup st x = head [ i | (y,i) <- st, y==x ]

update :: Store -> Var -> Integer -> Store

update st x n = (x,n):st

monadification evaulation
Monadification: evaulation

eval :: Expr -> evalST :: Expr ->

Store -> (Integer, Store) State Store Integer

eval (Lit n) st evalST (Lit n)

= (n,st) = do

return n

eval (Vbl x) st evalST (Vbl x)

= (lookup st x,st) = do

st <- get

return (lookup st x)

monadification evaulation 2
Monadification: evaulation 2

eval :: Expr -> evalST :: Expr ->

Store -> (Integer, Store) State Store Integer

eval (Add e1 e2) st evalST (Add e1 e2)

= (v1+v2, st2) = do

where v1 <- evalST e1

(v1,st1) = eval e1 st v2 <- evalST e2

(v2,st2) = eval e2 st1 return (v1+v2)

eval (Assign x e) st evalST (Assign x e)

= (v, update st' x v) = do

where v <- evalST e

(v,st') = eval e st st <- get

put (update st x v)

return v

classes and instances
Classes and instances
  • Type Store = [Int]
  • empty :: Store
  • empty = []
  • get :: Var -> Store -> Int
  • get v st = head [ i | (var,i) <- st, var==v]
  • set :: Var -> Int -> Store -> Store
  • set v i = ((v,i):)
classes and instances1
Classes and instances
  • Type Store = [Int]
  • empty :: Store
  • get :: Var -> Store -> Int
  • set :: Var -> Int -> Store -> Store
  • empty = []
  • get v st = head [ i | (var,i) <- st, var==v]
  • set v i = ((v,i):)
classes and instances2
Classes and instances
    • class Store a where
    • empty :: a
    • get :: Var -> a -> Int
    • set :: Var -> Int -> a -> a
    • instance Store [Int] where
    • empty = []
    • get v st = head [ i | (var,i) <- st, var==v]
    • set v i = ((v,i):)
  • Need newtype wrapper in Haskell 98 …

end

understanding a program
Understanding a program
  • Take a working semantic tableau system written by an anonymous 2nd year student …
  • … refactor to understand its behaviour.
  • Nine stages of unequal size.
  • Reflections afterwards.
an example tableau

((AB)C)

(AC)

(AB)

C

A

C

MakeBTrue

MakeAandCFalse

A

B

An example tableau

((AC)((AB)C))

v1 name types
Built-in types

[Prop]

[[Prop]]

used for branches and tableaux respectively.

Modify by adding

type Branch = [Prop]

type Tableau = [Branch]

Change required throughout the program.

Simple edit: but be aware of the order of substitutions: avoid

type Branch = Branch

v1: Name types
v2 rename functions
Existing names

tableaux

removeBranch

remove

become

tableauMain

removeDuplicateBranches

removeBranchDuplicates

and add comments clarifying the (intended) behaviour.

Add test datum.

Discovered some edits undone in stage 1.

Use of the type checker to catch errors.

test will be useful later?

v2: Rename functions
v3 literate normal script
Change from literate form:

Comment …

> tableauMain tab

> = ...

to

-- Comment …

tableauMain tab

= ...

Editing easier: implicit assumption was that it was a normal script.

Could make the switch completely automatic?

v3: Literate  normal script
v4 modify function definitions
From explicit recursion:

displayBranch

:: [Prop] -> String

displayBranch [] = []

displayBranch (x:xs)

= (show x) ++ "\n" ++

displayBranch xs

to

displayBranch

:: Branch -> String

displayBranch

= concat . map (++"\n") . map show

Abstraction: move from explicit list representation to operations such as map and concat which could be over any collection type.

First time round added incorrect (but type correct) redefinition … only spotted at next stage.

Version control: un/redo etc.

v4: Modify function definitions
v5 algorithms and types 1
removeBranchDup :: Branch -> Branch

removeBranchDup [] = []

removeBranchDup (x:xs)

| x == findProp x xs = [] ++ removeBranchDup xs

| otherwise = [x] ++ removeBranchDup xs

findProp :: Prop -> Branch -> Prop

findProp z [] = FALSE

findProp z (x:xs)

| z == x = x

| otherwise = findProp z xs

v5: Algorithms and types (1)
v5 algorithms and types 2
removeBranchDup :: Branch -> Branch

removeBranchDup [] = []

removeBranchDup (x:xs)

| findProp x xs = [] ++ removeBranchDup xs

| otherwise = [x] ++ removeBranchDup xs

findProp :: Prop -> Branch -> Bool

findProp z [] = False

findProp z (x:xs)

| z == x = True

| otherwise = findProp z xs

v5: Algorithms and types (2)
v5 algorithms and types 3
removeBranchDup :: Branch -> Branch

removeBranchDup = nub

findProp :: Prop -> Branch -> Bool

findProp = elem

v5: Algorithms and types (3)
v5 algorithms and types 4
removeBranchDup :: Branch -> Branch

removeBranchDup = nub

Fails the test! Two duplicate branches output, with different ordering of elements.

The algorithm used is the 'other' nub algorithm, nubVar:

nub [1,2,0,2,1] = [1,2,0]

nubVar [1,2,0,2,1] = [0,2,1]

Code using lists in a particular order to represent sets.

v5: Algorithms and types (4)
v6 library function to module
Add the definition:

nubVar = …

to the module

ListAux.hs

and replace the definition by

import ListAux

Editing easier: implicit assumption was that it was a normal script.

Could make the switch completely automatic?

v6: Library function to module
v7 housekeeping
Remanings: including foo and bar and contra (becomes notContra).

An instance of filter,

looseEmptyLists

is defined using filter, and subsequently inlined.

Put auxiliary function into a where clause.

Generally cleans up the script for the next onslaught.

v7: Housekeeping
v8 algorithm 1
splitNotNot :: Branch -> Tableau

splitNotNot ps = combine (removeNotNot ps) (solveNotNot ps)

removeNotNot :: Branch -> Branch

removeNotNot [] = []

removeNotNot ((NOT (NOT _)):ps) = ps

removeNotNot (p:ps) = p : removeNotNot ps

solveNotNot :: Branch -> Tableau

solveNotNot [] = [[]]

solveNotNot ((NOT (NOT p)):_) = [[p]]

solveNotNot (_:ps) = solveNotNot ps

v8: Algorithm (1)
v8 algorithm 2
splitXXX removeXXX solveXXX for each of nine rules.

The algorithm applies rules in a prescribed order, using an integer value to pass information between functions.

Aim: generic versions of splitremovesolve

Change order of rule application … effect on duplicates.

Add map sort to top level pipeline before duplicate removal.

v8: Algorithm (2)
v9 replace lists by sets
Wholesale replacement of lists by a Set library.

map mapSet

foldr foldSet (careful!)

filter filterSet

The library exposes the representation: pick, flatten.

Use with discretion … further refactoring possible.

Library needed to be augmented with

primRecSet :: (a -> Set a -> b -> b) -> b -> Set a -> b

v9: Replace lists by sets.
v9 replace lists by sets 2
Drastic simplification: no explicit worries about

… ordering (and equality), (removal of) duplicates.

Hard to test intermediate stages: type change is all or nothing …

… work with dummy definitions and the type checker.

Further opportunities: why choose one rule from a set when could apply to all elements at once? Gets away from picking on one value (and breaking the set interface).

v9: Replace lists by sets (2)
conclusions of the case study
Conclusions of the case study
  • Heterogeneous process: some small, some large.
  • Are all these stages strictly refactorings: some semantic changes always necessary too?
  • Importance of type checking for hand refactoring … … and testing when any semantic changes.
  • Undo, redo, reordering the refactorings … CVS.
  • In this case, directional … not always the case.
teaching and learning design
Teaching and learning design
  • Exciting prospect of using a refactoring tool as an integral part of an elementary programming course.
  • Learning a language: learn how you could modify the programs that you have written …
  • … appreciate the design space, and
  • … the features of the language.
conclusions
Conclusions
  • Refactoring + functional programming: good fit.
  • Real benefit from using available libraries … with work.
  • Want to use the tool in building itself.
  • Much more to do than we have time for.