13 05 2013 gijs dekkers ga tan de menten rapha l desmet n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
13/05/2013 Gijs Dekkers Gaëtan de Menten Raphaël Desmet PowerPoint Presentation
Download Presentation
13/05/2013 Gijs Dekkers Gaëtan de Menten Raphaël Desmet

Loading in 2 Seconds...

play fullscreen
1 / 51

13/05/2013 Gijs Dekkers Gaëtan de Menten Raphaël Desmet - PowerPoint PPT Presentation


  • 127 Views
  • Uploaded on

LIAM2 Introduction and demo model. Présentation pour le pôle Prévoyance de la Caisse de Dépôt et de Gestion Rabat, Maroc. 13/05/2013 Gijs Dekkers Gaëtan de Menten Raphaël Desmet. Introduction to Liam2.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '13/05/2013 Gijs Dekkers Gaëtan de Menten Raphaël Desmet' - binh


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
13 05 2013 gijs dekkers ga tan de menten rapha l desmet
LIAM2
  • Introduction and demo model
  • Présentation pour le pôle Prévoyance de la Caisse de Dépôt et de Gestion
  • Rabat, Maroc.

13/05/2013

Gijs Dekkers

Gaëtan de Menten

Raphaël Desmet

introduction to liam2
Introduction to Liam2
  • Tool for the development of dynamic microsimulation models with dynamic cross-sectional ageing.
  • ≠ a microsimulation model (<> Midas)
  • Simulation framework that allows for comprehensive modelling and various simulation techniques
  • Prospective / Retrospective simulation
  • Work in progress …
    • Immigration
    • Weights
    • More sophisticated regressions and simulation techniques
    • Speed optimisation
  • You get it for free!
how to get it
How to get it.
  • Check http://liam2.plan.be
  • This website contains
    • The LIAM 2 executable.
    • A synthetic dataset of 20,200 individuals grouped in 14,700 households in HDF5 format.
    • A small model containing
      • Fertility and mortality (aligned)
      • Educational attainment level
      • Some labour market characteristics
    • Documentation
      • A LIAM 2 user guide
    • A ready-to-use “bundle” of notepad integrated with LIAM 2 and the synthetic dataset.
overview
Overview
  • Written in Python
    • High level open source language
    • Efficient libraries mostly C
  • Input
    • Model description: text file (YAML)
    • Alignment: CSV files
    • Internal data engine: HDF5 file format and library for storing scientific data (meteorology, astronomy, …)
  • Output
    • HDF5 file, CSV file on demand
    • Interactive console
model definition the simulation file
Model definition: the simulation file
  • Declare entities (=data)
    • What is modelled? (person, household, enterprise, …)
    • Entity characteristics
      • fields:
        • what do we know about an individual?
        • what do we want to know?
        • How can we store the data?
          • Flag: boolean (eg. alive/dead, male/female)
          • discrete/category: integer (eg. single/married/divorced/…)
          • Continue/value: float (eg. Income)
      • links: interaction between entities
        • same kind: who is the mother?
        • different kinds: in what household does the person live?
  • Globals (=data)
    • External time series
      • Eg. macroeconomic context
model definition the simulation file1
Model definition: the simulation file
  • Simulation (=model)
    • Processes:
      • What happens to the entities in their lives?
      • In what order?
    • input: Which input file to use?
    • output: Where is the output?
    • start period:
    • periods: How many periods do we want to simulate?
simple simulation file
Simple simulation file

entities:

person:

fields:

# period and id are implicit

- age: int

- gender: bool

processes:

age: age + 1

isfemale: gender = True

simulation:

processes:

- person: [age, isfemale]

input:

file: base.h5

output:

file: output.h5

start_period: 2002 # first simulated period

periods: 20

liam2 bundled with notepad
Liam2 bundled with Notepad++

model/YAML

Interactive console

liam2 bundled with notepad1
Liam2 bundled with Notepad++
  • Simulation file (YAML-format, yml extension, highlighting)
    • indentation (grouping, levels)
    • colon, dash, brackets, double quotes, quotes, ...
    • comment (#)
  • Console
    • run: F6
    • import: F5
    • output
    • interactive (history)
liam 2 demo model
LIAM 2 – demo model
  • First simulation
    • simple entity
    • simple functions
    • first run
    • some output
basic setup
Basic setup
  • Description of the data : entities
    • fields:
      • name
      • type = bool (boolean), int (integer) or float
      • initialdata (data from input or new data)
  • The model definition: processes
    • model definition (transformation, regressions, alignment, ...)
  • Order of the processes: simulation
    • database (input, output)
    • what processes and when? (model order)
    • start_period, # periods
basic simulation file
Basic simulation file

entities:

person:

fields:

# period and id are implicit

- age: int

- gender: bool

processes:

age: age + 1

simulation:

processes:

- person: [age]

input:

file: base.h5

output:

file: output.h5

start_period: 2002 # first simulated period

periods: 20

simple simulation to run the file press f6
Simple simulation (to run the file, press F6)

entities:

person:

fields:

# period and id are implicit

- age: int

- gender: bool

# fields not present in input

- agegroup: {type: int, initialdata: false}

processes:

age: age + 1

agegroup: 5 * trunc(age / 5)

simulation:

processes:

- person: [age, agegroup]

input:

file: simple2001.h5

output:

file: simulation.h5

start_period: 2002

periods: 2

console output
Console output

Using simulation file: 'C:\usr\Liam2Suite\Synthetic\demo00.yml'

reading data from C:\usr\Liam2Suite\Synthetic\simple2001.h5 ...

person ...

period 2002

- loading input data

* person ... done (0 ms elapsed).

-> 20200 individuals

- 1/2 age ... done (2 ms elapsed).

- 2/2 agegroup ... done (3 ms elapsed).

- storing period data

* person ... done (2 ms elapsed).

-> 20200 individuals

period 2002 done (0.01 second elapsed).

period 2003

top 10 processes:

- agegroup: 0.01 second

- age: 3 ms

total for top 10 processes: 0.01 second

output
Output
  • Internal format = HDF5 file
  • Write to the console
    • show(expr1[, expr2, … ]): evaluates the expressions and shows the result
    • dump(expr1[, expr2, …, filter, missing, header): produces a table with the expressions given as argument evaluated over many (possibly all) individuals of the dataset.
  • Write to CSV-files
    • csv(expr1[, expr2, …, suffix, fname, mode]): function writes values to a csv-file
  • Pivot tables:
    • groupby(expr1[, expr2, …, filter=None, percent=False])
some functions
Some functions
  • Expressions
    • Arithmetic operators: +, -, *, /, ** (exponent), % (modulo)
    • Comparison operators: <, <=, ==, !=, >=, >
    • Boolean operators: and, or, not
    • Conditional expressions: if(condition, expression_if_true, expression_if_false)
  • Mathematical functions
    • abs, log, exp, round, trunc, ...
  • Aggregate functions
    • grpcount, grpsum, grpavg, grpstd, grpmax, grpmin
  • Temporal functions
    • lag, value_for_period, duration, tavg, tsum
  • Random functions
    • Uniform, normal, randint
simple simulation to run the file press f61
Simple simulation (to run the file, press F6)

entities:

person:

fields:

# period and id are implicit

- age: int

- gender: bool

# fields not present in input

- agegroup: {type: int, initialdata: false}

processes:

age: age + 1

agegroup: if(age < 50, 5 * trunc(age / 5),

10 * trunc(age / 10))

# produces 2 csv files (one per period): "person_20xx.csv“

# default name for csv-file = {entity}_{period}.csv

dump_info: csv(dump(id, age, gender))

show_demography: show(groupby(agegroup, gender))

slide19

simulation:

processes:

- person: [age, agegroup,

dump_info, show_demography]

input:

file: simple2001.h5

output:

file: simulation.h5

# first simulated period

start_period: 2002

periods: 2

interactive console
Interactive console

Welcome to LIAM interactive console.

help: print this help

q[uit] or exit: quit the program

entity [name]: set the current entity (this is required before any query)

period [period]: set the current period (if not set, uses the last period

simulated)

fields [entity]: list the fields of that entity (or the current entity)

show is implicit on all commands

>>> period 2002

current period set to 2002

>>> entity person

current entity set to person

>>> grpcount(gender)

10100

>>> grpcount(not gender)

10100

remarks
Remarks
  • All output functions can be used both during the simulation and in the interactive console
  • Some examples - show
    • show(groupby(age, gender, filter=age<=10))
    • show(grpcount(age >= 18))
    • show(grpcount(not dead), grpavg(age, filter=not dead))
    • show("Count:", grpcount(), "\nAverage age:", grpavg(age), "\nAgestddev:", grpstd(age))
  • Some examples – csv
    • csv(grpavg(age))
    • csv(period, grpavg(age), fname=‘avg_income.csv’, mode=‘a’)
  • Some examples – groupby
    • groupby(trunc(age/10),gender)
    • groupby(trunc(age/10),gender, percent=True)
slide22
links, init,

procedures, choice

demo02.yml

links model interaction
Links: model interaction
  • second entity (eg household)
  • links: interaction between entities (eg. persons, households)
  • one2many (one household has many persons)

person:

fields:

# period and id are implicit

- age:int

- gender:bool

...

- hh_id:int

household:

fields:

# period and id are implicit

- nb_persons: int

- nb_children: int

links:

persons: {type: one2many,

target: person,

field: hh_id}

use the links aggregate functions
Use the links: aggregate functions

entities:

household:

fields:

# period and id are implicit

- nb_persons: {type: int, initialdata: false}

- nb_children: {type: int, initialdata: false}

links:

persons: {type: one2many, target: person, field: hh_id}

processes:

household_composition:

- nb_persons: countlink(persons)

- nb_children: countlink(persons, age < 18)

To use information stored in the linked entities you have to use aggregate functions

countlink (eg. countlink(persons) gives the numbers of persons in the household)

sumlink (eg. sumlink(persons, income) sums up all incomes from the members in a household)

avglink (eg. avglink(persons, age) gives the average age of the members in a household)

minlink, maxlink (eg. minlink(persons, age) gives the age of the youngest member of the household)

many2one and the function
many2one and the “.”-function

entities:

person:

fields:

- age: int

- gender: bool

# link fields

- hh_id: int

links:

household: {type: many2one, target: household, field: hh_id}

many2one : link the item of the entity to one other item in the same (eg. a person to its mother) or another entity (eg. a person to its household).

To access a the value field of a linked item, you use:

link_name.field_name

processes:

# produces "person_20xx_info.csv"

dump_info: csv(dump(id, age, gender, household.nb_persons),

suffix='info')

many2one and the function1
many2one and the “.”-function

person:

fields:

# period and id are implicit

- age: int

- gender: bool

# link fields

- mother_id: int

- partner_id: int

- hh_id: int

links:

mother: {type: many2one, target: person, field: mother_id}

partner: {type: many2one, target: person, field: partner_id}

household: {type: many2one, target: household, field: hh_id}

children: {type: one2many, target: person, field: mother_id}

Some examples:

mother.age

mother.mother.age

age - partner.age

simulation init processes
Simulation: init - processes

simulation:

init:

- household: [init_region_id, household_composition]

processes:

- household: [household_composition]

- person: [ageing, dump_info]

input:

file: simple2001.h5

output:

file: simulation.h5

# first simulated period

start_period: 2002

periods: 2

init: executes the processes in start_period - 1 (here 2001) to initialise the household variables

processes: executes in 2002, 2003

simulation procedures local variables
Simulation: procedures – local variables

processes:

ageing:

- age: age + 1

- juniors: 5 * trunc(age / 5)

- plus50: 10 * trunc(age / 10)

- agegroup: if(age < 50, juniors, plus50)

dump_info: csv(dump(id, age, gender, hh_id, household.nb_persons,

mother.age, partner.age), suffix='info')

show_demography: show(groupby(agegroup, gender))

procedures

  • single process (ex. dump_info)
  • multi process (ex. ageing)
  • local variables
    • temporary: only available in the ageing procedure
    • not stored (ex. juniors, plus50 in the ageing procedure)
stochastic changes i probabilistic simulation
Stochastic changes I: probabilistic simulation

entities:

household:

fields:

# period and id are implicit

- nb_persons: {type: int, initialdata: false}

- nb_children: {type: int, initialdata: false}

- region_id: {type: int, initialdata: false}

links:

persons: {type: one2many, target: person, field: hh_id}

processes:

init_region_id:

- region_id: choice([0, 1, 2, 3], [0.1, 0.2, 0.3, 0.4])

choice

  • region_id: 10% chance to get 0, 20% for 1, 30% for 2 and 40% for 3
  • beware: sum of prob. = 100%
stochastic changes ii behavioural equations
Stochastic changes II: behavioural equations
  • Logit:
    • logit_regr(expr, filter=None, align=percentage)
    • logit_regr(expr, filter=None, align='filename.csv')
  • Alignment :
    • align(expr, [take=take_filter,] [leave=leave_filter,]

fname=’filename.csv’)

  • Continuous (expr + normal(0, 1) * mult + error_var):
    • cont_regr(expr, filter, mult, error_var)
  • Clipped continuous (always positive):
    • clip_regr(expr, filter, mult, error_var)
  • Log continuous (exponential of continuous):
    • log_regr(expr, filter, mult, error_var)
logit align example
logit + alignexample

processes:

ageing:

- age: age + 1

birth:

- to_give_birth: logit_regr(0.0,

filter=not gender and (age >= 15) and (age <= 50),

align='al_p_birth.csv')

logit_regr(expr, filter, align)

  • Expr
  • filter: select individuals from entity
  • apply alignment using al_p_birth.csv
macros easier to read maintain
macros: easier to read, maintain

processes:

ageing:

- age: age + 1

birth:

- to_give_birth: logit_regr(0.0,

filter=not gender and (age >= 15) and (age <= 50),

align='al_p_birth.csv')

person:

fields:

- age: int

. . . 

macros:

MALE: True

FEMALE: False

ISMALE: gender

ISFEMALE: not gender

processes:

ageing:

- age: age + 1

birth:

- to_give_birth: logit_regr(0.0,

filter=ISFEMALE and (age >= 15) and (age <= 50),

align='al_p_birth.csv')

  • macros
    • defined on entity level
    • re-evaluated on each execution
life cycle functions new create new entities
Life cycle functions – new – create new entities

birth:

- to_give_birth: logit_regr(0.0,

filter=ISFEMALE and (age >= 15) and (age <= 50),

align='al_p_birth.csv')

- new('person', filter=to_give_birth,

mother_id = id,

hh_id = hh_id,

age = 0,

partner_id = UNSET,

civilstate = SINGLE,

gender = choice([MALE, FEMALE], [0.51, 0.49]) )

new

  • entity name: what (same or other eg. household on marriage)
  • filter: who
  • set initial values to a selection of variables
life cycle functions remove remove entities
Life cycle functions – remove – remove entities

death:

- dead: if(ISMALE,

logit_regr(0.0, align='al_p_dead_m.csv'),

logit_regr(0.0, align='al_p_dead_f.csv'))

- civilstate: if(partner.dead, WIDOW, civilstate)

- partner_id: if(partner.dead, UNSET, partner_id)

- show('Avg age of dead men', grpavg(age, filter=dead and ISMALE))

- show('Avg age of dead women', grpavg(age, filter=dead and ISFEMALE))

- show('Widows', grpsum(ISWIDOW))

- remove(dead)

remove

  • filter: who has to removed?
  • Item is removed form the entity set
    • No data is available for that period and later
    • Historical data is still accessible
    • Links must be cleaned manually if necessary
remove empty households
Remove empty households

entities:

household:

fields:

- nb_persons: {type: int, initialdata: false}

links:

persons: {type: one2many, target: person, field: hh_id}

processes:

household_composition:

- nb_persons: countlink(persons)

- nb_children: countlink(persons, age < 18)

clean_empty: remove(nb_persons == 0)

. . .

simulation: 

processes:

- person: [list of processes]

- household: [household_composition, clean_empty]

debugging possibilities
Debugging possibilities
  • show and dump functions
    • skip_shows: if set to True, annuls all show() functions
  • interactive console
    • period
    • entity
    • output: aggregate, groupby functions
  • breakpoint
    • breakpoint ()
    • breakpoint(2021)
    • step (or s)
    • resume (or r)
  • random_seed
    • fix random seed: if you want to have several runs of a simulation use the same random numbers.
matching aka marriage market
Matching - aka Marriage market
  • matches individuals from subset 1 with individuals from subset 2
    • Give each individual in subset 1 a particular order (orderby)
    • Compute the score of all (unmatched) individuals in subset 2
    • take the best score

matching(

set1filter=boolean_expr,

set2filter=boolean_expr,

orderby=difficult_match,

score=coef1 * field1 + coef2 * other.field2 + ...)

marriage
Marriage

marriage:

- in_couple: ISMARRIED

- to_couple: if((age >= 18) and (age <= 90) and not in_couple,

if(ISMALE,

logit_regr(0.0, align='al_p_mmkt_m.csv'),

logit_regr(0.0, align='al_p_mmkt_f.csv')), False)

- difficult_match: if(to_couple and ISFEMALE,

abs(age - grpavg(age, filter=to_couple and ISMALE)),

nan)

- partner_id: if(to_couple,

matching(set1filter=ISFEMALE, set2filter=ISMALE,

score=- 0.4893 * other.age + 0.0131 * other.age ** 2 ...

orderby=difficult_match),

partner_id)

- justcoupled: to_couple and (partner_id != UNSET) 

- civilstate: if(justcoupled, MARRIED, civilstate)

new links change links
New links, change links

marriage:

- in_couple: ISMARRIED

... 

- civilstate: if(justcoupled, MARRIED, civilstate)

- newhousehold: new('household', filter=justcoupled and ISFEMALE,

region_id=choice([0, 1, 2, 3], [0.1, 0.2, 0.3, 0.4]))

- hh_id: if(justcoupled,

if(ISMALE, partner.newhousehold, newhousehold),

hh_id)

- csv(dump(id, age, gender, partner.id, partner.age,

partner.gender, hh_id, filter=justcoupled),

suffix='new_couples')

new link

  • change the value of the linked field
remove links
Remove links

divorce:

- agediff: if(ISFEMALE and ISMARRIED, age - partner.age, 0)

# select females to divorce

- divorce: logit_regr(0.6713593 * household.nb_children

- 0.0785202 * dur_in_couple

+ 0.1429621 * agediff - 0.0088308 * agediff **2 - 4.546278,

filter = ISFEMALE and ISMARRIED and (dur_in_couple > 0),

align = 'al_p_divorce.csv')

# break link to partner

- to_divorce: divorce or partner.divorce

- partner_id: if(to_divorce, UNSET, partner_id)

- civilstate: if(to_divorce, DIVORCED, civilstate)

- dur_in_couple: if(to_divorce, 0, dur_in_couple)

# move out males

- hh_id: if(ISMALE and to_divorce,

new('household',

region_id=household.region_id), hh_id)

1 graduate people
1. Graduate people

ineducation:

# unemployed if graduated

- workstate: if(ISSTUDENT and

(((age >= 16) and IS_LOWER_SECONDARY_EDU) or

((age >= 19) and IS_UPPER_SECONDARY_EDU) or

((age >= 24) and IS_TERTIARY_EDU)),

UNEMPLOYED,

workstate)

- show('num students', grpsum(ISSTUDENT))

2 retire people
2. Retire people

globals:

periodic:

- WEMRA: float

  • # retire
  • - workstate: if(ISMALE,
  • if((age >= 65), RETIRED, workstate),
  • if((age >= WEMRA), RETIRED, workstate))
  • globals
  • variables that do not relate to any particular entity
  • periodic globals can have a different value for each period
3 pick people to work in 2002
3. Pick people … to work in 2002

inwork:

- work_score: UNSET

# men

- work_score: if(ISMALE and (age > 15) and (age < 65) and ISINWORK,

logit_score(-0.196599 * age + 0.0086552 * age **2 - 0.000988 * age **3

+ 0.1892796 * ISMARRIED + 3.554612), work_score)

- work_score: if(ISMALE and (age > 15) and (age < 50) and ISUNEMPLOYED,

logit_score(0.9780908 * age - 0.0261765 * age **2 + 0.000199 * age **3

- 12.39108), work_score)

# women

# align on Number of Workers / Population by age class

- work: if((age > 15) and (age < 65),

if(ISMALE,

align(work_score, leave=ISSTUDENT or ISRETIRED, fname='al_p_inwork_m.csv'),

align(work_score, leave=ISSTUDENT or ISRETIRED, fname='al_p_inwork_f.csv')), False)

- workstate: if(work, INWORK, workstate)

- workstate: if(not work and lag(ISINWORK), -1, workstate)

4 pick people to be unemployed in 2002 5 remain
4. Pick people … to be unemployed in 2002 + 5. Remain …

unemp_process:

- unemp_score: -1

- unemp_condition: (age > 15) and (age < 65) and not ISINWORK

# Probability of being unemployed from being unemployed previously

- unemp_score: if(unemp_condition and lag(ISUNEMPLOYED),

logit_score(- 0.1988979 * age + 0.0026222 * age **2 + ...),

unemp_score)

# Probability of being unemployed from being inwork previously

- unemp_score: if(unemp_condition_m and lag(ISINWORK),

logit_score(0.1396404 * age - 0.0024024 * age **2 + ...), unemp_score)

# Alignment of unemployment based on those not selected by inwork

# [Number of new unemployed / (Population - Number of Workers)] by age

# The here below condition must correspond to the here above denumerator

- unemp: if((age > 15) and (age < 65) and not ISINWORK,

align(unemp_score, leave=ISSTUDENT or ISRETIRED,

fname='al_p_unemployed.csv'), False)

- workstate: if(unemp, UNEMPLOYED, workstate)

- workstate: if((workstate==-1) and not unemp, OTHERINACTIVE, workstate)

slide49
import data

demo_import.yml

import data to run the file press f5
Import data (to run the file, press F5)

# this is an "import" file. To use it press F5 in liam2 environment, or run

# the following command in a console:

# INSTALL_PATH\liam2 import demo_import.yml

output: simple2001.h5

entities:

person:

path: input\person.csv

fields:

# period and id are implicit

- age: int

- gender: bool

- ...

household:

path: input\household.csv

# if fields are not specified, they are all imported

optional
Optional

globals:

periodic:

path: input\globals_transposed.csv

transposed: true

entities: 

person:

path: input\person.csv

fields:

- age: int

- gender: bool 

# if you want to keep your csv files intact but you use different names

# in your simulation that in the csv files, you can specify name changes

# here. The format is: "newname: oldname"

oldnames:

gender: male

# if you want to invert the value of some boolean fields (True -> False

# and False -> True), add them to the "invert" list below.

invert: [gender]