1 / 21

LIAM2*

LIAM2*. Gijs Dekkers Federal Planning Bureau and Katholieke Universiteit Leuven Sides made by Gaëtan de Menten. A sneak preview. *Life-cycle Income Analysis Model.

luke
Download Presentation

LIAM2*

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIAM2* GijsDekkers Federal Planning Bureau and KatholiekeUniversiteit Leuven Sides made by Gaëtan de Menten A sneak preview... *Life-cycle Income Analysis Model Paper presented at the Paper to be presented at the « Tresury Brown Bag Lunch Meeting », Ministerodell'Economia e delleFinanze, Rome, February 14th, 2011

  2. LIAM 2: the foundations • LIAM by CathalO’Donoghue • Used in AIM-project, developing MIDAS for Belgium, Italy and Germany. • Updating, extending and considerable problem solving by Geert Bryon (FPB) • PROGRESS-MiDaL project (Grant VS/2009/0569) • FPP (Be): development, application and testing • Gaëtan de Menten: development • Geert Bryon: application and testing • GijsDekkers: management, data and a bit application • CEPS/INSTEAD (Lux): testing • IGSS (Lux): investment, testing • CathalO’Donoghue (Teacasc, Ire), Howard Redway (minstry of work and pensions, uk): comments and conceptual assistance

  3. Overview of this sneak preview • Current features • Current performance • Demonstration • TODO(?)

  4. Current features • Input • Simulation: a text file • Alignment: CSV files • Initial field data: an hdf5 file • Output • hdf5 file • Converters • Old data format (Tab-separated text files) <-> hdf5

  5. constants per_period: • PARAMETER-NAME: float entities entity-name1 (e.g. Household): fields links processes entity-name2 (e.g. Person): fields links processes simulation: init: processes: • entity: [list of processes, separated by commas] input: path: “path name” file: “file name.h5” output: path: “path name” file: “file name.h5” start_period: periods: The setup of a model

  6. Current features • Language • Python • High level, concise, readable, easy interface with C • Lots of 3rd party libraries (especially scientific tools) • But uses some efficient (open source) libraries written mostly in C • Numpy • Numexpr • PyTables

  7. Current features • Can declare “fields” with a type • float, int, bool • Evaluate simple expressions • Arithmetic operators: +, -, *, /, **, % • 0.51 * age + 0.023 * age ** 2 – 0.0012 * age ** 3 • Comparison operators: <, <=, ==, !=, >=, > • age < 20 • Boolean operators: and, or, not • not male and (age >= 15) and (age <= 50) • Conditional expressions: if(condition, iftrue, iffalse) • if(age < 65, earnings, pension)

  8. Current features • Store fields • for each period (if the field is declared) • age: “age + 1” • as temporaries (the value is lost after each period) • ischild: “age < 18” • Macros (re-evaluated wherever they appears) • ISCHILD: “age < 18” • difference with temporaries: • ischild: "age < 18" • before1: “if(ischild, 1, 2)" • before2: “if(ISCHILD, 1, 2)“ # before1 == before2 • age: "age + 1" • after1: "if(ischild, 1, 2)" • after2: "if(ISCHILD, 1, 2)“ # after1 != after2 !! # after1 == before 1

  9. Current features • Functions • Per individual • abs, log, exp • clip • 0.25 * clip(age ** 3, 0, 100000) • round • round(age / 10.0, 2) • min/max • min(age, 99) • max(pension, benefit)

  10. Current features • Functions • Aggregate functions • grpcount, grpsum, grpavg, grpstd, grpmax, grpmin • abs(age - grpavg(age)) • Normal: random numbers with a normal distribution • normal(loc=0.0, scale=grpstd(errsal)) • Some functions accept a filter argument • abs(age - grpavg(age, filter=male), filter=not male)

  11. Current features • lag/value_for_period • Only simple expressions and explicitly saved aggregates for now • value_for_period(inwork and not male, 2002) • lag(sum_twr) • matching: match two sets of individuals (aka Marriage market) • matches individuals from set 1 with individuals from set 2 • follow a particular order (given by an expression) • for each individual in set 1, computes the score of all (unmatched) individuals in set 2 and take the best scoring one • matching(set1filter=to_marry and not male, set2filter=to_marry and male, orderby=difficult_match)

  12. Current features • Many-to-one links • partner.age • grpavg(partner.age – age) • partner.father.age • partner.get(earnings + benefits)

  13. Current features • One-to-many links • countlink(link, filter) • countlink(persons) • countlink(children, age < 18) • sumlink(link, expr, filter) • sumlink(persons, earnings, age >= 18) • avglink(link, expr, filter) • avglink(children, age, not male) • minlink/maxlink(link, expr, filter) • minlink(children, age, not male)

  14. Current features • Regressions • Logit • logit_regr(expr, filter, align) • Continuous (expr + normal(0, 1) * mult + error) • cont_regr(expr, filter, align, mult, error_var) • Clipped continuous (always positive) • clip_regr(expr, filter, align, mult, error_var) • Log continuous (exponential of continuous) • log_regr(expr, filter, align, mult, error_var) • Alignment • Fixed percentage or 2 dimensional table in a csv file

  15. Current features • Lifecycle functions • new: create new individuals • new('person', filter=to_give_birth) • remove: remove individuals from the dataset • remove(dead) • remove(nb_persons == 0) • Miscellaneous functions • show: print anything to the console • show(grpcount(age >= 18)) • show(grpcount(not dead), grpavg(age, filter=not dead))

  16. Current features (9/10) • Miscellaneous functions • dump: produce a table with the expressions given as argument • show(dump(age, age / 10, filter=id < 20)) • groupby (aka “pivot table”): group individuals by their value for the given expressions, and optionally compute an expression for each group • show(groupby((age / 10, gender))) • show(groupby((agegroup, gender, inwork), grpcount())) • show(groupby(agegroup, grpavg(income))) • show(groupby((inwork, gender), id, filter=age < 10) • csv: write a table to a csv file • csv(dump(age, age / 10, gender), suffix=‘age’) • Show: interactive assessment of results: command line

  17. Current Performance • For a simple model: • birth (using alignment data from MIDAS) • chronic ill (using a fixed percentage alignment) • marriage market • earnings (using macro alignment) • Or at least what I think macro alignment is... • death (using alignment data from MIDAS)

  18. Current Performance • 10,000 persons, 20 periods • 2,65s (on a Dell latitude laptop computer) • 100,000 persons, 20 periods • 29s • 1,000,000 persons, 20 periods • 16 minutes 31s, of which approx. 83% is spent in the marriage market • ~180Mb RAM • 897Mb output file • could be compressed if needed • For a complete model with 100,000 persons • probably under 10min

  19. Demonstration

  20. TODO • Automated tests (aka “unit tests”) • Documentation • User manual • Code • Speed optimizations • Clean-up the code

  21. Questions or comments?

More Related