Sylvie huet
Sylvie Huet. Modelling from data: an experience in modelling rural demography. Laboratoire d’Ingénierie pour les Systèmes Complexes. From data to models Cergy-Pontoise, 27-28 june 2013. Context: demography in rural municipalities. Evolution du rural in Europe?

Context: demography in rural municipalities

Evolution du rural in Europe?

Coupling demography and residential mobility of people in order to study their evolution at a very local scale: the municipality level

Context: demography in rural municipalities

Decision-support in demography generally uses microsimulation modeling (O'Donoghue, C. (2001), Li and O’Donoghue, 2012).

Space and residential mobility

Coupling microsimulation and agent-based modelling

No integrated theories so extracted and using data to build a globally coherent theory through the dynamic modelling approach at the individual level and at the municipality level

A first instance on the Cantalpopulation (French region) (Huet et al 2012a, 2012b).


An interesting motivation

A well identified overall modelling choice

A marvellous applied research question

But data! As a constraint, as theories, as results…

“Pas de chichis, pas de blabla, que des résultats”

Summary: everything through the prism of data

Finding and censing data

Choosing data for dynamicmodelling

Finding data

Can’tbuilt a specificsurvey: too large problem

Can’t use a reweightedsample of individual: not enough and toomuchdifficult to access

tenacity… and then

Finding and make the census of data

At first, we had nothing…

and finally we have too much!

Réseau chambre d’hôtes

Finances Communales DGF


Taxes de séjour

Corinne Land Cover

Recensements 1990, 1999, 2006, …

Histoires familiales

Histoire de vie 2003

Household Panel

Base permanente des Equipements

Tables de mobilité 1999

SIRENE entreprises

Enquête logements

Inventaire Communal 1998, 1998

Labour Force Survey

SITADEL (logements)

Enquêtes générations 1988, 1998, …

Distribution des salaires (INSEE)

Recensements agricoles 1988, 1998, 2005

Revenus des ménages

ISSP sens du travail

Enquête Emploi

Changing confusion in results




Criteria to choose


Finding and censing data

Choosing data for dynamicmodelling

Criteria to choose among all the data?

Quantity of work

People and ideas

Building the various dynamics (and their couplings)

Calibrating and validating the model

1. Criteria: time and cognitive costs!

  • The ones we don’t really talked about linked to the quantity of work

  • Cost in terms of investigation of the data sources

  • Easiness to use statistical tool and representativity

  • Possible reuse of generic objects and dynamics in other countries

Laborious, difficult, not valorised,…

not publishable, not a research problem, too long to explain…

What a costly approach!

List of questions

List of variables (not necessarily the direct answers to questions)

List of modalities for a variables

Representativity at various scales, for various population…

Understanding hiden/above model, theories

Require to study for every possible source:

A lot of people always use the same survey as we use the same tools or the same methods

2. Criteria: working with people and ideas

  • In interdisciplinary work, the ones you don’t think a priori:

  • Understandable for involved people (and comparable with other models)

  • Working with research partners

  • A compromise to decide about

  • Or who you are going

  • not to understand

Criteria: working with researchers and ideas

Why not to use the wages?

  • The existing/choosing data are not collected under their theories’ hypothesis: misunderstood, disagreement

  • Some, especially modellers, don’t use data usually

  • Some, especially modellers, have difficulties to understand what individual based modelling means

3. Criteria: building the various dynamics

  • To build the various dynamics (and their couplings)

  • Possible interconnectivity of various sources

Example: using conjointly the LFS and the Census, giving both the “same” activity sectors and socio-professional category allowing to define the employment offer at the municipality level (Census) and the way an individual choose an employment and change it (LFS)

Criteria: building the various dynamics

  • Problem of the statistical representation (example of low density areas representing a small part of the population: 39% zones ruralesoupériurbaines)

Example in Cantal: number of farmers in Cantal; no problem to access to a lodging but problem to access services)

European Household Panel or National Census?

Census: rare datasources at low level and rare theories and/or knowledge

Criteria: building the various dynamics starting from wrong data

With the wrong data, in sense of irrelevant, not convenient, chosen for theircapacity to « reveal » a relevant dynamics

The number of in and out migrants has this property since it links every processes related to mobility, starting from the decision to move

Choosing a decision to move: “checking model”

Familyreasons are the mostcitedreasons for the decision to move (impact on needed size)





Old people move too much for a decision only based on the size of the current housing

Assessing the chosen decision to move

LITTERATURE (statistical analysis from data)

(Debrand and Taffin 2006) notice that

moving decreases with age

But also

the move to a large housing is much more common than the move to a smaller one

And finally we can also reproduce the critical values, and more simply, deciding to move with a lower probability when the need is to decrease the residence size

Choosing dynamics to ensure consistency (in case again of wrong data)

Counterintuitive choices to ensure the consistency between endogenous submodels, being parameterised from calibration, and exogenous submodels, parameterised from data.

Example: residential mobility modelling, people are susceptible to migrate out the region if and only if they have found a new residence place inside the region!

=> only because we only know about the probability to quit the region versus moving inside the region (ie problem of the unknown decision to move)

4. Criteria: calibration and validation

  • To calibrate (finding out the parameters of the dynamics chosen through the checking-model procedure) and validate the model:

  • Temporal continuity of the definitions and availability, comprising also the initial state (ex. : 1990, 1999, 2006, dwelling size…)

  • Relevance of the spatial scale at which the data are available

  • Critical indicators about the temporal evolution, especially related to “initially” unknown dynamics

Example for Cantal…

The Cantal: data for calibration




The Cantal: data for calibration



decreasing municipalities: red

increasing municipalities: blue

The Cantal: data for calibration













An almost impossible calibration despite the data and because of the data

Aim at respecting the tendency (not only the absolute difference to various measures of the time). What about a small overall distances if the tendency is not the same?

A combination of every tendencies is almost impossible to obtain…

 Require a quasi continuous loop of rebuilding the model

Small distance but badtendency

A never ending validation

Too many data in a way… how choosing to restrict the validation process? I don’t know at this stage.

Similarly to the calibration problem, you can’t be satisfied since you have a lot of data, almost all the data you have not retain for building the initialisation or the dynamics

Synthesis at this point of my study of what data brings into the dynamic modelling at low level of large systems

Finally very difficult to use as a predictive tool even if microsimulation (built from data) are usually built for this reason and considered as reliable since it propose a consistent theory extracted from data

Much more useful (probably even classical theoretical approach or discrete choice models) to learn about composing dynamics since they consider a lot of coupling dynamics (instead hypothesizing they are neglectable) : checking dynamics procedure

Data challenges the interdisciplinary work (instead of simplifying)!

What a richness and a nightmare!

