Workshop on complex systems research initiative an introduction to agent based modelling
This presentation is the property of its rightful owner.
Sponsored Links
1 / 80

Workshop on Complex Systems Research Initiative: An Introduction to Agent-Based Modelling PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Workshop on Complex Systems Research Initiative: An Introduction to Agent-Based Modelling. Edmund Chattoe-Brown ([email protected]). Department of Sociology, University of Leicester, UK. Thanks.

Download Presentation

Workshop on Complex Systems Research Initiative: An Introduction to Agent-Based Modelling

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Workshop on Complex Systems Research Initiative: An Introduction to Agent-Based Modelling

Edmund Chattoe-Brown ([email protected])

Department of Sociology, University of Leicester, UK


This research funded by the Economic and Social Research Council of the UK ( as part of the National Centre for Research Methods (

Thanks are due to Nigel Gilbert (SIMIAN Co Director) for the use of some training materials initially developed primarily by him.

Thanks to you all for inviting me!

The usual disclaimers applies.


Plan of the workshop

  • Mornings: Introductory lecture/discussion.

  • The rest: Discussion, questions.

  • Afternoon: Hands on, initially exploring existing models then (?) programming.

  • Generally: Your proposed research.

Plan for day 1

The role of research methods in shaping what we see. Examples of qualitative and quantitative research and the need for a “third way”.

A very brief interlude on social versus physical science.

A simple “running” example: The Schelling segregation model. (Microcosm.)

What should we learn from this example?

Key concepts: Emergence, non-linearity, complexity.

The distinctive methodology of ABSS/MAM and data.


Opening Thoughts

“I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.” (The Psychology of Science, Abraham Maslow)

“When scientists and mathematicians fail to find positive clues leading towards solutions of their problems, they sometimes reverse their frontal strategies and employ reductio ad absurdum, which by a process of eliminating all the impossibles and improbables, leaves a residue of least absurd, ergo most plausible solutions, which may be reduced, by physically testing to unequivocable answers.” (Buckminster Fuller, foreword to Confessions of a Trivialist, p. ix)


Overall goals

To introduce a novel method (MAM/ABSS) for understanding the social world using relevant examples.

To distinguish it clearly from some existing methods and thus lay out a coherent research strategy arising from it.

To introduce (and provide hands on experience for) a “typical” piece of software (NetLogo) for implementing that method.

To offer a “vision” of the future in research of this kind.


What are we used to?

I may well be talking to quite a diverse audience. I shall try not to assume too much.

I’ll start with sociology and we can take it from there. I can’t always promise obvious relevance of examples but this isn’t just laziness!

The two main methods of representing theory in sociology are narratives and equations.

These are almost invariably associated with qualitative (ethnographic) and quantitative (statistical) analysis respectively.

Other methods: Experiments/randomised control trials, history, analysis of artifacts/documents, monitoring … (Interesting!)


Example of narrative analysis

“Turkish interviewees do not include themselves when they are evaluating the status of ‘Turkish women’ in general. While referring to ‘Turkish women’, most Turkish interviewees use the pronoun ‘they’:

Turkish women are more home-oriented. I think that they are left in the backstage because they do not have education, because they are not given equal opportunities with men. (T3)

One of the Turkish interviewees stated that it was difficult for her to answer the questions related to her status ‘as a woman’, because:

I don’t think of myself as a Turkish women, but as a Turkish person. I mean I never think about what kind of role I have in the society as a woman. (T1)

Most Norwegian interviewees, on the other hand, identify with ‘Norwegian women’ in general, and they refer to ‘Norwegian women’ as ‘we’:

I think that in a way Norwegian women, that is we, at least have our rights on paper. We have equal rights for education and we have good welfare arrangements … (N1)” (Sümer, Acta Sociologica, 1998, 41(1), p. 122)


Narrative analysis pros and cons

As rich as you want it to be.

Crosses levels of analysis (self reports on decision making).

Limited at some unknown and fuzzy barrier with psychology.

Real dangers of subjectivity (should be “regulated” by the method though).

The price of that richness is that incompleteness, ambiguity and inconsistency can exist within the narrative and be hard to spot.

TANSTAAFL: Rich but “expensive” to collect and analyse, especially with observational data.

Can it generalise?


Example of quantitative analysis

“The most important empirical findings of this study can be summarized as follows:

… there is a moderate tendency for individuals with higher service class origins to be more likely than others to enrol in PhD programmes.

The estimated effect of class drops to zero when controlling for parents’ education and employment in research or higher education.

The overall implication of these findings is that the transition from graduate to doctoral studies is influenced by social origins to a considerable degree. Thus, the notion that such effects disappear at transitions at higher educational levels - due either to changes over the life course or to differential social selection - is not supported.” (Mastekaasa, Acta Sociologica, 2006, 49(4), pp. 448-449.)


Quantitative analysis pros and cons

Can’t be too rich to “solve” or “fit”.

Mostly completely explicit (though some methodological background may be tacit i. e. assumptions about distributions of data) thus avoiding ambiguity, incompleteness and inconsistency.

Can it particularise?

Hits data collection and analysis problem of “atomisation”: “50 cases per variable” rule of thumb in simple regression.


Aside: No theory, no data, no logo

Example: Educational success.

Girls and boys go through a school system, get grades/qualifications and reach different levels.

They may start biologically different, be socialised differently, form different peer groups, be selected differently into schools or subjects, be treated differently by teachers, develop different interests and motivations, be offered different resources, choose differently and so on.

All these processes unfold in parallel, in diverse combinations for diverse individuals.


What does that mean for methods?

If individuals are unique, we can all give up (but there are reasons not to be so pessimistic).

We often disagree (fruitlessly?) on where social regularities lie: Attributes versus practices.

Clearly gender is associated with educational success through all these processes but the notion of causality is much harder to apply. Why would there be “big” patterns to find?

Ethnography can subject tiny parts of sequences to detailed examination (and practices should generalise) but cannot look at the whole.


Stepping back: Levels of description

A micro level, where individual action occurs in an “environment”.

A macro level (environment), which shapes and is shaped by the micro level.

The eminent sociologist James S. Coleman argues that in order to explain properly, a theory must link one level by a process description to another. (Mechanism/middle range sociology.)

There are grounds for arguing that, although they may appear (or claim) to, neither statistical nor ethnographic accounts actually do this.


Aside: Physical and social systems

Physical systems cannot give accounts of themselves nor respond adaptively to their “environment”.

They “follow” the same “laws of nature” that we try to deduce from them. (Atoms in gas.)

Regularities in social systems cannot be of this kind because of reflection and adaptation.

The unique (but fuzzy edged) domain of social action arises from the almost unique ability of humans to make rich models of their world (including social science models). Marx?


Cashing this out: Segregation model

Agents live on a square grid so each has maximum 8 neighbours.

There are two “types” of agents (red and green) and some grid spaces are vacant. Initially agents/vacancies distributed randomly.

All agents decide what to do in the same very simple way.

Each agent has a preferred proportion (PP) of neighbours of its own kind (0.5 PP means you want at least half your neighbours to be your own kind - but you would accept all of them i. e. PP is minimum.)

If an agent is in a position that satisfies its PP then it does nothing otherwise it moves to a vacancy chosen at random.

A time period is defined (arbitrarily) as the time it takes for each agent (chosen in random order to avoid non robust patterns) to “take a turn at” deciding and possibly moving.



I’m going to show you exactly how the computer does this before too long.

In a nutshell, the description amounts to:

Create the world.

Do some things to each agent and repeat.


Initial random state



Aside: This is a NetLogo “world window”.


Two questions

What is the smallest PP (i. e. a number between 0 and 1) that will produce clusters?

What happens when the PP is 1?



About 0.3.

No clusters form.

Revisit 1: Had you “seen” the cluster data generated by PP=0.3, might you (if of a particular political or sociological persuasion) have attributed xenophobia to the system?

Reflection: Is PP=0.1 behaviourally indistinguishable in cross section from PP=1? Problem?


Why and so what?

Because PP is a minimum, people are always happy “inside” a cluster of their own kind.

If a cluster is “full” (no internal vacancies) then it cannot be disrupted.

Whether clusters form depends on whether their shape is compatible with the PP for each “edge agent”. (No “sharp corners” possible: Minimum size?)

When PP is 1, no shape of the cluster edge is compatible with the satisfaction of edge agents so the cluster cannot form.

An aggregate entity (the cluster) thus becomes a structuring principle for individuals.


Simple individuals/complex system

Counter-intuitive macro (social) results from simple micro interactions. A non-linear (and complex) system.


A vision: To be revisited/expanded

Simulation is a “macroscope” (or “complexoscope”) because it allows us to “see” complexity in a way that is similar to the way that a microscope allows us to see very small things.

The explicit process specification (that should mirror real social processes) shows us why existing methods have difficulty linking micro and macro levels. The “process” in a statistical model is just the equation system linking variables. In qualitative research there may be no such process. (The reasons why are interesting and puzzling.)


Connection 1: Data and methods

This is a patently unrealistic model: Identical decisions, random movement, no housing market, no schools or jobs to attend to. (I chose it deliberately!)

How, broadly, would it be made more realistic?

Using qualitative methods to study neighbourhoods, perceptions and decision processes.

Using quantitative methods to compare (in some sense) the simulated clusters with some real ones. Does this look anything like residential patterns by ethnicity in Toronto? How like? (I’ll return to this.)

Existing research methods are used in ways that are clearly different but certainly not unrecognisable.


Connection 2: Explanation

It is the simulation that links the interplay of situated micro processes (choosing agents with neighbours) with macroscopic patterns (clusters).

A social theory is thus neither represented as a narrative or set of equations but as a computer programme. (Coleman is happy!)

The rigour of quantitative research is retained (complete specification) but the behaviour only needs to be “generated” not “solved” or “fitted” so can be of arbitrary sophistication. (I’ll show this.)

If we can “generate” something then we have explained it. (Methodology hazard!)


Connection 3: Complexity concepts

Complexity: “Rich” patterns (here, non-linearity for example) do not need to come from “rich” agents or “rich” interactions. They can arise from simple interactions between simple agents. World view?

Emergence: The need to use categories at one level of description that do not make sense at another. (You cannot have a one agent cluster or a one car traffic jam.)

Non-linearity: We cannot assume things we often do assume (large effects imply large causes, similar effects have “close” causes).


Informal thoughts on methodology

These will be made more rigorous later.

Generally, don’t use MAM/ABSS to “explain” a straight line. The idea of “over fitting” (and Occam’s Razor) applies but we can’t formalise it as we can in statistics.

We need to worry about “how many” simulations can match a given real system. This is our “leap of faith”.

We discover this by general experience (clustering, Power Law, S-shaped innovation curve) and address it by “bar raising” and choice of research question/model.

Some of these issues arise not from weaknesses in the methodology itself but from the fact it is still being established. (Equating poor methodology and poor practice is a defence mechanism against novelty.)


Similarity in the Schelling model

A two (three?) state system.

Hollow versus full clusters, direct red/green interfaces versus vacancy buffers. (Vacancy chains idea.)

Exact match to Toronto?

Cluster sizes of correct distribution (but no location stability across runs?)

Cluster “shapes” correct?

“There are clusters”: Actually pretty weak.

Now consider 3 types: Separated versus concentric clusters. The latter is much more discriminating.

Or, what is internal structure of clusters with regard to PP? (Most tolerant at edges?)

Naïve (but useful?) notion: Ratio of possible world states to states compatible with your theory as measure of “power”.


Richness in the Schelling model

Emphasis (so far) on spatial pattern.

What about “biography” or “history” of agents?

What are effects of in and out migration to produce a dynamic rather than static equilibrium? (Convergence as an “artefact” or a finding?)

What are the distributions of any heterogeneous parameters (PP for example) with respect to clusters?

Very loose idea: Can we “fit” on some comparisons of real and simulated data and then “explain” on others? (Hazard warning: We don’t know how orthogonal different “aspects” - like biographies and clusters - are.)


A speculation

Some research methods must be “radical innovations” (rather than just “more of the same”).

If MAM/ABSS is such an RI, what follows? Humility needed!

Possible evidence of MAM/ABSS as an RI is its ability/requirement to reuse existing data and draw attention to novel data previously ignored.

But this casts doubt on the “origins” of the Schelling model: “If I were you, I wouldn’t start from here at all”.


How to start with MAM/ABSS?

Think of it just like research design.

What (one sentence?) are we trying to do/explain? How does phosphorous “move” around Lake Simcoe? How best can we make it do something different?

Why did we pick this method? (In some sense reasons already given but need to defend against claims of existing methods i. e. don’t “explain a straight line”.)

What is known? (TANSTAAFL again: Not just in all the relevant domains but in MAM/ABSS too!)


Why do all this work?

Read a few articles at random.

Make a set of weakly grounded assumptions.

Build a model: Throw in a few more invisible assumptions so it can’t be replicated.

Play with the model, get “results” and publish in an enclave simulation journal.

Defend your arbitrary assumptions to the death against others with equally arbitrary ones. Avoid collecting data to decide.

Be ignored by domain experts. Ignore them.

Wait for MAM/ABSS to become a footnote in social science history.


Plan for day 2

Going from informal methodology to formal. How to turn these general guidelines into a plan for a research project.

What relevant parts of NetLogo do we need to know about and (broadly) how do they work?


Developing the vision 1

MAM/ABSS is new which offers huge opportunities for innovation and originality. I can “offer you” whole social science disciplines with barely any models.

The price we pay is that we cannot yet fall back on a widely agreed “normal science”.

We have to raise our own standards “from within” without wrecking the community.

We have to “try harder” to convince the rest of the world.

We have to manage the “us or them” boundary especially carefully.

You may decide (quite reasonably) that you want to “come back later”.


Developing the vision 2

It can be done: We are looking for “win win” ideas.

Social networks have an enormous number of potential characterisations courtesy of existing Social Network Analysis. If you can “generate” simulated networks that look like real networks according to many of these (potentially orthogonal) characterisations, you are really on to something.

First win: Tools (Which n measures of social networks make the most effective index for similarity?)

Second win: Perspective (What can we learn if we treat existing SNA data as a sample rather than a bunch of “ethnographically unique” case studies?)


Formalising the methodology 1

The “Gilbert and Troitzsch Box”


Formalising the methodology 2

Choice of target: Clear research question. Avoid TOE: Theory of Everything. (Geographer example from Borges.)

Choice of target: Research question, theory (or theories), process in unknown environment, model.

Process of abstraction: Start from key “stylised facts” in domain. (Class example. Citation test.)

Process of abstraction: Not all abstractions are equally “harmful”. (“The assumptions you don’t realise you are making are the ones that will do you in”. Compare existing methods? More later.)


Formalising the methodology 3

Similarity: Already raised. How high can you go? Transparency and replication? Do it yourself like Darwin? (Commenting code.)

Identification of novel data requirements may reintroduce the really strong falsification test that is so appealing in physical science (Einstein and Mercury perihelion, position of Pluto conditional on theory of gravitation being true). Can’t do this with statistics because model fitting requires all data “up front”.

More on methodology later as needed.


MAM/ABSS abstraction example

A return to the Turkish/Norwegian women.

Is it significant that I sometimes say “we” and sometimes “they?”

Perhaps groups behave in certain ways and I either wish to behave in that way or in some other.

Suppose there are a number of “roles” that prescribe actions in different social settings.

I may choose a role (self interest?) but will I be accepted in it? (White rastafarians.) It depends how I behave. Maybe the role I am “put in” most often shapes how I see the world and how it satisfies me. (Role strain? Roles are two sided.)

A dynamic between behaviours, roles and interests? How does it unfold? Do roles mutate?

Reality check: what do we need to know here? Very broadly, how people behave, how they think they ought to behave and how they feel about it. “Killer” app? Women and work?


Back to no theory, no data, no logo

Phosphorous and Lake Simcoe: A blanket apology in advance.

Surprising how often we are “following stuff” around a system whether the stuff is dirty syringe needles, phosphorous or gazelles.

Goal is water phosphorous levels not much above the natural “carrying capacity”, a huge reduction over a short period. “Instant attention” of policy makers? What else are people doing with this? (NHS epidemics example.)

Where does the phosphorous come from, how does it move or “stick” and what removes it from the area of study?

Set of “phosphorous actors” and “P actions”: Golf courses and farmers fertilisation, waste water run off from residential areas, sewage works, manufacturing, other. Levels on rivers and open water are the key measurement points. Abstract by not modelling dog walkers … Padding?

Exogenous processes: Air pollution from other regions, outflow from the study region, natural “leaching” from some patches perhaps? An “accounting” approach based on overview of existing knowledge?



Physical processes: Can phosphorous be absorbed or naturally converted at some locations up to some level? How does it behave in ponds and lakes? Does it “coat” patches? (Relatively easy to “split” raindrops?)

What “can” we do? “PhosLok?”, taxes and subsidies, dredging/scrubbing, prohibitions and enforcement, “giving up” on some rivers and making them “sewers”, relocation, drains and changes to wastewater management. Out of the box thinking (Perhaps motivated by the simulation itself: What if we could move this lake?) and collecting the union of suggestions from stakeholders and feeding them back iteratively.

Back to Buckminster Fuller: Are there solutions that appear to cost impossibly much or are simply unacceptable to all but one stakeholder? Do some proposed solutions simply appear not to work? Interesting question: How much does it matter if the model is “wrong” in comparing the relative costs of different strategies?

How “social” a model is this? Do we need to model how the local community forms advice networks or shops for groceries or just how they allocate crops to fields and decide when to feed their lawns? (This is why having a clear research question matters: Does this move phosphorous?)


Getting started

We now have a pretty good NOTNODNOL blue print for our literature review (and phosphorous is a pretty good search term!)

We also have some notion of what kind of team “leaders” we might need (hydrologist/chemist, some sort of social scientist/community studies person and modeller). Models as common language.

We are looking for physical models, problem regions, management strategies, relevant social science on behaviour change in particular groups. (Don’t close in too soon though: What other water run off product problems are there?)


The world

Patches and attributes: Altitude, water held, surface water on patch to flow away, even cloud saturation above? (Don’t dismiss “kludges”.)

Rules of “transfer”: Water downhill, surface water by patch permeability, surface water by patch surface, clouds by (exogenous?) wind direction.

I don’t know how much of this is “known” or how existing models “transate” to this level of abstraction. I know some atmosphere models do do this!

Some aspects (water flow) are likely to be good approximations (pooling) at low cost.


Relevant NetLogo

Earth Sciences (Grand Canyon).

file-open “realplacedata.txt” (This file is just a list of altitudes extracted from other data. Not an NL issue.)

let patch-elevations file-read


Note: The patch-elevations variable comes directly from creation of elevation variable in patches-own. NL does the mapping for you.

See also how this programme makes buttons for “tracking” raindrops work.



Back to the “us and them problem”.

Except with policy makers/funders “in charge” (who have to be handled with tact), it is not enough to say that a phenomenon exists to require a model redesign. This is “death by detail”.

There must be data and reasonable grounds (perhaps from other studies) for thinking that the effect “matters”. Clear research designs are also defensible.

This is far from trivial. (Example of SNA and large scale survey data.)


What about “brains?”

Schelling agents had decision processes based on observation but they didn’t have “memories” or “practices” to draw on in alternative situations.

Mostly, agent brains are represented as sets of “if then” rules, partly for interpretability and partly for data access. (Other possibilities exist if needed like “learning systems”.)

Like most programming languages NL has “data structures”. For example, lists representing the x, y co-ordinates of my “required” daily activities.

Example: Social Science (El Farol).


Doing things to lists

set foo (list (random 10) (random 10) 7 2)

set foo (list (list 0 0 0) (list 1 1 1))

set foo but-first foo [Also but-last: Past behaviours being forgotten.]

if empty? foo [ do-thing ]

set foo filter [? < 3] [1 2 4 5 6 8 2]

set fput 2 [3 4 5]

set bar (item 2 [2 3 4 5]) (Note, starts from 0.)

set foo (replace-item 2 [2 3 4] 15)

Look at NetLogo Dictionary in Help.



Strings are mixtures looking rather like lists but can include words, numbers and punctuation.

A nifty trick (like LISP) is to use read-from-string to “execute” strings as NL code. So, for example, suppose you want an agent to act by if … then … rules. If you put these in procedures they are “hard coded” for each run but what you actually want is for agents to be able to change their set of practices (borrowing from others or deleting failed rules) then store them all as a string (or probably actually a string of strings) and then execute them one at a time in each situation.



Once agents have “brains”, communication and imitation fall out very naturally.

Examples: Reputation in the Prisoner’s Dilemma, the Gilbert and Troitzsch “shopping agents”.

Warning! Don’t let your model develop feature creep. This is not a model of how we diffuse better practices in communities. We only want to know what happens if we change the distribution of behaviours.

Is a farmer just a “ghostly presence” floating over a farm?


A Problem

How to make systematic use of past data?

If someone else read what you read, how similar would their model be? Defensibility?

Idea of inductive coding in qualitative research: What do papers “talk about” and “how much?” (More tomorrow?)

Sources: Literature reviews as a first cut, the “raw literature” once you have some structuring ideas (NOTNODNOL models are useful here), experts, stakeholder interviews, social science “common sense”.


“Version 0”

The simplest model you can think of that addresses the problem, contains all the “boxes” (key processes) and actually works. Small?

Now we can say more about “safe” abstractions and forward development. Having a fixed “lay down” for phosphorous across all patches is almost certainly wrong but, within the development framework, is trivial to fix in version 1. By contrast, a network free model would be awful to “fix” in the next version (assuming networks were needed, they may not be here).


Uses of version 0

Learn the skills.

Build the team and get conversation going.

Get wider input.

Show potentially interested partners/funders.

Scope additional data requirements. (Sensitivity analysis “starting from where you are”.)

Example: Do I need “real” weather? Suppose I have a “certain amount” of rain to distribute. How much (and in what way) does it matter if I distribute it randomly, in “lumps” or in “lumps by altitude?” You can “fake” this before deciding whether to “invest”.


Throwing it back

From “here”, what are the problems you envisage with SimCoeSim (or other models like homeless epidemiology and urban land use?)

How to do specific things in programming?

How to represent certain processes or abstract them?


Plan for day 3

Wrapping up the methodological outline.

Some “short takes” on the state of the art in various respects.

Some passing reflections on “large scale” research across disciplines.

Avoiding bad practice in MAM/ABSS. Getting “through” to publication or effective policy advice.


More methodology: Parameters

Avoid unmeasurable parameters generally. (But allow for considerable creativity in research methods: Firefighter example.)

Not all measurable parameters need yet be measured for scientific status. Unproblematically measurable is better.

“Quality” of models depends on progressive and iterative refinement of values by “significance”.

Too many unanchored parameters make a model capable of anything. Searching the parameter space from scratch is impossibly time consuming unless you start from plenty of “best guesses”.


More on parameters

Keep a parameter log with current value and rationale. Look for “the weakest link” and delegate. Some “parameters” are innocuous like divisions of colour scales.

Hypothesis: In principle, a model should have no tunable parameters except those susceptible to policy. Here, the journey is more important than the destination.

Defensibility is important here. Critics of the model can’t just say a parameter value is “wrong”. They have to say what is better, why and, ideally, how much it matters.


Is this achievable?

Abdou, Mohamed and Gilbert, Nigel (2009) Modelling the Emergence and Dynamics of Workplace Segregation, Mind and Society, 8, pp. 173-191.

We are starting to be able to point to a few examples with “gold standard” methodology but it has taken a while for what is needed to have become clear.



An elegant model is one that has a favourable ratio of parameters to explained phenomena. The more phenomena with the fewer parameters the better. You know it when you see it: Read models too!

Example: Farmers choose crops ex ante. Crop totals (plus exogenous “weather” plus practice if needed) determine a “fragment” of market price ex post (a big lump will be the rest of Canada or even the world market) and this feeds back to regulate cropping decisions and potentially also farm failure/growth. These two (measurable?) parameters are all it takes to “close” the economic system neatly as a first approximation. (Though we need to know farmer goals too.)


The Bacharach Conjecture

A model that makes minimally sensible assumptions in all processes will outperform one with excellent assumptions in a few processes and plainly silly ones elsewhere. (Perfect competition anyone?)

Chattoe-Brown’s Lemma: It will also outperform models with “missing” processes relevant to a given research question.

The combination of systematic literature reviewing and consensual development of “version 0” is intended to generate models on the “right” side of this conjecture/lemma.



Many social and physical processes are internal to the target with bi-directional links.

Some things affect the system but are not significantly affected by it relative to the research question. (So, in fact, rainfall may be affected by surface evaporation but, in a model of phosphorous transport in a local region, rainfall can be treated, reasonably, as exogenous as long as surface evaporation is only a small effect and doesn’t partake of phosphorous transport itself.)


Version control

Have a rolling list of versions and decide which amendments should go in which version.

Don’t have a rolling programme. You’ll need finished versions to “publish” and the project will get in a mess with “spaghetti code”.

You can’t really “hack” if the code needs to be passed around within the team.

Establish a work plan and procedure for transferring tasks between team members i. e. a complete specification of (agreed) upgrades to the next version for the modeller. Summary reports between, say, physical and social science “groups”.


Setting up for model testing

Because MAM/ABSS are complex and data hungry, you don’t get “a lot” of opportunities to test them. This makes these tests very important and they need to be planned for in advance.

Version 0 is supposed to “capture” existing knowledge and is judged on ability to reproduce stylised facts.

Can you achieve “hold backs” in various ways? (Historical data, subsets of phosphorous sampling “stations”.)

If you have parameter values you can’t set even approximately, can you tune the model to fit one “output dimension” and then use others as hold backs? (Clusters versus biographies?)

This is another reason why careful choice of parameters is important. Inability to set parameters with principled values (however imperfect) is a potential “waste” of testing opportunities which are much more “expensive” than data.

Think seriously about things like not giving whoever tunes the parameters the hold back data.


Iterative development: Recap

Conditional on a “finished” version, which parameter values are most significant to the ability of the model to “match” real data? (Very dangerous to “muddle” exploration of parameters with changes to programme functionality.)

What is the confidence we have in these crucial parameter values?

How should we allocate “research effort” (and of what kind) to most increase this confidence? Out of the box thinking? Quick and dirty laboratory experiments? More literature reviewing?

“Reverse” sensitivity analysis. If we could halve the uncertainty on this parameter value, would this get us into a single phase of system behaviour?

“Selling” research ideas to others? Meta-analysis? Larger samples for existing findings? (Close friendship example.)


Systematic reviewing 1

If it is correct that MAM/ABSS is a distinctive method that can “reuse” old data, how do we actually do this?

Example of social capital: Reading surveys and overviews identifies several key dimensions that are more or less central. (Everyone agrees on networks but rather few on bureaucracy.)

What is the minimal version 0 model that can integrate these dimensions in an “elegant” way?

Given this framework/V0M, what does each specific paper or book contribute?


Systematic reviewing 2

Start with a narrative (usually an interview, here a journal article).

“Chop it up” into theoretically significant sections i. e. here the respondent is talking about “danger” or “respect”.

Refine the “codes” (categories in the narrative) and produce “memos” (any ideas created by the data: “This sounds like patriarchy”, “Get a better tape recorder”).

Repeat sequentially with more narratives. This should make the codes robust, show their ubiquity, suggest relations between them and “falsify” memos against a sample for their “proper” use (organising principle versus “nice idea”.)

The model thus reflects the “most agreed” aspects of a phenomenon in a warrantable way.

Matters slightly complicated by endogenous evolution of fields. (Choice and refusal example.)


Developing middle range theory

Several people have asked me about “generic” agent architectures. Some exist but aren’t that useful because there is no such thing as a “generic social behaviour”.

However, there are areas where disciplinary boundaries or the simple “difficulty” of theorising have created recognisable gaps in what is needed for certain kinds of MAM/ABSS.

For example, models tend to be spatial, social or relational but rarely two and never (to my knowledge) all three.


PACT models

In reality, as busy academics, we fully recognise the social world as consisting primarily of places and people at specified times.

Who we meet where defines specific relations (colleagues) and underpins the “generation” of different social ties: Which of your colleagues would you also consider “a friend” to invite home to meet your family or drink with on a non work day? How did they get that way?


Version 0

Each agent has a “time plan”, simply a sequence of places to “be at” at each point in the day.

While there, they might meet anyone else who is there too (geography structures networks). There is a very low rate of “random” meeting (bus stops).

Version 1 allows some voluntary activities (let’s all meet in bar x at 5), easily adding “weekends”.

Version 2 allows remote communication and deliberate adaptation by how you “get on” with different people. (A “good party” will be reproduced. People search networks and “recommend”.)

Can make use of distinctive data like “oral histories” of friendships and time diaries. This will be pretty useful.


Other theory needed

Real dynamic decision making and thus “real” communication (co-evolution of problem representation and solutions rather than “hard coded” shared representation).

Behavioural underpinnings of realistic network dynamics. (Neighbour greetings example.)

Effective but economical representations of organisations and hierarchies and their impact. (Organisations as networks of “vacancies”.)

Coherent agent representations of norms based on empirical data in well defined domains.

Models that can cope with (and explore how agents cope with) “genuine novelty”.


Reconfiguring/“freeing” methods

How do we create/convince “theory building” or “gap filling” ethnographers?

How do we refocus at least some statistical analysis on “measures of similarity” between complex objects (rather than straight lines!)

Is it easier to teach programmers sociology or sociologists to programme?

Why are methods so often tied to disciplines and “flavoured” accordingly? (Experimental cultures example.) Can we “free” methods?


4 “No Nos” (and “Yes Yes” variants)

Models that are of interest only to the designer and his/her friends. [Models that cast light on unsolved problems in a particular domain.]

Models that don’t even capture the stylised facts of knowledge to date. (Econophysics?) [Models which systematically encapsulate what is known “to date” in an elegant way.]

Models that even the designer doesn’t fully understand. [Models the designer can explain to non-modellers in a way that provokes intelligent responses.]

Flaky or “do anything” models. [Models that generate a systematic programme of data synthesis and/or collection. Models that make robust predictions that are relatively insensitive to parameter changes within the known uncertainty.]


Quick thoughts on publication

Avoid the “no nos”.

Stand your ground on wrong headed criticisms.

“Borrow” and accumulate effective responses (examples) for the “standard objections”. (“Et tu? defence” example.)

Judge your target/audience and don’t write “boiler plate” you don’t need.

For “print”, figure out how to get your outputs into graphs and tables.

Think seriously about code and sample runs on the web. (Many journals now endorse this.)


What haven’t I talked about?

The floor is yours!


Now read on

Ahrweiler, P., Pyka, A. and Gilbert, N. 2004. Simulating knowledge dynamics in innovation networks. In R. Leombruni and M. Richiardi (eds.) Industry and labor dynamics: The agent-based computational economics approach. Singapore: World Scientific Press.

Chattoe, E. 2006. Using simulation to develop testable functionalist explanations: A Case study of church survival, British Journal of Sociology, 57(3), September, pp. 379-397.

Chattoe, E. and Hamill, H. 2005. It's Not Who You Know - It's What You Know About People You Don't Know That Counts: Extending the Analysis of Crime Groups as Social Networks', British Journal of Criminology, 45(6), November, pp. 860-876.

Chattoe-Brown, E. 2009. The Social Transmission of Choice: A Simulation with Applications to Hegemonic Discourse, Mind and Society, 8(2), December, pp. 193-207.

Epstein, J. M. and Axtell, R. 1996. Growing artificial societies: Social science from the bottom up. Washington, DC and Cambridge, MA: Brookings Institution Press and MIT Press

Gilbert, N. and Troitzsch, K. G. 2005 Simulation for the Social Scientist, second edition Buckingham: Open University Press. [Red cover edition. Important: Don’t get the blue cover first edition by mistake. Examples not in NL!]

Gilbert, N. 2007. Agent based models. Quantitative Applications in the Social Sciences 153. London: Sage.

Gilbert, N. 2007. A generic model of collectivities, Cybernetics and Systems, 38(7), September, pp. 695-706.

Ramanath, A. M. and Gilbert, N. 2004. The design of participatory agent-based social simulations. Journal of Artificial Societies and Social Simulation, 7(4).



JASSS, Journal of Artificial Societies and Social Simulation <>.


NetLogo <>.

simsoc email distribution list <>.

ESSA, European Social Simulation Association <>.

NAACSOS, North American Association for Computational Social and Organization Sciences <>.

CSSS, Computational Social Science Society <tbc!>.


Welcome to NetLogo

Click on the NL icon.

Go to Files > Models Library.

Select and click on Social Science to show options.

Select Segregation and then hit the Open button bottom right.


Things to observe




World window.

Interface, Information and Procedures buttons.

Speed slider.


Experiments with Schelling

Try population of 1000 with similar wanted at 50%.

Now try a population of 2000.

What happens with a population of 2500? Why?

Now try SW at 75% with populations at 2000 and 2500. What are the differences in observable behaviour?

What is the difference (for 2000 agents) between the behaviour of the system when SW is 75% and when it is 40%?

How many different “dimensions” of the system can vary based on different combinations of two parameters?

Can you get the system to do anything else that is interesting?


Other suggestive models

Earth Science: Erosion.

Earth Science: Grand Canyon.

Social Science: Wealth Distribution.

Social Science: Team Assembly.


  • Login