the argus software of the sdc project l.
Skip this Video
Loading SlideShow in 5 Seconds..
The ARGUS Software of the SDC-project PowerPoint Presentation
Download Presentation
The ARGUS Software of the SDC-project

Loading in 2 Seconds...

play fullscreen
1 / 35

The ARGUS Software of the SDC-project - PowerPoint PPT Presentation

  • Uploaded on

The ARGUS Software of the SDC-project Anco Hundepool Statistics Netherlands Washington, August 1999 Statistical Disclosure Control the balance between the need for (more and more) information and the privacy of the respondents Statistical Disclosure Control

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

The ARGUS Software of the SDC-project

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the argus software of the sdc project

The ARGUS Software of the SDC-project

Anco Hundepool

Statistics Netherlands

Washington, August 1999

statistical disclosure control
Statistical Disclosure Control
  • the balance between the need for (more and more) information and
  • the privacy of the respondents
statistical disclosure control3
Statistical Disclosure Control
  • Need for detailed micro data files
    • electronic publications
    • computing power of users
  • Need for more detailed tables


statistical disclosure control4
Statistical Disclosure Control
  • Protection of privacy of respondents

persons, enterprises, institutions

  • Respondents must be able to trust Statistical Offices!
  • Risks:
    • Intruders/ hackers
    • Accidental recognition
    • Advanced record linkage techniques
statistical disclosure control5
Statistical Disclosure Control
  • Produce ‘safe’ datafiles and tables
  • Apply data modification techniques
  • Preserve as much information

Implemented in ARGUS!

framework of development of argus
Framework of developmentof ARGUS
  • SDC project
  • partly subsidised by EU (4th Framework)
  • Co-operation between The Netherlands, Italy (+Spain) and UK
general aims of sdc project
General aims of SDC project
  • Methodological research in SDC
    • microdata, tables
    • concerning statistics, OR
    • geographical data
  • (general) SDC Software development
    • microdata (m-ARGUS)
    • tables (t-ARGUS)
sdc project members
SDC project members
  • Netherlands
    • CBS (ARGUS)
    • TU-Eindhoven (OR for microdata)
  • Italy
    • Istat (with Univ. of Rome)(Research/testing)
    • CPR-Padova (with Univ. Tenerife)(OR for tabular data)
sdc project members9
SDC project members
  • UK
    • ONS (data)
    • Univ. Manchester (with Univ. of Southampton)(Research on SARs)
    • Univ. Of Leeds (Geographical data)
main software developed in sdc project
Main software developed in SDC-project
  • m-ARGUS (CBS and TUE)
    • micro data
  • t-ARGUS (CBS and CPR)
    • tabular data
ideas of m argus
Ideas of m-ARGUS
  • Intruder uses information of identifying variables (e.g. region, sex, age, education, occupation) to identify records.
  • This leads to the sensitive information
m argus
  • Levels of protection
    • public use files (PUF)
    • micro files for researchers (MUC)universities, contract etc.
    • safe-setting
ideas of m argus13
Ideas of m-ARGUS
  • a list of combinations of identifying variables must checked
  • find value combinations that are unsafe
    • e.g. |a x b x c| <= threshold
    • threshold depends on level of protection
      • Public use files
      • Micro data for researchers (contract)
ideas of m argus14
Ideas of m-ARGUS
  • eliminate the unsafe combinations
    • by global recoding (age -> agegroup, region -> province)
    • local suppression (imputing missings)
    • interactively/automatically
  • with minimum information loss (entropy)
m argus15
  • For microdata
  • Developed in Borland C++
  • Windows-95/98
  • Version 3.0 last SDC-version
    • interactive/automatic global recoding
    • automatic local suppression
features of m argus
Features of m-ARGUS
  • can handle large microdata files
    • only tables derived from microdata are being used
  • flexible global recoding
  • options for automatic mix of global recoding and local suppression (TU Eindhoven)
addit features of m argus
Addit. features of m-ARGUS
  • Micro-aggregation
  • Top/Bottom coding
  • Rounding




Generate tables



Global recoding

Local suppression

Micro aggregation

Top/bottom coding





m argus input data
m-ARGUS input data
  • Data: Fixed format ASCII
  • Metadata
    • Name
    • Position
    • Missing values (2)
    • Identification level
    • Hierarchical coding
    • Codelist (opt.)
using m argus
Using m-ARGUS
  • reading data file
  • generating tables
  • apply global recodes
  • local suppression
  • generate safe file
  • generate report
ideas of t argus
Ideas of t-ARGUS
  • identification of sensitive cellsusing e.g. dominance rule
    • at least n (e.g. 2) contributors to a cell
    • sum of largest 3 contributors >= 75%(one large contributor could recalculate the contribution of its competitor)
  • easy part
ideas of t argus23
Ideas of t-ARGUS
  • Eliminate/protect sensitive cells(hard part)
  • by applying SDC techniques
    • table redesign
    • cell suppression
    • rounding
    • interactively and/or automatically
  • with minimum information loss (e.g. cell weights)
ideas of t argus24
Ideas of t-ARGUS
  • cell suppression in tables with marginals
  • identify primary sensitive cells
  • protect primary cells by suppressing additional (secondary) cells to prevent recalculation (to some approximation)
  • with minimal information loss (CPR)
t argus25
  • 3-D tables
  • interactive table redesign
  • primary & secondary cell suppression
  • optimisation routines for automatic cell suppression
  • rounding










Safe table

features of t argus
Features of t-ARGUS
  • Initial run through microdata
    • Determine also top k per cell ->sensitive cells
    • Table redesign possible without going back to microdata
  • Uses procedures for secondary cell suppression using state-of-the optimisation algorithms (CPR)
  • Prepared for linked tables
t argus28
  • Data: fixed format ASCII
  • Meta data:
    • Variable name
    • Start. position
    • Field length
    • Status
t argus29
  • Apply global recoding
  • Protect file with secondary suppression
  • Rounding
  • Safe table as ASCII or .WK1(plus report)
t argus30
  • Version 2.0 final SDC-version
  • requires commercial OR-solver(Xpress by Dash, UK, 600 GBP)
future casc
Future / CASC
  • Computational Aspects of Statistical Confidentiality
  • New European project-proposal(2000-2002)
    • Extending ARGUS
    • New research
  • Additional joint USA/EU-project?
casc m
  • Concentration on business/economic data
    • microaggregation
    • PRAM
    • Noise-addition/ masking
casc t
  • Hierarchical tables
  • Linked tables
  • Optimal solution vz. heuristics
  • Different input formats
casc team
  • Statistics Netherlands
  • Istat (Italy)
  • ONS, Univ. Southampton, Manchester, London, Plymouth (UK)
  • Bundesambt, IAB (Germany)
  • Stat. Catalunya, Univ Tenerife (Spain)
  • Anco Hundepool
  • Statistics Netherlands
  • PO box 4000
  • 2200 JM Voorburg
  • The Netherlands
  • email
  • fax: +31 70 3375990
  • phone: +31 70 3375038