1 / 23

Cell suppression in linked tables from structural business statistics using Tau Argus 3.3.0:

This paper presents a conceptual framework for cell suppression in linked tables from structural business statistics using Tau-Argus 3.3.0. It discusses the motivation, the need for a scheme to cope with the problem, and the steps involved in the protection process of the tables.

Download Presentation

Cell suppression in linked tables from structural business statistics using Tau Argus 3.3.0:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cell suppression in linked tables from structural business statistics using Tau Argus 3.3.0: a conceptual framework Alessandra Capobianchi and Luisa FranconiIstat -Division for Information Technology and Methodology - Italy Ntts 2009 Brussel 18-20 Febbruary 2009

  2. NTTS 2009 Brussel 18-20 Febbruary 2009 What are linked tables? Tables presenting data on the same response variable sharing some categories of at least one explanatory variableare said “Linked tables”. Such esplanatory variable is called “linked variable”.

  3. NTTS 2009 Brussel 18-20 Febbruary 2009 Motivation -EUROSTAT Since now Eurostat was in charge of protecting tables requested by SBS Regulations and performed a global confidentiality treatment. From 2008 Eurostat will not treat the confidentiality of the Sbs tables and each NIS has to take care on his own of such protection process - SBS tables Community Structural Business Statistics (SBS) are a setof hierarchical linked tables with spanning variable thatpresentdifferent levels of the hierarchyindifferent tables - Software Tau-argus version 3.3.0 available at the website of the Essnet project http://neon.vb.cbs.nl/casc/..%5Ccasc%5Ctau.htm . Currently it doesn’t deal with hierarchical linked tables that present different levels of the hierarchy. -NEEDTo develop a scheme to cope with such problem

  4. NTTS 2009 Brussel 18-20 Febbruary 2009 Community Structural Business Statistics • Community Structural Business Statistics (SBS) are collected within the framework of Council Regulation (EC, EURATOM) No. 58/97 of December 1996. • Definitions and table breakdowns are specified in a series of Commission and Council Regulations. • We focus our attention on the first fourannexes covering the 'business economy‘ (Annex1), industry (Annex2), distributive trades (Annex3) and construction (Annex4). • Aim- Achive the protection of the Structural Business Statistics deriving from such annexes

  5. NTTS 2009 Brussel 18-20 Febbruary 2009 Conceptual scheme • The protection process of the SBS linked tables can be divided into three steps: • Translate the legal framework into a set of tables • create a set of tables for Argus • 2. Analyse the links between tables • establish an order in the protection of the tables • 3. Apply Tau-Argus to each table according to the order previously established • maintain coherences in the suppression pattern

  6. NTTS 2009 Brussel 18-20 Febbruary 2009 1.Traslation The tables we focus on are those related to: annual enterprise statistics (at 4 digit Nace code) annual enterprise statistics (at 3 digit Nace code) by size classes annual enterprise statistics (at 2 digit Nace code) by region (NUTS2) The main statistical unit is the enterprise even if some statistics are produced also for KAU and for local unit Enterprise=the smallest combination of legal units that is an organisational unit producing goods or services Kau= kind-of-activity unit that groups all the parts of an enterprise contributing to the performance of an activity at the class level (four digits) of NACE Local Unit=an enterprise or part thereof (e.g. a workshop, factory, warehouse, office, mine or depot) situated in a geographically identified place

  7. NTTS 2009 Brussel 18-20 Febbruary 2009 1.Translation : some peculiarities • However, such general scheme comprising three types of tables presents some relevant differences: • the first table is replicated with KAU as response unit • in the second table the variable size class presents two different classifications for different sectors (C-F and G-K); • for sector G: • the regional table is released at NACE 3 level instead of NACE 2; • only for this sector there is the additional table relating to NACE 3 by turnover in classes.

  8. NTTS 2009 Brussel 18-20 Febbruary 2009 1.Translation: the set of tables The tables considered in the general scheme need to be split into several tables that are homogeneous in the level of the classifying variables and response unit. Definition of spanning variables for each table to be processed by Argus in order to fulfil SBS regulations

  9. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables The analysis of the levels of the hierarchy of the “linked variable” implies the definition of a scheme of relationships that provides the order of the processing of the tables from the most detailed level of the hierarchy to the most aggregated. That’s because more detailed cells of the table will contribute to the construction of marginal cells in other tables that present a lower level of the hierarchy of the linked variable. Common cells need to present a coherent suppression pattern.

  10. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables The most detailed tables is Tab1.1. That table is relative to all enterprises classified according to classes of NACE classification in 4 digit codes. That table will be called the “starting table”.

  11. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables The next table to be processed should present the hierarchical level of the “linked variable” immediately higher than the starting table. In SBS Community statistics there are two tables: Tab2.1 and Tab2.2: - present 3 digit NACE code as classifying variable. - are related to different sectors of the economy and no link exist between the two tables - present two different classifications of the variable size class relative to different sectors of NACE code

  12. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables

  13. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables • The next level of the hierarchy of the “linked variable” is 2 digit NACE code • In SBS Community statistics there is theTab3.1 that present this hierarchical level and is related to sectors C to K excluding G of the NACE classification. • Tab3.1 presents marginal derived from tab2.1 and tab2.2 for sectors C to K excluding G • The response unit are different but enterprises either coincide with local units or comprise of more than one local unit

  14. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables

  15. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables Some relevant differences are presented for the sector G. -The regional table is released at NACE 3 level instead of NACE 2; Tab2.3 andis linked only to tab2.2. -There is an additional table released at NACE 3 levelby turnover in classes; Tab2.4 linked to tab3.2 Also for sectors C-F the table at NACE 4 level is presented not only for enterprises but also for KAU; Tab1.2. The tables tab1.1 and tab1.2 coincide almost perfectly so it has been decided to apply Tab1.1 pattern of suppression to Tab1.2.

  16. NTTS 2009 Brussel 18-20 Febbruary 2009 2. Analysis of the Set of Linked Tables

  17. NTTS 2009 Brussel 18-20 Febbruary 2009 3. Protection phase: a priory information The order generated by the analysis of the links between the tables as described in the previous scheme aims to identify common cells in subsequent tables. Common cells need to present a coherent suppression pattern. Tau-Argus software allows to fix a setting of apriory information for cells selected by the user. Such flexibility of the software can be used to impose coherent suppression patterns to a set of tables

  18. NTTS 2009 Brussel 18-20 Febbruary 2009 3. Protection phase: a priory information This “A priori” information is organised in an history file. In the history file the user, before the protection phase, can assign to predetermined cells of the table, one of the following protection “status”.

  19. NTTS 2009 Brussel 18-20 Febbruary 2009 3.Protection of a Set of Linked Tables The protection process using secondary cell suppression starts from: Tab1.1; the most detailed table. This table is protected by Argus according to the rules established by the Member State. In order to communicate to Argus the information related to the protection of this starting table we ask the software to save the “status” information relative to each single cell of the protected tab1.1. Five different output status are allowed by Tau-Argus;

  20. NTTS 2009 Brussel 18-20 Febbruary 2009 3.Protection of a Set of Linked Tables The second step is to protect the second table of the scheme tab2.1 All the suppression applied to the previous table tab1.1 to the common marginal cells, have to be replicated in the current table tab2.1 This will be done by creating an history file for the current table (tab2.1) containing a priori information that will impose to Argus the constraints stemmingfrom the protection of the previous table (tab1.1). Different types of constraints may arise for each of the common cell.

  21. NTTS 2009 Brussel 18-20 Febbruary 2009 3.Protection of a Set of Linked Tables Output status in the previous table, meaning and corresponding status to be applied in the a priori information for the current table.

  22. NTTS 2009 Brussel 18-20 Febbruary 2009 3.Process is applied following the relationship scheme • Tab1.1 • Apply Tau Argus • Create .saf file • Convert information contained in the .saf file into a priori information for tab2.1 and tab2.2 • Select common cells (tab1.1 with tab2.1; tab1.1 and tab2.2) and create the “History files” (H2.1 , H2.2) • Tab2.2 • Apply History file H2.2 • Apply Tau Argus • Create .saf file • Convert information contained in the .saf file into a priori information for tab3.1 • Select common cells (tab2.2 with tab3.1) and create the “History file” (H3.1_2) • Tab2.1 • Apply History file H2.1 • Apply Tau Argus • Create .saf file • Convert information contained in the .saf file into a priori information for tab3.1 • Select common cells (tab2.1 with tab3.1) and create the “History file” (H3.1_1) • Tab3.1 • Apply History files H3.1_1 and H3.1_2 • Apply Tau Argus • Create .saf file

  23. Conclusions and Further Work • Conclusion • We describe a process to protect linked hierarchical tables from SBS using Tau-Argus • We have successfully aaplied the process to the Italian sample of SBS • Further work • With the entry into force of the new SBS regulation pertaining changes due to the adoption of the new classification of economic activities, NACE rev.2, more work need to be done. • Study protection pattern harmonised between subsequent years need to be carefully tuned so that coherence is maintained not only within a year but also for successive years.

More Related