1 / 23

Harri Lehtinen (harri.lehtinen@stat.fi)

Dissemination of Statistical Data, Publications and Metadata - Process Based on Common Structure of Statistical Information (CoSSI). Harri Lehtinen (harri.lehtinen@stat.fi). CoSSI: (Common Structure of Statistical Information).

wandat
Download Presentation

Harri Lehtinen (harri.lehtinen@stat.fi)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dissemination of Statistical Data, Publications and Metadata - Process Based on Common Structure of Statistical Information (CoSSI) Harri Lehtinen(harri.lehtinen@stat.fi)

  2. CoSSI: (Common Structure of Statistical Information) • The point of departure in the CoSSI was an (infological) analysis of the information being considered. • The conclusion from the analysis was that although in practice the definition of statistical information has varied according to a given situation and application, in reality statistical information has a certain simplifiable and acceptable universal structure. • The CoSSI describes the general structure that is not dependent on the situation of the statistical information presented in differing formats.=>CoSSI defines the structures of statistical data, metadata and publications. Harri Lehtinen

  3. XML based dissemination - CoSSI • Modules: • Document metadata • Statistical metadata • Processing metadata • Publications • DATA: • Matrices (XDF) • Tables (CALS) • Sparse matrix (KEYS) CoSSI – (www.stat.fi/cossi) Harri Lehtinen

  4. Implementation • Modular DTD system • Document Type Definitions • Use of standards • CALS, XDF, Dublin-Core... • Statistical matrix (statinfo_xdf.dtd):statmeta.dtd, docmeta.dtd, xdf.dtd • Statistical table (statinfo_cals.dtd):statmeta.dtd, docmeta.dtd, cals.dtd • Publications and documents (publication.dtd):docmeta.dtd, statmeta.dtd, statinfo_cals.dtd, figure.dtd... XML • One XML-file -> data and metadata • Multi-lingual documents Harri Lehtinen

  5. Metadata • Statistical metadata • Information vital for the interpretation of numerical statistical information • Document metadataInformation about: • The producer of document • Document’s content • Processing metadata • Information for a software to process data Harri Lehtinen

  6. Statistical metadata Content model ofstatistical metadata Document metadata Statistical metadata Variable name Concept definition Operational definition Description Calculation formula Measurement unit Classification ID Type Author Date Values Figure Harri Lehtinen

  7. Creator Person Publisher Organisation Contributor Person Date Published, modified Language Main and other language Document information SVT and Category Identifier URN, URL, ISBN, ISSN, DOI, Number Document metadata Content model ofdocument metadata Subject Keywords Content description Type Format Rights Coverage Relations Source Harri Lehtinen

  8. x x … x … x 11 12 1j 1p x x … x … x 21 22 2j 2p . . . . . . . . x x … x … x i1 i2 ij ip . . . . . . . . x x … x … x n1 n2 nj np Variable x x … x … x 1 2 j p . . Statistical unit a i . . n a Content model of statistical data matrix Statistical data Title Document metadata Statistical metadata Processing metadata Statistical data matrix XDF Variables Class values Statistical units Footnotes Harri Lehtinen

  9. Statistical table CALS Column headings Row headings Numerical data Table footnotes Statistical table Statistical metadata Content model ofstatistical table Table title Document metadata Processing metadata Harri Lehtinen

  10. Document Document metadata Documentsand publications Document main title Ingress Introduction Abstract Headnote Product specification Chapters Title Sections Title Paragraphs Summary Footnotes Bibliography Appendix Definition lists Harri Lehtinen

  11. Paragraph Paragraph List (unordered / ordered) Statistical table Figure Link Footnote reference Bibliographical reference Emphasis Harri Lehtinen

  12. Implementation to the PC-Axis • Need for the XML format for the PC-Axis • CoSSI-matrix-format is close to the PC-Axis data format and supports also multi-lingual data • Processing metadata for the PC-Axis (pxmeta) • Mapping of PC-Axis metadata to the CoSSI-model statistical, document and processing metadata • Three data formats • Matrix (XDF) • Table (CALS) • Keys (PC-Axis) => but the same metadata for all formats! • Allows more metadata than the original PC-Axis format • Automatical conversion between data formats Harri Lehtinen

  13. CoSSI for the PC-Axis Data part is in different formats but everything else stays the same • Matrix • Docmeta • Procmeta • Statmeta • Data -> XDF • Table • Docmeta • Procmeta • Data -> CALS • Statmeta • Keys • Docmeta • Procmeta • Statmeta • Data -> Keys Information is the same in all formats! Harri Lehtinen

  14. PC- Axis - tables PDF XLS / Dissemination process –Office97 .PX PX-Editmanual or batch processing - checking - edit metadata Automaticalpublishing -Timercontrolled Databaseservices .PX .PX PX-Web PX-Edit or PC-Axismanual or batch processing - exclusion- save as: Excel or txt Statisticalapplication Web-site FastWeb -Timercontrolled www.stat.fi PX-Edit Publication production (Monthly & quarterly publ, publication tables...) SuperStar to PX SAS to PX HTML Publicationeditor Metadata: - statistical metadata- classifications - processingmetadata FastWeb:- Conversionto XHTML Conversionto PDF Word, Excel,... PX-templates Harri Lehtinen

  15. What we need: • More and better metadata • Validation • Language versions • All information in a single file • Archiving • Automatical conversion to different dissemination channels • Structured searches • SVG • Vendor free solution • To add new dissemination channels Harri Lehtinen

  16. .PX .PX .PX .PX Statisticalapplication PX-Edit -> PX&CoSSI SuperStar -> PX&CoSSI Publicationeditor SAS -> PX&CoSSI Arbortext Metadata: - statistical metadata- classifications - processingmetadata PDF PDF PDF Monthly & quarterly publ, publication tables...) eXist,XML-database / XML based dissemination process – XML and PC-Axis Publishingandpreview PX-Web:PC-Axis tables Databaseservices PX-Web FastWeb-XML Conversion Disseminationdatabase HTML HTML Web-site eXist,XML-database www.stat.fi Printinghouse RSS,SDMX RSS,SDMX Harri Lehtinen

  17. Statisticalapplication .xml .xml PX-Edit -> PX&CoSSI SuperStar -> PX&CoSSI Publicationeditor SAS -> PX&CoSSI Arbortext Metadata: - statistical metadata- classifications - processingmetadata PDF PDF PDF Monthly & quarterly publ, publication tables...) eXist,XML-database / XML based dissemination process – integration completed Databaseservices FastWeb-XML Publishingandpreview PX-Web PX-Web:matrices(PXML) Conversion Disseminationdatabase HTML HTML Web-site eXist,XML-database www.stat.fi Printinghouse RSS,SDMX RSS,SDMX Harri Lehtinen

  18. XML Database and Statistical Information Harri Lehtinen

  19. eXist XML database Statistical metadata Statistical publications Statistics Statistical tables Harri Lehtinen

  20. Statistical publication in the Arbortext editor Harri Lehtinen

  21. Statistical metadata for a variable in a table Statistical metadata for a variable ”Disposable income” Harri Lehtinen

  22. HTML output of a statistical publication with statistical metadata Link to the statistical metadata Harri Lehtinen

  23. User interface for publishing and preview Harri Lehtinen

More Related