Structuring and Standardisation of Data of the DDK by Keywords
Sponsored Links
This presentation is the property of its rightful owner.
1 / 24

Structuring and Standardisation of Data of the DDK by Keywords PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

Structuring and Standardisation of Data of the DDK by Keywords Keywords – essentials of data description – use of thesauri and indices to describe datasets 1NUTS 2NUTS_Danube 3GEMET 4ISO 19115-topics. Keywords are Metadata Metadata is information about data.

Download Presentation

Structuring and Standardisation of Data of the DDK by Keywords

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


  • Structuring and Standardisation of Data of the DDK by Keywords

  • Keywords – essentials of data description – use of thesauri and indices to describe datasets

  • 1NUTS

  • 2NUTS_Danube

  • 3GEMET

  • 4ISO 19115-topics


  • Keywords are Metadata

  • Metadata is information about data.

  • Keywords are Metadata used to index a subject / dataset by

  •  commonly used words

  • formalised words (descriptive terms of a controlled vocabulary)


Thesauri and Indices to Describe Data

A thesaurus is a controlled vocabulary consisting of descriptors and non-descriptors which have specified relations among each other.

Indices are more or less simple alphabetical lists of keywords.

Thesaurus databases, created by international standards such as the ISO 2788 and ISO 5964, are generally arranged hierarchically by themes and topics.

Thesauri and indices are used to identify and locate records of a certain field of expertise labeled with these words.

The prime purpose of a thesaurus is to provide control over the terms (vocabulary) used for titling and indexing datasets.

The use of keyword to describe a dataset helps us to retrieve the “right” data within a large database.


  • Thesauri and indices used within the DDK

  • General there are two types of thesauri/indices used to label data within the DDK:

  • A gazetteer to describe data by geographic identifiers (region, commune, place).

  • 1.    NUTS

  • NUTS_Danube

  • A technical thesaurus or index to describe data by technical terms:

  • 2.    GEMET theaurus

  • 3.    Topic Category (ISO 19115)


Thesauri and indices used within the DDK

NUTS

The NUTS- (fr. Nomenclature des unités territoriales statistiques) - was established by Eurostat in order to provide a single uniform breakdown of territorial units for the production of regional statistics for the European Union. The NUTS is multilingual index according to the considered nation. Project members should always refer their data, if possible to the NUTS regions.


  • Levels of NUTS

  • There are three levels of NUTS defined, with two levels of local administrative units (LAU 1 and LAU 2) below that, historically called NUTS levels 4 and 5. For the DDK NUTS zero till three are implemented. Note that not all countries have every level of division

    • NUTS 0 – country

    • NUTS 1 – large regions - parts of country / 3 - 7 millions inhabitants

    • NUTS 2 – regions - groups of counties / 800.000 – 3.000.000 inhabitants 

    • NUTS 3 – smaller regions - county, city / 150.000 – 800.000 inhabitants


Levels of NUTS

Classification of NUTS regions of the DDK members:


Levels of NUTS

Classification of NUTS regions of the SK SLOVENSKA REPUBLIKA :

  • SK0 SLOVENSKA REPUBLIKA

    • SK01 Bratislavsky kraj

      • SK010 Bratislavsky kraj

    • SK02 Zapadne Slovensko

      • SK021 Trnavsky kraj

      • SK022 Trenciansky kraj

      • SK023 Nitriansky kraj

    • SK03 Stredne Slovensko

      • SK031 Zilinsky kraj

      • SK032 Banskobystricky kraj

    • SK04 Vychodne Slovensko

      • SK041 Presovsky kraj

      • SK042 Kosicky kraj

  • SKZ EXTRA-REGIO

    • SKZZ Extra-Regio

      • SKZZZ Extra-Regio


Thesauri and indices used within the DDK

NUTS_Danube

The “NUTS_danube” is a gazetteer according to the NUTS regions framework level 1 to 3. in evidence of EUROSTAT. The NUTS_danube of the ARGE DONAULÄNDER represents those project member countries that are not integrated into the NUTS of the EUROSTAT, like Serbia, Croatia or Moldavia. The NUTS-framework is especially important for the elaboration of the general scheme “Economy”, which will be using and mentioning statistical data for the respective regions. Compared to the NUTS, the NUTS_Danube language is english. All project members not representated by the NUTS of the EUROSTAT, should use instead the NUTS_Danube.


NUTS_Danube

Regions of the NUTS_Danube

NUTS-Danube

Croatia

HR2 – Eastern Croatia

Moldavia

MD3 - South Moldava

Serbia

YU2 - Middle Serbia

YU4 - East Serbia

YU6 - Vojvodina

Ukraina

part of - Odessa


Thesauri and indices used within the DDK

GEMET - GEneral Multilingual Environmental Thesaurus

GEMET, the GEneral Multilingual Environmental Thesaurus, has been developed as an indexing, retrieval and control tool for the European Topic Centre on Catalogue of Data Sources (ETC/CDS) and the European Environment Agency (EEA), Copenhagen. The work has been carried out through a contract between the EEA and the ETC/CDS which is led by the Ministry of the Environment of Lower Saxony, includes members of Germany, Austria, Italy, Sweden and benefits of the collaboration of other member countries of the European Union (EU), as well as of UNEP Infoterra.

The basic idea for the development of GEMET was to use the best of the presently available excellent multilingual thesauri, in order to save time, energy and funds. GEMET was conceived as a "general" thesaurus, aimed to define a common general language, a core of general terminology for the environment.

The GEMET Thesaurus covers all the topics within the Danube Data Base, so it is an excellent tool to decribe data by technical keywords.


Hierarchical listing of the GEMET thesaurus:

Hierarchical listing of the GEMET thesaurus:

NATURAL ENVIRONMENT, ANTHROPIC ENVIRONMENT

ANTHROPOSPHERE (built environment, human settlements, land setup)

ATMOSPHERE (air, climate)

BIOSPHERE (organisms, ecosystems)

ENVIRONMENT (natural environment, anthropic environment)

HYDROSPHERE (freshwater, marine water, waters)

LAND (landscape, geography)

LITHOSPHERE (soil, geological processes)

SPACE

TIME (chronology)


Hierarchical listing of the GEMET thesaurus:

ACCESSORY LISTS

FUNCTIONAL TERMS

GENERAL TERMS

HUMAN ACTIVITIES AND PRODUCTS, EFFECTS ON THE ENVIRONMENT  

AGRICULTURE, FORESTRY; ANIMAL HUSBANDRY; FISHERY

CHEMISTRY, SUBSTANCES, PROCESSES

EFFECTS, IMPACTS

ENERGY

INDUSTRY, CRAFTS; TECHNOLOGY; EQUIPMENTS

PHYSICAL ASPECTS, NOISE, VIBRATIONS, RADIATIONS

PRODUCTS, MATERIALS

RECREATION, TOURISM

RESOURCES (utilisation of resources)

TRADE, SERVICES

TRAFFIC, TRANSPORTATION

WASTES, POLLUTANTS, POLLUTION


Hierarchical listing of the GEMET thesaurus:

SOCIAL ASPECTS, ENVIRONMENTAL POLICY MEASURES

ADMINISTRATION, MANAGEMENT, POLICY, POLITICS, INSTITUTIONS, PLANNING

ECONOMICS, FINANCE

ENVIRONMENTAL POLICY

HEALTH, NUTRITION

INFORMATION, EDUCATION, CULTURE, ENVIRONMENTAL AWARENESS

LEGISLATION, NORMS, CONVENTIONS

RESEARCH, SCIENCES

RISKS, SAFETY

SOCIETY


Topic Category (ISO 19115)Topic Category is a enumeration of “main themes” within the ISO 19115.At least one item of the topic Category must be used to describe a dataset.

A total of 19 topic categories are available to describe a dataset.

The lower list shows a selection of Topic Categories:


  • Structuring and Standardization of Data in the DDK by

  • ISO Standard

  • What is the ISO 19115

  • 1 General explanation of the ISO

  • 2 Metadata – Description of DDK-Datasets

  • 3 Which metadata have to be collected within the DDK database (ISO conformity)


What is the ISO?

ISO (International Organization for Standardization) is a global network that identifies what International Standards are required by business, government and society.

ISO – a non-governmental organization – is a federation of the national standards bodies of 157 countries, one per country, from all regions of the world, including developed, developing and transitional economies.

ISO develops required standards in partnership with the sectors that will put them to use, adopts them by transparent procedures based on national input and delivers them to be implemented worldwide.

ISO has a current portfolio of 16 077 standards one of them is the ISO 19115 Geographic information – Metadata


  • ISO 19115(2003)

  • The objective of ISO 19115 Geographic information – Metadata is to provide uniform rules for describing digital geographic data.

  • This ISO is intended to be used by information system analysts, program planners, and developers of geographic information systems as well as others in order to understand the basic principles and the overall requirements of standardization of geographic information.

  • ISO 19115 will:

  • Provide data producers with appropriate information to characterize their geographic data properly (for example member of the DDK should describe their geographic data by ISO standard).

  • Faciliate the organization and management of metadata for geographic data

  • Enable users to apply geographic data in the most efficient way by knowing the basic characteristics.

  • Faciliate data discovery, retrieval and reuse. Users will be better able to locate, access, evaluate, purchase and utilize geographic date.

  • Enabel users to determinate wether geographic data in a holding will be of use to them.


Structure of ISO 19115

The ISO 19115 defines an extensive set of metdata elements; typically only a subset of the full number of elements is used.

However, it is essential that a basic minimum number of metadata elements be maintained for a dataset. This basic metadata are integrated as “core metadata”. 

A community as it own metadata profile like the “UGV-profile” of the Bavarian State Ministry of the Environment, Health and Consumer Protection.

Each profile should contain Core metadata.

Metadata Model according to ISO


  • Metadata

  • Metadata is information about data, describing the what, when, who and how of data. Spatial metadata incorporates additional information about the where component of data.

  •  WHO has created data

  •  WHAT is the contet of data

  •  WHEN data was created

  •  WHERE data are deposited

  •  HOW data was developed20

  •  Of WHICH quality data are

  • Metadata can exist for datasets, web services and project information such as the DDK-project.

  • The UOK - (UmweltObjektKatalog) - is the meta-database of the Bavarian State Ministery of Environment, Health and Consumer Protection, refering to data within the field of activity of the State Ministery (e.g. documents, spatial information, services).


ISO 19115 – core data

Describing data in detail, all core elements should be specified. Minimum level for data description are mandatory metadata. Those metadata are not adequate to the requirements for geographic information, but only as a general information.

The above table lists core metadata elements which are required to describe a dataset.

Within the metadata core elements there are several degrees of elements:

M: element is mandatory

C: element is mandatory under certain conditions

O: element is optional


ISO 19115 – core data

Describing data in detail, all core elements should be specified. Minimum level for data description are mandatory metadata. Those metadata are not adequate to the requirements for geographic information, but only as a general information.

The above table lists core metadata elements which are required to describe a dataset. 

Metadata should be collected by DDK project members

Metadata outside of the colored box are acquired by the Bavarian State Ministery.

M: element is mandatory

C: element is mandatory under certain conditions

O: element is optional


EPSG Codes

European Petroleum Survey Group or EPSG was a scientific organisation linked to the European petroleum industry consisting of specialists working in applied geodesy , surveying, and cartography (Oil exploitation).

EPSG compiled and disseminated the EPSG Geodetic Parameter Set, a widely used database of Earth ellipsoids, datums, geographic and projected coordinate systems and units of measurement e.g.

The role of EPSG was taken over in 2005 by the newly formed OGC Surveying and Positioning Committee.


  • EPSG Codes

  • EPSG Codes of the Donauländer Project Members:

  • Romania: EPSG:31600 Stereo_33;

  • EPSG:31700 Stereo_70,

  • Hungary: EPSG:23700; ungarian_1972_Egyseges_Orszagos_Vetuleti

  • Czech Republic:EPSG:28462 Pulkovo_1942_GK_Zone_2N;

  • EPSG:20002 Pulkovo_1995_GK_Zone_2;

  • Yugoslavia:EPSG:31176 MGI_Balkans_6;

  • EPSG:31177 MGI_Balkans_7;

  • Reference System Identifier default setting in the DDK):

    • EPSG 31467 ( DHDN_3_Degree_Gauss_Zone_3 )

    • EPSG 31468 ( DHDN_3_Degree_Gauss_Zone_4 )

    • EPSG 31469 ( DHDN_3_Degree_Gauss_Zone_5 )

    • EPSG 4326 ( GCS_WGS_1984 )


  • Login