j nogueras iso f j l pez j lacasta f j zarazaga soria p r muro medrano geneva 6 7 november 2006 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006 PowerPoint Presentation
Download Presentation
J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006

Loading in 2 Seconds...

play fullscreen
1 / 25

J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006 - PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on

1st Workshop of COST Action C21: "Ontologies for Urban Development: Interfacing Urban Information Systems" Building an Address Gazetteer on top of an Urban Network Ontology. J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006. Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano Geneva, 6-7 November 2006' - carolena


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
j nogueras iso f j l pez j lacasta f j zarazaga soria p r muro medrano geneva 6 7 november 2006

1st Workshop of COST Action C21: "Ontologies for Urban Development: Interfacing Urban Information Systems"Building an Address Gazetteer on top of anUrban Network Ontology

J.Nogueras-Iso, F.J.López, J.Lacasta, F.J.Zarazaga-Soria, P.R.Muro-Medrano

Geneva, 6-7 November 2006

outline
Outline
  • 1. Introduction
  • 2. A typical use-case: IDEZar
  • 3. Ontology building using a manual mapping
  • 4. Ontology building using an automated approach
  • 5. Conclusions
1 introduction
1. Introduction
  • The increasing relevance of geographic information for decision-making and resource management in diverse areas promoted the creation of Spatial Data Infrastructures (SDI)
  • SDI: a coordinated approach to technology, policies, standards, and human resources necessary for the effective acquisition, management, distribution and utilization of GI at different organization levels and involving both public and private institutions
  • Gazetteer Service
    • A typical component of an SDI
    • Directory of instances of a class or classes of features containing some information regarding position
    • Looks up geographic feature locations based on geographic identifiers
address gazetteer service
Address Gazetteer Service
  • In SDIs for local administrations such as a city council,
    • address gazetteer services represent one of the most important services that the councils must offer to their citizens
  • An Address Gazetteer Service
    • Specialized on Urban Network Features (addresses)
    • The councils are responsible for the management of urban networks, and these networks are used as reference information for other services at national level such as cadaster or census services
creation of the contents of a gazetteer
Creation of the contents of a gazetteer
  • It usually requires combining multiple repositories
    • The same feature (concept) is stored in different repositories, each of them contributing with a different piece of attribute information
    • Typical problems of heterogeneity
      • Different data models (roles, granularity), encoding
  • Our proposal to deal with heterogeneity in this context:
    • Build an urban network ontology upon existing feature types taxonomies
2 a typical use case idezar
2. A typical use-case: IDEZar
  • The IDEZar Project is the result of a collaboration agreement signed in March 2004 between the City Council and the University of Zaragoza
    • Zaragoza is a medium-sized city (some 650000 inhabitants), in the northeast of Spain (capital of Aragón), growing fast in extension and population. The municipality is about 1000 km2 and includes several towns
  • Objective: development of a local SDI for Zaragoza
    • To facilitate, increase and coordinate the use of spatial data by the Council
    • To develop applications for the citizens and to provide them with access to public sector information
idezar service architecture http www zaragoza es idezar

IDEE

(National SDI)

IDEZar

(Local SDI)

IDEAr

(Aragón – Regional SDI)

  • <<WMS>>
  • Base
  • Street maps
  • <<WMS>>
  • Environment-Thematic
  • Agenda 21, protected areas...
  • <<WMS>>
  • IDEE-Base
  • Base map up to 1:25000 of Spain
  • <<WMS>>
  • Base
  • Orthoimages
  • <<WCAS>>
  • Catalog
  • <<WMS>>
  • Urban-Thematic
  • Public services (libraries, police stations...)
  • Private services (pharmacies, parkings...)
  • <<Gazetteer>>
  • IDEE-Nomenclátor
  • Toponyms
  • <<Gazetteer>>
  • Street names

GeoPortal

  • <<Route planner>>
  • Arriving at Zaragoza

Street Map and Gazetteer

IDEZar Service Architecture http://www.zaragoza.es/idezar/
address related repositories

IDEZar

AYTO

Addresses ranges

Statistics

Office

TVIAN

National

Statistics

Institute

TVIAN

Street types

Street names

Informatics Office

AYTO

Zaragoza City Council

Electoral

Census

Inhabitant

Census

Addresses

Addresses

Maps

Tax Office

SIGLA

Urban Planning Office

AYTO,SIGLA

Site development updates

Town planning updates

Addresses updates

Street names

Addresses

Maps

Property

Census

Amends

(streets, addresses)

National Cadaster Office

SIGLA

Address related repositories
  • Multiple repositories
    • Not very different models
      • Feature = name + type + additional info (location, range, …)
    • But different taxonomies for urban network feature types
    • Not specially synchronized
address related repositories1

IDEZar

AYTO

Addresses ranges

Statistics

Office

TVIAN

National

Statistics

Institute

TVIAN

Street types

Street names

Informatics Office

AYTO

Zaragoza City Council

Electoral

Census

Inhabitant

Census

Addresses

Addresses

Maps

Tax Office

SIGLA

Urban Planning Office

AYTO,SIGLA

Site development updates

Town planning updates

Addresses updates

Street names

Addresses

Maps

Property

Census

Amends

(streets, addresses)

National Cadaster Office

SIGLA

Address related repositories
  • Statistics Office repository
    • Inhabitant/poll census, exchanges from/to National Statistics Institute
    • TVIAN (Tipo de Vía Normalizada): standardized network feature types of the National Statistics Institute
address related repositories2

IDEZar

AYTO

Addresses ranges

Statistics

Office

TVIAN

National

Statistics

Institute

TVIAN

Street types

Street names

Informatics Office

AYTO

Zaragoza City Council

Electoral

Census

Inhabitant

Census

Addresses

Addresses

Maps

Tax Office

SIGLA

Urban Planning Office

AYTO,SIGLA

Site development updates

Town planning updates

Addresses updates

Street names

Addresses

Maps

Property

Census

Amends

(streets, addresses)

National Cadaster Office

SIGLA

Address related repositories
  • Cadaster Office repository
    • Land/Tax management, exchanges from/to National Cadaster Office
    • SIGLA: network feature types of the Cadaster office
address related repositories3

IDEZar

AYTO

Addresses ranges

Statistics

Office

TVIAN

National

Statistics

Institute

TVIAN

Street types

Street names

Informatics Office

AYTO

Zaragoza City Council

Electoral

Census

Inhabitant

Census

Addresses

Addresses

Maps

Tax Office

SIGLA

Urban Planning Office

AYTO,SIGLA

Site development updates

Town planning updates

Addresses updates

Street names

Addresses

Maps

Property

Census

Amends

(streets, addresses)

National Cadaster Office

SIGLA

Address related repositories
  • Informatics Office repository
    • Central repository used for assignation of new street names
    • AYTO: Network feature types of the council
gazetteer content creation
Gazetteer content creation
  • Why do we need to combine both 3 repositories?
    • Not all features are in the 3 repositories
    • Attribute information is distributed in the different repositories
gazetteer content creation ii
Gazetteer content creation II
  • Problems found while combining
    • Matching can not be based uniquely on feature names
      • 2 features may differ in typology but not in name (Spain square vs Spain avenue)
    • Which is the most appropriate feature type taxonomy for the gazetteer contents?
  • Solution proposed: define a urban network ontology
    • An ontology defines explicitly the concepts and relations between these concepts in a domain
    • This ontology will provide a unified model of the feature types that can be found in this domain
      • Making the necessary mappings to the particular taxonomies use in the different council offices or external organizations
how to build up the ontology

TVIAN

AYTO

SIGLA

How to build up the ontology
  • The construction of ontologies upon existing vocabularies is a classical and widely used approach
  • The underlying problem (ontology alignment)
    • How to find the relationships that hold between the entities represented in different taxonomies
  • Two approaches for the ontology construction
    • Manual mapping approach
    • Automated approach
3 manual mapping approach

AYTO

(City Council)

SIGLA

(Cadaster)

RESIDENTIAL DEVELOPMENT

PEDESTRIAN STREET

COUNTRY HOUSE

(SOUTH OF SPAIN)

SQUARE

SQUARE

PEDESTRIAN STREET SEGMENT

MINOR

ROAD

STREET

STREET

TVIAN

AYTO

SIGLA

MINOR

ROAD

Concepts

Acronyms

“CL”

“AN”

“CN”

“CM”

“PZ”

“PL”

“CLP”

“CLTP”

“CL”

“CN”

3. Manual Mapping approach
  • Matching of terms (names + acronyms) between the different taxonomies
    • Difficulties: lack of semantic descriptions
  • Categories of matches
    • Exact match
    • Partial match: one concept is broader or narrower No match
    • Provisional match: taxonomy errors (homonyms) imply erroneous matches
a more flexible approach

TVIAN

AYTO

SIGLA

URBISOC

A more flexible approach
  • Previous approach
    • Too time expensive and with little scalability
  • Improvement
    • Use of well-established shared common core ontology and make mappings between the distinct sources and this common core
  • New experiment: Use of URBISOC thesaurus
    • a thesaurus focused on Spanish terminology for Town Planning
    • developed by the CINDOC/CSIC institute (Centre for Scientific Information and Documentation / Spanish National Research Council)
a more flexible approach ii
A more flexible approach II
  • Use of Towntology ontology editor
    • Focused on ontology construction
    • Storage of concepts with several definitions that are in a process of selection and characterization
  • Although improving scalability, still time expensive and error prone
4 ontology building using an automated approach

TVIAN

AYTO

SIGLA

generated

4. Ontology building using an automated approach
  • Why?
    • Manual mappings are time expensive
    • Some mappings may not be successful because content creators have not assigned the correct feature type
  • Technique proposed
    • Formal Concept Analysis (1980, Wille &Ganter …)
    • It enables the extraction of a hierarchy of concepts from the feature instances contained in the source repositories
basics of fca
Basics of FCA
  • Definition of formal contexts, triple (G,M,I)
    • G: objects
    • M: attributes
    • I: binary relation between G and M, incidence matrix
  • It is possible to extract formal concepts
    • Given AG and BM, a pair (A,B) is a formal concept if and only if
      • the set of all attributes shared by the objects in A is identical with B
      • A is also the set of all the objects which have in common with each other the attributes in B
  • Additionally it is possible to establish a subconcept-superconcept relation
    • (A1,B1)(A2,B2)  A1A2 ( B2B1)
applying fca
Applying FCA
  • How to obtain a unique repository of instances, i.e. the formal context required by FCA?
    • Traditional datalinking has been applied to the feature instances contained in the different databases
      • based on the analysis of the lexical and spatial similarities of feature attributes
    • Transform the datalinking matrix into the incidence matrix
      • Each checked cell (match of source features) generates an object/instance in the incidence matrix
      • The columns correspond with the transformation of urban network feature type codes (e.g., AYTO CODE, SIGLA CODE) into proper attributes with boolean values
slide21

2718 features

18 AYTO codes

4318 features

35 SIGLA codes

Datalinking

matrix

Incidence

matrix

Replace

by code

applying fca1
Obtain the concept lattice

NEXT CLOSED SET algorithm (Ganter 87)

Incidence

matrix

FCA

supremum

(least common superconcept)

Concept

Lattice

AYTO_PL

SIGLA_PZ

(square)

SIGLA_AV

(avenue)

SIGLA_CL

(street)

Only

attributes

SIGLA_CL

AYTO_AN

(carfree designed

street)

SIGLA_CL

AYTO_CLP

(pedestrianized

street)

SIGLA_CL

AYTO_CL

(traffic allowed

street)

AYTO_AV

SIGLA_AV

(traffic allowed

avenue)

SIGLA_AV

AYTO_AVP

(pedestrian

avenue)

infimum

(greatest common subconcept)

Applying FCA
results
Results
  • Experiment: combining COUNCIL_FEATURE and CADASTER_FEATURE databases
    • A concept lattice of 36 concepts from the original 53 concepts
  • Identification of equivalent concepts in in both taxonomies,
    • e.g., square (PL in AYTO and PZ in SIGLA)
  • And also subconcept-superconcept relations.
    • E.g., identification of street as a broader concept in SIGLA (CL), which has narrower concepts in the AYTO
      • traffic-allowed streets (CL)
      • pedestrianized streets (CLP)
      • Or carfree-designed streets (AN).
5 conclusions
5. Conclusions
  • FCA approach seems to be more flexible
    • Dynamic building of the ontology (at least, a draft)
    • We don’t need to define the concepts, we just need to observe the data that exists
  • We have created a domain specific ontology that facilitate the interoperability (synchronization, update and merge) of the separate repositories
  • Future lines
    • Improve the efficiency of the method
    • Enrich the generated concepts with commonalities found in other feature attributes of the instances (e.g., geometry, perimeter, area)
    • Apply to other domains
      • Hydrology: NMA vs Water Agency repositories