1 / 24

Geographic reference analysis for geographic document querying

Geographic reference analysis for geographic document querying. F.Bilhaut , T.Charnois, P.Enjalbert & Y.Mathet {bilhaut, charnois, enjalbert, mathet}@info.unicaen.fr GREYC, CNRS UMR 6072 University of Caen. The "GéoSem" project. Passage extraction from geographical documents

Download Presentation

Geographic reference analysis for geographic document querying

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geographic reference analysis for geographic document querying F.Bilhaut , T.Charnois, P.Enjalbert & Y.Mathet {bilhaut, charnois, enjalbert, mathet}@info.unicaen.fr GREYC, CNRS UMR 6072 University of Caen

  2. The "GéoSem" project • Passage extraction from geographical documents • From a query to a ranked set of passages • Queries are concerned with : - time - phenomenon - space

  3. Excerpt from "Hérin" corpus From 1965 to 1985, the number of high-school students has increased by 70%, but at different rythms and intensities depending on academies and departments. Lower in South-West and Massif Central, moderate in Brittany and Paris, the rise has been considerable in Mid-West and Alsace. […] Also occurs the schooling duration increase which was more important in departments where, in the middle of the 60's, study continuation after primary school was far from beeing systematic.

  4. Excerpt from "Hérin" corpus From 1965 to 1985, the number of high-school students has increased by 70%, but at different rythms and intensities depending on academies and departments. Lower in South-West and Massif Central, moderate in Brittany and Paris, the rise has been considerable in Mid-West and Alsace. […] Also occurs the schooling duration increase which was more important in departments where, in the middle of the 60's, study continuation after primary school was far from beeing systematic. Time

  5. Excerpt from "Hérin" corpus From 1965 to 1985, the number of high-school students has increased by 70%, but at different rythms and intensities depending on academies and departments. Lower in South-West and Massif Central, moderate in Brittany and Paris, the rise has been considerable in Mid-West and Alsace. […] Also occurs the schooling duration increase which was more important in departments where, in the middle of the 60's, study continuation after primary school was far from beeing systematic. Time Phenomenon

  6. Excerpt from "Hérin" corpus From 1965 to 1985, the number of high-school students has increased by 70%, but at different rythms and intensities depending on academies and departments. Lower in South-West and Massif Central, moderate in Brittany and Paris, the rise has been considerable in Mid-West and Alsace. […] Also occurs the schooling duration increase which was more important in departments where, in the middle of the 60's, study continuation after primary school was far from beeing systematic. Time Phenomenon Space

  7. Queries • Which passages address educational difficulties in west of France in the 50's ? • Which passages address variations of the number of pupils in rural areas ? • Which passages address Calvados district?

  8. Queries • Which passages address educational difficultiesin west of Francein the 50's? • Which passages address variations of the number of pupilsin Paris area? • Which passages address Calvados district?

  9. Some Signifiant Spatial Expressions Paris in north of France from south of Loire Some seabord towns The quarter of The districts in north of France Fifteen All Some seabord towns of Normandy The most rural districts situated from south of Loire

  10. The type "zone"a georeferenced area anchored in a named place Paris in north of France Normandy From Normandy to Alsace from south of Loire

  11. The ‘LocGeo’ type • The canonical form: [quantification]+[type]+[zone] Quant Type Zone qualification administrative Position named geo. entity The quarter of / districts in north of France Fifteen / All / Some seabord towns of Normandy The most rural districts situated from south of Loire Some seabord towns

  12. The ‘LocGeo’ type quant type zone Quant Type Zone qualification administrative Position named geo. entity The quarter of / districts in north of France Fifteen / All / Some seabord towns of Normandy The most rural districts situated from south of Loire Some seabord towns

  13. Semantic Representation « Paris » ty_zone: town egn: nom: Paris zone: loc: internal Lat: 45.633333 coord: Long: 5.733333

  14. Semantic Representation « Some seabord towns in north of Normandy » type: relative quant: ty_zone: town type: geo: seabord locgeo: ty_zone: region egn: nom: Normandy zone: loc: internal position: north

  15. Implementation and (first) Results • A tokenisation and a morphological analysis • A DCG to perform altogether syntactic and semantic analysis• the grammar contains 160 rules• an internal lexical base of 200 entries• a gazetteer of 100000 named places (France) • 9OO expressions recognised and analysed from a geographical corpus (200 text pages) • Good results but a precise and quantitative evaluation to be done

  16. Semantic matching : Why ? corpora […] the south of a Bordeaux-Genève line […] Text A […] the northern half of France […] 3 a query 1 […] In Paris and Toulouse […] "Which passages address Paris ?" 2 Text B […] In Ile de France region […]

  17. Semantic matching : How ? • Spatial compatibility : Is the zone denoted by the passage spatially compatible with the one of the query? (is there, at least, an intersection?) • Relevance degree : if this zone is compatible, how relevant is it w.r.t.the query? - probability - granularity

  18. Compatibility computation • Q1) Which passages address Paris ? • P1) […] the capital city […] • P2) […] big cities in France. • P3) […] the northern half of France […] • P4) […] South of a Bordeaux-Genève line. YES gazetteer YES gazetteer + computation YES NO gis+computation

  19. "the northern half of France"

  20. "the south of a Bordeaux-Genève line"

  21. GIS GIS Relevance degree (1)Quantification Query= "Calvados" (french district) P1= "The quarter of districts in north of France" P2= "All districts in north of France" P3= "Some districts in north of France" P4= "Fifteen districts in north of France" rank 3 r=25% 1 r=100% 4 r=i/n=5/52=9.6% 2 r=i/n=15/52=29%

  22. Relevance degree (2)Granularity country region district city "zone"  ’the northern half of France’ "Basse Normandie" "Caen" "Calvados"

  23. locgeo(locgeo:(det:Det..type:Type..Zone)) --> #prep, det(Det), type(Type), zone(Zone). det(Sem) --> [X],{lexique(X,[X|R],det,Sem)}. type(X) --> typeQualif(X). type(ty_zone:N) --> nomtype(N). typeQualif(ty_zone:N..Q) --> option, nomtype(N), #prep, qualif(Q). nomtype(Sem) --> [X], {lexique(X,[X|R],nom,Sem)}. zone(X)--> egn(X). egn(egn:(ty_zone:T..nom:Y..coord:C)) --> --> ls_lexiconExtDCG(np, type_sem:egn..type_zone:T..nom:Y..coord:C ). egn(egn:(ty_zone:T..nom:Y)) --> [X],{lexique(X,[X|R],np, type_sem:egn..type_zone:T..nom:Y)}.

  24. lexique(quelque,[quelque],det,type_sem:relatif..type:relatif_qualifielexique(quelque,[quelque],det,type_sem:relatif..type:relatif_qualifie ..nb:'qualitatif:faible'). lexique(tout,[tout,le],det,type_sem:exhaustif). lexique(région,[région],nom,type_sem:zone(administrative) ..nom_zone:région). lexique(ville,[ville],nom,type_sem:zone(administrative) ..nom_zone:ville). Lexique('Bretagne',['Bretagne'], np,type_sem:egn..type_zone:région..nom:'Bretagne').

More Related