1 / 57

by Shuchi Patel

Planning and Optimizing Semantic Information Requests Using Domain Modeling and Resource Characteristics. by Shuchi Patel. Outline. Motivation InfoQuilt Background Planning and Optimization IScape Execution Execution Monitoring Related Work Conclusions and Future Work. Motivation.

Download Presentation

by Shuchi Patel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning and Optimizing Semantic Information Requests Using Domain Modeling and Resource Characteristics by Shuchi Patel

  2. Outline • Motivation • InfoQuilt • Background • Planning and Optimization • IScape Execution • Execution Monitoring • Related Work • Conclusions and Future Work

  3. Motivation • Explosion of data • Heterogeneities between sources • Limitations of web search techniques • Limitations of database search techniques • Need to manually integrate data

  4. Related Work • SIMS (USC) • TSIMMIS (Stanford, IBM Almaden) • Information Manifold (AT&T) • OBSERVER • Infomaster (Stanford)

  5. InfoQuilt - Goal • To provide an environment that allows users to query and analyze the data available from a multitude of diverse autonomous sources (including web-based sources), gain better understanding of the domains and their interactions as well as study hypothetical relationships to establish or disprove them.

  6. Important Building Blocks • Semantic domain modeling • Semantic inter-domain relationship modeling • Resource characteristics modeling • Complex operations • Semantic information request modeling • Learning paradigm

  7. Ontologies Terms/Concepts (Attributes) site Functional Dependencies (FDs) latitude eventDate longitude Disaster description Hierarchies region => latitude, longitude damage damagePhoto bodyWaveMagnitude Natural Disaster Man-made Disaster numberOfDeaths conductedBy magnitude Volcano explosiveYield NuclearTest magnitude > 0 Earthquake bodyWaveMagnitude > 0 magnitude < 10 bodyWaveMagnitude < 10 Domain Rules

  8. Ontologies.. NuclearTest ( site, explosiveYield, bodyWaveMagnitude, testType, eventDate, conductedBy, latitude, longitude, bodyWaveMagnitude > 0, bodyWwaveMagnitude < 10, testSite -> latitude longitude );

  9. Operations • Complex operators • Post-processing data • Simulations • Clarke Urban Growth Model Modeling urban growth and land use change

  10. Clarke Urban Growth Model (UGM) Source: http://edcdgs9.cr.usgs.gov/urban/factsht.pdf

  11. Inter-Ontological Relationships Anuclear testcould havecausedanearthquake if the earthquake occurredsome time afterthe nuclear test was conducted andin a nearby region. • NuclearTest Causes Earthquake • <=dateDifference( NuclearTest.eventDate, • Earthquake.eventDate ) < 30 • AND distance( NuclearTest.latitude, • NuclearTest.longitude, • Earthquake,latitude, • Earthquake.longitude ) < 10000

  12. Resource modeling • Attributes available • Data Characteristic (DC) rules A web site on earthquakes earthquakes after January 1, 1990 eventDate > “January 1, 1990”

  13. Resource Modeling... • Local Completeness Hartsfield International Airport (http://atlanta-airport.com/) flights to and from Atlanta airport toCity = “Atlanta” fromCity = “Atlanta”

  14. Resource Modeling... • Binding Pattern [toCity, toState, fromCity, fromState, departureMonth, departureDay] AirTran Airways (www.airtran.com)

  15. Information Scape (IScape) • Specified in terms of components of knowledge base • “understands” user’s information request “Find allearthquakeswith epicenter less than 5000 mile from the location at latitude 60.790 North and longitude 97.570 Eastand find alltsunamisthat they might have caused”

  16. Learning Paradigm • Understand domains and relationships between them • Query and analyze data from multiple autonomous and heterogeneous sources • Explore potential relationships • Analyze data to support or disprove potential relationships

  17. Where we are... • Motivation • ADEPT • Background • Planning and Optimization • IScape Execution • Execution Monitoring • Related Work • Conclusions and Future Work

  18. Planning and Optimization • IScapes are specified in terms of ontologies • Source selection • Execution plans that are executable • Plans that retrieve more complete information • Integrate data from sources • Optimization using domain and resource characteristics

  19. IScape 1 Resource Resource NuclearTestsDB( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, [dc] waveMagnitude > 3, [dc] eventDate > “January 1, 1985” ); SignificantEarthquakesDB( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, [dc] eventDate > “January 1, 1970” ); Resource NuclearTestSites( testSite, latitude, longitude ); Ontology Ontology NuclearTest( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, latitude, longitude, waveMagnitude > 0, waveMagnitude < 10, testSite -> latitude longitude ); Earthquake( eventDate, description, region, magnitude, latitude,longitude, numberOfDeaths, damagePhoto, magnitude > 0 ); Relationship NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000 IScape “Find all nuclear tests conducted by India or Pakistan after January 1, 1995 with seismic body wave magnitude > 4.5 and find all earthquakes that could have been caused due to these tests.”

  20. Semantic Check • Apply domain rules • Check if IScape is semantically correct • Constraint reduction

  21. IScape 1 Resource Resource NuclearTestsDB( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, [dc] waveMagnitude > 3, [dc] eventDate > “January 1, 1985” ); SignificantEarthquakesDB( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, [dc] eventDate > “January 1, 1970” ); Resource NuclearTestSites( testSite, latitude, longitude ); Ontology Ontology NuclearTest( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, latitude, longitude, waveMagnitude > 0, waveMagnitude < 10, testSite -> latitude longitude ); Earthquake( eventDate, description, region, magnitude, latitude,longitude, numberOfDeaths, damagePhoto, magnitude > 0 ); Relationship NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000 IScape “Find all nuclear tests conducted by India or Pakistan after January 1, 1995 with seismic body wave magnitude > 4.5 and find all earthquakes that could have been caused due to these tests.”

  22. Source Selection • One ontology at a time • First check locally complete sources • The DC rules of the resource should not falsify the IScape constraint • If none found, select all sources • The DC rules of a resource should not falsify the IScape’s constraint • Binding Patterns on resources should be respected

  23. IScape 1 Resource Resource NuclearTestsDB( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, [dc] waveMagnitude > 3, [dc] eventDate > “January 1, 1985” ); SignificantEarthquakesDB( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, [dc] eventDate > “January 1, 1970” ); Resource NuclearTestSites( testSite, latitude, longitude ); Ontology Ontology NuclearTest( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, latitude, longitude, waveMagnitude > 0, waveMagnitude < 10, testSite -> latitude longitude ); Earthquake( eventDate, description, region, magnitude, latitude,longitude, numberOfDeaths, damagePhoto, magnitude > 0 ); Relationship NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000 IScape “Find all nuclear tests conducted by India or Pakistan after January 1, 1995 with seismic body wave magnitude > 4.5 and find all earthquakes that could have been caused due to these tests.”

  24. Missing Attributes • Use functional dependencies (FD) • <attribute>+ -> <missing attributes><attribute>* • Couple withassociate resource Join (using LHS attributes) Primary.B = Associate.B AND Primary.C = Associate.C All available attributes (A, B, C, D) LHS attributes + missing attributes (B, C, E, F) BC -> DEF Associate resource Primary resource A, B, C, D A, B, C, E, F

  25. Missing Attributes… • Criteria for FD • All the missing attributes should be in the RHS of the FD • All attributes in the LHS of the FD should be available from the primary resource. Join (using LHS attributes) Primary.B = Associate.B AND Primary.C = Associate.C All available attributes (A, B, C, D) LHS attributes + missing attributes (B, C, E, F) BC -> DEF Associate resource Primary resource A, B, C, D A, B, C, E, F

  26. Missing Attributes… • Criteria for associate resource • Provide missing attributes + attributes in LHS of FD • Resource rules should not falsify query constraint • Resource rules should not falsify resource rules on primary resource • BP can be supplied by primary resource Join (using LHS attributes) BPSupplier and Join Primary.B = Associate.B AND Primary.C = Associate.C All available attributes (A, B, C, D) LHS attributes + missing attributes (B, C, E, F) BC -> DEF Associate resource Primary resource A, B, C, D A, B, C, E, F

  27. IScape 1 Resource Resource NuclearTestsDB( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, [dc] waveMagnitude > 3, [dc] eventDate > “January 1, 1985” ); SignificantEarthquakesDB( eventDate, description, region, magnitude, latitude, longitude, numberOfDeaths, damagePhoto, [dc] eventDate > “January 1, 1970” ); Resource NuclearTestSites( testSite, latitude, longitude ); Ontology Ontology NuclearTest( testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy, latitude, longitude, waveMagnitude > 0, waveMagnitude < 10, testSite -> latitude longitude ); Earthquake( eventDate, description, region, magnitude, latitude,longitude, numberOfDeaths, damagePhoto, magnitude > 0 ); Relationship NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000 IScape “Find all nuclear tests conducted by India or Pakistan after January 1, 1995 with seismic body wave magnitude > 4.5 and find all earthquakes that could have been caused due to these tests.”

  28. Sources Selected Use of FD to retrieve missing attributes Join Use of function to resolve syntactic heterogeneity testSiteEquals ( NuclearTestsDB.testSite, NuclearTestSites.testSite ) Select NuclearTest.waveMagnitude > 4.5 AND ( NuclearTest.conductedBy = “India” OR NuclearTest.conductedBy = “Pakistan” ) Resource Access Resource Access Resource Access NuclearTestsDB testSite, explosiveYield, waveMagnitude, testType, eventDate, conductedBy NuclearTestSites testSite, Latitude, longitude SignificantEarthquakesDB eventDate, description, Region, magnitude, Latitude, longitude, numberOfDeaths, damagePhoto

  29. Data Integration • Integrate data from all sources • Union • Function Evaluations • Constraint Checking • Relationship Evaluation • Aggregations • Projection

  30. Data Integration Union * Union * Join testSiteEquals ( NuclearTestsDB.testSite, NuclearTestSites.testSite ) Select NuclearTest.waveMagnitude > 4.5 AND ( NuclearTest.conductedBy = “India” OR NuclearTest.conductedBy = “Pakistan” ) Resource Access Resource Access Resource Access NuclearTestsDB NuclearTestSites SignificantEarthquakesDB

  31. Data Integration Select dateDifference ( “January 1, 1995”, NuclearTest.eventDate ) > 0 Function Evaluator dateDifference ( “January 1, 1995”, NuclearTest.eventDate ) Union * Union * Earthquake NuclearTest

  32. Data Integration Relationship Evaluator NuclearTest Causes Earthquake dateDifference ( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance ( NuclearTest.latitude, NuclearTest.longitude, Earthquake.latitude, Earthquake.longitude ) < 10000 Select dateDifference ( “January 1, 1995”, NuclearTest.eventDate ) > 0 Function Evaluator dateDifference ( “January 1, 1995”, NuclearTest.eventDate ) Union * Union * Earthquake NuclearTest

  33. Data Integration… Project N.testSite, N.eventDate, N.testType, N.explosiveYield, N.waveMagnitude, N.conductedBy, E.eventDate, E.region, E.description, E.magnitude, E.numberOfDeaths, E.damagePhoto, dateDifference( N.eventDate, E.eventDate ), distance( N.latitude, N.longitude, E.latitude, E.longitude ) Relationship Evaluator NuclearTest Causes Earthquake N = NuclearTest E = Earthquake

  34. Results

  35. IScape 2 http://travel.yahoo.com Resource http://www.airtran.com Resource Resource YahooTravel ( airlineCompany, flightNo, aircraft, fromCity, fromState, toCity, toState, departureDate, meals, departureTime, arrivalTime, [fromCity, fromState, toCity, toState, departureDate] ); AirTranAirways( airlineCompany, flightNo, fromCity, fromState, toCity, toState, departureDate, fare, departureTime, arrivalTime, [dc] airlineCompany = “AirTran Airways”, [lc] airlineCompany = “AirTran Airways”, [fromCity, fromState, toCity, toState, departureDate]); WeatherChannel ( date, city, state, description, icon, hiTemp, loTemp, [city, state] ); http://www.weather.com Resource AirlineLogos ( airlineCompany, airlineLogo ); Ontology Ontology DirectFlight ( airlineCompany, airlineLogo, flightNo, aircraft, fromCity, fromState, toCity, toState, departureDate, fare, meals, departureTime, arrivalTime, airlineCompany -> airlineLogo ); DailyWeather ( date, city, state, description, icon, hiTemp, loTemp ); IScape “Find all direct flights from Atlanta, GA to Boston, MA for March 23, 2001 and show the weather in the destination city on that day.”

  36. Plan Generation • IScape is semantically correct • Locally complete sources available but not applicable

  37. IScape 2 http://travel.yahoo.com Resource http://www.airtran.com Resource Resource YahooTravel ( airlineCompany, flightNo, aircraft, fromCity, fromState, toCity, toState, departureDate, meals, departureTime, arrivalTime, [fromCity, fromState, toCity, toState, departureDate] ); AirTranAirways( airlineCompany, flightNo, fromCity, fromState, toCity, toState, departureDate, fare, departureTime, arrivalTime, [dc] airlineCompany = “AirTran Airways”, [lc] airlineCompany = “AirTran Airways”, [fromCity, fromState, toCity, toState, departureDate]); WeatherChannel ( date, city, state, description, icon, hiTemp, loTemp, [city, state] ); http://www.weather.com Resource AirlineLogos ( airlineCompany, airlineLogo ); Ontology Ontology DirectFlight ( airlineCompany, airlineLogo, flightNo, aircraft, fromCity, fromState, toCity, toState, departureDate, fare, meals, departureTime, arrivalTime, airlineCompany -> airlineLogo ); DailyWeather ( date, city, state, description, icon, hiTemp, loTemp ); IScape “Find all direct flights from Atlanta, GA to Boston, MA for March 23, 2001 and show the weather in the destination city on that day.”

  38. Binding Patterns • The execution plan should specify : • Which BP is to be used • How values for the BP attributes should be supplied • Values can be supplied in following ways • Query constraint • Attributes in other ontologies • Associate Resource

  39. Binding Pattern Using Associated Resource • Criteria for associate resource • Should supply all attributes needed for BP • Values for its BP, if any, should be supplied from IScape’s constraint only • Resource rules involving only BP attributes being retrieved should not falsify IScape’s constraint BP Supplier All attributes (A, B, C, D) BP attributes (A, B) BP attributes (A, B) Associate resource Primary resource A, B, C, D [A, B] A, B, C, E, F

  40. Binding Patterns… • BP of AirTranAirways and YahooTravel can be supplied from query constraint DirectFlight.fromCity = “Atlanta” AND DirectFlight.fromState = “GA” AND DirectFlight.toCity = “Boston” AND DirectFlight.toState = “MA” AND DirectFlight.departureDate = “March 23, 2001” AND DailyWeather.city = DirectFlight.toCity AND DailyWeather.state = DirectFlight.toState AND DailyWeather.date = DirectFlight.departureDate [ fromCity: (“Atlanta”), fromState: (“GA”), toCity: (“Boston”), toState: (“MA”), departureDate: (“March 23, 2001”) ]

  41. Binding Patterns… • BP of WeatherChannel to be supplied using attributes from DirectFlight DirectFlight.fromCity = “Atlanta” AND DirectFlight.fromState = “GA” AND DirectFlight.toCity = “Boston” AND DirectFlight.toState = “MA” AND DirectFlight.departureDate = “March 23, 2001” AND DailyWeather.city = DirectFlight.toCity AND DailyWeather.state = DirectFlight.toState AND DailyWeather.date = DirectFlight.departureDate

  42. Sources Selected Use BPSupplier Node when BP values are retrieved from another ontology Use of FD to retrieve missing attributes Join Join BPSupplier YahooTravel.airlineCompany = AirlineLogos.airlineCompany AirTranAirways.airlineCompany = AirlineLogos.airlineCompany W.city = F.toCity, W.state = F.toState, W.date = F.departureDate Resource Access Resource Access Resource Access Resource Access YahooTravel airlineCompany, flightNo, aircraft, departureDate, departureTime, … fromCity = “Atlanta”, fromState = “GA”, toCity = “Boston”, toState = “MA”, departureDate = “March 23, 2001” AirlineLogos airlineCompany, airlineLogo AirTranAirways airlineCompany, flightNo, aircraft, departureDate, departureTime, … fromCity = “Atlanta”, fromState = “GA”, toCity = “Boston”, toState = “MA”, departureDate = “March 23, 2001” WeatherChannel Desrcription, icon, hiTemp, loTemp, city, state, date Only one Resource Access node for one resource if possible BP supplied using IScape constraint W = DailyWeather F = DirectFlight

  43. Data Integration Intermediate Union Union * Join Join BPSupplier YahooTravel.airlineCompany = AirlineLogos.airlineCompany AirTranAirways.airlineCompany = AirlineLogos.airlineCompany W.city = F.toCity, W.state = F.toState, W.date = F.departureDate Resource Access Resource Access Resource Access Resource Access YahooTravel fromCity = “Atlanta”, fromState = “GA”, toCity = “Boston”, toState = “MA”, departureDate = “March 23, 2001” AirlineLogos airlineCompany, airlineLogo AirTranAirways fromCity = “Atlanta”, fromState = “GA”, toCity = “Boston”, toState = “MA”, departureDate = “March 23, 2001” WeatherChannel Desrcription, icon, hiTemp, loTemp, city, state, date W = DailyWeather F = DirectFlight

  44. Data Integration… BPSupplier W = DailyWeather F = DirectFlight W.city = F.toCity, W.state = F.toState, W.date = F.departureDate BP values are retrieved from intermediate union Union * Resource Access WeatherChannel Join Join YahooTravel.airlineCompany = AirlineLogos.airlineCompany AirTranAirways.airlineCompany = AirlineLogos.airlineCompany Resource Access Resource Access Resource Access YahooTravel AirlineLogos AirTranAirways

  45. Data Integration… Project F.airlineCompany, F.airlineLogo, F.flightNo, F.aircraft, …, W.description, W.icon, W.date, W.loTemp, W.hiTemp,… W = DailyWeather F = DirectFlight Join W.city = F.toCity AND W.state = F.toState AND W.day = F.departureDay AND W.month = F.departureMonth BPSupplier Union * Resource Access WeatherChannel Join Join Resource Access Resource Access Resource Access YahooTravel AirlineLogos AirTranAirways

  46. Results

  47. IScape Execution Knowledge Final Results IScape IScape Plan Plan Final Results Data retrieved Query Query Query

  48. Where we are... • Motivation • ADEPT • Background • Planning and Optimization • IScape Execution • Execution Monitoring • Related Work • Conclusions and Future Work

  49. Execution Monitoring • IScape Processing Monitor (IPM) GUI • High-level debugger • Allows monitoring how much time each phase of IScape processing takes • Allows localizing errors

  50. IScape Processing Monitor (IScape 1)

More Related