1.04k likes | 1.17k Views
What Do You Want —Semantic Understanding?. (You’ve Got to be Kidding). David W. Embley Brigham Young University. Funded in part by the National Science Foundation. Presentation Outline. Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data
E N D
What Do You Want—Semantic Understanding? (You’ve Got to be Kidding) David W. Embley Brigham Young University Funded in part by the National Science Foundation
Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges
Can we quantify & specify the nature of this grand challenge? Grand Challenge Semantic Understanding
Grand Challenge Semantic Understanding “If ever there were a technology that could generate trillions of dollars in savings worldwide …, it would be the technology that makes business information systems interoperable.” (Jeffrey T. Pollock, VP of Technology Strategy, Modulant Solutions)
Grand Challenge Semantic Understanding “The Semantic Web: … content that is meaningful to computers [and that] will unleash a revolution of new possibilities … Properly designed, the Semantic Web can assist the evolution of human knowledge …” (Tim Berners-Lee, …, Weaving the Web)
Grand Challenge Semantic Understanding “20th Century: Data Processing “21st Century: Data Exchange “The issue now is mutual understanding.” (Stefano Spaccapietra, Editor in Chief, Journal on Data Semantics)
Grand Challenge Semantic Understanding “The Grand Challenge [of semantic understanding] has become mission critical. Current solutions … won’t scale. Businesses need economic growth dependent on the web working and scaling (cost: $1 trillion/year).” (Michael Brodie, Chief Scientist, Verizon Communications)
We succeed in managing information if we can “[take] data and [analyze] it and [simplify] it and [tell] people exactly the information they want, rather than all the information they could have.” - Jim Gray, Microsoft Research Why Semantic Understanding? • Because we’re overwhelmed with data • Point and click too slow • “Give me what I want when I want it.” • Because it’s the key to revolutionary progress • Automated interoperability and knowledge sharing • Automated negotiation in e-business • Large-scale, in-silico experiments in e-science
What is Semantic Understanding? Semantics: “The meaning or the interpretation of a word, sentence, or other language form.” Understanding: “To grasp or comprehend [what’s] intended or expressed.’’ - Dictionary.com
Can We Achieve Semantic Understanding? “A computer doesn’t truly ‘understand’ anything.” … But computers can manipulate terms “in ways that are useful and meaningful to the human user.” - Tim Berners-Lee Key Point: it only has to be good enough. And that’s our challenge and our opportunity!
Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges
Meaning • Knowledge • Information • Data Information Value Chain Translating data into meaning
Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]
Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]
Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]
Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]
Data • Attribute-Value Pairs • Fundamental for information • Thus, fundamental for knowledge & meaning
Data • Attribute-Value Pairs • Fundamental for information • Thus, fundamental for knowledge & meaning • Data Frame • Extensive knowledge about a data item • Everyday data: currency, dates, time, weights & measures • Textual appearance, units, context, operators, I/O conversion • Abstract data type with an extended framework
Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges
? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm
? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm
? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm
? Olympus C-750 Ultra Zoom Sensor Resolution 4.2 megapixels Optical Zoom 10 x Digital Zoom 4 x Installed Memory 16 MB Lens Aperture F/8-2.8/3.7 Focal Length min 6.3 mm Focal Length max 63.0 mm
Digital Camera Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm
? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117
? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117
? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117
? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117
Car Advertisement Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117
? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
Airline Itinerary Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04
? Monday, October 13, 2003 Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …
? Monday, October 13, 2003 Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …
World Cup Soccer Monday, October 13, 2003 Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …
? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm
? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm
? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm
Treadmill Workout Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm
? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW
? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW
? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW
Maps Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,100 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW
Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges
Information Extraction Ontologies Source Target Information Extraction Information Exchange
What is an Extraction Ontology? • Augmented Conceptual-Model Instance • Object & relationship sets • Constraints • Data frame value recognizers • Robust Wrapper (Ontology-Based Wrapper) • Extracts information • Works even when site changes or when new sites come on-line
Extraction Ontology: Example Car [-> object]; Car [0:1] has Year [1:*]; Car [0:1] has Make [1:*]; … Car [0:*] has Feature [1:*]; PhoneNr [1:*] is for Car [0:1]; Year matches [4] constant {extract “\d{2}”; context “\b’[4-9]\d\b”; …} … Mileage matches [8] keyword {\bmiles\b”, “\bmi\b.”, …} … …
Extraction Ontologies:An Example ofSemantic Understanding • “Intelligent” Symbol Manipulation • Gives the “Illusion of Understanding” • Obtains Meaningful and Useful Results
Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges
A Variety of Applications • Information Extraction • High-Precision Classification • Schema Mapping • Semantic Web Creation • Agent Communication • Ontology Generation