1 / 48

Context Knowledge Management for Armament Safety

Context Knowledge Management for Armament Safety. Stuart Madnick, Lynn Wu MIT Sloan School of Management {smadnick, linwu}@mit.edu. Information Integration & Re-Use Projects Stuart Madnick (smadnick@mit.edu):. Context Knowledge Management Approach to “Armament Safety Management”.

emilie
Download Presentation

Context Knowledge Management for Armament Safety

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Context Knowledge Management for Armament Safety Stuart Madnick, Lynn Wu MIT Sloan School of Management {smadnick, linwu}@mit.edu

  2. Information Integration & Re-Use Projects Stuart Madnick (smadnick@mit.edu): Context Knowledge Management Approach to “Armament Safety Management” Financial Services (account aggregation) Technologies Applications RFID IT Infrastructure COntext INterchange (COIN) (1) Others … Security Analysis Total Data Quality (TDQM) Program (5) Military Logistics Data Quality MIT Information Quality (MIT-IQ) Program System Dynamics Modeling of State Stability (4) Pros and cons Of data standards Stakeholder Perceptions of Security (2) Economic model of alternatives to EU Database Directive (3) Strategy, Policy & Legal Issues Security

  3. COntext INterchange (COIN) Project CONTEXT MEDIATION * Automatic conflict detection and conversion - Derived data - Source selection - Source attribution Web Pages INPUT PROCESSING * Automatic web wrapping - Semi-structured text -Multi-source query plan and execution OUTPUT PROCESSING ODBC Driver Web - Publishing Appli- cations Receivers Sources TRUSTED AGENTS Data bases Browsers APPLICATIONS: Financial services, electronic commerce, asset visibility, in-transit visibility.

  4. Key COIN Technologies • Web Wrapper • Extractselected information from web (HTML+XML) • Allows web to be treated as large relational SQL database • Can handle dynamic web sites, cookies, “login”, etc. • Performs SQL Joins & Unions involving DB’s + Web sources • Context Mediator • Resolvesemantic (meaning)differences • Enable meaningful aggregation & comparison

  5. Context: Multiple Perspectives . . . old lady or young lady ?

  6. Context Context Context Role Of Context 05-06-07 06-05-07 $ ? 07-06-05 • CONTEXT VARIATIONS: • - GEOGRAPHIC ( US vs. UK ) • - FUNCTIONAL (CASH MGMT vs. LOANS ) • - ORGANIZATIONAL ( CITIBANK vs. CHASE ) Data: Databases Web data E-mail

  7. Representational Ontological Temporal Types of Context

  8. The 1999 Overture Unit-of-measure mixup tied to loss of $125Million Mars Orbiter “NASA’s Mars Climate Orbiter was lost because engineers did not make a simple conversion from English units to metric, an embarrassing lapse that sent the $125 million craft off course. . . . . . . The navigators ( JPL ) assumed metric units of force per second, or newtons. In fact, the numbers were in pounds of force per second as supplied by Lockheed Martin ( the contractor ).” Source: Kathy Sawyer, Boston Globe, October 1, 1999, page 1.

  9. Context Knowledge Management for Armament Safety Motivation • Context Knowledge Management is an important challenge • Semantic inconsistency is present in databases even in the military. • For example, what does accident rate really mean? • Army Ground Accident Rate: # accidents/period-of-time • Per year • Per month • Per total actual personnel strength • Per operational personnel strength • How do we address such semantic inconsistencies? • How do we interpret different accident rates? • Need context knowledge management

  10. Contexts: Unit A Weapon Accident Rate Injury Rate Nuclear Test Safety Exclusion Zone Radioactivity 0.1/week↔0.52/year 77/week/prs↔170/ps 2500 feet ↔ 762 meters 1 curie ↔ 3.7 x 10^10 bq A123 0.01 77 2500 1 Curie Per week Per month per pro-rated Strength Feet Semantic heterogeneity Unit B Weapon Accident Rate Injury Rate Nuclear Test Safety Exclusion Zone Radioactivity A123 0.52 170 762 3.7 x 10^10 Per month per personnel Strength Per year Meters Bq Motivating Example • In the military, there are many ways to measure safety. • Accident and injury rate can be measured in per week, per month or per year basis. • Nuclear testing data generally uses U.S. Customary measurement system, since most of the nuclear testing has been done in the US. To conform with international standards, the US government has been slowly trying convert the units to metric system. However, even with the metric system, there is a confusion between SI units and non SI units. Disclaimer: The data above are artificial and is used to for demonstration only

  11. Source Context Differences Source Context

  12. Scenario • A general wants to see a composite reports on all four units. • Direct queries on all four units would results incomparable data. • Without mediation, unit B seems to be doing poorly.

  13. Standardization: often not a solution • Works in small systems. • Legitimate reasons for diversity (e.g., different needs)  multiple standards • Unit 1 uses accident rate per year • Unit 2 uses accident rate per month • Standards are costly to develop • DoD started data standardization in 1991; by 2000, they only standardized ~1.2% of 1 million data elements* • Standards do evolve over time • Nuclear tests used the US Customary Measurement Standard. Now it is moving toward SI standard * Rosenthal, A., Seligman, L. and Renner, S. (2004) "From Semantic Integration to Semantics Management: Case Studies and a Way Forward", ACM SIGMOD Record,33(4), 44-50.

  14. Shared Conversion Ontologies Libraries Context Management Administrator Context Mediator Source Receiver Context Context accident Rate Select accidentRatex52 From unitA Select accidentRate From unitA Context 0.52 Transformation 0.01 Source Receiver The Context Interchange Approach Concept: Accident Rate Per WeekPer Year f() Per WeekPer Year 2 1 3

  15. Aggregated results in receiver context of Unit C

  16. Conclusion • Many different contexts are used to evaluate safety measurement within the military. • Needs to have an aggregator to gather and integrate various data. • Automatic context mediation plays a critical role • Context Interchange enables meaningful aggregation • For more information: • http://context2.mit.edu/coin

  17. US Sweden France UK Another Example: Regional Comparison Shoppers

  18. COIN Conceptual Model (Ontology)

  19. format temporalEntity basic scaleFactor currency monetaryValue taxRate kind price organization Legend is_a relationship attribute modifier Ontology and Conversion Function context_a currency: ‘KRW’; scaleFactor:1000 kind: base; format: yyyy.mm.dd context_b currency: ‘TRL’; scaleFactor:1e6 kind:base+tax; format: dd-mm-yyyy context_c currency: ‘USD’; scaleFactor:1 kind:base+tax+SH; format: mm/dd/yyyy context_d is_a context_b scaleFactor:1e3 context_e is_a context_d Format: yyyy-mm-dd context_f is_a context_c Kind: base+tax Example source: src_turkey(Product, Vendor, QuoteDate, Price)

  20. Demo – Same Context No semantic differences Meaningful data returned

  21. (a) Select Vendor, Price From src_turkey Where Product=“Samsung SyncMaster 173P”; Conversion for scale factor (b) Select Vendor, QuoteDate, Price From src_turkey Where Product=“Samsung SyncMaster 173P”; Conversion for date format Conversion for scale factor Compose only relevant conversions (b  e)

  22. Introduced because of context difference in auxiliary source Auto-reconciliation for auxiliary source (b  f)

  23. Detection and Explication (ba)

  24. Date format for receiver Price definition – remove tax Scale factor Date format for auxiliary source olsen Currency Mediated Query (b  a)

  25. Interoperate: hard-wired approaches (c) Internal standard approach: Adopting a standard (a) BFS approach: Brute-force between pair-wise sources 2 1 2 1 Internal standard 6 3 3 6 5 4 5 4 1 2 (b) BFC approach: Brute-force between contexts context_a currency: ‘KRW’; scaleFactor:1000 kind: base; format: yyyy-mm-dd 5 6 3 4 context_b currency: ‘TRL’; scaleFactor:1e6 kind:base+tax; format: dd-mm-yyyy context_c currency: ‘USD’; scaleFactor:1 kind:base+tax+SH; format: mm/dd/yyyy

  26. Flexibility and Scalability Need to update/add many conversion programs Not flexible • Why other approaches cannot fully benefit from general purpose conversion? • the decision whether to invoke the conversion is in the conversion program Flexible Update the declarative knowledge base.

  27. How COIN Scales • Semantic differences cannot be standardized away • Must be flexible and scalable • Component conversions are defined for each modifier • Overall conversions are automatically composed by abductive reasoning engine • Composition via symbolic equation solver and a shortest path algorithm • Inheritance enabled • COIN is a good solution • Modularization, declarativeness • Automatic composition of necessary conversions

  28. The 1805 Overture In 1805, the Austrian and Russian Emperors agreed to join forces against Napoleon. The Russians promised that their forces would be in the field in Bavaria by Oct. 20. The Austrian staff planned its campaign based on that date in the Gregorian calendar. Russia, however, still used the ancient Julian calendar, which lagged 10 days behind. The calendar difference allowed Napoleon to surround Austrian General Mack's army at Ulm and force its surrender on Oct. 21, well before the Russian forces could reach him, ultimately setting the stage for Austerlitz. Source: David Chandler, The Campaigns of Napoleon, New York: MacMillan 1966, pg. 390.

  29. EXTRA SLIDES

  30. Yet Another Context Example (Basis for Demo) Context Mediation Services Company Name DAIMLER-BENZ 614,995 Net Income * 97,736,992 Sales Datastream Company Name DAIMLER-BENZ AG * Net Income 346,577 Sales 56,268,168 WorldScope Company Name DAIMLER BENZ CORP Net Income 615,000,000 * Sales 97,737,000,000 Appl. Users & Disclosure * O&A DEM-USD Exchange Rate Systems 1.00 German Mark= 0.58 US Dollar as 12/31/93 OANDA Web Server * Wrapper Services

  31. Disclosure Worldscope DataStream Country of USD Country of Currency Incorporation Incorporation Used Money Amount Money Amount Money Amount Currency As_Of_Date As_Of_Date As_Of_Date Conversion 3 Letters 3 Letters 2 Letters Currency Symbols 1 1000 1000 Scale Fact or Disclosure Names Worldscope Names DataStream Names Company Names American with ‘/’ as American with ‘/’ as European with ‘ - ’ as Date Style separator separator separator Olsen (OANDA) Web Source uses 3 Letter Currency Symbols and European Date Style with ‘/’ as a separator Some Context Differences Context Definitions

  32. exchange- Rate number string curTypeSym fromCur toCur country- Name officialCurrency scaleFactor currency- Type dateFmt txnDate format currency countryIncorp date fyEnding company- Financials company- Name company Domain Model • Some currency context possibilities: • Currency is stated explicitly as part of record • Currency not stated, but the same for all (e.g., US $) • Currency not stated or constant, but inferred by country Inheritance Attribute Modifier

  33. COIN System Architecture SERVER PROCESSES MEDIATOR PROCESSES CLIENT PROCESSES Web Client COIN N SQL Compiler ( cgi -scripts) Repository SQL Query HTTPD-Daemon HTTPD-Daemon Context Datalog Mediator N Query WWW Gateway SQL Query Mediated Query Optimizer Wrapper Optimized ODBC-compliant Apps Query Plan Executioner HTTPD-Daemon Results (e.g Microsoft Excel) ODBC-Driver Web-site Data Store for HTTPD-Daemon Intermediate Results

  34. System Demonstration Single Source Queries with Mediation Q6. Scenario: Using Context Interchange, you can look at the Disclosure data using Datastream Context. Query: Find out from Disclosure what Net Income for DAIMLER-BENZ was. Use Datastream Context. Capabilities Demonstrated: Ability to perform Scale Factor Conversion, Date Format Conversion, Company Name Conversion.

  35. Demonstration @ context2.mit.edu Source Context

  36. Context Metadata (Partial)

  37. Conflict Detection and Mediation Mediated Query in Datalog Date convert Scale factor convert Name convert

  38. Mediated SQL Query & Result Mediated SQL Query Adjust scale factor Date format conversion Name conversion Final results – from Disclosure but in Datastream context

  39. More Complex Example (4 sources: DB + Web) Databases Web source select WorldcAF.TOTAL_ASSETS, DiscAF.NET_SALES, DiscAF.NET_INCOME, DStreamAF.TOTAL_EXTRAORD_ITEMS_PRE_TAX, quotes.Last from WorldcAF, DiscAF, DStreamAF, quotes where WorldcAF.COMPANY_NAME = "DAIMLER-BENZ AG" and DStreamAF.AS_OF_DATE = "01/05/94" and WorldcAF.COMPANY_NAME = DStreamAF.NAME and WorldcAF.COMPANY_NAME = DiscAF.COMPANY_NAME and WorldcAF.COMPANY_NAME = quotes.Cname;

  40. Conflict Table (1st part)

  41. Conflict Table (2nd part)

  42. Generated SQL (1st Part) select worldcaf.total_assets, discaf.net_sales, ((discaf.net_income*0.001)*olsen.rate), (dstreamaf2.total_extraord_items_pre_tax*olsen2.rate), quotes.Last from (select date1, 'European Style -', '01/05/94', 'American Style /' from datexform where format1='European Style -' and date2='01/05/94' and format2='American Style /') datexform, (select dt_names, 'DAIMLER-BENZ AG' from name_map_dt_ws where ws_names='DAIMLER-BENZ AG') name_map_dt_ws, (select ds_names, 'DAIMLER-BENZ AG' from name_map_ds_ws where ws_names='DAIMLER-BENZ AG') name_map_ds_ws, (select 'DAIMLER-BENZ AG', ticker, exc from ticker_lookup2 where comp_name='DAIMLER-BENZ AG') ticker_lookup2, (select 'DAIMLER-BENZ AG', latest_annual_financial_date, current_outstanding_shares, net_income, sales, total_assets, country_of_incorp from worldcaf where company_name='DAIMLER-BENZ AG') worldcaf, (select country, currency from currencytypes where currency <> 'USD') currencytypes, (select exchanged, 'USD', rate, date from olsen where expressed='USD') olsen, (select company_name, latest_annual_data, current_shares_outstanding, net_income, net_sales, total_assets, location_of_incorp from discaf) discaf,

  43. Generated SQL (Continued - Partial) (select as_of_date, name, total_sales, total_extraord_items_pre_tax, earned_for_ordinary, currency from dstreamaf) dstreamaf, (select as_of_date, name, total_sales, total_extraord_items_pre_tax, earned_for_ordinary, currency from dstreamaf) dstreamaf2, (select char3_currency, char2_currency from currency_map where char3_currency <> 'USD') currency_map, (select country, currency from currencytypes where currency <> 'USD') currencytypes2, (select exchanged, 'USD', rate, '01/05/94' from olsen where expressed='USD' and date='01/05/94') olsen2, (select Cname, Last from quotes) quotes where currencytypes.country = discaf.location_of_incorp and currencytypes.currency = olsen.exchanged and dstreamaf.currency = dstreamaf2.currency and dstreamaf2.currency = currency_map.char2_currency and olsen.date = discaf.latest_annual_data and currency_map.char3_currency = currencytypes2.currency and currencytypes2.currency = olsen2.exchanged and name_map_dt_ws.dt_names = dstreamaf2.name and name_map_ds_ws.ds_names = discaf.company_name and ticker_lookup2.ticker = quotes.Cname and datexform.date1 = dstreamaf2.as_of_date and currencytypes.currency <> 'USD' and currency_map.char3_currency <> 'USD' union select worldcaf2.total_assets, discaf2.net_sales, ((discaf2.net_income*0.001)*olsen3.rate), dstreamaf4.total_extraord_items_pre_tax, quotes2.Last from (select date1, 'European Style -', '01/05/94', 'American Style /' from datexform where format1='European Style -' and date2='01/05/94' and format2='American Style /') datexform2, (select dt_names, 'DAIMLER-BENZ AG' from name_map_dt_ws where ws_names='DAIMLER-BENZ AG') name_map_dt_ws2, (select ds_names, 'DAIMLER-BENZ AG' from name_map_ds_ws where ws_names='DAIMLER-BENZ AG') name_map_ds_ws2, (select 'DAIMLER-BENZ AG', ticker, exc from ticker_lookup2 where comp_name='DAIMLER-BENZ AG') ticker_lookup22, (select 'DAIMLER-BENZ AG', latest_annual_financial_date, current_outstanding_shares, net_income, sales, total_assets, country_of_incorp from worldcaf where company_name='DAIMLER-BENZ AG') worldcaf2, (select country, currency from currencytypes where currency <> 'USD') currencytypes3, (select exchanged, 'USD', rate, date from olsen where expressed='USD') olsen3, (select company_name, latest_annual_data, current_shares_outstanding, net_income, net_sales, total_assets, location_of_incorp from discaf) discaf2, (select as_of_date, name, total_sales, total_extraord_items_pre_tax, earned_for_ordinary, currency from dstreamaf) dstreamaf3, (select 'USD', char2_currency from currency_map where char3_currency='USD') currency_map2, etc

  44. Final Result

  45. Execution Trace (1st Part - Partials) Parallel Execution . . . Retrieving data From Web source

  46. Execution Trace (Continued - Partials) . . . Stock price returned From Web source Another Web source used (for currency conversion) . . .

  47. Appendix: Sample Applications • Airfare, Car Rental and Merged Travel • Weather • Global Price Comparison • Airfare Aggregation • Disaster Relief • TASC Financial Example • Web Services Demo • Corporate Householding

  48. Appendix: COIN Web-Wrapper Technology User or Program(via SQL Query) Select Edgar.Net_income From Edgar Where Edgar.Ticker=intc and Edgar.Form=10-Q Web page spec file * Web Wrapper Generator HTML Side SQL Side Ticker Net Income 1,983 INTC Data record returned * Spec file contains: Schema, Navigation rules, and Extraction rules.

More Related