1 / 8

HNC Data Alignment Research Direction

HNC Data Alignment Research Direction. Richard Rohwer Senior Principal Scientist, Advanced Technologies HNC Software / Fair Isaac. Cognition needs Semantics needs Massive Data. KNOWLEDGE. Tacit Knowledge. Theorem:

corbin
Download Presentation

HNC Data Alignment Research Direction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HNC Data Alignment Research Direction Richard Rohwer Senior Principal Scientist, Advanced TechnologiesHNC Software / Fair Isaac

  2. Cognition needs Semantics needs Massive Data KNOWLEDGE Tacit Knowledge Theorem: Probability distributions are the UNIQUE logically consistent knowledge representation. includes Semantics / Meaning = Association Statistics Explicit Knowledge Statistics Information Organization Reasoning Statistics Massive Data

  3. From massive data to machine cognition:The technical principles • Mathematical ingredients: • Association-Grounded Semantics (AGS) • To capture meaning mathematically. • Semantically-Driven Segmentation (SDS) • To extract the most meaningful patterns. • Distributional Alignment (DA) • To compare meanings abstractly. • Semantically Enriched Reasoning Engine • To think in terms of meanings instead of symbols.

  4. Association-Grounded Semantics (AGS):Meaning = Usage fro onto reaching acrs btwn beyond frm inside alg across via thru ovr around near between within through into over by from at jun sept apr jul nov oct dec aug feb sep jan bsb msj tng opv adm atm cpo bdo notal u b captain mr gen msgt ltc tsgt cpt sgt ssgt capt maj lt Cables

  5. Distributional Alignment (DA)Abstraction ~ Structural Commonality • Align semantic spaces by distribution of content. • No need to understand content. • Transport meaning between • Languages • Dialects • Cultures • Transport metaphorically between topics. transLign algorithm: • No language knowledge. • No tie words. • No aligned corpora.

  6. Alignment: Terminology “bank note” “river bank” “bank” What ‘cha call it? AGS Semantic Space Cable English Foreign Newswire Automation AGS techniques do not require manually constructed resources… … but can use them when available. RP English Less Commonly Taught Language Newswire English Terror Cell Obfuscated Slang Blog Dialects Institutional Dialects Professional Dialects Information Loss (Unequal expressive power) Polysemy (Sense resolution) Good solutions from NIMD: • Entity Disambiguation (5.5% err vs. 13.5% err in KDD) • General terms fluffy snow Naïve Bayes

  7. Alignment: Schemata Natural Language Corpora Natural Language Corpora Semantic Alignment Table name Table name Semantic Alignment Column name Column name Column name Column name Column name Column name Instance Statistics (Joined across schema) Instance Statistics (Joined across schema) I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e I n s t a n c e Structural Alignment Schema Graph Schema Graph

  8. Alignment: Ontologies • More complex graph structure • Reflecting multiple (transitive) relations • is-a, part-of, reports-to, prerequisite-for, … • Implies more options for defining AGS statistics • More relations, more ways to define co-occurrence. • Big Picture issue: • Ontological structure makes general statements about instances of relationships within data. • So does AGS. • How are these related?

More Related