A real world knowledge engineering application the neuroscholar project
Download
1 / 51

a real-world knowledge engineering application: - PowerPoint PPT Presentation


  • 418 Views
  • Updated On :

A Real-World Knowledge Engineering Application: The NeuroScholar Project. Gully APC Burns K. M. Research Group University of Southern California. Structure of the presentation. Ideas & Concepts Design Implementation Demonstration. I. Ideas & Concepts.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'a real-world knowledge engineering application:' - arleen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A real world knowledge engineering application the neuroscholar project l.jpg

A Real-World Knowledge Engineering Application:The NeuroScholar Project

Gully APC Burns

K. M. Research Group University of Southern California


Structure of the presentation l.jpg
Structure of the presentation

  • Ideas & Concepts

  • Design

  • Implementation

  • Demonstration


I ideas concepts l.jpg

I. Ideas & Concepts

In which we are reminded of what most people think knowledge is, how it is currently used (and misused) and how we might improve matters.


What does the word knowledge mean l.jpg
What does the word ‘Knowledge’ mean?

  • Main Entry: knowl·edgePronunciation: 'nä-lijFunction: nounEtymology: Middle English knowlege, from knowlechen to acknowledge, irregular from knowenDate: 14th century1obsolete: COGNIZANCE2 a (1) : the fact or condition of knowing something with familiarity gained through experience or association (2) : acquaintance with or understanding of a science, art, or technique b (1) : the fact or condition of being aware of something (2) : the range of one's information or understanding <answered to the best of my knowledge> c: the circumstance or condition of apprehending truth or fact through reasoning : COGNITIONd: the fact or condition of having information or of being learned <a man of unusual knowledge>3archaic: SEXUAL INTERCOURSE4 a: the sum of what is known: the body of truth, information, and principles acquired by mankind barchaic: a branch of learning

[from http://www.m-w.com/]


The published literature l.jpg
The published literature

… is the end-product of research and as such forms the basis for human understanding of the subject

… is very valuable.

… is structured.

… is interpretable.

Image taken from U.S. Geological Survey Energy Resource Surveys Program


The published literature6 l.jpg
The published literature

… is large and unwieldy.

… has varying reliability.

… is inconsistent.

… is based on natural language.

… is difficult to automate.

… is terse

… is qualitative

… is 2-D

Image taken from U.S. Geological Survey Energy Resource Surveys Program


The published literature7 l.jpg
The published literature

… is a valid target for attack with informatics-based methods.

This permits …

(a) Increased clarification through formalization

(b) large-scale data-handling capability

(c) analysis of existing data to examine organization

Image taken from U.S. Geological Survey Energy Resource Surveys Program


A semantic continuum l.jpg

The current status of ‘theory’ in Neuroscience

How we would like neuroscientists to think

Where we would like to work

A semantic continuum

  • [Mike Uschold, Boeing Corp]

Shared human consensus

Semantics hardwired;

used at runtime

Semantics processed and used at runtime

Text descriptions

Implicit

Informal

(explicit)

Formal

(for humans)

Formal

(for machines)

  • Further to the right means:

  • Less ambiguity

  • More likely to have correct functionality

  • Better inter-operation (hopefully)

  • Less hardwiring

  • More robust to change

  • More difficult


What s wrong with this picture from a neuroscientist s point of view l.jpg
What’s wrong with this picture?…from a neuroscientist’s point of view…

Number of structures

= 500 x 2

Number of Cell Groups per structure

= 10

Number of Possible Connections between cell groups

= 10,000 x 10,000

= 108

Estimated Number of Connections between cell groups

= 250,000

From Swanson (1998), “Brain Maps, Structure of the Rat Brain”, 2nd edition, Elsevier, Amsterdam.


It s even worse than that l.jpg
… it’s even worse than that …

  • Neuroscience is extremely multidisciplinary

  • Spatial Scales of Measurement: 101 – 10-9 m

  • Temporal Scales of Measurement: 70 yrs (2.21x109 s) to 10-3 s(not even including evolutionary time!)

  • Study occurs in a heterogeneous theoretical framework involving:

  • Anatomy, Physiology, Psychology, Ethology, Biochemistry (Molecular Biology, Genetics, Bioinformatics), Biophysics, Behavioural Ecology, Biology … to name a few…

  • All of these subjects are specialized, hard to link work between disciplines and across levels


It s even worse than that11 l.jpg
… & it’s even worse than that !!!

  • Neuroanatomical nomenclature are the closest thing that neuroscience has for a standardized framework…

  • In any given paper, the same name may be used for different structures, or different names may be used different structures.

  • e.g., ‘Globus Pallidus, pars medialis (GPm)’ also called the ‘Entopeduncular Nucleus’ by others.

  • See the index of Swanson (1998), “Brain Maps, Structure of the Rat Brain”, 2nd edition, Elsevier, Amsterdam list of synonyms according to one source.


We restrict the problem space to a specific soluble strategy l.jpg
We restrict the problem space to a specific soluble strategy

  • Describe a given phenomenon (e.g., the stress response).

  • Identify which populations of neurons are involved in the phenomenon (i.e., any neurons that turn on, turn off, change their firing, affect the phenomenon if messed with, etc.).

  • Represent how these populations of neurons are interconnected.

  • Represent the dynamic processes of there neurons that underlie the phenomenon.


A construct a knowledge model l.jpg
A Construct: ‘A Knowledge Model’

  • = A personalized representation of an individual’s knowledge.

  • e.g., A review article is an example of a non-computational knowledge model


Another construct knowledge landscape l.jpg
Another Construct: ‘Knowledge Landscape’

  • = A map of Knowledge Models (where each KM is timestamped)

  • e.g., An list of the best reviews of a given subject over time is an example of a non-computational knowledge landscape


Ii design l.jpg

II. Design

In which all of these high-falutin’ ideas are put into a logical design and it becomes clear that the design criteria of the NeuroScholar project distinguish it from pure research in computer science


Some design requirements l.jpg
Some design requirements

  • In order of importance

  • Powerful & enabling to neuroscientists in their everyday work

  • Easy to use! (i.e., free, multi-platform, one-click installation)

  • Knowledge acquisition / data collation is the rate limiting step

  • Open-source for future development as an academic project.


Knowledge landscapes l.jpg
Knowledge Landscapes

NeuroScholar Screenshot- (dummy data)


Knowledge landscapes18 l.jpg

‘Knowledge Landscape’

‘Data Collection’

‘Fragments’

‘Knowledge Model’

‘Entities’

‘Properties’

‘Relations’

‘Annotations’

Knowledge Landscapes

NeuroScholar Screenshot- (dummy data)


Knowledge models examples l.jpg

‘Data Collection’

A set of data fragments

‘Annotations’

Knowledge Models & examples

e.g. a publication: Allen GV & DF Cechetto. (1993) J Comp Neurol 330:421-438.

‘Fragments’

‘Entities’

‘Properties’

‘Relations’


Knowledge models examples20 l.jpg

individual pieces of the literature

‘Fragments’

‘Annotations’

Knowledge Models & examples

‘Data Collection’

e.g. descriptions of experimental results.“… Moderate to light terminal labeling was present in the parvocellular portions of the paraventricular nucleus, anterior-hypothalamic nucleus, anterior portion of the lateral hypothalamic area (Figs. 2D, 3B), and in the central nucleus of the amygdala (Fig, 2D)….”

From Allen & Cechetto (1993)

‘Entities’

‘Properties’

‘Relations’


Knowledge models examples21 l.jpg

e.g. neuronPopulation object

knowledge type = descriptiondomain type = tract-tracing experiment

‘Entities’

brainVolumes

experimentalMethod

labeling

‘Properties’

‘Annotations’

injectionSite

labeling

Knowledge Models & examples

Abstract data structures that capture the meaning of a set of fragments within the framework of the NeuroScholar system

‘Data Collection’

‘Fragments’

‘Relations’


Knowledge models examples22 l.jpg

ZI

LHA

‘Annotations’

Knowledge Models & examples

Rules that link two objects together.

‘Data Collection’

‘Fragments’

‘Entities’

‘Properties’

‘Relations’

‘Relations’


Knowledge models examples23 l.jpg

Sets of objects and relations, explicitly selected and prioritized within system

neuronPopulation2

‘Annotations’

neuronPopulation1

Knowledge Models & examples

‘Data Collection’

‘Fragments’

‘Summaries’

‘Entities’

‘Properties’

‘Relations’


Knowledge models examples24 l.jpg

‘Annotations’ prioritized within system

Human-interpretable text to make contents of knowledge base understandable

‘Annotations’

Knowledge Models & examples

‘Data Collection’

‘Fragments’

‘Objects’

‘Properties’

‘Relations’


Slide25 l.jpg

Distributed Online Sources of Information prioritized within system

‘Fragments’

Local Implementation


Slide26 l.jpg

Distributed Online Sources of Information prioritized within system

Users’

Spaces & Models

‘Fragments’

Centralized Published Knowledge

Repository

Local Implementation


Slide27 l.jpg

Distributed Online Sources of Information prioritized within system

‘Fragments’

Users’

Spaces & Models

‘Pending Review’


Slide28 l.jpg

Distributed Online Sources of Information prioritized within system

‘Fragments’

Users’

Spaces & Models

P2P sharing

KnowledgeModelComparison


Knowledge model comparison l.jpg
Knowledge Model Comparison prioritized within system

  • Given two users A & B, with Knowledge Models KA & KB being shared under the P2P model.

  • We want A to be able to run a program that automatically compares KB to KA so that the discrepancies and contradictions between the two models can be understood and reconciled.


What s wrong with this picture from an computer scientist s point of view l.jpg
What’s wrong with this picture? prioritized within system…from an computer scientist’s point of view…

  • Where is the formal logic?

It’s o.k. if we only export knowledge models to a formal logic-based representation rather that base our entire approach on it. Knowledge Acquisition is the rate-limiting step!


Knowledge representation l.jpg
Knowledge Representation prioritized within system

  • Knowledge representation is a multidisciplinary subject that applies theories and techniques from three other fields:

  • Logic provides the formal structure and rules of inference.

  • Ontology defines the kinds of things that exist in the application domain.

  • Computation supports the applications that distinguish knowledge representation from pure philosophy…

  • Sowa (2000), Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA.


Knowledge representation32 l.jpg
Knowledge Representation prioritized within system

  • … Without logic, a knowledge representation is vague, with no criteria for determining whether statements are redundant or contradictory. Without ontology, the terms and symbols are ill-defined, confused, and confusing. And without computable models, the logic and ontology cannot be implemented in computer programs. Knowledge representation is the application of logic and ontology to the task of constructing computable models for some domain.  

  • Sowa (2000), Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA.


Iii implementation l.jpg

III. Implementation prioritized within system

In which the design issues become concerned with more pressing concerns like: ‘how are we actually going to build this thing?’


Some implementation choices l.jpg
Some implementation choices prioritized within system

  • Built under UML-based software engineering paradigm

    • The View-Primitive-Data-Model framework (‘VPDMf’)

  • Object Oriented Design

    • Unified Modeling Language (UML)

    • PerlOO

    • Java

  • Relational Databases

    • MySQL

    • Informix

  • Exporting Ontologies (via the VPDMf)

    • XML, RDF, Flogic

  • Exporting Logic

    • Embedded within typed Relation objects within the OO knowledge model.

    • Use simple method overloading in Java to run Knowledge Model Comparison


Vpdmf system builder l.jpg
VPDMf System Builder prioritized within system

UML-based documentation

VPDMf specs

(Data Model file &

VPDMf XML files)

Forward Engineering

DBMS

Reverse Engineering

Final Working System

User

Interface

Component


Implementation plan l.jpg

VPDMf prioritized within system

Admin

App

Plugins

Plugins

VPDMf

Client

App

Implementation Plan

Client

Server

Review

Database

Main

Database

Local

Database


Implementation plan37 l.jpg

VPDMf prioritized within system

Admin

App

Local

Apps

Plugins

Plugins

VPDMf

Client

App

Implementation Plan

Client

Server

Review

Database

Main

Database

Local

Database

VPDMf

System

Builder


Implementation plan38 l.jpg

VPDMf prioritized within system

Admin

App

Plugins

Plugins

VPDMf

Client

App

Implementation Plan

Client

Server

Review

Database

Main

Database

Demonstration

Local

Database


Large scale organization of neuroscholar s schema l.jpg

Data management of publication data prioritized within system

General knowledge management structures

Annotations, Justifications, Judgements

Experimental data,

General histological data

Neuroanatomical tract tracing data

Final output of the system: the knowledge model

Components of the knowledge model specific to neuronal data

General data constructs used throughout the system

Large scale organization of NeuroScholar’s schema


E g views from bibliography l.jpg
e.g. prioritized within system, Views from ‘bibliography’


Slide41 l.jpg

ViewLink prioritized within system

ViewDefinitionArticle

ViewDefinitionFragment


Slide42 l.jpg

ViewLink prioritized within system


Basic functionality the viewstatemachine forms l.jpg
Basic Functionality: prioritized within systemThe ViewStateMachine & Forms


Additional functionality specialized form controls plugins l.jpg
Additional Functionality: prioritized within systemSpecialized Form Controls &Plugins

  • The Article Robot Form Control

    Uses PubMed to retrieve citation information easily

  • The Fragmenter Plugin

    Allows delineation of fragments on pdf files

  • The AtlasMapper Plugin

    Allows delineation of regions on brain maps


Iv demonstration l.jpg

IV. Demonstration prioritized within system

In which the truth is finally revealed


Acknowledgements l.jpg

This work is funded by the National Library of Medicine (RO1-LM07061-01)

Thanks to

Arshad Khan

Shahram Ghandehanderazdeh

Cyrus Shahabi

Mark O’Neill

Larry Swanson

Alan Watts

Mihail Bota

Wei Cheng Chen

Shyam Kapadia

Shanshan Song

Ning Zhang

Yi-Shin Chen

Acknowledgements


ad