l a s i
Download
Skip this Video
Download Presentation
L.A.S.I.

Loading in 2 Seconds...

play fullscreen
1 / 43

L.A.S.I. - PowerPoint PPT Presentation


  • 209 Views
  • Uploaded on

L.A.S.I. Linguistic Analysis for Subject Identification. Feasibility Presentation Presented by: CS410 Red Group. November 12, 2012. Outline. Team Red Staff Chart Introduction Societal Problem Case Study Proposed Solution Major Component Diagram Algorithm The Competition Risk

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' L.A.S.I.' - zed


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
l a s i

L.A.S.I.

Linguistic Analysis for Subject Identification

Feasibility Presentation

Presented by:CS410 Red Group

November 12, 2012

slide2
November 12, 2012

Outline

  • Team Red Staff Chart
  • Introduction
  • Societal Problem
  • Case Study
  • Proposed Solution
  • Major Component Diagram
  • Algorithm
  • The Competition
  • Risk
  • Conclusion
team red staff chart

410 Red Group

November 12, 2012

Team Red Staff Chart

Scott Minter

Project Co Leader

Software Specialist

Brittany Johnson

Project Co Leader

Documentation Specialist

Dustin Patrick

Algorithm Specialist

Expert Liaison

Richard Owens

Documentation Specialist

Communication Specialist

Erik Rogers

Marketing Specialist

GUI Developer

Aluan Haddad

Algorithm Specialist

Software Specialist

slide5

410 Red Group

November 12, 2012

A specific and distinctive quality, characteristic, or concern.1

1“Theme” Merriam Webster

slide7

410 Red Group

November 12, 2012

5 W’s & 1 H

  • Who
  • What
  • When
  • Where
  • Why
  • How
slide8

410 Red Group

November 12, 2012

Bill’s stove was broken. He has been saying for months that he would go to the appliance store to buy a new one. He had some free time yesterday, so he drove to the store to buy a new stove.

slide9

410 Red Group

November 12, 2012

the theme from the 5 w s 1 h

410 Red Group

November 12, 2012

The Theme from the 5 W’s & 1 H

Bill drove to the store yesterday to buy a new stove because his broke.

why are themes important

410 Red Group

November 12, 2012

Why are themes important?
  • Comprehension
  • Summarization
  • Assists in communication between people
societal problem

410 Red Group

November 12, 2012

Societal Problem

It is difficult for people to identify a common theme over a large set of documents in a timely, consistent, and objective manner.

how long does it take

410 Red Group

November 12, 2012

How long does it take?
  • Finding a theme over multiple documents is a time-consuming process.
  • The average reading speed of an adult is 250 words per minute.2

2Thomas "What Is the Average Reading Speed and the Best Rate of Reading?"

consistency and objectivity

410 Red Group

November 12, 2012

Consistency and Objectivity
  • The criteria for evaluation may vary from person to person.
  • Large quantities of documents must be mentally digested, assessed, and interrelated.
dr patrick hester

410 Red Group

November 12, 2012

Dr. Patrick Hester

“My research interests include multi-objective decision making under uncertainty, probabilistic and non probabilistic uncertainty analysis, critical infrastructure protection, and decision making using modeling and simulation.” 3

- Dr. Hester

Ph. D. from Vanderbilt University, 2007

Major: Risk and Reliability Engineering and Management

3Patrick Hester Website

slide16

410 Red Group

November 12, 2012

  • Dr. Hester is a systems analyst and researcher
    • He Must
      • Conduct extensive research
      • Quickly become familiar with client systems
      • Formulate concise, objective assessments
  • LASI will help with all of this
assessment improvement design a i d

410 Red Group

November 12, 2012

Assessment Improvement Design (A.I.D.)
  • Preliminary Problem statement Identified from document
  • Problem statement then used to find Critical Operational Issues (COI’s)
  • COIs used to find Measures of Effectiveness (MOE’s)
  • MOE’s used to find Measures of Performance (MOP’s)
current method

410 Red Group

November 12, 2012

Current Method

Continue on to the rest of the A.I.D Process

Customer Contact

yes

Is Customer satisfied?

Situational Awareness Meeting

Problem Statement Presentation

no

Will NCSOSE be needed?

yes

Document Gathering Process

Document Analysis

no

Client Goes Elsewhere

our proposed solution

410 Red Group

November 12, 2012

Our Proposed Solution
  • LASI is a linguistic analysis decision support tool used to help determine a common theme across multiple documents. It is our goal with LASI to:
    • accurately find themes
    • be system efficient
    • provide consistent results
what do we mean by linguistic analysis

410 Red Group

November 12, 2012

What do we mean by “linguistic analysis”?

The contextual study of written works and how the words combine to form an overall meaning.

linguistic analysis involves

410 Red Group

November 12, 2012

Linguistic analysis involves

Syntactic

Semantic

  • Logical grammar
  • Statistical Data
    • Alphabetical Frequencies
    • Word Counts
    • Parts of Speech
  • Word Dependencies
  • Relating syntactic structures to language-independent meanings
  • Extracting meaning and conceptional arguments
  • Summarization
the wills and will nots of lasi

410 Red Group

November 12, 2012

The Wills and Will Nots of LASI

What LASI Will Do

What LASI WillNot Do

  • Analyze multiple documents to find common themes
  • Provide statistical data to help a user make a decision
  • Provide a concise synopsis
  • Provide a single theme
who would this appeal to

410 Red Group

November 12, 2012

Who Would This Appeal To?
  • Researchers
  • Consultants
  • Academics
  • Students
benefits to the customer

410 Red Group

November 12, 2012

Benefits To The Customer
  • Time saving
  • Objective output
  • Consistent output
  • Cost saving solution
before lasi

410 Red Group

November 12, 2012

Before LASI

Customer Contact

Continue on to the rest of the A.I.D Process

yes

Is the Customer satisfied?

Situational Awareness Meeting

Problem Statement Presentation

no

Will NCSOSE be needed?

yes

Document Gathering Process

Document Analysis

no

Client Goes Elsewhere

after lasi

410 Red Group

November 12, 2012

After LASI

Customer Contact

Continue on to the rest of the A.I.D Process

yes

Is the Customer satisfied?

Situational Awareness Meeting

Problem Statement Presentation

no

Will NCSOSE be needed?

yes

Document Gathering Process

LASI Aided Document Analysis

no

Client Goes Elsewhere

major functional components

410 Red Group

November 12, 2012

Major Functional Components

Hardware

Software

Algorithm:

Extrapolates the most likely congruence of themes and ideas across all documents in the input domain

  • High End Notebook PC
  • - Computation
  • Quad-Core CPU
  • - Primary Memory
  • 8.0 GB DDR3 RAM
  • - Document Storage
  • Solid State Storage
  • ~$1500 USD

User Interface:

- Multi-Level Views

- Weighted Phrase List

- Detailed Breakdown

- Step by Step Justification

linguistic analysis algorithm

410 Red Group

November 12, 2012

Linguistic Analysis Algorithm

Primary Analysis:

Word Count and Syntactic Assessment

Tertiary Analysis:

Semantic Relationship Assessment

Secondary Analysis:

Associative Identification

Traverse Document in Word-Wise Manner

Bind Pronouns to Nouns, Updating Frequency

Identify Potential Synonyms

Assess Potential Subject-Object-Verb Relationships

Identify Corresponding Parts of Speech

Bind Adjectives to Nouns

Output List of Weighted Themes

Determine Frequency by Grammatical Role

Identify Potential Noun Phrases

slide31
November 12, 2012

Milestone diagram

wordstat

410 Red Group

November 12, 2012

WordStat
readme

410 Red Group

November 12, 2012

ReadMe
automap

410 Red Group

November 12, 2012

Automap
risk matrix

410 Red Group

November 12, 2012

Risk Matrix

Customer Risks

C1 -- Product Interest

C2 -- Maintenance

C3 -- Trust

Technical Risks

T1 -- System Limitations

T2 -- Scanned Text Recognition

T3 -- Jargon Recognition

T4 – Illegal Character Handling

customer risks

410 Red Group

November 12, 2012

Customer Risks

C1. Product Interest

Probability 2 Impact 4

Mitigation: LASI offers unique functionality and user friendliness.

C2. Maintenance

Probability 3 Impact 2

Mitigation: LASI will be a free, open source application allowing the community to maintain and extend it over time.

C3. Trust

Probability 3Impact 3

Mitigation: LASI will provide a step by step breakdown of output analysis and algorithm reasoning

technical risks

410 Red Group

November 12, 2012

Technical Risks

T1. System Limitations

Probability 4 Impact 2

Mitigation: LASI will be designed from the ground up in native C++ for memory and CPU efficient code.

T2. Scanned Text Recognition

Probability 4 Impact 3

Mitigation: LASI will implement an optical character recognition algorithm to handle scanned text

technical risks1

410 Red Group

November 12, 2012

Technical Risks

T3. Jargon Recognition

Probability 3 Impact 2

Mitigation: LASI will have domain specific dictionaries and feature intuitive contextual inference.

T4. Illegal Character Handling

Probability 4 Impact 2

Mitigation: LASI will providers contextual inference, synonym recognition and statistical methods

slide42

410 Red Group

November 12, 2012

  • Conclusion
  • LASI is feasible.
  • LASI is a decision support tool not a decision making tool.
  • Implications of success affect a wide area of study and professions.
  • In order for LASI to succeed the output needs to immediately usable and the interface user-friendly.
references

410 Red Group

November 12, 2012

References
  • "Theme." Def. 1b. Merriam Webster. N.p., n.d. Web. 19 Oct. 2012. <http://www.merriam-webster.com/dictionary/theme >.
  • Thomas, Mark. "What Is the Average Reading Speed and the Best Rate of Reading?" What Is the Average Reading Speed and the Best Rate of Reading? Web. 19 Oct. 2012. <http://www.healthguidance.org/entry/13263/1/What-Is-the-Average- Reading-Speed-and-the-Best-Rate-of-Reading.html>.
  • “Patrick Hester" Old Dominion University. N.p., n.d. Web. 24 Sept. 2012

<http://www.odu.edu/directory/people/p/pthester>.

Stanislaw Osinski, Dawid Weiss. 13 August, 2012 . Carrot 2. 9/25/2012 <http://project.carrot2.org>.

”WordStat” Provalis Research. Web. 24 Sept. 2012. <http://provalisresearch.com/products/content-analysis-software/>.

“ReadMe: Software for Automated Content Analysis” Web. 24 Sept. 2012. <http://gking.harvard.edu/node/4520/rbuild_documentation/readme.pdf>

"AlchemyAPI Overview." AlchemyAPI. N.p., n.d. Web. 19 Oct. 2012. <http://www.alchemyapi.com/api/>.

"AutoMap:." Project. N.p., n.d. Web. 19 Oct. 2012. <http://www.casos.cs.cmu.edu/projects/automap/>.

"CL Research Home Page." CL Research Home Page. N.p., n.d. Web. 19 Oct. 2012. <http://www.clres.com/>.

ad