Gap analysis data and information
1 / 4

- PowerPoint PPT Presentation

  • Uploaded on

GAP Analysis – Data and Information . Technical Challenges. SUMMARY OF THE STATE OF THE ART. Research Areas. CURRENT LIMITATIONS.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - nuri

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Gap analysis data and information
GAP Analysis – Data and Information





Research Areas


Images/Video: Features like edges, filter outputs, color etc. Weak general representations like color histograms and strong specialized techniques for tasks like face recognition.

Cannot abstract images well – current image representations depend on tasks etc.

Accurate Perception of Situation

Abstract Image representations/features useful for multiple tasks, insensitive to changes like lighting etc

Software and knowledge tied to narrow tasks/domains.Can’t handle modality/affect/non-literal language etc. Scalability of NLP

Reusable semantic and discourse components. Beyond fact/event-based IE (e.g. causal relations, opinions). Techniques to scale to web size.

Text/NLP: Sharable lexical and syntactic processing. Fact oriented information extraction (IE).

Models do not handle structure, NLP –sentence structure, IR – proximity, Images/videos - spatial relationships, viewpoint and scale changes well.

Statistical models to handle structure for NLP, IR and images/videos.

Image/Video/Text: Statistical models – usually bag of words like.

Comprehension of Existing Situation

Classification/Regression models do not exploit dependencies. Generative models are hard to design and often less effective.

Learning representations to support reasoning, ontologies, semantic interpretations.

Classification/regression. Generative models like HMM’s

Better database selection and mergiing algorithms. P2P systems. Combining information across text and structured databases.

Small distributed systems – mostly ad hoc. Classification of homogeneous text streams.

Distributed IR - some database selection and mergiing algorithms.

Gap analysis data and information1
GAP Analysis – Data and Information





Research Areas


Long term models for ecommerce (e.g. recommender systems) but little for IR.

Current methods restricted to high quality, homogeneous data. Lack of good user models.

Models of Users. Interactive Retrieval. Better use of implicit feedback. Privacy concerns.

Semi-supervised learning not robust. Too much high effort. Can’t be applied to tasks with limited data.

Supervised systems require lots of data.

Semi-supervised learning. Learning from small amounts of data.

Technical challenge areas data and information panel
Technical Challenge AreasData and Information Panel

  • Reduce human effort:

    • semi-supervised learning

  • Learning from small data amounts of data

  • Performance Metrics

    • Evaluation and testbeds

    • How to evaluate complex processes/solutions

  • Beyond classification and regression:

    • E.g., Using geospatial data as input

    • E.g., learning semantic structures (ontologies)

  • Machine Learning lifecycle:

    • context of models may change of time

    • Maintenance, update, formatting of data

  • Managing multiple learning models

    • E..g, Portfolio of models: each model serves a different purpose

    • E.g. ensembles, model correlation, model disagreement

Gap analysis data and information 1346939

  • Trust in going from data to interpretation

    • E..g, Pedigree/reliability of data source

    • Explaining conclusions

  • Modeling Spatial RelationshipsX

  • Better image/video features X

  • Incorporating user guidance

    • Defining search space

    • Complex search criteria

  • Beyond Syntactic Analysis X

    • NLP components for deep semantic interpretation

    • Non-factual NLP

  • Scalable NLP/information extractionX

  • Beyond bag-of-words techniques for IRX

    • Incorporating output of information extraction systems, text categorization systems