text analytics workshop l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Text Analytics Workshop PowerPoint Presentation
Download Presentation
Text Analytics Workshop

Loading in 2 Seconds...

play fullscreen
1 / 17

Text Analytics Workshop - PowerPoint PPT Presentation


  • 231 Views
  • Uploaded on

Text Analytics Workshop. Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com. Agenda. Introduction – Elements & Infrastructure Platform Semantics not technology Infrastructure not project Value of Text Analytics

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Text Analytics Workshop' - liam


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
text analytics workshop

Text AnalyticsWorkshop

Tom ReamyChief Knowledge Architect

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com

agenda
Agenda
  • Introduction – Elements & Infrastructure Platform
    • Semantics not technology
    • Infrastructure not project
    • Value of Text Analytics
  • Evaluating Software
    • Two Phase Process
    • Designing the Team and Content Structures
  • Development – Taxonomy, Categorization, Faceted Metadata
  • Text Analytics Applications
    • Integration with Search and ECM
    • Platform for Information Applications
kaps group general
KAPS Group: General
  • Knowledge Architecture Professional Services
  • Virtual Company: Network of consultants – 8-10
  • Partners – SAS, SAP, Microsoft-FAST, Concept Searching, etc.
  • Consulting, Strategy, Knowledge architecture audit
  • Services:
    • Taxonomy/Text Analytics development, consulting, customization
    • Technology Consulting – Search, CMS, Portals, etc.
    • Evaluation of Enterprise Search, Text Analytics
    • Metadata standards and implementation
    • Knowledge Management: Collaboration, Expertise, e-learning
    • Applied Theory – Faceted taxonomies, complexity theory, natural categories
introduction to text analytics semantic infrastructure elements
Introduction to Text AnalyticsSemantic Infrastructure - Elements
  • Taxonomy – Thesauri, Controlled Vocabulary
  • Metadata – Standard (Dublin Core) and Facets
  • Basic Text Analytics
    • Categorization – Document Topics – Aboutness
    • Entity Extraction – noun phrases, feed facets
    • Summarization – beyond snippets
  • Advanced Text Analytics
    • Fact extraction – ontologies
    • Sentiment Analysis – good, bad, and ugly
  • What is in a Name – text analytics or ?
introduction to text analytics taxonomy
Introduction to Text AnalyticsTaxonomy
  • Thesauri, Controlled Vocabulary
    • Resources to build on
    • Indexing not categorization
  • Taxonomy
    • Foundation for Categorization
    • Browse – classification scheme
    • Formal – Is-Child-Of, Is-Part-Of
    • Large taxonomies - MeSH – indexing all topics
    • Small is better – for categorization and faceted navigation
introduction to text analytics metadata
Introduction to Text AnalyticsMetadata
  • Metadata standards – Dublin Core - Mostly syntactic not semantic
    • Description – static or dynamic (summarization)
    • Semantic – keywords – very poor performance
  • Best Bets – high level categorization-search
    • Human judgments
  • Audience – mixed results
    • Role, function, expertise, information behaviors
  • Facets – classes of metadata
    • Standard - People, Organization, Document type-purpose
    • Specialized – methods, materials, products
introduction to text analytics text analytics
Introduction to Text AnalyticsText Analytics
  • Categorization
    • Multiple techniques – examples, terms, Boolean
    • Built on a taxonomy
  • Entity Extraction
    • Catalogs with variants, rule based dynamic
  • Summarization
    • Rules – find sentences in a document
  • Fact Extraction
    • Relationships of entities – people-organizations-activities
  • Sentiment Analysis
    • Rules – adjectives & adverbs not nouns
introduction to text analytics text analytics8
Introduction to Text AnalyticsText Analytics
  • Why Text Analytics?
    • Enterprise search has failed to live up to its potential
    • Enterprise Content management has failed to live up to its potential
    • Taxonomy has failed to live up to its potential
    • Adding metadata, especially keywords has not worked
  • What is missing?
    • Intelligence – human level categorization, conceptualization
    • Infrastructure – Integrated solutions not technology, software
  • Text Analytics can be the foundation that (finally) drives success – search, content management, and much more
text analytics platform 4 basic contexts
Text Analytics Platform4 Basic Contexts
  • Ideas – Content Structure
    • Language and Mind of your organization
    • Applications - exchange meaning, not data
  • People – Company Structure
    • Communities, Users
    • Central team - establish standards, facilitate
  • Activities – Business processes and procedures
  • Technology
    • CMS, Search, portals, taxonomy tools
    • Applications – BI, CI, Text Mining
text analytics platform the start and foundation knowledge architecture audit
Text Analytics Platform: The start and foundationKnowledge Architecture Audit
  • Knowledge Map - Understand what you have, what you are, what you want
    • The foundation of the foundation
  • Contextual interviews, content analysis, surveys, focus groups, ethnographic studies
  • Category modeling – “Intertwingledness” -learning new categories influenced by other, related categories
  • Natural level categories mapped to communities, activities
      • Novice prefer higher levels
      • Balance of informative and distinctiveness
  • Living, breathing, evolving foundation is the goal
text analytics platform benefits idc white paper
Text Analytics Platform – BenefitsIDC White Paper
  • Time Wasted
    • Reformat information - $5.7 million per 1,000 per year
    • Not finding information - $5.3 million per 1,000
    • Recreating content - $4.5 Million per 1,000
  • Small Percent Gain = large savings
    • 1% - $10 million
    • 5% - $50 million
    • 10% - $100 million
text analytics platform benefits
Text Analytics Platform – Benefits
  • Findability within and outside the enterprise
    • Savings per year - $millions
  • Rescue enterprise search and ECM projects
    • Add semantics to search
  • Clean up enterprise content
    • Duplication and accurate categorization
  • Improve the quality of information access
    • Finding the right information can save millions
  • Build smarter applications
    • Social networking, locate expertise within the enterprise
text analytics platform benefits13
Text Analytics Platform – Benefits
  • Understand your customers
    • What they are talking about and how they feel about it
  • Empower your employees
    • Not only more time, but they work smarter
  • Understand your competitors
    • What they are working on, talking about
    • Combine unstructured content and rich data sources – more intelligent analysis
text analytics platform dangers
Text Analytics Platform – Dangers
  • Text Analytics as a software project
  • Not enough resources – to develop, to maintain-refine
  • Wrong resources – SME’s, IT, Library
    • Need all of the above and taxonomists+
  • Bad Design:
    • Start with bad taxonomy
    • Wrong taxonomy – too big or two flat
  • Bad Categorization / Entity Extraction
    • Right kind of experience
resources
Resources
  • Books
    • Women, Fire, and Dangerous Things
      • George Lakoff
    • Knowledge, Concepts, and Categories
      • Koen Lamberts and David Shanks
    • The Stuff of Thought – Steven Pinker
  • Web Sites
    • Text Analytics News - http://social.textanalyticsnews.com/index.php
    • Text Analytics Wiki - http://textanalytics.wikidot.com/
resources16
Resources
  • Blogs
    • SAS- Manya Mayes – Chief Strategist - http://blogs.sas.com/text-mining/
  • Web Sites
    • Taxonomy Community of Practice: http://finance.groups.yahoo.com/group/TaxoCoP/
    • Whitepaper – CM and Text Analytics - http://www.textanalyticsnews.com/usa/contentmanagementmeetstextanalytics.pdf
questions

Questions?

Tom Reamytomr@kapsgroup.com

KAPS Group

Knowledge Architecture Professional Services

http://www.kapsgroup.com