1 / 27

Content Analytics for Legacy Data Retention

The Dayhuff Group. The Dayhuff Group has a long history of providing enterprise content management solutions in a wide variety of industries. In business since 1997 IBM Premier Business Partner Software ValueNet Partner Over 180 projects at 80 companies

usoa
Download Presentation

Content Analytics for Legacy Data Retention

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Dayhuff Group • The Dayhuff Group has a long history of providing enterprise content management solutions in a wide variety of industries. • In business since 1997 • IBM Premier Business Partner • Software ValueNet Partner • Over 180 projects at 80 companies • 96% customer satisfaction rating Content Analytics for Legacy Data Retention

  2. The information lifecycle governance problem 98% 17% $3M Companies that cite defensible disposal as key result of governance programs Average cost to collect, cull and review information per legal case1 Amount of IT budget spent on storage3 44x 22% 70% Companies that can defensibly dispose today Portion of information unnecessarily retained2 Projected information growth, 2009-20204

  3. Watson and IBM ECM Today • Natural Language Processing (NLP)is the cornerstone to translate interactions between computers and human (natural) languages • Watson uses IBM Content Analytics to perform critical NLP functions • Unstructured Information Management Architecture (UIMA) is an open framework for processing text and building analytic solutions • Several IBM ECM products leverage UIMA text analytics processing: • IBM Content Analytics • OmniFind Enterprise Edition • IBM Classification Module • IBM eDiscovery Analyzer

  4. Going from raw information to rapid insight Uncover business insight through unique visual-based approach Aggregate and extract from multiple sources Organize, analyze and visualize Search and explore to derive insight … enterprise content (and data) by identifying trends, patterns, correlations, anomalies and business context from collections. … to form large text-based collections from multiple internal and external sources (and types), including ECM repositories, structured data, social media and more. … from collections to confirm what is suspected or uncover something new without being forced to build models or deploy complex systems.

  5. Why Content Analytics? • #1 problem all accounts have, “don’t know what content they have” • #2 problem “Uncontrollable Storage Cost” • Want to discover the value that may exist in their existing information / content resources • Want to know who, where, when & how to leverage their information / content assets • Believe they too can demonstrate a three month return on investment • Non-threatening to IT • Low Cost investment • Content Analytics “Try & Buy”

  6. Content Analytics Content In The Wild Unnecessary Information Necessary Information Content Analytics Dynamically analyzeto know what you haveAggregate, correlate, visualize and explore your enterprise information to make rapid decisions about business value, relevance and disposition. Decommissionwhat’s unnecessaryCut costs and reduce risk by eliminating obsolete, over-retained, duplicate, and irrelevant content – and the infrastructure that supports it. Preserve and exploit the content that mattersCollect valued content to manage, trust and govern throughout its lifespan … policies that use rule-based metadata, advanced contextual classification, and advanced content analytics LEVEL 5(Transformational)

  7. How to decommission and preserve content • Identify Content Sources to be assessed • IT Initial Assessment to decommission irrelevant content • LOB Specific Assessments to decommission over-retained and obsolete content … and to collect and classify valued and obligated content. • System and Application decommissioning by IT • Periodic audits by IT and LOB keep content environments optimized. 5 Periodic Audit 1 2 3 4 Initial Assessment Specific Assessments System & Application Decommissioning Identify Content Sources Content Collection Content Collection

  8. Content Analytics and RIM 1 2 3 Content Collection w/Content Classification Content Exploration Policies TrustedECM Step 1: Identify content sources Step 2: Exploration • Examine sources and analyze content • Records manager uses interface to explore & identify value-based content categories • Define policies expressing required actions (delete, move, copy, ...) based on categories Step 3: Archival and Management of content • IT manager encodes policies content collection mechanism • Content identified by exploration process is collected • Content collection is executed in an ongoing basis, as prescribed by policies • Supports Selective Content Decommissioning: • Operate on a subset of content in the original source • Identify & extract records across the enterprise LegacyECM File SharePoint File

  9. Content Analytics and Dynamic ESI Collection 1 2 Content Collection Content Analytics Multi-case Evidentiary ECM based Policy based collections Dynamic on demand reactive collection requests 3 eDiscovery Tools Step 1: Identify content sources in the wild Step 2: Exploration • Examines sources and analyzes content • IT or Legal user explores & identifies content relevant to case • User determines content to be collected into a case set and invokes collection process • Collection tool (embedded ICC) copies identified content into ECM evidence repository Step 3: eDiscovery • Cull, hold, audit-track, export ESI • Analytics-driven Early Case Assessment across all relevant evidentiary ESI

  10. Unlock valuable insight from contentWhat our clients are doing with Content Analytics

  11. Basic questions to consider regarding content • Know your content • Accelerate time to knowledge by providing greater accuracy and more complete business context with enterprise search • Dynamically analyze what you have, decommission the unnecessary, and preserve the content that matters with content assessment • Trust your content • Manage and govern content in trusted repositories, not in suspect environments, enabling confidence in your content • Create and manage 360 degree trusted content views to enrich master data by connecting to enterprise content • Leverage and Exploit your content • Interactively discover content to derive unexpected business insights and take action with content analytics • Exploit content analytics insights by enriching BI and predictive analytics as well as tailoring for industry and customer specific scenarios Do we know what we have and can we find it? Is it properly managed and can we trust it? What does it all mean and how can we benefit from it?

  12. Content Analytics for Legacy Data Retention Addresses And Delivers • Two Objectives: • Securely Retain Records Requiring Retention • Defensibly Decommission duplicate, non-business and information which has satisfied its retention requirements • One LARGE ROI: • Storage Cost Savings • ROI in less than 3 months • $44m in Storage Savings for one client

  13. How CA for Legacy Data Retention Delivers ... ...I need to dynamically collect electronically stored information (ESI) by knowing what I have, sorting out the case relevant information, declare a records and bring under hold management or decommission as necessary • Content Assessment enables content-based decision making for: • Decommissioning for cost savings – selected content or entire sources • Dynamic collection Records Management for eDiscovery • Ongoing proactive information governance • Improving metadata & content organization • Reduce Storage Cost

  14. Content Analytics Admin Content Sources • File Systems • Content Repositories • Databases • Email • Collaboration • Web Content • Web Pages • Portals • Content Integration (custom crawlers) • ...

  15. Content Analytics Admin Parse and Index Content • Linguistic understanding of your content • Industry and business specific dictionaries • Understanding of Named Entities • People, Places, Companies • Integration with Classification Module • Deep concept analysis • Annotators specific to industry, business and specific uses • Record Types • Industry and company specific concepts • Business specific concepts such as Employee Names , Products, etc. • . . .

  16. Content Analytics for Legacy Data Retention - How it works Analyzed Content (and Data) Financial Record to be Declared ExtractedConcept Reason Trade Action Day Prep Phrase Adjective Verb Noun stocks rose Monday on comments from ... Source Information Internal (ECM, Files, DBMS, etc.) and External (Social, News, etc.) Automatic Visualization for Interactive Exploration and Assessment

  17. Content Analytics for Defensible Decomissioning – How it works Analyzed Content (and Data) Non-record Decomissionable ExtractedConcept Action Reason Element of a plant Vegetation Noun Noun Noun Verb stocks on the plants require trimming ... Source Information Internal (ECM, Files, DBMS, etc.) and External (Social, News, etc.) Automatic Visualization for Interactive Exploration and Assessment

  18. Classification Process for Legacy Data Retention Analyze Collect information & context needed to make an inform a decision (declare vs. decommission) Decide Assess the collected information and select a category (declare vs. decommission), accurately & repeatably Take Action Use the selected category to determine & initiate an appropriate response (declare vs. decommission) Enforce Ensure actions are taken consistently & correctly, creating defensible process

  19. Content Analytics for Legacy Data Retention Enterprise Records Classification Module Electronic Discovery Solution Overview IBM ECM 4 1 2 Content Collectors Content Analytics 5 3

  20. Demonstration of Content Analytics for Legacy Data Retention Content Analytics for Legacy Data Retention is a solution to inventory and locate legacy data requiring retention or disposition The Dayhuff Group has created this solution using IBM Content Analytics tools to perform the heavy lifting of mining legacy information It allows data retention analysts to view and analyze their source content based on familiar concepts such as how the content fits into records series’ within their file plan

  21. Facets – Record Types This view exposes facets which categorize the content based on the business’ record types within record series’

  22. Select documents to Decommission or Declare The analyst can graphically see content that is past it’s retention period available for decommissioning

  23. Flag as Past Retention period Documents are flagged based on their retention requirements identified using Content Analytics and the Record Type facets, then exported to Content Collector

  24. Collect to IER Content Collector Documents are decommissioned based on analysis results in Content Analytics or archived to Enterprise Records Decommission UnnecessaryContent ECM Records Retention Programs & Policies

  25. IBM Enterprise Records marketing communication insurance accounts payable treasury Budget & forecast payroll cash management Financial reporting tax accounts receivable customs general accounting Cost accounting Sales & Marketing - 7000 Service - 4000 sales dealer support product management market research warranty administration customer service quality assurance Finance - 2000 Documents are declared as records in Enterprise Records matching Record Type Facets analyzed in Content Analytics Legal - 5000

  26. Content Analytics for Legacy Data Retention • Benefits: • Reduced risk • Increased productivity • Quantifiable and measurable results • Reduced cost • Defensible process for evaluation • Reduction of information through disposition • Reduction of duplicated and old information

More Related