1 / 41

Human Created Information and Big Data

Human Created Information and Big Data. F rom Big Content to Unified Information Access. Alejandro Quiroga, Director Sales aquiroga@attivio.com +1.617.480.6465 Rik Tamm-Daniels, VP Technology, Co-Founder. W hat is Big Data, really?. The Data Universe Big Bang. Business Intelligence.

leane
Download Presentation

Human Created Information and Big Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Created Information and Big Data • From Big Content to Unified Information Access Alejandro Quiroga, Director Sales aquiroga@attivio.com +1.617.480.6465 Rik Tamm-Daniels, VP Technology, Co-Founder

  2. What is Big Data, really?

  3. The Data Universe Big Bang Business Intelligence Enterprise Search

  4. Big Data – Machine Generated Information BIG DATA (machine generated information)

  5. Big Content – Human Generated Information BIG CONTENT (human generated information)

  6. The (more) Complete Data Universe

  7. Consider this single email: Let’s look at Big Content a bit closer To: X Airline Service <csr@x.com> From: Joe Customer jcust@xyz.com Subject: My recent experience with your phone agent Date: 3-1-2013 01:35:00 I travelled on your airline from San Francisco to Boston and had a terrible experience with your phone agent. I had recently been granted frequent flyer status on your airline matching my status on airline Y and when I checked in for my flight was able to select a window upgraded seat. However, when I arrived at the airport, I was in a middle seat. I called the customer help desk and was told that “no seats are guaranteed” and there was nothing that could be done for me. The agent spoke in a very rude tone, not the way I would expect you to treat your frequent flyers.

  8. What does this single email tell us? Relevant Airport Locations and Route Customer email Customer name To: X Airline Service <csr@x.com> From: Joe Customer jcust@xyz.com Subject: My recent experience with your phone agent Date: 3-1-2013 01:35:00 I travelled on your airline from San Francisco to Boston and had a terrible experience with your phone agent. I had recently been granted frequent flyer status on your airline matching my status on airline Y and when I checked in for my flight was able to select a window upgraded seat. However, when I arrived at the airport, I was in a middle seat. I called the customer help desk and was told that “no seats are guaranteed” and there was nothing that could be done for me. The agent spoke in a very rude tone, not the way I would expect you to treat your frequent flyers. Sentiment Competitor Key terms for my industry

  9. What’s the business value of this single email? • Next time the customer calls in, the CSR view could look like this: • Customer-centric view • Operational glimpse of the “Voice of the Customer”

  10. What could the value of a lot of emails be? • Now we can answer new questions like: • “Which routes are driving negative customer experiences and why arethey unhappy with those routes?”

  11. Building Bridges from Big Content to Big Data • Leverage Text Analytics routines for interpreting content • Entity Extraction: detection of people, places, things • Key Phrase Detection: identifying topics and concepts • Classification: sentiment analysis, categorization • Lean on metadata layers • Corporate taxonomies (e.g. product hierarchies, sales territories, etc.) • Controlled vocabularies (e.g. Medical Subject Headings) • It’s all about tying the Content back to a Business Entity • For example: identifying that the email is a complaint about Product XYZ

  12. Feeding Analytics with Big Content • Pull signals derived from unstructured content into your analytics • Treat content like time series data • Use creation dates, modified dates, etc. • Dates/times mentioned within the content itself! • Analyze co-occurrences, proximity, etc. using distance of entities • Identify relationships between people, products, companies, etc. • Develop even better predictive models to alert the business to act!

  13. Leveraging Big Content across Verticals • Big Content is everywhere!

  14. Can you really use Big Data tools for Big Content? Volume Velocity Variety

  15. Enterprise Text Analytics Spectrum Directed Discovery

  16. Iteration, iteration, iteration – the faster you can iterate, the greater the ROI Enterprise Text Analytics in Practice Text Analytics are Iterative

  17. Enterprise Text Analytics Best Practices

  18. Unified Information Access

  19. Attivio and Unified Information Access Introduction to Unified Information Access (UIA) “Unified information access platforms will emerge to knit together information silos across the enterprise, no matter the format or the content. They are capable of indexing and integrating large volumes of unstructured, semi-structured, and structured information into a unified environment for information discovery, analysis, and decision support.” Worldwide Big Data Technology and Services 2012–2015 Forecast

  20. What do we mean by Unified Information Access? Unified Information Access Unify Disconnected Information Systems

  21. Why UIA Matters: Return on Information Assets

  22. Attivio Applications Source: IDC

  23. Big Content Architectures Big Content Platform ADBMS CEP Hadoop Vertical Alignment implies directed data flow from bottom to top Big Content Big Data Architecture Horizontal Alignment implies bi-directional data sharing

  24. Big Content Architectures Data Virtualization Engine Big Content Platform Hadoop Big Content Virtualization Architecture

  25. UIA Big Data Architectures UIA UIA Big Content CEP Hadoop EDW Unified Information Architecture Unified Insight Architecture

  26. Demonstration • Oil and Gas Well Cost Analysis

  27. Our portfolio of oil fields has high variability of productivity and cost We have good analytics on well data BUT The “other” 80% of information is unstructured – e.g. daily reports, geological survey, incident reports How do we figure out the why? Case Study: Well File Management

  28. Case Study: Well File Management

  29. Case Study: Well File Management • Ad hoc access to everything • No data model changes • No IT telling you they don’t load that field • Correlate field feedback with systems data • Find “more like this” for top wells…

  30. Case Studies

  31. Case Study: Root Cause Analysis • Root cause Categories are compared against the Business Units • The heat map shows the number of incidents at various BU’s • Indonesia is clearly seen as a hotspot for several Root Cause categories.

  32. Case Study: Root Cause Analysis • Choosing a high activity zone(Indonesia) provides the key phrases that are associated with that category. • The top Key phrases that drives such incidents are shown in a treemap

  33. Case Study: Root Cause Analysis • Selecting a Key phrase (Lube Oil) ties the incident back the Root Cause documents that are associated with the key phrases. • We are now able to see the root cause reports from Indonesia where Lube Oil had a significant impact.

  34. Case Study: Root Cause Analysis • We select the next tab to further analyze the Hazops associated with the root causes through common key phrases. • Selecting “Drain Valve” key phrase from the combined results shows the associated regions and the study Names. • The root causes tied to the Hazop are brought out as well.

  35. Case Study Create new BI platform to proactively manage customer jet engine fleets at new level of breadth and depth of detail beyond just data trends Greatly improve efficiency, customer satisfaction and repeat business Analyze Everything platform for complete agile BI: integrates, correlates and presents data and content, with no advance data modeling required: Engine sensor data, generated in “Big Data” volumes Service status data, quality metrics, CRM and other databases Customer case management notes Engine maintenance system notes by service technicians Supports BI tools with native SQL support & ODBC/JDBC connectivity BI pilot completed in just 5 weeks – “a new standard for BI time to market” Managers analyze and discover new correlations between changes in engine KPIs, sensor data, recurring key phrases from service notes & more New insights into root causes behind service issues – not just the numbers “No data left behind…Time from ‘data to decision’ drastically reduced” Global Manufacturer Problem Why AIE? Results

  36. Dashboards using TibcoSpotfire contain a mix of in-memory tables loaded from AIE with on-demand detail drill down TibcoSpotfire SEARCH API SQL over ODBC/JDBC Logical Tables from the BI tool perspective, AIE does not store data in physical tables QUERY/RESPONSE WORKFLOWS UNIVERSAL INDEX In addition to raw data, ingestion workflows pre-compute core KPIs KPIs CRM CASES EMAILS Textual data is enriched with text analytics (sentiment, keyphrases, entity extraction) DEVICES OWNED CUSTOMERS OPERATIONAL EVENTS INGESTION WORKFLOWS CONTENT API DATA & CONTENT CONNECTORS Operational Events Device Generated Data Maintenance Reports Customer Service Email Siebel CRM Teradata Complex Event Processing Engine

  37. Case Study: Non-Productive Time Analysis • Business Problem: High number of adverse events on rigs leading to large amount of Non-Productive Time (127% greater than acceptable) • Approach: • Identify both common and uncommon causes of NPT • Develop predictive solution for preventing production halts • Outcome: • Better understanding of the uncommon causes of NPT (and a realization they are the biggest contributors) • Better prediction of severity of NPT causes allowing for minimizing NPT overall

  38. Case Study: Non-Productive Time Analysis

  39. Case Study: Manufacturing System Analytics • Business Problem: Lack of visibility into the performance of all current and historical parts, components, and systems that have been deployed on customer equipment. • Approach: • Integrate Excel spreadsheets, emails and CRM system data • Entity extraction to help identify root cause and correlation of events related to parts that can expose potential trends in a proactive manner • Technology: • Attivio AIE providing text analytics and correlation of entities / key-phrases • TIBCO Spotfire for visualization and dashboarding • Outcome: • Improved understanding of product performance to optimize warranty and service plans. • Faster resolution of cases by field and customer service personnel

  40. Case Study: Manufacturing System Analytics • Field Application for data entry will enable • Connecting to back office • Field-based data entry • Perform file uploads BI Dashboard will feature Navigation, Reporting, and Analysis using: Search & Facet Filters Key Word/Key Phrase Interactive Charting Correlation across data sources Entity Extraction will detect Parts, Serial #’s, Events, Locations, etc. Content Analytics identify and create linkages across all data sources and data elements. Attivio’s connectors retrieve new and changed data from the sources, as often as desired

  41. Thank you!Questions?

More Related