capabilities apollo and sql server data mining n.
Skip this Video
Loading SlideShow in 5 Seconds..
Capabilities Apollo and SQL Server Data Mining PowerPoint Presentation
Download Presentation
Capabilities Apollo and SQL Server Data Mining

Loading in 2 Seconds...

play fullscreen
1 / 37

Capabilities Apollo and SQL Server Data Mining - PowerPoint PPT Presentation

  • Uploaded on

Capabilities Apollo and SQL Server Data Mining. Presented by Jeff Kaplan, Principal Client Services Paul Bradley, Ph.D., Principal Data Mining Technology 312.787.7376. Agenda. Apollo Overview Data Mining 101 Project REAL Case Study SQL Server 2005 Data Mining Demo Real-life Examples.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Capabilities Apollo and SQL Server Data Mining' - paul2

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
capabilities apollo and sql server data mining

CapabilitiesApollo and SQL Server Data Mining

Presented by

Jeff Kaplan, Principal Client Services

Paul Bradley, Ph.D., Principal Data Mining Technology


  • Apollo Overview
  • Data Mining 101
  • Project REAL Case Study
  • SQL Server 2005 Data Mining Demo
  • Real-life Examples


Apollo Overview

company background


Company Background
  • First company delivering true predictive analytic solutions
  • 10 plus years in data mining and data warehousing
  • Premier Partner for SQL Server 2005 Data Mining
  • Cater to a wide range of business including Microsoft, Sprint, Wal-Mart, Barnes & Noble, Seattle Times, Knight Ridder
  • Variety of Industries
    • Retail and Consumer Goods
    • Media
    • Financial Services
    • Manufacturing
    • Public Services



Sales & Distribution



Market Research

  • Claim Analysis
  • Call Center Analytics
  • Data Warehousing
  • Dashboard Reporting
  • Inventory Forecasting
  • Sales Forecasting
  • Pricing Optimization
  • Next Best Offer
  • Market Basket Analysis
  • Recency & Frequency Modeling
  • Customer Acquisition
  • Campaign Targeting
  • Cross-sell/Up-sell
  • Customer Segmentation
  • Retention Modeling
  • Behavioral Targeting
  • Personalization
  • Correlation Analysis
  • Key Driver Analysis
  • Verbatim Summarization


Customer Targeting Models

  • Score Model Results
  • Join Customer Data Sources
  • Run Predictive Algorithms
  • Deliver Targeted Predictions



Customer Clustering Models


Predictive Models







Automate Predictions for Targeting, Forecasting, Detection, etc.


Dashboard &

Ad-hoc Reporting




Measure Promotion Success


ms data mining

  • Fastest Growing BI Segment (IDC)
    • Data Mining Tools: $1.85B in 2006
    • Predictive Analytic projects yield a high median ROI of 145%
  • Uses
    • Marketing: Customer Acquisition and Targeting, Cross-Sell/Up-Sell
    • Retail: Inventory Forecasting, Price Optimization
    • Market Research: Driver Analysis, Verbatim Summarization
    • Operations: Call Center Analytics
    • Finance: Fraud Detection, Risk Models
  • Mainstream Emergence
    • E-commerce (e.g
    • Search (e.g.
    • Behavioral Advertising
  • SQL-Server is in a Unique Position to Service Market Needs
evolution of sql server data mining

Win Leadership

  • Continue standards and developer effort
  • Comprehensive feature set
  • Penetrate the Enterprise
  • Thought leadership

ms data mining

Evolution of SQL Server Data Mining

SQL 2005

SQL 2000

Enter the Game

  • Create industry standard
  • Target developer audience
  • V1.0 product with 2 algorithms
value of data mining

Data Mining


Reports (Adhoc)

Reports (Static)

ms data mining

Value of Data Mining

Business Knowledge

SQL-Server 2005

Relative Business Value



sql server 2005 bi platform

Management Tools

Development Tools

Reporting Services

Analysis Services

OLAP & Data Mining

Integration Services


SQL Server

Relational Engine

ms data mining

SQL-Server 2005 BI Platform
sql server 2005 bi platform1

ms data mining

SQL Server 2005 BI Platform
  • Embed Data Mining: Development Tool Integration
    • Make Decisions Without Coding
    • Customized Logic Based on Client Data
    • Logic Updated by Model Reprocessing – Applications Do Not Need to be Re-Written, Re-Compiled, and Re-Deployed
  • Data Mining Key Points
    • Price Point to Achieve Market Penetration
    • Database Metaphors for Building, Managing, Utilizing Extracted Patterns and Trends
    • APIs for Embedding Data Mining Functionality into Applications
sql server 2005 algorithms

ms data mining

SQL-Server 2005 Algorithms

Decision Trees


Time Series

Neural Net

Sequence Clustering


Naïve Bayes

Linear and Logistic Regression

project real
Project REAL


client profile inventory forecasting

project real

Client Profile – Inventory Forecasting
  • Create a Reference Implementation of a BI System Using Real Retail Data.
  • Partners - Barnes & Noble, Microsoft, Scalability Experts, EMC, Unisys, Panorama, Apollo
  • Forecast Out-of-Stock for 5 Book Titles Across Entire Chain (800 Stores)
  • Predictive Models to Flag Items That Are Going to be Out-of-Stock
  • Model on 48 Weeks of Data, Predictions for Month of December
  • Models Predicted Out-of-Stock Occurrences > 90% Accuracy
  • Conservative Sales Opportunity for just 5 Titles: $6,800 per year
  • Extrapolate Across Millions of Titles - Million Dollar Sales Opportunity
predictive modeling process

project real

Predictive Modeling Process






Identify the cluster which the store belongs to, for the category of that item.

Each item belongs to a category



For the category, create a set of store clusters predictive of sales in the category


Utilize sales data predict item sales 2 weeks out.

out of stock data preparation summary

project real

Out-of-Stock Data Preparation Summary
  • Apollo Explored 3 Data Preparation Strategies
    • Use Sales, On-Hand, On-Order History Data for All Stores in the Same Cluster
      • Build One Mining Structure per Cluster, For All Stores in that Cluster for Each Title
      • Build One Mining Model per Store, per Cluster for Each Title
      • Negative: Few OOS Examples per Store, Computation to Deploy One Mining Model per Store/Title Combination
    • Use Sales, On-Hand, On-Order History for All Stores, Across All Clusters
      • Build One Mining Structure per Book, Use Cluster Membership of Store as Input Attribute
      • Positive: Optimizes OOS Examples per Title by Considering All Stores
      • Negative: Does Not Capture Derivative Sales Information
    • Removed Negative of Strategy 2
      • Included Historical Week-on-Week Sales Derivative Information for Each Title
      • Increase the Information Content of the Source Data for Modeling
creating variables for success

project real

Creating Variables for Success
  • Using:
    • Sales and Inventory History from January 2004 to end of November 2004
    • Recommend two (2) years of Historical Data to Increase accuracy for training model
  • Key:
    • Store + Fiscal Year + WeekID
  • Predicted Variables
    • 1 Week Ahead OOS Boolean
    • 1 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)
    • 2 Week Ahead OOS Boolean
    • 2 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4+)
  • Input Attributes
    • Store Cluster Membership (Derived from Store Cluster Model)
    • Current Week Sales, On-Hand, On-Order
    • Preceding 1-5 Week Sales, On-Hand, On-Order
    • Sales Derivative Atttributes
model training and testing scenarios

project real

Model Training and Testing Scenarios
  • Purpose: Intelligence on Model Training Frequency
    • Scenario 1: Train Models Every 2 Weeks
      • Training Dataset: All Data Prior to Last 2 Fiscal Weeks in December 2004
      • Test Dataset: Last 2 Fiscal Week in December 2004
    • Scenario 2: Train Models Monthly
      • Training Dataset: All Data Prior to End of Fiscal November 2004
      • Test Dataset: Fiscal Month of December 2004
balancing training data

project real

Balancing Training Data
  • When Considering All Stores, Still Have Un-Balanced Datasets
    • [# Store/Week Combinations Where OOS is False] >> [# Store/Week Combinations Where OOS is True]
    • Common in Many Data Mining Applications
  • Training Datasets were Balanced
    • Sample Store/Week Combinations Where OOS is False to Obtain Equal Proportion of True/False Values
  • “Cost” of Predictive Errors are Equal
    • Requested by Client
prediction methods

project real

Prediction Methods
  • Algorithm Selection
      • Microsoft Decision Trees for Predicting OOS Boolean flags
      • Consistently High Overall Accuracy
      • Straightforward Interpretation
  • Data Preparation
    • Scenario 2
    • Rebuild models monthly
      • Predictive Models are Contextual and Optimized for Behavior in the Coming Month
prediction methods1

project real

Prediction Methods
  • Modeling Methodology Benefits
    • Scalability (Titles and Stores)
    • Saves 4x to 5x on Computational Cost when Rebuilding Models (versus Neural Networks)
      • 5 Minutes for All 5 Titles => 1 Minute per Title for All Stores
inventory prediction results

project real

Inventory Prediction Results
  • 1 week and 2 week prediction accuracies
sales opportunity

project real

Sales Opportunity
  • Data Mining created revenue generating opportunity
  • Based on 55 titles for Jan 2004 - Dec 2004
    • (# of weeks OOS across all stores)(Apollo Boolean Predicted Accuracy)
    • X (actual % of actual sales across all stores) x (retail price)
    • = Yearly Increase in Sales Opportunity using Apollo OOS Predictions

Sales bins produced $3.4K, $6.8K potential lift in sales



Client Profiles

client profile customer acquisition

client profiles

Client Profile – Customer Acquisition
  • Decrease Subscriber Churn
  • Increase New Subscriptions
  • Segment Geo-Demographic and Attitudinal Behaviors for Subscribers and Non-Subscribers
  • Build Predictive Models to Identify Likely New Subscribers
  • Using Analysis to Deliver Targeted Marketing Campaigns for Acquisition
  • Increased Stop Saves by 2%
client profile cross sell up sell global catalog retailer

client profiles

Client Profile – Cross sell / Up sell (Global Catalog Retailer)
  • Increase Average Purchase Size
  • Deploy Product Recommendations on their Website
  • Modeling Historical Sales to Determine Product Affinities
  • Incorporate Business Logic into Modeling Process (e.g. Same category recommendation)
  • Increase Average Shopping Cart Size
  • Increase Sales Lift
  • Data Mining Driven Product Recommendation Performed Better than Manual Recommendations
client profile customer support automation

client profiles

Client Profile – Customer Support Automation
  • Increase Visibility into Customer Service Center
  • Increase Speed of Customer Support
  • Utilizing Text Mining Engines to Automate Processing of Customer Support (Email, Web Inquiries, etc.)
  • Automating the Process of Rolling up Keywords into Concepts
  • Customer Support Center has the Ability to View Trends in Minutes versus Weeks
  • Improved Accuracy - Text Mining Engines Removed the Bias and Inaccuracies Often Occurring in Call Center Representative Notes and Tagging.
client profile key driver analysis

client profiles

Client Profile – Key Driver Analysis
  • Evaluate Customer Satisfaction Metrics
  • Increase Customer Satisfaction
  • Partnered with Apollo to Develop Market Research Database and Reporting
  • Developed Models to Identify “Key” Satisfaction Drivers
  • Successfully Identified Drivers to Increase Customer Satisfaction
  • Delivered Driver Recommendations to Field Operations - Insight into Action
  • Company Wide (sales, marketing, executive level) Visibility into Customer Satisfaction Metrics
presented by jeff kaplan principal client services jeff@apollodatatech com 312 787 7376
Presented by

Jeff Kaplan

Principal Client Services