slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standa PowerPoint Presentation
Download Presentation
Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standa

Loading in 2 Seconds...

play fullscreen
1 / 40

Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standa - PowerPoint PPT Presentation


  • 219 Views
  • Uploaded on

Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standard Chartered . The Art and Science of Data Mining. Y V Hui City University of Hong Kong. The Driving Forces. Specialization and focus in business

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standa' - jeneil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Speakers:

Prof Y V Hui, CityU

Dr H P Lo, CityU

Dr Sammy Yuen, CityU

Dr K W Cheng, SAS Institute

Mr Steven Parker, Standard Chartered

Knowledge Discovery Centre: CityU-SAS Partnership

the art and science of data mining

The Art and Science of Data Mining

Y V Hui

City University of Hong Kong

Knowledge Discovery Centre: CityU-SAS Partnership

the driving forces
The Driving Forces
  • Specialization and focus in business

- To satisfy the needs of customers

- To improve and develop specific

business strategies and processes

- Personalization through mass

customization

Knowledge Discovery Centre: CityU-SAS Partnership

the driving forces1
The Driving Forces
  • Challenges

- local and global competition

- distributed business operations

- product innovation

  • Technology development
  • Benefit, cost and risk on a product or customer basis

Knowledge Discovery Centre: CityU-SAS Partnership

data mining
Data Mining
  • Also known as knowledge discovery in databases. Data mining digs out valuable information from large and messy data. (Computer scientist’s definition)
  • Data mining is a knowledge discovery process. It’s the integration of business knowledge, people, information, statistics and computing technology.

Knowledge Discovery Centre: CityU-SAS Partnership

data mining is hot
Data Mining is Hot
  • Ten Hottest Job, Time, 22 May, 2000
  • 10 emerging areas of technology, MIT’s Magazine of Technology Review, Jan/Feb, 2001

Knowledge Discovery Centre: CityU-SAS Partnership

data mining philosophy
Data Mining Philosophy
  • A powerful enabler of competitive advantage.
  • Data mining is driven from business knowledge.
  • Data mining is about enabling people to discover actionable information about their business.
  • Return of profit isn’t about algorithms

Knowledge Discovery Centre: CityU-SAS Partnership

slide8

Scope of Data Mining

Management’s

Decision World

Data Miner’s

Analytical World

Interface

Business outlook

Industry conditions

Product offering

Customer analysis

Strategic options

Competitive actions

etc

Problem

development

and management

Reporting

and evaluations

Project design

Data collection and

preparation

Model building

Validation

Knowledge Discovery Centre: CityU-SAS Partnership

project management
Project Management
  • Cross-functional team
  • System architecture

Knowledge Discovery Centre: CityU-SAS Partnership

successful applications
Successful applications
  • Business transaction

- risks and opportunities

  • Customer relationship management

- personalization, target marketing

  • Electronic commerce & web

- web mining

Knowledge Discovery Centre: CityU-SAS Partnership

successful applications1
Successful applications
  • Science & engineering
  • Health care
  • Multi-media
  • Others

Knowledge Discovery Centre: CityU-SAS Partnership

data mining process
Data Mining Process

Understanding of business

Problem identification

Knowledge Discovery Centre: CityU-SAS Partnership

understanding your business
Understanding Your Business
  • Do we have a problem?

- What is the current situation? Are there any

undesirable situations that need attention?

- Are there any conditions, processes, etc,

that could be improved?

- Are any problems foreseeable that could

affect the business?

- Are there any potential opportunities that

the company may capitalize on?

A problem is a learning opportunity

Knowledge Discovery Centre: CityU-SAS Partnership

understanding your problem
Understanding Your Problem
  • Operational or analytical
  • Convention rule or knowledge discovery
  • Product based or customer based
  • Market research or data mining
  • Ownership of the information
  • Privacy
  • Added value

Knowledge Discovery Centre: CityU-SAS Partnership

data mining process1
Data Mining Process

Collecting relevant information

Understanding of business

Problem identification

Knowledge Discovery Centre: CityU-SAS Partnership

collecting relevant information
Collecting Relevant Information
  • Data Search
  • Data Collection
  • Data Preparation
  • Data Mining Database

Knowledge Discovery Centre: CityU-SAS Partnership

data search
Data Search
  • Exploring the problem space.

Don’t let the data drive the problem.

  • Measurement
  • Exploring the data sources

Knowledge Discovery Centre: CityU-SAS Partnership

data collection
Data Collection
  • Data retrieval
  • Data audit
  • Data set assembly and data warehouse
  • Survey

Knowledge Discovery Centre: CityU-SAS Partnership

data preparation
Data Preparation
  • Data representation
  • Data exploration
  • Data normalization
  • Data transformation
  • Imputation of missing data
  • Data tuning

Knowledge Discovery Centre: CityU-SAS Partnership

data mining database
Data Mining Database
  • Variable selection
  • Record selection
  • Data set partition

Knowledge Discovery Centre: CityU-SAS Partnership

data mining process2
Data Mining Process

Learning

Collecting relevant information

Model building

Understanding of business

Problem identification

Knowledge Discovery Centre: CityU-SAS Partnership

model building
Model Building
  • Model based vs non-model based

y1,y2,…,yp=f(x1, …, xq)

Inputs

Outputs

y1, …, yp

x1, …, xq

Knowledge Discovery Centre: CityU-SAS Partnership

model building1
Model Building
  • Parametric vs nonparametric

Knowledge Discovery Centre: CityU-SAS Partnership

model building2
Model Building
  • Estimation vs trial and error
  • Directed vs undirected
  • Multidimensional analysis
  • Large data set vs small data set

Knowledge Discovery Centre: CityU-SAS Partnership

slide25

Data Mining Algorithms

Online Analytical

Processing

Discovery Driven Methods

Description

Prediction

SQL

Query Tools

Classification

Regressions

Visualization

Decision Trees

Clustering

Neural Networks

Association

Sequential Analysis

Knowledge Discovery Centre: CityU-SAS Partnership

online analytical processing
Online Analytical Processing
  • Query and reporting

Example of SQL query:

How many credit-card customers who made purchases of over $1,000 on sporting goods in December have at least $20,000 of available credit?

  • Manual and validation driven

Knowledge Discovery Centre: CityU-SAS Partnership

estimation and prediction
Estimation and Prediction
  • Statistical models
  • Neural network

Example:

Housing price valuation model

Knowledge Discovery Centre: CityU-SAS Partnership

classification algorithms
Classification Algorithms
  • Statistical techniques
  • Neural networks
  • Genetic algorithms
  • Nearest neighbor method
  • Rule induction and decision tree

Example: Customer segmentation and buying behavior description

Knowledge Discovery Centre: CityU-SAS Partnership

association rules
Association Rules
  • Apriori algorithm

Example:

Market basket analysis, cross selling analysis

Knowledge Discovery Centre: CityU-SAS Partnership

sequential analysis
Sequential Analysis
  • Count-all algorithm
  • Count-some algorithm

Example:

Attached mailing, add-on sales

Knowledge Discovery Centre: CityU-SAS Partnership

algorithms comparison
Algorithms Comparison
  • No single data mining algorithm can outperform any other.

Try different algorithms and draw conclusions from the results. Use your business knowledge.

  • Neural networks do no better than statistical models when the underlying structure is known. However, neural networks detect hidden interactions and nonlinearity.

Use the prior information if available.

Knowledge Discovery Centre: CityU-SAS Partnership

algorithms comparison1
Algorithms Comparison
  • Data mining algorithms cannot handle dependent records.

Use the prior information. Statistical models help.

  • Data tuning and dimension reduction enhance data mining before and after the analysis.

Statistical techniques help.

Knowledge Discovery Centre: CityU-SAS Partnership

data mining process3
Data Mining Process

Learning

Collecting relevant data

Model building

Understanding of business

Problem identification

Business strategy

and evaluation

Action

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining
Trends that Effect Data Mining
  • Data trends

- data explosion

- data types

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining1
Trends that Effect Data Mining
  • Hardware trends

- memory

- processing speed

- storage

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining2
Trends that Effect Data Mining
  • Network trends

- network connectivity

- distributed databases

  • Wireless communication

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining3
Trends that Effect Data Mining
  • Scientific computing trends

- theory, experiment and simulation

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining4
Trends that Effect Data Mining
  • Business trends

- total quality management,

- customer relationship management,

- business process reengineering,

- enterprise resources planning,

- supply chain management,

- business intelligence and knowledge management,

- e – business and m – business

Knowledge Discovery Centre: CityU-SAS Partnership

trends that effect data mining5
Trends that Effect Data Mining
  • Privacy and Security

Knowledge Discovery Centre: CityU-SAS Partnership

pot of gold
Pot of Gold
  • The benefits of knowing one’s business and customers become so critical that technologies are coming together to support data mining.
  • Data mining is not a cybernetic magic that will turn your data into gold. It’s the process and result of knowledge production, knowledge discovery and knowledge management.

Knowledge Discovery Centre: CityU-SAS Partnership