query optimization over web services l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Query Optimization over Web Services PowerPoint Presentation
Download Presentation
Query Optimization over Web Services

Loading in 2 Seconds...

play fullscreen
1 / 40

Query Optimization over Web Services - PowerPoint PPT Presentation


  • 128 Views
  • Uploaded on

Query Optimization over Web Services. Utkarsh Srivastava Jennifer Widom Kamesh Munagala Rajeev Motwani. Performance Numbers. Student. Advisor. Relative Contribution to Research. 100. 80. This Work. 60. Percent Contribution. 40. 20. 0. 0. 1. 2. 3. 4. 5.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Query Optimization over Web Services' - zurina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
query optimization over web services

Query Optimization overWeb Services

Utkarsh Srivastava

Jennifer Widom

Kamesh Munagala

Rajeev Motwani

performance numbers
Performance Numbers

Student

Advisor

Relative Contribution to Research

100

80

This

Work

60

Percent Contribution

40

20

0

0

1

2

3

4

5

Time in Program (years)

future directions sample
Future Directions (sample)
  • Web services with monetary cost
  • Web services with unstable response times

(QoS guarantees?)

  • Multiple web services for same data
  • Caching web-service query results
  • More expressive queries, also workflows
  • Web service profiling and statistics-tracking
first steps in big problem
First Steps in Big Problem

Our

contribution

New Query

Optimization

Problem

web services
Web Services
  • Standardized way of sharing data and
  • functionality
  • Description and discovery
  • Communication

Data,

Functionality

WSDL,UDDI

Web

Services

Users/

Clients

SOAP

example web services
Example Web Services

Stock symbol

WS1

Company info

Reuters

Stock symbol

WS2

Stock activity

NASDAQ

querying across web services
Querying Across Web Services

Get info about all

companies with

high-activity stock

Stock symbol

WS1

Company info

Query

User/

Client

Reuters

Results

  • Easy
  • Transparent
  • Efficient
  • Etc.

Stock symbol

WS2

Stock activity

NASDAQ

same basic goal as traditional dbms
Same Basic Goal as Traditional DBMS

Declarative

Interface

Query

User/

Client

Data

Database

Management

System

Results

  • Easy
  • Transparent
  • Efficient
  • Etc.
web service management system
Web Service Management System

WS1

Query

User/

Client

Reuters

Reuters

Results

WS2

NASDAQ

Web Service

Management

System

  • Easy
  • Transparent
  • Efficient
  • Etc.
wsms architecture
WSMS Architecture

WSMS

Declarative Interface

WS Invocations

Metadata Component

Schema

mapper

Web service

registration

WS1

Query +

input data

Query Processing Component

WS2

Client

Plan

selection

Plan

execution

Results

Profiling and Statistics Component

WSn

Statistics

tracker

Response-

time profiler

running example
Running Example
  • Credit card company wants to send offers to
  • people with:
    • credit rating > 600, and
    • payment history = “good” on prior credit card
  • Company has at its disposal:

L : List of potential recipients (identified by SSN)

WS1 : SSN  credit rating

WS2 : SSN  cc number(s)

WS3 : cc number  payment history

plan 1
Plan 1

SSN

WSMS

WS1

SSN,cr

SSNcr

Filter on cr, keep SSN

L(SSN)

Query

Plan

WS2

Client

SSNccn

SSN,ccn

WS3

SSN,ccn,ph

ccnph

Filter on ph, keep SSN

Note: Pipelined processing

simple representation of plan 1
Simple Representation of Plan 1

WS1

WS2

WS3

L

Results

ccnph

SSNcr

SSNccn

plan 2
Plan 2

WSMS

WS1

SSN

SSN,cr

SSNcr

Filter on cr, keep SSN

SSN

SSN

L(SSN)

WS2

Client

Join

SSNccn

SSN,ccn

WS3

SSN

SSN,ccn,ph

ccnph

Filter on ph, keep SSN

simple representation of plan 2
Simple Representation of Plan 2

SSNcr

WS1

L

Results

WS2

WS3

SSNccn

ccnph

slide16
Quiz

Which plan is better?

Plan 1

WS1

WS2

WS3

L

Results

WS1

Plan 2

L

Results

WS2

WS3

  • Cost metric:steady-state throughput
  • Assume join is “free”

Plan 1 is never worse

query optimization primer
Query Optimization Primer
  • Possible query plans:P1, …, Pn
  • Data/access statistics:S
  • Execution cost metric:cost(Pi, S)
  • GOAL: Find least-cost plan
query optimization primer18
Query Optimization Primer
  • Possible query plans:P1, …, Pn
  • Data/access statistics: S
  • Execution cost metric: cost(Pi, S)
  • GOAL: Find least-cost plan
queries and plans
Queries and Plans
  • “Select-Project-Join” queries over input dataL
  • and set of web services WS1, …, WSn
  • Precedence constraints

Output of WSi may be needed as input forWSj

Ex: WS2:SSN  ccn and WS3:ccn  ph

  • Precedence DAG defines space of query plans
query optimization primer20
Query Optimization Primer
  • Possible query plans: P1, …, Pn
  • Data/access statistics:S
  • Execution cost metric: cost(Pi, S)
  • GOAL: Find least-cost plan
statistics
Statistics

Our

contribution

  • Web service response times
  • Web service selectivities

New Query

Optimization

Problem

statistics response times
Statistics: Response Times

Our

contribution

  • ri: per-tuple response time of WSi from client

SSN

Client

WS1

SSNcr

cr

r1

  • ri ≈1/throughput, can be reduced by batching, parallel calls

batching

(see paper)

  • Assume independent response
  • times within query plans

New Query

Optimization

Problem

statistics selectivities
Statistics: Selectivities

Our

contribution

  • si: selectivity of WSi
  • Average # output tuples per input tuple toWSi
  • including post-filtering in query plan

WS1: SSN  cr, filter cr > 600

If 90% of SSNs have cr > 600 then s1 = 0.9

WS2: SSN  ccn

If on average each SSN has 2 credit cardsthen s2 = 2.0

  • Assume independent
  • selectivities within query plans

New Query

Optimization

Problem

query optimization primer24
Query Optimization Primer
  • Possible query plans: P1, …, Pn
  • Data/access statistics: S
  • Execution cost metric:cost(Pi, S)
  • GOAL: Find least-cost plan
bottleneck cost metric
Bottleneck Cost Metric

Our

contribution

New Query

Optimization

Problem

bottleneck cost metric26
Bottleneck Cost Metric

Conference Lunch Buffet

Dish 1

Dish 2

Dish 3

Dish 4

Average per-tuple processing time =

response time of slowest (bottleneck) stage in pipeline

Note: selectivities=1 in this example

cost equation for plan p
Cost Equation for Plan P
  • Ri(P): Predecessors of WSi in plan P

Πj∈Ri(P) sj

  • Fraction of input tuples seen byWSi=

(Πj∈Ri(P) sj)•ri

  • WSiresponse time per input tuple =
  • Bottleneck cost metric:

cost(P) = max1≤i≤n( (Πj∈Ri(P) sj)•ri )

(assumes WSMS processing is not the bottleneck)

contrast with sum cost metric
Contrast with Sum Cost Metric

cost(P) =∑1≤i≤n( (Πj∈Ri(P) sj)•ri )

  • Stream filter ordering
  • Expensive predicate placement

“Polite” Lunch Buffet

Dish 1

Dish 2

Dish 3

Dish 4

problem statement
Problem Statement
  • Input:
    • Web services WS1, …, WSn
    • Response times r1, …, rn
    • Selectivities s1, …, sn
    • Precedence constraints among web services
  • Output:
    • Web services arranged into a plan P
    • P respects all precedence constraints
    • cost(P) is minimized
no precedence constraints
No Precedence Constraints
  • All selectivities ≤ 1
  • Theorem:Optimal to order linearly by ri
  • (selectivities irrelevant)
  • General case
  • (optimal):

“proliferative”

web services

“selective” web services

ordered by response-time

join

at

WSMS

Results

with precedence constraints
With Precedence Constraints

cost(P) = max1≤i≤n( (Πj∈Ri(P) sj)•ri )

with precedence constraints32
With Precedence Constraints

100

80

60

Student

Percent Contribution

Advisor

40

20

0

0

1

2

3

4

5

Time in Program (years)

cost(P) =∑1≤i≤n( (Πj∈Ri(P) sj)•ri )

  • Sum cost metric
    • Hard to even obtain a factorO(n) of optimal
with precedence constraints33
With Precedence Constraints

100

80

60

Student

Percent Contribution

Advisor

40

20

0

0

1

2

3

4

5

Time in Program (years)

cost(P) = max1≤i≤n( (Πj∈Ri(P) sj)•ri )

  • Bottleneck (max) cost metric
    • Surprisingly, optimal solution in polynomial time
    • O(n5) algorithm in paper
      • Add one WS at a time to the plan
      • WS chosen by solving a linear program
example revisited
Example Revisited

Plan 1

WS1

WS1

WS2

WS2

WS3

WS3

L

Results

SSNcr

SSNccn

ccnph

SSNcr

max1≤i≤n( (Πj∈Ri(P) sj)•ri )

WS1

WS1

Plan 2

L

Results

WS2

WS2

WS3

WS3

SSNccn

ccnph

Selective

WS3

WS2

Precedence constraint

Proliferative

implementation
Implementation
  • Built prototype WSMS query processor
    • Optimizer and execution engine
    • Assumes schema issues resolved, statistics provided
    • Written in Java and uses Apache Axis (open-source SOAP implementation)
    • Experiments (see paper) validate analytical results
isn t problem the same as
Isn’t Problem the Same as … ?
  • Web Service composition
    • Targeted for workflow-oriented applications
    • No provably optimal strategies
  • Parallel/distributed query optimization
    • Freedom to place query operators
    • Much larger space of execution plans
  • Data integration, mediators
    • For general sources of data
    • Optimization of total resource consumption
future directions sample37
Future Directions (sample)
  • Web services with monetary cost
  • Web services with unstable response times

(QoS guarantees?)

  • Multiple web services for same data
  • Caching web-service query results
  • More expressive queries, also workflows
  • Web service profiling and statistics-tracking
conclusion
Conclusion

Our

contribution

New Query

Optimization

Problem

conclusion39
Conclusion

New Query

Optimization

Problem

Our

contribution

questions
Questions?

Student

Advisor

100

80

60

Percent Contribution

40

20

0

0

1

2

3

4

5

Time in Program (years)