routing of xml and xpath queries in data dissemination networks n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Routing of XML and XPath Queries in Data Dissemination Networks PowerPoint Presentation
Download Presentation
Routing of XML and XPath Queries in Data Dissemination Networks

Loading in 2 Seconds...

play fullscreen
1 / 28

Routing of XML and XPath Queries in Data Dissemination Networks - PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on

Routing of XML and XPath Queries in Data Dissemination Networks. Guoli Li, Shuang Hou Hans-Arno Jacobsen Middleware Systems Research Group University of Toronto. Agenda. Motivation Advertisement-based routing Covering Evaluation Conclusions. XML. XML. Motivation. Queries.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Routing of XML and XPath Queries in Data Dissemination Networks' - livana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
routing of xml and xpath queries in data dissemination networks

Routing of XML and XPath Queries in Data Dissemination Networks

Guoli Li, Shuang Hou

Hans-Arno Jacobsen

Middleware Systems Research Group

University of Toronto

ICDCS 2008 @ Beijing China

agenda
Agenda
  • Motivation
  • Advertisement-based routing
  • Covering
  • Evaluation
  • Conclusions

ICDCS 2008 @ Beijing China

motivation

XML

XML

Motivation

Queries

  • Data sources: publish XML data
  • Data users: register XPath queries
  • The data dissemination network: deliver matching results to a large and dynamically changing group of users

Content-based Data Dissemination

Results

… …

… …

Results

Queries

ICDCS 2008 @ Beijing China

publish subscribe

Publisher

Subscriber

Subscriber

Publish/Subscribe

Advertisement (DTD)

  • Matching of XMLs and XPaths [ICDE’06]
  • Matching of Advertisements and XPaths
  • Exploring relations among XPaths

Publication (XML)

Subscription (XPath)

ICDCS 2008 @ Beijing China

covering based routing
Covering-based Routing

1

5

3

4

2

6

ICDCS 2008 @ Beijing China

language model
Language Model
  • Advertisement: generated from DTDs
      • Non-recursive advertisement
        • e.g., A = /t1/t2/t3…/tn-1/tn
      • Recursive advertisement
        • Simple A = A1(A2)+A3
        • Series A = A1(A2)+A3(A4)+A5
        • Embedded A = A1(A2(A3 )+ A4)+A5

<?xml encoding="UTF-8"?>

<!ELEMENT personnel (person)+>

<!ELEMENT person (name,email*,url*,link?)>

<!ATTLIST person id ID #REQUIRED>

<!ELEMENT name ((family,given)|(given,family))>

<!ELEMENT family (#PCDATA)>

<!ELEMENT given (#PCDATA)>

<!ELEMENT email (#PCDATA)>

<!ELEMENT url EMPTY>

<!ATTLIST url href CDATA 'http://'>

<!ELEMENT link EMPTY>

<!ATTLIST link manager IDREF #IMPLIED>

… …

/personnel/person

/personnel/person/name

/personnel/person/name/family

/personnel/person/name/given

/personnel/person/email

/personnel/person/url

/personnel/person/link

Advertisements

DTD

ICDCS 2008 @ Beijing China

language model1
Language Model
  • Subscription: XPaths
    • Absolute
      • e.g., /c/d/*/e
    • Relative
      • e.g., c/d/*/e
    • Descendant operators
      • e.g., c//e/*/c

c

e

d

b

*

*

e

c

a

ICDCS 2008 @ Beijing China

advertisement based routing
Advertisement-based Routing

Broker

Subscription (S)

P(A)

A1: /a/b/*/e

A2: /b/e

A3: /a/b/d

A4: /a/b/e

… …

P(S)

P(S)

P(A)

P(A)

P(S)

P(A)

P(S)

ICDCS 2008 @ Beijing China

overlapping algorithms
Overlapping Algorithms
  • Basic case:
  • Other cases:

A = /a /b /c /* /b /c /* /b /e

e.g, S = /a /b //c /* /b //e

S = /a /b /c /* /b /e

Next Table

/a /b /c /* /b /c /* /b /e

/a /b /c /* /b/c /* /b /e

/a /b /c /* /b /c /* /b /e

/a /b /c /* /b /e

/a /b /c /* /b /e

/a /b /c /* /b/e

ICDCS 2008 @ Beijing China

subscription tree
Subscription Tree
  • Subscriptions are maintained in a hierarchical tree
  • A child has more than one parent
  • Siblings may intersect
  • If a publication does not match a node, it does not match any of the descendants

pointer

ROOT

/a

/*/b

/b

d/a

/a/c

/a/*/d

/a/b

/b/e

/b/d

/a/c/d

/a/b/d

/b/e/c/f

/b/d/a

ICDCS 2008 @ Beijing China

tree maintenance
Tree Maintenance
  • Insert
  • Delete

ICDCS 2008 @ Beijing China

covering algorithms
Similar to Adv-Sub overlapping algorithms

Absolute simple XPEs

Relative simple XPEs

XPEs with // operator

e.g.,

Covering Algorithms

S1 = /* /a //e /c

S2 = /a /a /* //c /e /c /d

/e /c

/* /a

/a /a /*//c /e /c /d

/a /a

/*

//

c /e /c /d

ICDCS 2008 @ Beijing China

merging rules

P(S)

P(S1)

P(S2)

Merging Rules
  • Rules
    • XPEs with one difference (e.g., element, op)
      • e.g., S1= /a/*/c/d S2 = /a/*/c/e S = /a/*/c/*
    • XPEs with different sub-XPEs
      • e.g.,

S1

… …

XPE1

… …

S

//

… …

… …

S2

… …

XPE2

… …

  • Merge degree

ICDCS 2008 @ Beijing China

evaluation
Evaluation
  • Setup
    • Implemented in C++
    • Overlay with 127 content-based routers
    • Cluster (each node:1.86GHz, 4G) vs. PlanetLab
    • Workloads are generated from two DTDs: NITF and PSD
  • Metrics
    • Number of subscriptions per router
    • Network traffic
    • XPE processing time
    • Notification delay

ICDCS 2008 @ Beijing China

routing table size
Routing Table Size

ICDCS 2008 @ Beijing China

routing table size1
Routing Table Size

ICDCS 2008 @ Beijing China

network traffic
Network Traffic

ICDCS 2008 @ Beijing China

process time
Process Time

ICDCS 2008 @ Beijing China

notification delay psd
Notification Delay (PSD)

ICDCS 2008 @ Beijing China

notification delay nitf
Notification Delay (NITF)

ICDCS 2008 @ Beijing China

related work
Related Work
  • Locating data sources in large distributed systems [Galanis et al. 2003]
    • DHT based approach
    • Data summary
  • Query aggregation for scalable data dissemination [Chan et al. 2002]
    • Equivalence between the original query set and the aggregated set
  • ONYX [Diao et al. 2004]
    • Deliver part of the XML documents
    • Share common prefixes among queries using NFA
  • XTreeNet [Fenner et al. 2005]
    • Unify the pub/sub model and the query/response model
    • Avoid repeatedly matching at each hop

ICDCS 2008 @ Beijing China

conclusions
Conclusions
  • Investigate advertisement-based routing for XML data dissemination networks
  • Propose a novel data structure to maintain covering & merging relationships among XPEs.
  • Perform experimental evaluation on a 127 broker overlay to demonstrate the approach
    • Reduce routing table by up to 90%
    • Improve routing latency by roughly 85%
  • Future work
    • Extend to tree patterns
    • Share common prefixes among XPEs in overlapping and covering algorithms

ICDCS 2008 @ Beijing China

slide23
Q & A

Thank You!

  • Contact
    • gli@cs.toronto.edu
    • jacobsen@eecg.toronto.edu
  • Middleware systems research group, University of Toronto
    • www.msrg.eecg.toronto.edu

ICDCS 2008 @ Beijing China

process time1

140

120

100

Time (ms)

80

60

40

20

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Number of Subscriptions

Process Time

ICDCS 2008 @ Beijing China

notification delay nitf1
Notification Delay (NITF)

ICDCS 2008 @ Beijing China

notification delay psd1

16

12

Notification Delay (ms)

8

4

0

2

3

4

5

6

Number of Hops

Notification Delay (PSD)

ICDCS 2008 @ Beijing China

false positives
False Positives

ICDCS 2008 @ Beijing China

conclusions1
Conclusions
  • Investigate advertisement-based routing for XML data dissemination networks
  • Present algorithms to determine the covering relations among arbitrary XPEs
  • Propose a novel data structure to maintain covering & merging relationships among XPEs.
  • Explore rules to merge similar XPEs in order to further reduce the routing table size
  • Perform experimental evaluation on a 127 broker overlay to demonstrate the approach
    • Reduce routing table by up to 90%
    • Improve routing latency by roughly 85%

ICDCS 2008 @ Beijing China