hyper scaling xrootd clustering
Download
Skip this Video
Download Presentation
Hyper-Scaling Xrootd Clustering

Loading in 2 Seconds...

play fullscreen
1 / 21

Hyper-Scaling Xrootd Clustering - PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on

Hyper-Scaling Xrootd Clustering. Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 29-September-2005 http://xrootd.slac.stanford.edu. Root 2005 Users Workshop CERN September 28-30, 2005. Outline. Xrootd Single Server Scaling Hyper-Scaling via Clustering

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Hyper-Scaling Xrootd Clustering' - jenski


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
hyper scaling xrootd clustering

Hyper-Scaling Xrootd Clustering

Andrew Hanushevsky

Stanford Linear Accelerator Center

Stanford University

29-September-2005

http://xrootd.slac.stanford.edu

Root 2005 Users WorkshopCERN

September 28-30, 2005

outline
Outline
  • Xrootd Single Server Scaling
  • Hyper-Scaling via Clustering
    • Architecture
    • Performance
  • Configuring Clusters
    • Detailed relationships
    • Example configuration
    • Adding fault-tolerance
  • Conclusion

2: http://xrootd.slac.stanford.edu

latency per request xrootd
Latency Per Request (xrootd)

3: http://xrootd.slac.stanford.edu

capacity vs load xrootd
Capacity vs Load (xrootd)

4: http://xrootd.slac.stanford.edu

xrootd server scaling
xrootd Server Scaling
  • Linear scaling relative to load
    • Allows deterministic sizing of server
      • Disk
      • NIC
      • Network Fabric
      • CPU
      • Memory
  • Performance tied directly to hardware cost

5: http://xrootd.slac.stanford.edu

hyper scaling
Hyper-Scaling
  • xrootd servers can be clustered
    • Increase access points and available data
      • Complete scaling
    • Allow for automatic failover
      • Comprehensive fault-tolerance
  • The trick is to do so in a way that
    • Cluster overhead (human & non-human) scales linearly
      • Allows deterministic sizing of cluster
    • Cluster size is not artificially limited
    • I/O performance is not affected

6: http://xrootd.slac.stanford.edu

basic cluster architecture
Basic Cluster Architecture
  • Software cross bar switch
    • Allows point-to-point connections
      • Client and data server
    • I/O performance not compromised
      • Assuming switch overhead can be amortized
  • Scale interconnections by stacking switches
    • Virtually unlimited connection points
      • Switch overhead must be very low

7: http://xrootd.slac.stanford.edu

single level switch
Single Level Switch

A

open file X

Redirectors

Cache file

location

go to C

2nd open X

Who has file X?

B

go to C

I have

open file X

C

Redirector

(Head Node)

Client

Data Servers

Cluster

Client sees all servers as xrootd data servers

8: http://xrootd.slac.stanford.edu

two level switch
Two Level Switch

Client

A

Who has file X?

Data Servers

open file X

B

D

go to C

Who has file X?

I have

open file X

I have

C

E

go to F

I have

Redirector

(Head Node)

Supervisor

(sub-redirector)

F

open file X

Cluster

Client sees all servers as xrootd data servers

9: http://xrootd.slac.stanford.edu

making clusters efficient
Making Clusters Efficient
  • Cell size, structure, & search protocol are critical
    • Cell Size is 64
      • Limits direct inter-chatter to 64 entities
      • Compresses incoming information by up to a factor of 64
      • Can use very efficient 64-bit logical operations
    • Hierarchical structures usually most efficient
      • Cells arranged in a B-Tree (i.e., B64-Tree)
      • Scales 64h (where h is the tree height)
        • Client needs h-1 hops to find one of 64h servers (2 hops for 262,144 servers)
        • Number of responses is bounded at each level of the tree
    • Search is a directed broadcast query/rarely respond protocol
      • Provably best scheme if less than 50% of servers have the wanted file
        • Generally true if number of files >> cluster capacity
        • Cluster protocol becomes more efficient as cluster size increases

10: http://xrootd.slac.stanford.edu

cluster scale management
Cluster Scale Management
  • Massive clusters must be self-managing
    • Scales 64n where n is height of tree
      • Scales very quickly (642 = 4096, 643 = 262,144)
      • Well beyond direct human management capabilities
    • Therefore clusters self-organize
      • Single configuration file for all nodes
      • Uses a minimal spanning tree algorithm
        • 280 nodes self-cluster in about 7 seconds
        • 890 nodes self-cluster in about 56 seconds
      • Most overhead is in wait time to prevent thrashing

11: http://xrootd.slac.stanford.edu

clustering impact
Clustering Impact
  • Redirection overhead must be amortized
    • This is deterministic process for xrootd
      • All I/O is via point-to-point connections
      • Can trivially use single-server performance data
    • Clustering overhead is non-trivial
      • 100-200us additional for an open call
      • Not good for very small files or short “open” times
    • However, compatible with the HEP access patterns

12: http://xrootd.slac.stanford.edu

detailed cluster architecture
Detailed Cluster Architecture

A cell is 1-to-64 entities (servers or cells)

clustered around a cell manager

The cellular process is self-regulating and creates a

B-64 Tree

Head Node

M

13: http://xrootd.slac.stanford.edu

the internal details
xrootd

ofs

odc

olb

The Internal Details

xrootd

Data Network

(redirectors steer clients to data

Data servers provide data)

olbd

Control Network

Managers, Supervisors & Servers

(resource info, file location)

Redirectors

olbd

M

ctl

xrootd

olbd

S

Data Clients

data

xrootd

Data Servers

14: http://xrootd.slac.stanford.edu

schema configuration
xrootd

ofs

odc

olb

Schema Configuration

Redirectors

(Head Node)

Supervisors

(sub-redirector)

Data Servers

(end-node)

ofs.redirect remote

odc.manager host port

ofs.redirect remote

ofs.redirect target

ofs.redirect target

x

x

x

o

o

o

olb.role manager

olb.port port

olb.allow hostpat

olb.role supervisor

olb.subscribe host port

olb.allow hostpat

olb.role server

olb.subscribe host port

15: http://xrootd.slac.stanford.edu

example slac configuration
Example: SLAC Configuration

kan01

kan02

kan03

kan04

kanxx

kanrdr-a

kanrdr02

kanrdr01

client machines

Hidden Details

16: http://xrootd.slac.stanford.edu

configuration file
xrootd

ofs

odc

olb

Configuration File

if kanrdr-a+

olb.role manager

olb.port 3121

olb.allow host kan*.slac.stanford.edu

ofs.redirect remote

odc.manager kanrdr-a+ 3121

else

olb.role server

olb.subscribe kanrdr-a+ 3121

ofs.redirect target

fi

17: http://xrootd.slac.stanford.edu

potential simplification
xrootd

ofs

odc

olb

Potential Simplification?

if kanrdr-a+

olb.role manager

olb.port 3121

olb.allow host kan*.slac.stanford.edu

ofs.redirect remote

odc.manager kanrdr-a+ 3121

else

olb.role server

olb.subscribe kanrdr-a+ 3121

ofs.redirect target

fi

olb.port 3121

all.role manager if kanrdr-a+

all.role server if !kanrdr-a+

all.subscribe kanrdr-a+

olb.allow host kan*.slac.stanford.edu

Is the simplification really better?

We’re not sure, what do you think?

18: http://xrootd.slac.stanford.edu

adding fault tolerance
Adding Fault Tolerance

Manager

(Head Node)

xrootd

xrootd

Fully Replicate

olbd

olbd

Supervisor

(Intermediate Node)

xrootd

xrootd

xrootd

Hot Spares

olbd

olbd

olbd

xrootd

xrootd

Data Replication

Restaging

Proxy Search*

Data Server

(Leaf Node)

olbd

olbd

^xrootd has builtin proxy support today; discriminating proxies will be available in a near future release.

19: http://xrootd.slac.stanford.edu

conclusion
Conclusion
  • High performance data access systems achievable
    • The devil is in the details
  • High performance and clustering are synergetic
    • Allows unique performance, usability, scalability, and recoverability characteristics
  • Such systems produce novel software architectures
    • Challenges
      • Creating applications that capitilize on such systems
    • Opportunities
      • Fast low cost access to huge amounts of data to speed discovery

20: http://xrootd.slac.stanford.edu

acknowledgements
Acknowledgements
  • Fabrizio Furano, INFN Padova
    • Client-side design & development
  • Principal Collaborators
    • Alvise Dorigo (INFN), Peter Elmer (BaBar), Derek Feichtinger(CERN), Geri Ganis (CERN), Guenter Kickinger(CERN), Andreas Peters (CERN), Fons Rademakers (CERN), Gregory Sharp (Cornell), Bill Weeks (SLAC)
  • Deployment Teams
    • FZK, DE; IN2P3, FR; INFN Padova, IT; CNAF Bologna, IT; RAL, UK; STAR/BNL, US; CLEO/Cornell, US; SLAC, US
  • US Department of Energy
    • Contract DE-AC02-76SF00515 with Stanford University

21: http://xrootd.slac.stanford.edu

ad