Evaluation of Data and Request Distribution Policies in Clustered Servers

Evaluation of Data and Request Distribution Policies in Clustered Servers Adnan Khaleel and A. L. Narasimha Reddy Texas A&M University adnan,reddy@ee.tamu.edu

Introduction • Internet use has skyrocketed • 74MB/month in ‘92, several gigabytes/hour today • Trend can be expected to grow in coming years • Increasing load has placed burdens on hardware and software beyond their original designs

Introduction (cont’d) • Clustered Servers are viable solutions

Issues in Clustered Servers • Need to present a single server image • DNS aliasing, magic routers etc • Multiplicity in Back-End Servers: • How should data be organized on back-end ? • How should incoming requests be distributed amongst the back-end servers ?

Issues in Clustered Servers (cont’d) • Data Organization • Disk Mirroring • Identical data maintained on all back-end servers • Every machine able to service requests without having to access files on other machines. • Several redundant machines present, good system reliability • Disadvantages • Inefficient use of disk space • Data cached on several nodes simultaneously

Issues in Clustered Servers (cont’d) • Data Organization (cont’d) • Disk Striping • Borrowed from Network File Servers • Entire data space divided over all the back-end servers • Portion of file may reside on several machines • Improve reliability through parity protection • For large file accesses, automatic load distribution • Better access times

Issues in Clustered Servers (cont’d) • Locality • Taking advantage of files already cached in back-end server’s memory • For clustered Server System • Requests accessing same data be sent to the same set of servers

Issues in Clustered Servers (cont’d) • Distribution Vs Locality ? • Load balanced system • Distribute requests evenly among back-end servers • Improve hit-rate and response time • Maximize locality • Current studies focus only on one aspect and ignore the other

Request Distribution Schemes (cont’d) • Round Robin Request Distribution

Request Distribution Schemes (cont’d) • Round Robin Request Distribution (cont’d) • Requests distributed in a sequential manner • Results in ideal distribution • Does not take server loading into account • Weighted Round Robin • Two Tier Round Robin • Cache Hits purely coincidental

Request Distribution Schemes (cont’d) • Round Robin Request Distribution (cont’d) • Every back-end server has to cache the entire content of the Server • Unnecessary duplication of files in cache • Inefficient use of cache space • Back-ends may see different queuing times due to uneven hit rates

Request Distribution Schemes (cont’d) • File Based Request Distribution

Request Distribution Schemes (cont’d) • File Based Request Distribution (cont’d) • Locality based distribution • Partition file-space and assign a partition to each back-end server • Advantages • Does not suffer from duplicated data on cache • Based on access patterns, can yield high hit rates

Request Distribution Schemes (cont’d) • File Based Request Distribution (cont’d) • Disadvantages • How to determine file-space partitioning ? • Difficult to partition so requests load back-ends evenly • Dependent on client access patterns, no one partition scheme can satisfy all cases • Some files will always be requested more than others • Locality is of primary concern, distribution ignored • Hope that partitioning achieves the distribution

Request Distribution Schemes (cont’d) • Client Based Request Distribution

Request Distribution Schemes (cont’d) • Client Based Request Distribution (cont’d) • Also locality based • Partition client-space and assign a partition to each back-end server • Advantages and disadvantages similar to file-based • Difficult to find ideal partitioning scheme • Ignores distribution

Request Distribution Schemes (cont’d) • Client Based Request Distribution (cont’d) • Slightly modified from DNS used in Internet • Allows flexibility in client-server mapping • TTL set during first resolution • On expiration, client expected to re-resolve name • Possibly different TTL could be used for different workload characteristics • However, clients ignore TTL • Hence a STATIC scheme

Request Distribution Schemes (cont’d) • Locality Aware Request Distribution [5] • Broadly based on file-based scheme • Addresses the issue of load balancing • Each file assigned a dynamic set of servers instead of just one server

Request Distribution Schemes (cont’d) • LARD (cont’d) • Technique • On first request for a file, assign least loaded back-end • On subsequent requests for the same file • Determine Max/Min loaded servers in assigned set • If (Max loaded server > High Threshold OR a server exists in cluster with load < Low Threshold ) then add the new least loaded server to set and assign to service request • Else assign Min loaded server in set to service request • If any server in set inactive > time T, remove from set

Request Distribution Schemes (cont’d) • LARD (cont’d) • File-space partitioning done on the fly • Disadvantages • Large amounts of processing needs to be performed by the front-end • Large amount of memory needed to maintain information on each individual file • Possible bottleneck as system is scaled

Request Distribution Schemes (cont’d) • Dynamic Client Based Request Distribution • Based on the premise that file reuse among clients is high • Complete ignorance of server loads • Propose a modification to the static client based distribution to make it actively modify distribution based on back-end loads.

Request Distribution Schemes (cont’d) • Dynamic Client Based (cont’d) • Use of time-to-live (TTL) for server mappings within cluster - TTL is continuously variable • In heavily loaded systems • RR type distribution preferable as queue times predominate • TTL values should be small • In lightly loaded systems • TTL values should be large in order to maximize benefits of locality

Request Distribution Schemes (cont’d) • Dynamic Client Based (cont’d) • On TTL expiration, assign client partition to least loaded back-end server in cluster • If more than one server has the same low load - choose randomly from that set • Allows server using an IPRP[4] type protocol to redirect client to other server if it aids load balancing • Unlike DNS, clients cannot void this mechanism • Hence - Dynamic

Request Distribution Schemes (cont’d) • Dynamic Client Based (cont’d) • Trend in server load essential to determine if TTL is to be increased or decreased • Need to average out the requests to smooth out transient activity • Moving Window Averaging Scheme • Only requests that come within the window period actively contribute towards load calculation

Simulation Model • Trace Driven simulation model • Based on CSIM [8] • Modelled an IBM OS/2 for various hardware parameters • Several parameters could be modified • # of servers, memory size, CPU capacity in MIPS (50), disk access times, Network communication time/packet, data organization - disk mirror or stripe

Simulation Model (cont’d) • In disk mirror and disk striping, data cached at request servicing nodes • In disk striping, data is also cached at disk-end nodes

Simulation Model (cont’d) • Traces • Representative of two arena where clustered servers are currently used • World Wide Web (WWW) Servers • Network File (NFS) Servers

Simulation Model (cont’d) • WEB Trace • ClarkNet WWW Server - ISP for Metro Baltimore - Washington DC area • Collected over a period of two weeks • Original trace had 3 million records • Weeded out non HTTP related records like CGI, ftp • Resulting trace had 1.4 million records • Over 90,000 clients • Over 24,000 files that had a total occupancy of slightly under 100 MBytes

Simulation Model (cont’d) • WEB Trace (cont’d) • Records had timestamps with 1 second resolution • Did not accurately represent real manner of request arrivals • Requests that arrived in the same second were augmented with a randomly generated microsecond extension

Simulation Model (cont’d) • NFS Trace • Obtained from Auspex [9] file server at UC Berkeley • Consists of post client-cache misses • Collected over a period of one week • Had 231 clients, over 68,000 files that had a total occupancy of 1,292 Mbytes

Simulation Model (cont’d) • NFS Trace (cont’d) • Original trace had a large amount of backup data at night and over weekends, only daytime records used in simulation • Records had timestamps with microsecond resolution • Cache allowed to WARM-UP prior to any measurements being made

Results - Effects of Memory Size • NFS Trace,Disk Stripe • Increase mem = increase cache space Response time for 4 back-end servers.

Results - Effects of Memory Size • NFS trace, Disk Stripe • FB better at extracting locality • RR hits are purely probabilistic Cache-hit ratio for 4 back-end servers.

Results - Effects of Memory Size • WEB trace, Disk Stripe • WEB trace has a smaller working set • Increase in memory as less of an effect Response time for 4 back-end servers.

Results - Effects of Memory Size • WEB trace, Disk Stripe • Extremely high hit rates, even at 32 Mbytes • FB able to extract maximum locality • Distribution scheme less of an effect on response time • Load distribution was acceptable for all schemes, best RR , worst FB Cache hit rates for 4 back-end system.

Results - Effects of Memory Size • WEB Trace,Disk Mirror • Very similar to DS • With smaller memory, hit rates slightly lower as no disk end caching Disk Stripe Disk Mirror Disk stripe vs. disk mirror.

Results - Scalability Performance • NFS trace, Disk Stripe • RR shows least benefit • Due to probabilistic cache hits Number of servers on response time (128MB memory).

Results - Scalability Performance • NFS Trace,Disk Stripe • ROUND ROBIN • Drop in hit rates with more servers • Lesser “probabilistic” locality Cache hit rate vs. memorysize and number of back-end servers

Results - Scalability Performance • NFS Trace,Disk Mirror • RR performance worsens with more servers • All other schemes perform similar to Disk Striping Number of servers on response time (128MB).

Results - Scalability Performance • NFS Trace,Disk Mirror • For RR, lower hit rates with more servers - lower response time • For RR, disk-end caching offers better hit rates in disk striping than in disk mirror Disk Stripe Disk Mirror Cache hit rates for RR under Disk striping vs. mirroring (128MB)

Results - Effects of Memory Size • NFS trace, Disk Mirror • Similar effect of more memory • Stagnation of hit rates in FB, DM does better than DS due to caching of data at disk end • RR exhibits better hit rates with DS than DM, greater variety of files in cache Cache hit rates with disk mirror and disk striping.

Results - Disk Stripe Vs Disk Mirror • Implicit distribution of load in Disk striping produces low disk queues Disk Stripe Disk Mirror Queueing time in Disk stripe and disk mirror. NFS trace with a 4 back-end system used.

Conclusion & Future Work • RR ideal distribution, poor response rates due to probabilistic nature of cache hit rates. • File -based was the best at extracting locality, complete lack of server loads, poor load distribution • LARD, similar to FB but better load distribution • For WEB Trace, cache hit rates were so high that distribution did not play a role in determining response time

Conclusion & Future Work • Dynamic CB addressed the problem of server load ignorance of static CB, better distribution in NFS trace, better hit rates in WEB Trace • Disk Striping distributed requests over several servers, relieved disk queues but increased server queues • In the process of evaluating a flexible caching approach with Round Robin distribution that can exploit the file-based caching methodology • Throughput comparisons of various policies • Impact of faster processors • Impact of Dynamically generated web page content

Evaluation of Data and Request Distribution Policies in Clustered Servers

Evaluation of Data and Request Distribution Policies in Clustered Servers

Presentation Transcript

Data Management in Application Servers

Policies and Their Evaluation

Evaluation of agri-environmental policies

Repository Optimization and TAR PM index rebuild in clustered Weblogic servers

Clustered alignments of gene-expression time series data

Locality-Aware Request Distribution in Cluster-based Network Servers

Analysis of Clustered and Longitudinal Data

STUDENT EVALUATION OF FACULTY: Policies and Processes

Oracle on Clustered Data ONTAP

Frequency of Association Request in KDDI data

Request Distribution in Server Clusters

Efficient Support for Interactive Browsing Operations in Clustered CBR Video Servers

Evaluation of Labour Market Policies: The Use of Data-Driven Analyses in Ireland

Modeling Correlated/Clustered Multinomial Data

Achieving Load Balance and Effective Caching in Clustered Web Servers

Analysis of Clustered and Longitudinal Data

MEMORY PERFORMANCE EVALUATION OF HIGH THOUGHPUT SERVERS

High Availability in Clustered Multimedia Servers

Repository Optimization and TAR PM index rebuild in clustered Weblogic servers

Clustered Planarity = Flat Clustered Planarity

Data and Distribution

Evaluation of agri-environmental policies