gamma dbms part 2 failure management query processing
Download
Skip this Video
Download Presentation
Gamma DBMS (Part 2): Failure Management Query Processing

Loading in 2 Seconds...

play fullscreen
1 / 37

Gamma DBMS (Part 2): Failure Management Query Processing - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Gamma DBMS (Part 2): Failure Management Query Processing. Shahram Ghandeharizadeh Computer Science Department University of Southern California. Failure Management Techniques. Teradata’s Interleaved Declustering A partitioned table has a primary and a backup copy.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Gamma DBMS (Part 2): Failure Management Query Processing' - evelina-jara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
gamma dbms part 2 failure management query processing

Gamma DBMS (Part 2): Failure Management Query Processing

Shahram Ghandeharizadeh

Computer Science Department

University of Southern California

failure management techniques
Failure Management Techniques
  • Teradata’s Interleaved Declustering
    • A partitioned table has a primary and a backup copy.
    • The primary copy is constructed using one of the partitioning techniques.
    • The secondary copy is constructed by:
      • Dividing the nodes into clusters (cluster size is 4),
      • Partition a primary fragment (R0) across the remaining nodes of the cluster: 1, 2, and 3. Realizing r0.0, r0.1, and r0.2.
teradata s interleaved declustering
Teradata’s Interleaved Declustering
  • When a node (say 1) fails, its backup copy processes requests directed towards the primary copy of R1.
    • Three backup fragments r1.2, r1.0 and r1.1.
  • Note that the load of R1 is distributed across the remaining nodes of the cluster.
teradata s interleaved declustering1
Teradata’s Interleaved Declustering
  • MTTR involves:
    • Replacing the failed node with a new one.
    • Reconstructing the primary copy of the fragment assigned to the failed node, R1.
      • By reading r1.2, r1.0, and r1.1 from Nodes 0, 2, and 3.
    • Reconstructing the backup fragments assigned to the failed node: r0.0, r2.2, and r3.1.
teradata s interleaved declustering2
Teradata’s Interleaved Declustering
  • When does data become unavailable?
teradata s interleaved declustering3
Teradata’s Interleaved Declustering
  • When does data become unavailable?
    • When a second node in a cluster fails prior to repair of the first failed node in that cluster.
    • Note that it is a bit more complex than the discussion here.
teradata s interleaved declustering4
Teradata’s Interleaved Declustering
  • What is the advantage of making the cluster size equal to 8?
teradata s interleaved declustering5
Teradata’s Interleaved Declustering
  • What is the advantage of making the cluster size equal to 8?
    • Better distribution of the workload across the nodes in the presence of a failure.
teradata s interleaved declustering6
Teradata’s Interleaved Declustering
  • What is the advantage of making the cluster size equal to 8?
    • Better distribution of the workload across the nodes in the presence of a failure.
  • What is the dis-advantage of making the cluster size equal to 8?
teradata s interleaved declustering7
Teradata’s Interleaved Declustering
  • What is the advantage of making the cluster size equal to 8?
    • Better distribution of the workload across the nodes in the presence of a failure.
  • What is the dis-advantage of making the cluster size equal to 8?
    • Higher likelihood of data becoming unavailable.
teradata s interleaved declustering8
Teradata’s Interleaved Declustering
  • What is the advantage of making the cluster size equal to 8?
    • Better distribution of the workload across the nodes in the presence of a failure.
  • What is the dis-advantage of making the cluster size equal to 8?
    • Higher likelihood of data becoming unavailable.
  • Tradeoff between load-balancing (in the presence of a failure) and data availability.
gamma s chained declustering
Gamma’s Chained Declustering
  • Nodes are divided into disjoint groups called relation clusters.
  • A relation is assigned to one relation cluster and its records are declustered across the nodes of that relation cluster using a partitioning strategy (Range, Hash).
  • Given a primary fragment Ri, its backup copy is assigned to node (i+1) mod M (M is the number of nodes in the relation cluster).
gamma s chained declustering1
Gamma’s Chained Declustering
  • During normal mode of operation:
    • Read requests are directed to the fragments of primary copy,
    • Write requests update both primary and backup copies.
gamma s chained declustering2
Gamma’s Chained Declustering
  • In the presence of failure:
    • Both primary and backup fragments are used for read operations,
      • Objective: Balance the load and avoid bottlenecks!
    • Write requests update both primary and backup copies.
  • Note:
    • Load of R1 (on node 1) is pushed to node 2 in its entirety.
    • A fraction of read request from each node is pushed to the others for a 1/8 load increase attributed to node 1’s failure.
gamma s chained declustering3
Gamma’s Chained Declustering
  • MTTR involves:
    • Replace node 1 with a new node,
    • Reconstruct R1 (from r1 on node 2) on node 1,
    • Reconstruct backup copy of R0 (i.e., r0) on node 1.
  • Note:
    • Once Node 1 becomes operational, primary copies are used to process read requests.
gamma s chained declustering4
Gamma’s Chained Declustering
  • Any two node failures in a relation cluster does not result in data un-availability.
  • Two adjacent nodes must fail in order for data to become unavailable.
gamma s chained declustering5
Gamma’s Chained Declustering
  • Re-assignment of active fragments incurs neither disk I/O nor data movement.
slide18
Join
  • Hash-join
  • A data-flow execution paradigm
example join of emp and dept
Example Join of Emp and Dept

Emp join Dept (using dno)

Dept

EMP

hash join 1 node
Hash-Join: 1 Node
  • Join of Tables A and B using attribute j (A.j = B.j) consists of two phase:
    • Build phase: Build a main-memory hash table on Table A using the join attribute j, e.g., build a hash table on the Toy department using dno as the key of the hash table.
    • Probe phase: Scan table B one record at a time and use its attribute j to probe the hash table constructed on Table A, e.g., probe the hash table using the rows of the Emp department.
hash join build
Hash-Join: Build
  • Read rows of Dept table one at a time and place in a main-memory hash table.

dno % 7

hash join build1
Hash-Join: Build
  • Read rows of Emp table and probe the hash table.

dno % 7

hash join build2
Hash-Join: Build
  • Read rows of Emp table and probe the hash table and produce results when a match is found.

dno % 7

hash join build3
Hash-Join: Build
  • Termination condition is when all rows of the Emp table have been processed!

dno % 7

hash join
Hash-Join
  • Key challenge:
hash join1
Hash-Join
  • Key challenge: Table used to build the hash table does not fit in main memory!
  • Solution:
hash join2
Hash-Join
  • Key challenge: Table used to build the hash table does not fit in main memory!
  • A divide-and-conquer approach:
    • Use the inner table (Dept) to construct n memory buckets where each bucket is a hash table.
    • Every time memory is exhausted, spill a fixed number of buckets to the disk.
    • The build phase terminates with a set of in-memory buckets and a set of disk-resident buckets.
    • Read the outer relation (Emp) and probe the in-memory buckets for joining records. For those records that map onto the disk-resident buckets, stream and store them to disk.
    • Discard the in memory buckets to free memory space.
    • While disk-resident buckets of inner-relation exist:
      • Read as many (say i) of the disk-resident buckets of the inner-relation into memory as possible.
      • Read the corresponding buckets of the outer relation (Emp) to probe the in-memory buckets for joining records.
      • Discard the in memory buckets to free memory space.
      • Delete the i buckets of the inner and outer relations.
hash join build4
Hash-Join: Build
  • Two buckets of Dept table. One in memory and the second is disk-resident.

dno % 7

hash join probe
Hash-Join: Probe
  • Read Emp table and probe the hash table for joining records when dno = 1. With dno=2, stream the data to disk.

dno % 7

hash join probe1
Hash-Join: Probe
  • Those rows of Emp table with dno=1 probed the hash table and produce 3 joining records.

dno % 7

hash join while loop
Hash-Join: While loop
  • Read the disk-resident bucket of Dept into memory.

dno % 7

hash join while loop1
Hash-Join: While loop
  • Read the disk-resident bucket of Dept into memory.
  • Probe it with the disk-resident buckets of Emp table to produce the remaining two joining records.

dno % 7

parallelism and hash join
Parallelism and Hash-Join
  • Each node may perform hash-join independently when:
    • The join attribute is the declustering attribute of the tables participating in the join operation.
    • The participating tables are declustered across the same number of nodes using the same declustering strategy.
    • The system may re-partition the table (see the next bullet) if its aggregate memory exceeds the size of memory the tables are declustered across.
  • Otherwise, the data must be re-partitioned to perform the join operation correctly.
  • Show an example!
parallelism and hash join cont
Parallelism and Hash-Join (Cont…)
  • R join S where R is the inner table.
data flow execution paradigm
Data Flow Execution Paradigm
  • Retrieve all those Employees working for the toy department:

SELECT *

FROM Dept d, Emp e

WHERE d.dno = e.dno and d.dname = Toy

data flow execution paradigm1
Data Flow Execution Paradigm
  • Producer/Consumer relationship where consumers are activated in advance of the producers.
data flow execution paradigm2
Data Flow Execution Paradigm
  • “Split Table” contains routing information for the records
  • The consumers must be setup in order to activate producers.
ad