Gamma dbms part 2 failure management query processing
Download
1 / 37

Gamma DBMS (Part 2): Failure Management Query Processing - PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on

Gamma DBMS (Part 2): Failure Management Query Processing. Shahram Ghandeharizadeh Computer Science Department University of Southern California. Failure Management Techniques. Teradata’s Interleaved Declustering A partitioned table has a primary and a backup copy.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Gamma DBMS (Part 2): Failure Management Query Processing' - evelina-jara


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Gamma dbms part 2 failure management query processing

Gamma DBMS (Part 2): Failure Management Query Processing

Shahram Ghandeharizadeh

Computer Science Department

University of Southern California


Failure management techniques
Failure Management Techniques

  • Teradata’s Interleaved Declustering

    • A partitioned table has a primary and a backup copy.

    • The primary copy is constructed using one of the partitioning techniques.

    • The secondary copy is constructed by:

      • Dividing the nodes into clusters (cluster size is 4),

      • Partition a primary fragment (R0) across the remaining nodes of the cluster: 1, 2, and 3. Realizing r0.0, r0.1, and r0.2.


Teradata s interleaved declustering
Teradata’s Interleaved Declustering

  • When a node (say 1) fails, its backup copy processes requests directed towards the primary copy of R1.

    • Three backup fragments r1.2, r1.0 and r1.1.

  • Note that the load of R1 is distributed across the remaining nodes of the cluster.


Teradata s interleaved declustering1
Teradata’s Interleaved Declustering

  • MTTR involves:

    • Replacing the failed node with a new one.

    • Reconstructing the primary copy of the fragment assigned to the failed node, R1.

      • By reading r1.2, r1.0, and r1.1 from Nodes 0, 2, and 3.

    • Reconstructing the backup fragments assigned to the failed node: r0.0, r2.2, and r3.1.


Teradata s interleaved declustering2
Teradata’s Interleaved Declustering

  • When does data become unavailable?


Teradata s interleaved declustering3
Teradata’s Interleaved Declustering

  • When does data become unavailable?

    • When a second node in a cluster fails prior to repair of the first failed node in that cluster.

    • Note that it is a bit more complex than the discussion here.


Teradata s interleaved declustering4
Teradata’s Interleaved Declustering

  • What is the advantage of making the cluster size equal to 8?


Teradata s interleaved declustering5
Teradata’s Interleaved Declustering

  • What is the advantage of making the cluster size equal to 8?

    • Better distribution of the workload across the nodes in the presence of a failure.


Teradata s interleaved declustering6
Teradata’s Interleaved Declustering

  • What is the advantage of making the cluster size equal to 8?

    • Better distribution of the workload across the nodes in the presence of a failure.

  • What is the dis-advantage of making the cluster size equal to 8?


Teradata s interleaved declustering7
Teradata’s Interleaved Declustering

  • What is the advantage of making the cluster size equal to 8?

    • Better distribution of the workload across the nodes in the presence of a failure.

  • What is the dis-advantage of making the cluster size equal to 8?

    • Higher likelihood of data becoming unavailable.


Teradata s interleaved declustering8
Teradata’s Interleaved Declustering

  • What is the advantage of making the cluster size equal to 8?

    • Better distribution of the workload across the nodes in the presence of a failure.

  • What is the dis-advantage of making the cluster size equal to 8?

    • Higher likelihood of data becoming unavailable.

  • Tradeoff between load-balancing (in the presence of a failure) and data availability.


Gamma s chained declustering
Gamma’s Chained Declustering

  • Nodes are divided into disjoint groups called relation clusters.

  • A relation is assigned to one relation cluster and its records are declustered across the nodes of that relation cluster using a partitioning strategy (Range, Hash).

  • Given a primary fragment Ri, its backup copy is assigned to node (i+1) mod M (M is the number of nodes in the relation cluster).


Gamma s chained declustering1
Gamma’s Chained Declustering

  • During normal mode of operation:

    • Read requests are directed to the fragments of primary copy,

    • Write requests update both primary and backup copies.


Gamma s chained declustering2
Gamma’s Chained Declustering

  • In the presence of failure:

    • Both primary and backup fragments are used for read operations,

      • Objective: Balance the load and avoid bottlenecks!

    • Write requests update both primary and backup copies.

  • Note:

    • Load of R1 (on node 1) is pushed to node 2 in its entirety.

    • A fraction of read request from each node is pushed to the others for a 1/8 load increase attributed to node 1’s failure.


Gamma s chained declustering3
Gamma’s Chained Declustering

  • MTTR involves:

    • Replace node 1 with a new node,

    • Reconstruct R1 (from r1 on node 2) on node 1,

    • Reconstruct backup copy of R0 (i.e., r0) on node 1.

  • Note:

    • Once Node 1 becomes operational, primary copies are used to process read requests.


Gamma s chained declustering4
Gamma’s Chained Declustering

  • Any two node failures in a relation cluster does not result in data un-availability.

  • Two adjacent nodes must fail in order for data to become unavailable.


Gamma s chained declustering5
Gamma’s Chained Declustering

  • Re-assignment of active fragments incurs neither disk I/O nor data movement.


Join

  • Hash-join

  • A data-flow execution paradigm


Example join of emp and dept
Example Join of Emp and Dept

Emp join Dept (using dno)

Dept

EMP


Hash join 1 node
Hash-Join: 1 Node

  • Join of Tables A and B using attribute j (A.j = B.j) consists of two phase:

    • Build phase: Build a main-memory hash table on Table A using the join attribute j, e.g., build a hash table on the Toy department using dno as the key of the hash table.

    • Probe phase: Scan table B one record at a time and use its attribute j to probe the hash table constructed on Table A, e.g., probe the hash table using the rows of the Emp department.


Hash join build
Hash-Join: Build

  • Read rows of Dept table one at a time and place in a main-memory hash table.

dno % 7


Hash join build1
Hash-Join: Build

  • Read rows of Emp table and probe the hash table.

dno % 7


Hash join build2
Hash-Join: Build

  • Read rows of Emp table and probe the hash table and produce results when a match is found.

dno % 7


Hash join build3
Hash-Join: Build

  • Termination condition is when all rows of the Emp table have been processed!

dno % 7


Hash join
Hash-Join

  • Key challenge:


Hash join1
Hash-Join

  • Key challenge: Table used to build the hash table does not fit in main memory!

  • Solution:


Hash join2
Hash-Join

  • Key challenge: Table used to build the hash table does not fit in main memory!

  • A divide-and-conquer approach:

    • Use the inner table (Dept) to construct n memory buckets where each bucket is a hash table.

    • Every time memory is exhausted, spill a fixed number of buckets to the disk.

    • The build phase terminates with a set of in-memory buckets and a set of disk-resident buckets.

    • Read the outer relation (Emp) and probe the in-memory buckets for joining records. For those records that map onto the disk-resident buckets, stream and store them to disk.

    • Discard the in memory buckets to free memory space.

    • While disk-resident buckets of inner-relation exist:

      • Read as many (say i) of the disk-resident buckets of the inner-relation into memory as possible.

      • Read the corresponding buckets of the outer relation (Emp) to probe the in-memory buckets for joining records.

      • Discard the in memory buckets to free memory space.

      • Delete the i buckets of the inner and outer relations.


Hash join build4
Hash-Join: Build

  • Two buckets of Dept table. One in memory and the second is disk-resident.

dno % 7


Hash join probe
Hash-Join: Probe

  • Read Emp table and probe the hash table for joining records when dno = 1. With dno=2, stream the data to disk.

dno % 7


Hash join probe1
Hash-Join: Probe

  • Those rows of Emp table with dno=1 probed the hash table and produce 3 joining records.

dno % 7


Hash join while loop
Hash-Join: While loop

  • Read the disk-resident bucket of Dept into memory.

dno % 7


Hash join while loop1
Hash-Join: While loop

  • Read the disk-resident bucket of Dept into memory.

  • Probe it with the disk-resident buckets of Emp table to produce the remaining two joining records.

dno % 7


Parallelism and hash join
Parallelism and Hash-Join

  • Each node may perform hash-join independently when:

    • The join attribute is the declustering attribute of the tables participating in the join operation.

    • The participating tables are declustered across the same number of nodes using the same declustering strategy.

    • The system may re-partition the table (see the next bullet) if its aggregate memory exceeds the size of memory the tables are declustered across.

  • Otherwise, the data must be re-partitioned to perform the join operation correctly.

  • Show an example!


Parallelism and hash join cont
Parallelism and Hash-Join (Cont…)

  • R join S where R is the inner table.


Data flow execution paradigm
Data Flow Execution Paradigm

  • Retrieve all those Employees working for the toy department:

    SELECT *

    FROM Dept d, Emp e

    WHERE d.dno = e.dno and d.dname = Toy


Data flow execution paradigm1
Data Flow Execution Paradigm

  • Producer/Consumer relationship where consumers are activated in advance of the producers.


Data flow execution paradigm2
Data Flow Execution Paradigm

  • “Split Table” contains routing information for the records

  • The consumers must be setup in order to activate producers.


ad