1 / 61

How to Model and Implement a Hierarchy in SQL Server

How to Model and Implement a Hierarchy in SQL Server. AD-318-S. Louis Davidson (drsql.org) drsql@hotmail.com. Please silence cell phones. Explore Everything PASS Has to Offer . Free SQL Server and BI Web Events . Regional Event. Free 1-day Training Events. This is Community.

long
Download Presentation

How to Model and Implement a Hierarchy in SQL Server

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to Model and Implement a Hierarchy in SQL Server AD-318-S Louis Davidson (drsql.org) drsql@hotmail.com

  2. Please silence cell phones

  3. Explore Everything PASS Has to Offer Free SQL Server and BI Web Events Regional Event Free 1-day Training Events This is Community Business Analytics Training Local User Groups Around the World Session Recordings PASS Newsletter Free Online Technical Training

  4. Session Evaluations ways to access Your feedback is important and valuable. Submit by 5pm Friday Oct. 18 toWIN prizes Go to passsummit/evals Download the GuideBook App and search: PASS Summit 2013 Follow the QR code link displayed on session signage throughout the conference venue and in the program guide

  5. Who am I? • Been in IT for over 19 years • Microsoft MVP For 10 Years • Corporate Data Architect • Written five books on database design • Ok, so they were all versions of the same book. They at least had slightly different titles each time

  6. Hierarchies

  7. Agenda

  8. Hierarchies Wood with Tape Screw and Tape Screw Piece of Wood Tape • Trees - Single Parent Hierarchies • Graphs – Multi Parent Hierarchies • Note: Graphs can be complex to deal with as a whole, but often you can deal with them as a set of trees

  9. Cycles in Hierarchies Grandparent Parent Child • “I’m my own grandpa” syndrome • Must be understood or can cause infinite loop in processing • Generally disallowed in trees • Generally handled in graphs

  10. Hierarchy Uses • Trees • Species • Jurisdictions • “Simple” Organizational Charts (Or at least the base manager-employee part of the organization) • Directory folders • Graph • Bill of materials • Complex Organization Chart (all those dotted lines!) • Genealogies • Biological (Typically with limit cardinality of parents to 2 ) • Family Tree – (Sky is the limit)

  11. Modeling a hierarchy • Typically, there is only one way to model a hierarchy • One row is related to another row to indicate the relationship • The variation will be in how you implement that relationship • No matter how you implement things, the logical view to the user will remain the same • Modeling choices include: • Include the hierarchy structure with user data or not? • How many hierarchies need to be represented? • How many parents can a node have? • Can the same node have the same parent more than once for the same or different purpose?

  12. Variations on a Theme Two simple one parent “trees” (Getting closer to a developer reporting hierarchy Graph, allowing Employee to have multiple reports but only one for a Relationship Type Graph, allowing Employee to have multiple people they report to for the same reason (REALITY!) Simple One Parent Tree in External Structure Simple one parent “tree”

  13. Implementation of a Hierarchy • “There is more than one way to shave a dog” • None of which are pleasant for the dog or the shaver • And the doctor who orders it only asks for a bald dog • Hierarchies are not at all natural to manipulate/query using relational code • And the natural, recursive processing of a node at a time is horribly difficult and slow in relational code • So, multiple methods of processing them have arisen through the years • The topic (much like the topic of how cruel it is to shave a dog), inspires religious-like arguments • I find all of the implementation possibilities fascinating, so I set out to do an overview of them all…

  14. Working with Trees - Background Node recursion Relational Recursion

  15. Tree Processing Algorithms • There are several methods for processing trees in SQL • We will cover • Fixed Levels • Adjacency List • HierarchyId • Path Technique • Nested Sets • Kimball Helper Table • Without giving away too much, pretty much all of the methods have some use…

  16. Preconceived Notions Which method/algorithm do you expect to be fastest? • Fixed Levels • Adjacency List • HierarchyId • Path Technique • Nested Sets • Kimball Helper Table

  17. Coding for trees • Manipulation: • Creating a new node • Moving/Reparenting a node • Deleting a node (without children) • Note: No tree algorithms allow for “simple” SQL solutions to all of these problems • Usage • Getting the children of a node • Getting the parent of a node • Aggregating along the tree • We will have demos of all of these operations…

  18. Reparenting Example Dragging along all of it’s child nodes along with it Starting with: Perhaps ending with:

  19. Implementing a tree – Fixed Levels CREATE TABLE CompanyHierarchy( Company varchar(100) NULL, Headquarters varchar(100) NULL, Branch varchar(100) NULL, PRIMARY KEY (Company, Headquarters, Branch)) Very limited, but very fast and easy to work with I will not demo this structure today because it’s use is both extremely obvious and limited

  20. Implementing a tree – Adjacency List Every row includes the key value of the parent in the row Parent-less rows have no parent value Code is the most complex to write (though not as inefficient as it might seem) CREATE TABLE CompanyHierarchy( Organization varchar(100) NOT NULL PRIMARY KEY,ParentOrganizationvarchar(100) NULL REFERENCES CompanyHierarchy (Organization), Name varchar(100) NOT NULL)

  21. Adjacency List – Adding a Node New Node

  22. Simply set the parent and done!

  23. Implementing a tree – Path Method Every row includes a representation of the path to their parent Processing makes use of like and string processing (I have seen a case that used fixed length binary values) Limitation on path size for string manipulation/indexing CREATE TABLE CompanyHierarchy(OrganizationIdint NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Path varchar(900))

  24. Path Method Adding a Node New Node

  25. New Id = 9

  26. Plus the New Id Path from the parent

  27. Implementing a tree – Path Method Every row includes a representation of the path to their parent Processing makes use of like and string processing (I have seen a case that used fixed length binary values) Limitation on path size for string manipulation/indexing CREATE TABLE CompanyHierarchy(OrganizationIdint NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Path varchar(900))

  28. Implementing a tree – HierarchyId Somewhat unnatural method to the typical SQL Programmer Similar to the Path Method, and has some of the same limitations when moving around nodes Node path does not use data natural to the table, but rather positional locationing CREATE TABLE CompanyHierarchy(OrganizationIdint NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL,OrgNodehierarchyIdnot null)

  29. Implementing a tree – Nested Sets Query processing is done using range queries Structure is quite slow to maintain due to fragile structure Can produce excellent performance for queries CREATE TABLE CompanyHierarchy( Organization varchar(100) NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Left int NOT NULL, Right int NOT NULL)

  30. Nested Sets – Adding a Node New Node

  31. Updating Right Values

  32. And the One Left value right of the new node

  33. Renumber, leaving gap for child

  34. The New Node

  35. Set the New Node’s Left/Right

  36. Implementing a tree – Nested Sets Query processing is done using range queries Structure is quite slow to maintain due to fragile structure Can produce excellent performance for queries CREATE TABLE CompanyHierarchy( Organization varchar(100) NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Left int NOT NULL, Right int NOT NULL)

  37. Implementing a tree – Kimball Helper Developed initially for data warehousing since data is modified all at once with a fixed cost Basically explodes the hierarchy into a table that turns all hierarchy manipulations into a relational query Maintenance can be slightly costly, but using the data is extremely fast

  38. Implementing a tree – Kimball Helper For the rows in yellow, expands to the table shown:

  39. Performance Examples and Limitations The following tests were run multiple times, and the results were taken from one such run. Clearly the results are not scientific, and done with random data. However, they very much match my expectations from my research. Load times were captured loading one row at a time. Test machine was a: Samsung Series 9, Sandy Bridge i5, 1.6Ghz Dual Core (Hyperthreaded), 4GB RAM, 128 GB SSD Note: All load times include time to load 5 transactions per node

  40. Performance Example Explanation For each performance test (which I will show the code later), I ran three query sets on each data set: Load the tree (until my computer couldn’t do it before PASS) Fetch all children from the root node Aggregate data for all children at all levels

  41. Performance Comparisons 157

  42. Performance Comparisons

  43. Performance Comparisons 14618 46203

  44. Performance Comparisons

  45. Performance Comparisons

  46. Performance Comparisons Note: HierarchyId dropped as it took over an hour to return results

  47. Performance Comparisons 175

  48. Performance Comparisons

  49. Performance Comparisons 1320000 3899720

More Related