1 / 53

Extreme Querying With analytics

Extreme Querying With analytics. blah blah NOT LIABLE blah blah blah, I NEVER SAID THAT blah blah READ THE DOCUMENTATION blah blah blah NO PROMISES blah I GET PAID BY THE WORD blah blah. DISCLAIMER. Read my blog at HTTP://BLOG.SYDORACLE.COM.

ganya
Download Presentation

Extreme Querying With analytics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extreme Querying With analytics

  2. blah blah NOT LIABLE blah blah blah, I NEVER SAID THAT blah blah READ THE DOCUMENTATION blah blahblahNO PROMISES blah I GET PAID BY THE WORD blah blah DISCLAIMER Read my blog at HTTP://BLOG.SYDORACLE.COM

  3. PartitionS and windows

  4. A row in a result set was all alone…

  5. It can get chaotic

  6. Aggregate functions are the basis of many Analytics All the standard aggregates (MIN, MAX, COUNT, SUM, etc) can be used with analytic clauses. AgGREGATE FUNCTIONS

  7. Min / Max (with added KEEP) KEEP means keep the column value for the highest ranked record. NEW AgGREGATE FUNCTIONS

  8. *The most populous city Which of their cities has the most potential slaves?

  9. Cities data SYDNEY and X both have a population of 2 million

  10. Results MIN or MAX only makes a difference if there are multiple entries of the same ORDER BY rank

  11. Min / Max (with added KEEP) • Collect • Create an collection of all the individual values • A list of large cities … AgGREGATE FUNCTIONS

  12. Results

  13. Min / Max (with added KEEP) • Collect • XMLAgg (in four steps) • Collect the column(s) into an XML document AgGREGATE FUNCTIONS

  14. 1. Record to xml fragment

  15. 2. 'Parent' element for fragment

  16. 3. Aggregate xml fragments

  17. 4. top-level parent for aggregation

  18. Min / Max (with added KEEP) • Collect • XMLAGG • ListAgg • 11g function to create a single VARCHAR2 value from a collection of individual VARCHAR2s AgGREGATE FUNCTIONS

  19. LISTAGG - 11g

  20. Wrap the aggregate around a CASE statement to give more aggregation possibilities. SELECT SUM(case when state='VIC' then pop end) vic_pop, SUM(case when state='NSW' then pop end) nsw_pop FROM cities; Aggregate functions with CASE

  21. (at last) ANALYTIC FUNCTIONS

  22. Dense Rank / Rank / Row Number AgGREGATE FUNCTIONS for analytics

  23. Analytic Functions Smithers, Bring me a list of our highest paid employees … and the poisoned donuts.

  24. select name, wage, sector, row_number() over (partition by sector order by wage desc) rn, rank() over (partition by sector order by wage desc) rnk, dense_rank() over (partition by sector order by wage desc) drnk from emp order by sector, wage desc; AgGREGATE FUNCTIONS for analytics

  25. AgGREGATE FUNCTIONS for analytics

  26. SUM - WITH ANALYTICS

  27. Using ROW_NUMBER with other analytics can confuse… select name, wage, cum_wage from (select name, wage, sum(wage) over (order by wage desc) cwage, row_number() over (order by wage desc) rn from emp where sector = '7G') where rn < 3 NAME WAGE CUM_WAGE Homer 2OO 2OO Lenny 1OO 4OO WARNING

  28. SOLUTION : Switch WINDOWING TO ROWS

  29. Dense Rank / Rank / Row Number • NTILE • The "Snobs" and "Yobs" function • Ignore the outliers and extremes • Or ignore the 'huddled masses' AgGREGATE FUNCTIONS for analytics

  30. Normally distributed data

  31. NTILE Exclude the most common 90% Focus on the most common 10%

  32. Dense Rank / Rank / Row Number • NTILE • Lag / Lead • Look around for the previous or next row AgGREGATE FUNCTIONS for analytics

  33. MONTH AMOUNT PREV_AMT PERC January 340 February 340 340 .00 March 150 340 -55.88 April 130 150 -13.33 May 170 130 30.77 June 210 170 23.53 July 350 210 66.67 August 270 350 -22.86 September 380 270 40.74 Lag - percentage change over time

  34. MON AMOUNT PREV_AMT ---------- ---------- ---------- January 340 February 340 340 March 150 340 April 130 150 May 170 130 June 170 July 350 170 August 270 350 September 380 270 LAG - IGNORE NULLS (11g)

  35. Dense Rank / Rank / Row Number • Percent Rank • Lag / Lead • First / Last • Look further ahead or behind AgGREGATE FUNCTIONS for analytics

  36. select to_char(period,'Month') mon, amount, first_value(amount) over (partition bytrunc(period,'Q') order by period) prev_amt from sales order by period FIRST_VALUE AND PARTITION BY

  37. MON AMOUNT PREV_AMT ---------- ---------- ---------- January 340 340 February 340 340 March 150 340 April 130 130 May 170 130 June 210 130 July 350 350 August 270 350 September 380 350 RESULTS

  38. Rarely needed in practice Partition By and Order By normally enough WINDOW clause

  39. If you omit the PARTITION clause, especially with in-line views , the results can be BAD Beware the missing partition

  40. Multiple lines for multiple orders

  41. Regular query with analytic

  42. Incorrect query In the inline view, the SUM analytic applies to ALL the Orders in the table.

  43. Include a partition by clause

  44. (if we have time) ADVANCED GROUPINGS

  45. Rollup Grouping sets Cube Totals and subtotals

  46. ROLLUP, ROLLUP

  47. Totals and subtotals

  48. Rollup • Cube CUBE allows combinations of columns to be totaled Totals and subtotals

  49. Cube - example

  50. Rollup • Cube • Grouping sets • Perform grouping across multiple columns • Without the lower level totals of CUBE Totals and subtotals

More Related