1 / 7

OLAP Functions

OLAP Functions. Order-Dependent Aggregates and Windows in SQL: SQL:2003---same as SQL:1999. Extended Aggregation: OLAP Functions. Rank, row_number and other functions based on explicit order

lhumble
Download Presentation

OLAP Functions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OLAP Functions Order-Dependent Aggregates and Windows in SQL: SQL:2003---same as SQL:1999

  2. Extended Aggregation: OLAP Functions • Rank, row_number and other functions based on explicit order • Many new statistical functions: see, e.g., DB2 UDB's High-Function Business Intelligence in e-business---Red Book http://www.redbooks.ibm.com/redbooks/SG246546.html • VAR(X): variance, STDDEV(X): its square root • COV(X,Y): covariance is the measure of the linear association between two variables. • CORR(X,Y): correlation is normalized covariance. +1 max correlation, 0 none, -1 inverse • Regression functions: Y = a X + b • REGR_SLOPE (X) -> a • REGR_INTERCEPT (X) -> b More: Chi-squared testing, sampling

  3. OLAP Functions • Windows on aggregates: • Physical windows: based on the number of rows in the window • Logical windows: based on the value span of the window.

  4. Ranking is done in conjunction with an order by specification. Suppose we are given a relation student-marks(Name, marks) which stores the marks obtained by each student. The following query gives the rank of each student. Select Name, rank ( ) (order by (marks) desc) as s-rank,dense_rank ( ) (order by (marks) desc) as d-rank from student-marks order by s-rank Name Marks Rank DenseRank Tom 8 1 1 Jeff 7 2 2 Mary 7 2 2 Alex 6 4 3 Ranking

  5. Partition By • Ranking can be done within partition of the data. The following query then gives the rank of students within each section: student-marks(student-id, marks); student- section(student-id, section) select student-id, section,rank( ) over (partition by section order by marks desc) as sec-rankfrom student-marks, student-sectionwhere student-marks.student-id = student-section.student-idorder by section, sec-rank • dense_rank(): no holes after ties • row_number(): provide row numbering given a specific partitioning and ordering of rows.

  6. For a given constant n, the ranking the function ntile(n) takes the tuples in each partition in the specified order, and divides them into n buckets with equal numbers of tuples. For instance, we sort employees by salary, and use ntile(3) to find which range (bottom third, middle third, or top third) each employee is in, and compute the total salary earned by employees in each range: select threetile, sum(salary)from (select salary, ntile(3) over (order by (salary) as threetilefrom employee) as sgroup by threetile SQL permits the user to specify where they should occur by using nulls first or nulls last, for instance select student-id, rank ( ) over (order by marks desc nulls last) as s-rankfrom student-marks Ranking (Cont.)

  7. An example of window query is that, given sales values for each date, calculates for each date the average of the sales on that day, the previous day, and the next day; such moving average queries are used to smooth out random variations. In contrast to group by, the same tuple can exist in multiple windows. Suppose we are given a relation transaction(account-number, date-time, value), select account-number, date-time,sum(value) over (partition by account-number order by date-timerange unbounded preceding) This query returns a new sum for each new tuple—cumulative sum! An actual window can also be specified: e.g. range 10 rows preceding range between 10 rows preceding and 4 following range 30 minutes preceding range interval 30 minutes preceding and current row Aggregates on Windows

More Related