## Gentle dive into analytic functions

A small example to illustrate using analytic functions to solve a (silly) query requirement.

Oracle introduced analytic functions in Release 2 of 8i. These functions greatly aide in solving some common query scenarios using straight SQL, that previously demanded procedural processing. Read Introduction to 8i Analytic Functions to get an idea on what these are.

Naturally, you will have situations where you have to write SQL for not so sensible business (or database design) situations. I will try to post more such samples. Here’s one to ponder.

Background: A race has many legs. Each race is a collection of sequentially arranged legs, with sequence running from 1, without any gap. That is, if a race has 4 legs, sequence numbers will be 1, 2, 3 and 4; in that order.

Problem: Given a set of records that have this information, without any id for a race, you want to find out the starting and entry points for each race, the total distance and the number of legs. The definition on at what record does each race begin is indicated by a sequence# of 1. A race ends at the record just before a record with sequence#1.

Again, let us not worry about the de-merits of the database design. I have changed the business situation and used races as an analogy.

Here’s the SQL script that explains the solution. Output of the script is below.

``` /* A race has many legs. Each race is a collection of sequentially arranged legs, with sequence running from 1, without any gap. */ set echo off set termout off create table t_(     seq number(2) not null,     strt varchar2(5) not null,     stop  varchar2(5) not null,     distance number(3) not null ); insert into t_ values(1, 'A', 'B', 10); insert into t_ values(2, 'C', 'D', 20); insert into t_ values(3, 'E', 'F', 15); insert into t_ values(1, 'AA', 'BB', 5); insert into t_ values(2, 'CC', 'DD', 1); insert into t_ values(3, 'EE', 'FF', 4); insert into t_ values(4, 'GG', 'HH', 8); insert into t_ values(1, 'X', 'Y', 25); set termout on set echo on /* So, in the data below, each race begins at seq=1 and ends just before next 1. */ select * from t_; /* In each group get the start and last stop and the sum of distances and the number of legs. Desired output for the above data is: A  F  45 3 AA FF 18 4 X  Y  25 1 */ -- STEP I: -- The pressing problem here is that we don’t have any "race_id" or anything -- else to group each segment. Using Analytic functions, that can be -- solved along with our assumption that sequences will have no gaps. -- Make up groups using rownum. It becomes easy to keep looking back using -- LAG function. select     lag(rownum, seq-1) over (order by 1) grp, seq, strt, stop, distance from t_; select     -- STEP II:     -- Now that we have groups from STEP I, use it as     -- a partition to get the values.     first_value(strt) over (partition by grp) strt,     last_value(stop) over (partition by grp) stop,     sum(distance) over (partition by grp) distance,     count(*) over (partition by grp) legs,     last_value(seq) over (partition by grp) legs_method_2     from (     -- STEP I:     select         lag(rownum, seq-1) over (order by 1) grp,     seq, strt, stop, distance     from t_ ); /* Putting it all together... */ select     -- STEP III: we don’t want duplicates. Hence distinct.     distinct     -- STEP II:     -- Now that we have groups from STEP I, use it as     -- a partition to get the values.     first_value(strt) over (partition by grp) strt,     last_value(stop) over (partition by grp) stop,     sum(distance) over (partition by grp) distance,     count(*) over (partition by grp) legs,     last_value(seq) over (partition by grp) legs_method_2     from (     -- STEP I:     select         lag(rownum, seq-1) over (order by 1) grp,     seq, strt, stop, distance     from t_ ); set termout off drop table t_; set termout on set echo off ```

The spool from the script illustrates the output too.

scott@vbabu>
DOC>A race has many legs. Each race is a collection of sequentially
DOC>arranged legs, with sequence running from 1, without any gap.
DOC>*/

scott@vbabu>/*
DOC>So, in the data below, each race begins at seq=1 and ends just before next 1.
DOC>*/
scott@vbabu>select * from t_;

SEQ STRT  STOP    DISTANCE
---------- ----- ----- ----------
1 A     B             10
2 C     D             20
3 E     F             15
1 AA    BB             5
2 CC    DD             1
3 EE    FF             4
4 GG    HH             8
1 X     Y             25

8 rows selected.

scott@vbabu>/*
DOC>In each group get the start and last stop and the sum of distances and
DOC>the number of legs. Desired output for the above data is:
DOC>A  F  45 3
DOC>AA FF 18 4
DOC>X  Y  25 1
DOC>*/
scott@vbabu>
scott@vbabu>-- STEP I:
scott@vbabu>-- The pressing problem here is that we don’t have any "race_id" or anything
scott@vbabu>-- else to group each segment. Using Analytic functions, that can be
scott@vbabu>-- solved along with our assumption that sequences will have no gaps.
scott@vbabu>-- Make up groups using rownum. It becomes easy to keep looking back using
scott@vbabu>-- LAG function.
scott@vbabu>select
2  lag(rownum, seq-1) over (order by 1) grp,
3  seq, strt, stop, distance
4  from t_;

GRP        SEQ STRT  STOP    DISTANCE
---------- ---------- ----- ----- ----------
1          1 A     B             10
1          2 C     D             20
1          3 E     F             15
4          1 AA    BB             5
4          2 CC    DD             1
4          3 EE    FF             4
4          4 GG    HH             8
8          1 X     Y             25

8 rows selected.

scott@vbabu>
scott@vbabu>select
2      -- STEP II:
3      -- Now that we have groups from STEP I, use it as
4      -- a partition to get the values.
5      first_value(strt) over (partition by grp) strt,
6      last_value(stop) over (partition by grp) stop,
7      sum(distance) over (partition by grp) distance,
8      count(*) over (partition by grp) legs,
9      last_value(seq) over (partition by grp) legs_method_2
10  from (
11      -- STEP I:
12      select
13      lag(rownum, seq-1) over (order by 1) grp,
14      seq, strt, stop, distance
15      from t_
16  );

STRT  STOP    DISTANCE       LEGS LEGS_METHOD_2
----- ----- ---------- ---------- -------------
A     F             45          3             3
A     F             45          3             3
A     F             45          3             3
AA    HH            18          4             4
AA    HH            18          4             4
AA    HH            18          4             4
AA    HH            18          4             4
X     Y             25          1             1

8 rows selected.

scott@vbabu>
scott@vbabu>
scott@vbabu>/*
DOC>Putting it all together...
DOC>*/
scott@vbabu>select
2      -- STEP III: we don’t want duplicates. Hence distinct.
3      distinct
4      -- STEP II:
5      -- Now that we have groups from STEP I, use it as
6      -- a partition to get the values.
7      first_value(strt) over (partition by grp) strt,
8      last_value(stop) over (partition by grp) stop,
9      sum(distance) over (partition by grp) distance,
10      count(*) over (partition by grp) legs,
11      last_value(seq) over (partition by grp) legs_method_2
12  from (
13      -- STEP I:
14      select
15      lag(rownum, seq-1) over (order by 1) grp,
16      seq, strt, stop, distance
17      from t_
18  );

STRT  STOP    DISTANCE       LEGS LEGS_METHOD_2
----- ----- ---------- ---------- -------------
A     F             45          3             3
AA    HH            18          4             4
X     Y             25          1             1

3 rows selected.

scott@vbabu>
scott@vbabu>drop table t_;

Table dropped.

1. good job,good examples,good description

Posted by: jametong on September 18, 2004 01:38 PM
//-->