Related Entries

Useful SQL*Plus headers
Dino eats frog
Oracle with Active Record
When is he free?
Pivoting for dates - summary

« Signs
» When is he free?

Gentle dive into analytic functions

A small example to illustrate using analytic functions to solve a (silly) query requirement.

Oracle introduced analytic functions in Release 2 of 8i. These functions greatly aide in solving some common query scenarios using straight SQL, that previously demanded procedural processing. Read Introduction to 8i Analytic Functions to get an idea on what these are.

Naturally, you will have situations where you have to write SQL for not so sensible business (or database design) situations. I will try to post more such samples. Here’s one to ponder.

Background: A race has many legs. Each race is a collection of sequentially arranged legs, with sequence running from 1, without any gap. That is, if a race has 4 legs, sequence numbers will be 1, 2, 3 and 4; in that order.

Problem: Given a set of records that have this information, without any id for a race, you want to find out the starting and entry points for each race, the total distance and the number of legs. The definition on at what record does each race begin is indicated by a sequence# of 1. A race ends at the record just before a record with sequence#1.

Again, let us not worry about the de-merits of the database design. I have changed the business situation and used races as an analogy.

Here’s the SQL script that explains the solution. Output of the script is below.

/*
A race has many legs. Each race is a collection of sequentially
arranged legs, with sequence running from 1, without any gap.
*/

set echo off
set termout off

create table t_(
    seq number(2) not null,
    strt varchar2(5) not null,
    stop  varchar2(5) not null,
    distance number(3) not null
);

insert into t_ values(1, 'A', 'B', 10);
insert into t_ values(2, 'C', 'D', 20);
insert into t_ values(3, 'E', 'F', 15);
insert into t_ values(1, 'AA', 'BB', 5);
insert into t_ values(2, 'CC', 'DD', 1);
insert into t_ values(3, 'EE', 'FF', 4);
insert into t_ values(4, 'GG', 'HH', 8);
insert into t_ values(1, 'X', 'Y', 25);

set termout on
set echo on
/*
So, in the data below, each race begins at seq=1 and ends just before next 1.
*/
select * from t_;


/*
In each group get the start and last stop and the sum of distances and
the number of legs. Desired output for the above data is:
A  F  45 3
AA FF 18 4
X  Y  25 1
*/

-- STEP I:
-- The pressing problem here is that we don’t have any "race_id" or anything
-- else to group each segment. Using Analytic functions, that can be
-- solved along with our assumption that sequences will have no gaps.
-- Make up groups using rownum. It becomes easy to keep looking back using
-- LAG function.
select    
lag(rownum, seq-1) over (order by 1) grp,
seq, strt, stop, distance
from t_;

select
    -- STEP II:
    -- Now that we have groups from STEP I, use it as
    -- a partition to get the values.
    first_value(strt) over (partition by grp) strt,
    last_value(stop) over (partition by grp) stop,
    sum(distance) over (partition by grp) distance,
    count(*) over (partition by grp) legs,
    last_value(seq) over (partition by grp) legs_method_2    
from (
    -- STEP I:
    select    
    lag(rownum, seq-1) over (order by 1) grp,
    seq, strt, stop, distance
    from t_
);


/*
Putting it all together...
*/
select
    -- STEP III: we don’t want duplicates. Hence distinct.
    distinct
    -- STEP II:
    -- Now that we have groups from STEP I, use it as
    -- a partition to get the values.
    first_value(strt) over (partition by grp) strt,
    last_value(stop) over (partition by grp) stop,
    sum(distance) over (partition by grp) distance,
    count(*) over (partition by grp) legs,
    last_value(seq) over (partition by grp) legs_method_2    
from (
    -- STEP I:
    select    
    lag(rownum, seq-1) over (order by 1) grp,
    seq, strt, stop, distance
    from t_
);

set termout off
drop table t_;
set termout on
set echo off

The spool from the script illustrates the output too.

scott@vbabu>
DOC>A race has many legs. Each race is a collection of sequentially
DOC>arranged legs, with sequence running from 1, without any gap.
DOC>*/

scott@vbabu>/*
DOC>So, in the data below, each race begins at seq=1 and ends just before next 1.
DOC>*/
scott@vbabu>select * from t_;

       SEQ STRT  STOP    DISTANCE
---------- ----- ----- ----------
         1 A     B             10
         2 C     D             20
         3 E     F             15
         1 AA    BB             5
         2 CC    DD             1
         3 EE    FF             4
         4 GG    HH             8
         1 X     Y             25

8 rows selected.

scott@vbabu>/*
DOC>In each group get the start and last stop and the sum of distances and
DOC>the number of legs. Desired output for the above data is:
DOC>A  F  45 3
DOC>AA FF 18 4
DOC>X  Y  25 1
DOC>*/
scott@vbabu>
scott@vbabu>-- STEP I:
scott@vbabu>-- The pressing problem here is that we don’t have any "race_id" or anything
scott@vbabu>-- else to group each segment. Using Analytic functions, that can be
scott@vbabu>-- solved along with our assumption that sequences will have no gaps.
scott@vbabu>-- Make up groups using rownum. It becomes easy to keep looking back using
scott@vbabu>-- LAG function.
scott@vbabu>select
  2  lag(rownum, seq-1) over (order by 1) grp,
  3  seq, strt, stop, distance
  4  from t_;

       GRP        SEQ STRT  STOP    DISTANCE
---------- ---------- ----- ----- ----------
         1          1 A     B             10
         1          2 C     D             20
         1          3 E     F             15
         4          1 AA    BB             5
         4          2 CC    DD             1
         4          3 EE    FF             4
         4          4 GG    HH             8
         8          1 X     Y             25

8 rows selected.

scott@vbabu>
scott@vbabu>select
  2      -- STEP II:
  3      -- Now that we have groups from STEP I, use it as
  4      -- a partition to get the values.
  5      first_value(strt) over (partition by grp) strt,
  6      last_value(stop) over (partition by grp) stop,
  7      sum(distance) over (partition by grp) distance,
  8      count(*) over (partition by grp) legs,
  9      last_value(seq) over (partition by grp) legs_method_2
 10  from (
 11      -- STEP I:
 12      select
 13      lag(rownum, seq-1) over (order by 1) grp,
 14      seq, strt, stop, distance
 15      from t_
 16  );

STRT  STOP    DISTANCE       LEGS LEGS_METHOD_2
----- ----- ---------- ---------- -------------
A     F             45          3             3
A     F             45          3             3
A     F             45          3             3
AA    HH            18          4             4
AA    HH            18          4             4
AA    HH            18          4             4
AA    HH            18          4             4
X     Y             25          1             1

8 rows selected.

scott@vbabu>
scott@vbabu>
scott@vbabu>/*
DOC>Putting it all together...
DOC>*/
scott@vbabu>select
  2      -- STEP III: we don’t want duplicates. Hence distinct.
  3      distinct
  4      -- STEP II:
  5      -- Now that we have groups from STEP I, use it as
  6      -- a partition to get the values.
  7      first_value(strt) over (partition by grp) strt,
  8      last_value(stop) over (partition by grp) stop,
  9      sum(distance) over (partition by grp) distance,
 10      count(*) over (partition by grp) legs,
 11      last_value(seq) over (partition by grp) legs_method_2
 12  from (
 13      -- STEP I:
 14      select
 15      lag(rownum, seq-1) over (order by 1) grp,
 16      seq, strt, stop, distance
 17      from t_
 18  );

STRT  STOP    DISTANCE       LEGS LEGS_METHOD_2
----- ----- ---------- ---------- -------------
A     F             45          3             3
AA    HH            18          4             4
X     Y             25          1             1

3 rows selected.

scott@vbabu>
scott@vbabu>drop table t_;

Table dropped.

  1. good job,good examples,good description

    Posted by: jametong on September 18, 2004 01:38 PM
//-->