Top-N Queries in Oracle – SQL at the sharp-end of the season

It’s coming up to that time of year once more. Games are running out, permutations are being permutated, bottoms are beginning to squeak.

The question on everyone’s lips ( well, Simon’s actually) is, who will fill the Play-Off places in the Conference ?

For those of you not acquainted with the lower echelons of English Professional Football ( or Soccer, if you really must), The Conference is the fifth tier of English football and the pinnacle of non-League. Each year, the winners gain automatic promotion to the promised land of the Football League with those finishing second to fifth playing-off for the right to join them.

All of which gives me the perfect opportunity to forsake my old habits and bring my SQL bang up-to-date.

The (league) Table

Often, I’ll find myself trying to get details of the latest record in a table, using the synthetic key :

SELECT some_data
FROM some_table
WHERE id = ( SELECT  MAX(id) FROM some_table)
/

Top of the pile or RANK and file

Obviously, that's all well and good when you're dealing with a unique value, but what happens in the hurly burly of the league run-in ….let's start with some test data :


CREATE TABLE league_table (
    team VARCHAR2(30),
    played NUMBER(2),
    goal_diff NUMBER(3),
    points NUMBER(3))
/

DECLARE
    PROCEDURE ins( p_team VARCHAR2, p_pld NUMBER, p_gd NUMBER, p_pts NUMBER) IS
        BEGIN
            INSERT INTO league_table( team, played, goal_diff, points)
            VALUES( p_team, p_pld, p_gd, p_pts);
        END ins;
BEGIN
    ins('HISTON', 38, -32, 27);
    ins('EASTBOURNE B', 39, -35, 29);
    ins('FOREST GREEN', 40, -21, 38);
    ins('HAYES', 40, -30, 39);
    ins('ALTRINCHAM', 40, -35, 40);
    ins('SOUTHPORT', 39, -8, 41);
    ins('TAMWORTH', 40, -16, 43);
    ins('BARROW', 39, -12, 43);
    ins('CAMBRIDGE', 40, -7, 44);
    ins('GATESHEAD', 37,3,49);
    ins('MANSFIELD', 36,3,49);
    ins('KETTERING', 40,-7,50);
    ins('BATH CITY', 38,-2,52);
    ins('RUSHDEN', 38,7,53);
    ins('GRIMSBY', 38,9,53);
    ins('NEWPORT', 39,5,54);
    ins('DARLINGTON', 37,15,55);
    ins('YORK', 38,5,61);
    ins('FLEETWOOD', 40,16,63);
    ins('KIDDERMINSTER', 39, 15, 65);
    ins('WREXHAM', 40,14,70);
    ins('LUTON', 37, 36, 70);
    ins('WIMBLEDON', 40, 27, 74);
    ins('CRAWLEY', 38, 50, 87);
END;
/

And now, let's see whose top...

SELECT team, played, points, goal_diff 
FROM league_table t1 
WHERE t1.points = ( 
	SELECT MAX(t2.points) 
	FROM league_table t2) 
/

OK, at the moment, we'll only get one record back ( Crawley are running away with the league this year), but let's indulge Simon's fantasy for a moment ( no, not that one !) :

UPDATE league_table
SET points = 87
WHERE team = 'LUTON'
/

Now when we re-run the query we get :

TEAM			 PLAYED     POINTS  GOAL_DIFF
-------------------- ---------- ---------- ----------
LUTON			     37 	87	   36
CRAWLEY 		     38 	87	   50

Now, in the olden days, you'd have to try something like this to get the correct order ( when teams are level on points, the one with the highest goal difference is top) :

SELECT team, played, points, goal_diff 
FROM ( 
    SELECT team, played, points, goal_diff 
    FROM league_table 
    ORDER BY points DESC, goal_diff DESC) 
WHERE rownum = 1 
/

Not exactly a thrilling prospect. You need to make sure that the rownum reference happens outside of the sub-query as otherwise, the rownum will get applied before the sort and who knows what result you'll get. Instead, let's try that swanky new RANK analytical function :

SELECT team, played, points, goal_diff 
FROM 
    ( 
        SELECT team, played, points, goal_diff, 
            RANK() OVER (ORDER BY points desc, 
                goal_diff DESC) as position 
        FROM league_ table 
    ) 
WHERE position = 1 
/

OK, you're thinking, it all looks very flash, but what's wrong with sticking with good old rownum ?
Well, to answer that one, we need to shatter Simon's illusions once more and look at the grim reality of who is actually in the play-offs. Having rolled back our update, Luton are now back on 70 points. Now, let's see if we can persuade rownum to tell us who is in the play-off picture...

SELECT team, played, points, goal_diff 
FROM ( 
    SELECT team, played, points, goal_diff 
    FROM league_table 
    ORDER BY points DESC, goal_diff DESC) 
WHERE rownum BETWEEN 2 AND 5 
/

no rows selected

Er...apparently rownum hasn't kept up-to-date with the comparatively modern concept of play-offs in the Conference.
In order to calm Simon's nerves a little, I think we need to go back to RANK :


SELECT team, played, points, goal_diff 
FROM ( 
    SELECT team, played, points, goal_diff, 
        RANK() OVER( ORDER BY points DESC, goal_diff DESC) as position 
    FROM league_table) 
WHERE position BETWEEN 2 AND 5 
/

TEAM			 PLAYED     POINTS  GOAL_DIFF
-------------------- ---------- ---------- ----------
WIMBLEDON		     40 	74	   27
LUTON			     37 	70	   36
WREXHAM 		     40 	70	   14
KIDDERMINSTER		     39 	65	   15

We'll they're not top, but at least Luton look fairly solid for a play-off place.

About these ads

One thought on “Top-N Queries in Oracle – SQL at the sharp-end of the season

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s