Oracle Batch Job Logging – a framework for domestic harmony

Like most men, I have a standard of tidiness and cleanliness that I think of as “Bloke Clean”.
Deb’s standards are rather higher ( she would say normal). The difference can occasionally be a source of tension.
“Maybe I’ll just run off with that Steven Feuerstein bloke !”, she may have said during one of our discussions about the state of the study.
“Oh really ?”, I might retort, “and what’s he got that I haven’t ?”,
“Money, fame, talent and his own framework…not to mention a cleaner, I’ll bet”.
“Well…at least I have more hair”, I might say, disconcerted by her surprisingly comprehensive knowledge of someone who, it must be said, is not exactly famous outside of the wonderful world of Oracle.
“Not by much.” would probably have been the devastating reply.
Predictably, a compromise has now been reached…the upshot of which is that I’ve just spent the afternoon becoming intoxicated by the fumes from various cleaning products…and the study is now gleaming and all the papers filed away…and we’re getting a cleaner.
As for the money, fame and talent…well, I’ll just have to make do with the framework for now.
Truth be told, calling it a framework is overstating things a bit. But hey, it does give me an excuse to come up with a (possibly) amusing name.

The problem to be addressed

Imagine that we have an application that has various batch jobs running overnight. These jobs may be initiated by a Scheduled Job to call a database packaged procedure or even ( and yes, this does still happen), a user hitting a button on a screen before leaving for the night.
Once started, our overnight job will run to completion…or crash…or suddenly run very slowly without warning.

Of course, there may well be times when you are up at 3am and can see what is happening to a problematic job in real-time.
However, some of us need our beauty sleep. What we need is something that we can look at to see :

  • what is happening in real-time
  • how long each component of the job took to run
  • and optionally, a trace file to see why a job might be misbehaving

To solve this particular problem, I present to you the Monitoring Oracle Runs OverNight framework (MORON).
Yes, I know a recursive acronym would be far more clever and geeky…but simple things please simple minds.

The Batch Process

The Cleaning Fairy Job runs overnight. It does a lookup on a table to see which rooms in the house need cleaning and then performs various tasks in each room.

The table looks like this :

CREATE TABLE rooms(
    room_name VARCHAR2(30),
    clean_interval NUMBER(1),
    last_cleaned DATE)
/


INSERT INTO rooms(room_name, clean_interval, last_cleaned)
VALUES('KITCHEN', 1, NULL)
/

INSERT INTO rooms(room_name, clean_interval, last_cleaned)
VALUES('LOUNGE', 2, NULL)
/

INSERT INTO rooms(room_name, clean_interval, last_cleaned)
VALUES('BATHROOM', 3, TRUNC(SYSDATE -3))
/

INSERT INTO rooms(room_name, clean_interval, last_cleaned)
VALUES('BEDROOM', 5, NULL)
/

COMMIT;

Or, if you prefer…

SQL> SELECT *
  2  FROM rooms;

ROOM_NAME                      CLEAN_INTERVAL LAST_CLEA
------------------------------ -------------- ---------
KITCHEN                                     1
LOUNGE                                      2
BATHROOM                                    3 01-OCT-12
BEDROOM                                     5

SQL>

The cleaning fairy is invoked every evening by a call to the following package :

CREATE OR REPLACE PACKAGE cleaning_fairy_pkg AS
    PROCEDURE clean_house_pr;
END cleaning_fairy_pkg;
/

CREATE OR REPLACE PACKAGE BODY cleaning_fairy_pkg AS

PROCEDURE clean_sink_pr( i_room IN rooms.room_name%TYPE) IS
BEGIN
    --
    -- Nice and quick...
    --
    DBMS_LOCK.SLEEP(60);
END clean_sink_pr;

PROCEDURE clean_bath_pr( i_room IN rooms.room_name%TYPE) IS
BEGIN
    --
    -- Takes a bit longer 
    --
    DBMS_LOCK.SLEEP(300);
END clean_bath_pr;

PROCEDURE hoover_pr( i_room IN rooms.room_name%TYPE) IS
BEGIN
    --
    -- Depending on which room...
    --
    IF i_room = 'BEDROOM' THEN
        DBMS_LOCK.SLEEP(60);
    ELSE
        --
        -- Lounge is a bit bigger
        --
        DBMS_LOCK.SLEEP(120);
    END IF;
END hoover_pr;

PROCEDURE change_sheets_pr( i_room IN rooms.room_name%TYPE) IS
BEGIN
    --
    -- have you spilt something on this ?
    --
    DBMS_LOCK.SLEEP(300);
END change_sheets_pr;

PROCEDURE wash_floor_pr( i_room IN rooms.room_name%TYPE) IS
BEGIN
    --
    -- Ah, that pine fresh smell !
    --
    DBMS_LOCK.SLEEP(500);
END wash_floor_pr;

PROCEDURE clean_house_pr IS
--
-- This is the main entry-point for the package.
--
BEGIN
    FOR r_room IN (
        SELECT room_name
        FROM rooms
        WHERE TRUNC(NVL( last_cleaned, SYSDATE - (clean_interval + 1))) < 
            TRUNC(SYSDATE - clean_interval)
        ORDER BY 1)
    LOOP
        --
        -- For each room, call the task routine to carry out the tasks
        --
        IF r_room.room_name = 'BATHROOM' THEN
            clean_sink_pr( i_room => r_room.room_name);
            clean_bath_pr( i_room => r_room.room_name);
        ELSIF r_room.room_name = 'BEDROOM' THEN
            hoover_pr( i_room => r_room.room_name);
            change_sheets_pr( i_room => r_room.room_name);
        ELSIF r_room.room_name = 'KITCHEN' THEN
            wash_floor_pr( i_room => r_room.room_name);
            clean_sink_pr( i_room => r_room.room_name);
        ELSIF r_room.room_name = 'LOUNGE' THEN
            hoover_pr( i_room => r_room.room_name);
        END IF;
    END LOOP;
END clean_house_pr;
END cleaning_fairy_pkg;
/

As you can see, the package won’t run the same set of procedures every evening – remember, certain people think the Kitchen should be cleaned EVERY DAY, whilst the bedroom can go five days between visits from the Cleaning Fairy.
So, what we need now is a framework that we can apply to this, or any other batch process to record run information…

The MORON Framework

That last heading sounds like it should be the title of a Will Ferrell spoof of the Bourne Identity. I’m beginning to regret this acronym.
Anyway, now to introduce the components of the framework.

The tables

First, we need a table to determine which jobs we are going to log and a means of toggling both logging and tracing for each job.
Remember, we also want to be able to identify the session that the job is running in ( just in case we have any insomniac DBAs lying around), as well as an easy way to identify any trace file we generate…

CREATE TABLE moron_programs(
    module_name VARCHAR2(64),
    tracefile_identifier VARCHAR2(100) NOT NULL,
    log_flag VARCHAR2(1) DEFAULT 'N' NOT NULL ,
    trace_flag VARCHAR2(1) DEFAULT 'N' NOT NULL,
    CONSTRAINT mp_mod_name_pk PRIMARY KEY( module_name))
/

COMMENT ON COLUMN moron_programs.module_name IS
    'The name of the Job we are monitoring - column definition matches v$session.module.'
/

COMMENT ON COLUMN moron_programs.tracefile_identifier IS
    'The name to be prefixed to the tracefile generated in the USER_DUMP_DEST if tracing is enabled for this module_name.'
/

COMMENT ON COLUMN moron_programs.log_flag IS
    'If Y(es) then record progress of the run in the RUN_LOGS table. Default is N(o)'
/

COMMENT ON COLUMN moron_programs.trace_flag IS 
    'If Y(es) then trace the session. Default is N(o)'
/

Next, we need a table to hold details of which sub-processes the job has run; when they were run; and how long they ran for :

CREATE TABLE moron_logs(
    log_id NUMBER PRIMARY KEY,
    log_date TIMESTAMP NOT NULL,
    username VARCHAR2(30) NOT NULL,
    mp_module_name VARCHAR2(64) NOT NULL,
    action VARCHAR2(32) NOT NULL,
    start_time NUMBER NOT NULL,
    end_time NUMBER)
/

COMMENT ON COLUMN moron_logs.log_id IS
    'Synthetic Key for table entries. Populated using sequence MORON_LOG_ID_SEQ.'
/

COMMENT ON COLUMN moron_logs.log_date IS
    'Timestamp for this record.'
/

COMMENT ON COLUMN moron_logs.username IS
    'The user running this process.'
/

COMMENT ON COLUMN moron_logs.mp_module_name IS
    'Foreign Key to MORON_PROGRAMS.MODULE_NAME'
/

COMMENT ON COLUMN moron_logs.action IS
    'Description of the process ( usually a sub-program name). Column definition matches V$SESSION.ACTION'
/

COMMENT ON COLUMN moron_logs.start_time IS
    'The start time as per the output of DBMS_UTILITY.GET_TIME'
/

COMMENT ON COLUMN moron_logs.start_time IS
    'The end time as per the output of DBMS_UTILITY.GET_TIME'
/

And the sequence …

CREATE SEQUENCE moron_log_id_seq
    START WITH 1
    INCREMENT BY 1
    NOCYCLE
/

The package

Now we come to the heart of the monitoring framework. We need a PL/SQL package to do the following :

  • Record the start of the run and start tracing the session if required
  • For each sub-process, update the V$SESSION module and action columns
  • Record the start and end of each run in the RUN_LOGS table
  • Stop tracing when finished
  • As an added bonus – include a function to return the duration of each record logged in hours, minutes and seconds
CREATE OR REPLACE PACKAGE moron_pkg AS
    --
    -- Global variables used in the package
    --
    g_rl_id moron_logs.log_id%TYPE;
    g_module_name VARCHAR2(32);
    g_logging VARCHAR2(1) := 'N';
    g_trace VARCHAR2(1) := 'N';
    g_trace_on BOOLEAN := FALSE;

    PROCEDURE start_trace_pr( i_identifier VARCHAR2);
    PROCEDURE stop_trace_pr;
    PROCEDURE set_module_pr( i_module_name VARCHAR2);
    PROCEDURE set_action_pr( i_action_name VARCHAR2, i_start BOOLEAN);
    FUNCTION get_formatted_time_fn( i_start PLS_INTEGER, i_end PLS_INTEGER)
        RETURN VARCHAR2;
END moron_pkg;
/

CREATE OR REPLACE PACKAGE BODY moron_pkg AS

PROCEDURE start_trace_pr( i_identifier VARCHAR2) IS
--
-- Called from set_module_pr if the module in question has the 
-- trace flag set.
--
    l_cmd VARCHAR2(2000);
    PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
    EXECUTE IMMEDIATE 'ALTER SESSION SET TIMED_STATISTICS=TRUE';
    l_cmd := 'ALTER SESSION SET TRACEFILE_IDENTIFIER='||i_identifier;
    EXECUTE IMMEDIATE l_cmd;
    EXECUTE IMMEDIATE 'ALTER SESSION SET SQL_TRACE=TRUE';
END start_trace_pr;

PROCEDURE stop_trace_pr IS
--
-- Check to see if we are tracing. If so, then stop.
-- Then unset the module and action for the session.
--
    PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
    IF g_trace_on THEN
        EXECUTE IMMEDIATE 'ALTER SESSION SET SQL_TRACE=FALSE';
        g_trace_on := FALSE;
    END IF;
    --
    -- Unset the module name and action
    --
    DBMS_APPLICATION_INFO.SET_MODULE(
        module_name => NULL, action_name => NULL);
END stop_trace_pr;

PROCEDURE set_module_pr( i_module_name VARCHAR2) IS
--
-- Called at the start of a monitoring session to set the module
-- in V$SESSION.
--
    l_identifier moron_programs.tracefile_identifier%TYPE;
    CURSOR c_mprog( cp_module VARCHAR2) IS
        SELECT NVL(log_flag, 'N'), NVL( trace_flag, 'N'), tracefile_identifier
        FROM moron_programs
        WHERE module_name = UPPER(cp_module);
BEGIN
    OPEN c_mprog( i_module_name);
    FETCH c_mprog INTO g_logging, g_trace, l_identifier;
    IF c_mprog%NOTFOUND THEN
        --
        -- no logging or tracing required
        --
        g_logging := 'N';
        g_trace := 'N';
    END IF;
    CLOSE c_mprog;
    DBMS_APPLICATION_INFO.SET_MODULE(
        module_name => UPPER(i_module_name), action_name => NULL);
    --
    -- Set the module name package variable for reference later in the run
    --
    g_module_name := UPPER(i_module_name);
    IF g_trace = 'Y' AND NOT g_trace_on THEN
        --
        -- we need to trace the session
        --
        start_trace_pr( i_identifier => NVL(l_identifier, i_module_name));
        g_trace_on := TRUE;
    END IF;
END set_module_pr;

PROCEDURE set_action_pr( i_action_name VARCHAR2, i_start BOOLEAN) IS
--
-- Called at the start and end of each step.
-- When starting, add a record to the log table.
-- When ending, update the previously added log table record with the
-- end time.
--
    PRAGMA AUTONOMOUS_TRANSACTION;
    l_time NUMBER;
BEGIN
    --
    -- Check to see if we're logging
    --
    IF g_logging = 'N' THEN
        RETURN;
    END IF;
    --
    -- get the start time
    --
    l_time := DBMS_UTILITY.GET_TIME;
    IF i_start THEN
        --
        -- This is the beginning of this action, not the end so
        -- update the action column in V$SESSION
        --
        DBMS_APPLICATION_INFO.SET_ACTION( action_name => i_action_name);
        --
        -- Get the id we're using - this is the 11g way.
        -- If you're running 10g then you'll need to use
        -- SELECT moron_log_id_seq.NEXTVAL INTO g_rl_id FROM dual;
        --
        g_rl_id := moron_log_id_seq.NEXTVAL;
        --
        -- Now insert a record into the RUN_LOGS table
        --
        INSERT INTO moron_logs(
            log_id, log_date, username,
            mp_module_name, action, start_time)
        VALUES(
            g_rl_id, SYSDATE, USER,
            g_module_name, i_action_name, l_time);
    ELSE
        --
        -- This is an update to an existing action - i.e. just applying an end time
        --
        UPDATE moron_logs
        SET end_time = l_time
        WHERE log_id = g_rl_id;
        --
        -- Cleardown the Package variable and the V$SESSION action as this record is now complete
        --
        DBMS_APPLICATION_INFO.SET_MODULE(module_name => g_module_name, action_name => NULL);
        g_rl_id := NULL;
    END IF;
    COMMIT;
END set_action_pr;

FUNCTION get_formatted_time_fn( i_start PLS_INTEGER, i_end PLS_INTEGER)
--
-- Function to return the difference between two DBMS_UTLITY.GET_TIME
-- values in format HH:MI:SS.SS
--
    RETURN VARCHAR2 IS
    l_elapsed PLS_INTEGER;
    l_huns PLS_INTEGER;
    l_secs PLS_INTEGER;
    l_mins PLS_INTEGER;
    l_hrs PLS_INTEGER;
    l_formatted_time VARCHAR2(20);
BEGIN
    l_elapsed := i_end - i_start;
    l_huns := MOD( l_elapsed, 100);
    l_secs := FLOOR(MOD( l_elapsed, 6000)/ 100);
    l_mins := FLOOR(FLOOR( MOD( l_elapsed , 360000))/6000);
    l_hrs := FLOOR(l_elapsed / 360000);
    l_formatted_time := LPAD(l_hrs, 2, '0')||':'||LPAD( l_mins, 2, '0')||':'||LPAD( l_secs, 2, '0')||'.'||LPAD( l_huns,2,'0');
    RETURN l_formatted_time;
END get_formatted_time_fn;
END moron_pkg;
/

Implementing the monitoring

Now to get our cleaning fairy package to utilise the framework.
First of all, we need to insert a record for it in the MORON_PROGRAMS table. To start with we want to have both logging and tracing turned on :

INSERT INTO moron_programs(
    module_name, tracefile_identifier, log_flag, trace_flag)
VALUES(
    'CLEANING_FAIRY_PKG', 'cleaning_fairy', 'Y', 'Y')
/
COMMIT;

Next, we need to add calls to the framework from the program we want to monitor.
To start the monitoring process :

moron_pkg.set_module_pr( i_module_name => module_to_monitor);

…where module_to_monitor is the name of the module.
The procedure will then set the value of v$session.module.
If tracing is required, then the procedure will enable tracing in the session.

When starting a step in the module, we need to call :

moron_pkg.set_action_pr( i_action_name => the_action, i_start => TRUE);

…where the_action is the name of the action we’re monitoring.
This call will only do anything if the log_flag is set to ‘Y’ in MORON_PROGRAMS.
If it is then the following will happen :

  • a record is inserted into the MORON_LOGS table which includes the current DBMS_UTILITY.GET_TIME value
  • the v$session.action value is set

Once the action is completed, the MORON_LOGS record is updated with the end time. This is done with the call :

moron_pkg.set_action_pr( i_action_name => the_action, i_start => FALSE);

After all of the actions in the module have been completed, a final call to the framework is made to stop tracing and cleardown the v$session values :

moron_pkg.stop_trace_pr;

Remember, the actions taken by the monitoring package are dictated by the values set in the MORON_PROGRAMS table for the module we’re running.

Now, CLEANING_FAIRY_PK has a single entry-point – the CLEAN_HOUSE_PR procedure. Therefore, we just need to add calls to our logging package there …

...
PROCEDURE clean_house_pr IS
--
-- This is the main entry-point for the package.
--
BEGIN
    moron_pkg.set_module_pr( i_module_name => 'CLEANING_FAIRY_PKG');
    FOR r_room IN (
        SELECT room_name
        FROM rooms
        WHERE TRUNC(NVL( last_cleaned, SYSDATE - (clean_interval + 1))) < 
            TRUNC(SYSDATE - clean_interval)
        ORDER BY 1)
    LOOP
        --
        -- For each room, call the task routine to carry out the tasks
        --
        IF r_room.room_name = 'BATHROOM' THEN
            moron_pkg.set_action_pr('CLEAN_SINK_PR - BATHROOM', TRUE);
            clean_sink_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('CLEAN_SINK_PR - BATHROOM', FALSE);                     
            moron_pkg.set_action_pr('CLEAN_BATH_PR', TRUE); 
            clean_bath_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('CLEAN_BATH_PR', FALSE);        
        ELSIF r_room.room_name = 'BEDROOM' THEN
            moron_pkg.set_action_pr('HOOVER_PR - BEDROOM', TRUE);
            hoover_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('HOOVER_PR - BEDROOM', FALSE);
            moron_pkg.set_action_pr('CHANGE_SHEETS_PR', TRUE);
            change_sheets_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('CHANGE_SHEETS_PR', FALSE);
        ELSIF r_room.room_name = 'KITCHEN' THEN
            moron_pkg.set_action_pr('WASH_FLOOR_PR', TRUE);
            wash_floor_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('WASH_FLOOR_PR', FALSE);
            moron_pkg.set_action_pr('CLEAN_SINK_PR - KITCHEN', TRUE);
            clean_sink_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('CLEAN_SINK_PR - KITCHEN', FALSE);
        ELSIF r_room.room_name = 'LOUNGE' THEN
            moron_pkg.set_action_pr('HOOVER_PR - LOUNGE', TRUE);
            hoover_pr( i_room => r_room.room_name);
            moron_pkg.set_action_pr('HOOVER_PR - LOUNGE', FALSE);
        END IF;
    END LOOP;
    moron_pkg.stop_trace_pr;
END clean_house_pr;
...

The test run

Here we go :

SELECT userenv('sessionid') FROM dual;

USERENV('
---------
100105

SQL> 

BEGIN
    cleaning_fairy_pkg.clean_house_pr;
END;
/

First of all, let’s see what we can find in V$SESSION :

SELECT audsid, module, action
FROM v$session
WHERE audsid = 100105
/

MODULE			       ACTION 
------------------------------ ------------------------------ 
CLEANING_FAIRY_PKG	       CLEAN_BATH_PR 

SQL> 

At this point, I’ve been called away to do a little light dusting. Coming back, I can see how we’re getting on by looking at the RUN_LOGS table :

SELECT TO_CHAR(log_date, 'HH24:MI') "Time Started",
  action,
  moron_pkg.get_formatted_time_fn( start_time, end_time) Duration
FROM moron_logs
WHERE TRUNC(log_date) = TRUNC(SYSDATE)
AND mp_module_name = 'CLEANING_FAIRY_PKG'
ORDER BY log_id;

Time Start ACTION			  DURATION 
---------- ------------------------------ -------------------- 
21:28	   CLEAN_SINK_PR - BATHROOM	  00:01:00.10 
21:29	   CLEAN_BATH_PR		  00:05:00.04 
21:34	   HOOVER_PR - BEDROOM		  00:01:00.02 
21:35	   CHANGE_SHEETS_PR		  ::. 

SQL> 

Once the run is completed, if I want to review exactly what happened, I can run the same query :

Time Started	ACTION			       DURATION 
--------------- ------------------------------ ------------------------------ 
21:28		CLEAN_SINK_PR - BATHROOM       00:01:00.10 
21:29		CLEAN_BATH_PR		       00:05:00.04 
21:34		HOOVER_PR - BEDROOM	       00:01:00.02 
21:35		CHANGE_SHEETS_PR	       00:05:00.03 
21:40		WASH_FLOOR_PR		       00:08:20.04 
21:48		CLEAN_SINK_PR - KITCHEN        00:01:00.01 
21:49		HOOVER_PR - LOUNGE	       00:02:00.02 

7 rows selected. 

SQL> 

If I want to take a more in-depth look, I can skip over to the USER_DUMP_DEST and find my trace file. First, find out where the USER_DUMP_DEST is in SQL*Plus :

SQL> show parameter user_dump_dest 

NAME				     TYPE	 VALUE 
------------------------------------ ----------- ------------------------------ 
user_dump_dest			     string	 /u01/app/oracle/diag/rdbms/xe/ 
						 XE/trace 
SQL>

Next, find the trace file :

cd /u01/app/oracle/diag/rdbms/xe/XE/trace
ls -l *CLEANING_FAIRY.trc

-rw-r----- 1 oracle dba 70550 Oct  8 22:03 XE_ora_2933_CLEANING_FAIRY.trc 

OK, so that’s the framework sorted. All I need now is for the lottery program to finally come up trumps. What’s that dear ?
Yes, I’d love to go and tidy up the shed whilst I’m waiting.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s