Caching in – adventures in Oracle Tuning

Given the job of tuning a problem statement, I’ll usually try to work on it on a lightly used database. More importantly, where practical, I’ll execute the statement twice, one after the other, and use the response time for the second run as the starting point – the timing that my tuning changes to the statement need to improve upon.

My reason for doing this is simple – I want to filter out the effects of database caching on performance of this individual statement.

Obviously, this approach tends to work best for relatively short running queries. If you’re lumbered with a 12-hour Batch behemoth, then this isn’t going to be an appealing plan of attack.

When Oracle first executes a statement, it will store it in the Library Cache. This is why the first execution will cause a hard parse whilst the second and subsequent runs will benefit from a soft parse.

In terms of a single execution of a single statement however, the caching of data blocks in the buffer cache are likely to be far more significant in terms of response time.

A colleague recently question this approach. By serendipitous chance, we were looking into the relative merits of Bulk Collect versus Cursor For Loops at the time.

A Quick History Lesson

Back when Oracle 8i was shiny and new there emerged the exciting new technique of Bulk Binding. Simply put, this new technique facilitated the processing of result sets within PL/SQL rather than the row-by-row processing offerred by the traditional Cursor For Loop. The upshot was that expensive context switches between the SQL and PL/SQL were minimized and runtimes were consequently much faster.

Overnight, the trusty Cursor For Loop was now an outmoded pariah. All the cool kids were Bulk Collecting.

Fast forward a few years and the picture becomes rather more confusing.
On the one hand, the 10g R2 documentation seems to be recommending the use of Bulk Collect rather than the Cursor For Loop. On the other, the Cursor For Loop has apparently been optimised to operate more like a bulk bind.

Meanwhile, back in the present…

In the meantime, various Oracle gurus are still lining up to extoll the virtues of Bulk Collect.
Incidentally, I think I should make it clear at this point that I don’t regard myself as a guru ( and if you’re a regular reader of this blog, neither will you). Apart from anything else, the pay doesn’t seem that great and, frankly, I could never see myself being comfortable in paper pants.
But enough of this professional navel-gazing, back to the point.

The point is this. If you really want to know the truth of the matter for you on your application on your environment, you need to suck it and see.

What I’ve done here is to take one of the many test cases kicking around and amended it to suit my environment.
This example is designed to demonstrate that Bulk Collect is several times faster than a Cursor For Loop.

NOTE - I’ve used 10g XE ( 10.2.0.1.0) to run these examples. Refinements of how a Cursor For Loop operates means that you’re likely to get different results on 11g, and also on later releases of 10g.
Anyway, onto the code :

set serveroutput on
DECLARE
    TYPE typ_source IS TABLE OF all_source%ROWTYPE;
    tbl_source typ_source := typ_source();
    l_start PLS_INTEGER;
    l_end PLS_INTEGER;
BEGIN
   --
    -- First use a cursor for loop
    --
    l_start := DBMS_UTILITY.GET_TIME;
    FOR r_source IN (SELECT * FROM all_source) LOOP
        tbl_source.EXTEND;
        tbl_source( tbl_source.LAST) := r_source;
    END LOOP;
    l_end := DBMS_UTILITY.GET_TIME;
    DBMS_OUTPUT.PUT_LINE('Conventional (' || tbl_source.COUNT||')'
        ||TO_CHAR( (l_end - l_start)/100) ||' seconds.');
    --
    -- Now try again, this time using BULK COLLECT
    --
    l_start := DBMS_UTILITY.GET_TIME;
    SELECT * BULK COLLECT INTO tbl_source
    FROM all_source;
    l_end := DBMS_UTILITY.GET_TIME;
    DBMS_OUTPUT.PUT_LINE('Bulk Collect (' || tbl_source.COUNT||')'
        ||TO_CHAR( (l_end - l_start)/100) ||' seconds.');
END;
/

Let's run it :

SQL> @cfl_bc.sql
Conventional (154220)3.32 seconds.
Bulk Collect (154220).64 seconds.
PL/SQL procedure successfully completed.

Yep, that looks sort of OK. Five times faster. Look, I'll run it again, just to prove it wasn't a fluke...

SQL> @cfl_bc.sql
Conventional (154220)1.06 seconds.
Bulk Collect (154220).61 seconds.
PL/SQL procedure successfully completed.

Um....have the tuning fairies sneaked in between runs to make the Cursor For Loop several times more efficient ?
OK, let's try that again, only this time, we'll run the Bulk Collect first.
Before we do that we need to clear the caches. The simplest the way to do that is just to re-start the database. Obviously, it's a good idea to either make sure that you're the only one on this database or no-one else is going to get upset if you suddenly bounce it.

set serveroutput on
DECLARE
    TYPE typ_source IS TABLE OF all_source%ROWTYPE;
    tbl_source typ_source := typ_source();
    l_start PLS_INTEGER;
    l_end PLS_INTEGER;
BEGIN
    --
    -- BULK COLLECT first
    --
    l_start := DBMS_UTILITY.GET_TIME;
    SELECT * BULK COLLECT INTO tbl_source
    FROM all_source;
    l_end := DBMS_UTILITY.GET_TIME;
    DBMS_OUTPUT.PUT_LINE('Bulk Collect (' || tbl_source.COUNT||')'
        ||TO_CHAR( (l_end - l_start)/100) ||' seconds.');
    --
    -- Now use a cursor for loop
    --
    l_start := DBMS_UTILITY.GET_TIME;
    FOR r_source IN (SELECT * FROM all_source) LOOP
        tbl_source.EXTEND;
        tbl_source( tbl_source.LAST) := r_source;
    END LOOP;
    l_end := DBMS_UTILITY.GET_TIME;
    DBMS_OUTPUT.PUT_LINE('Conventional (' || tbl_source.COUNT||')'
        ||TO_CHAR( (l_end - l_start)/100) ||' seconds.');
END;
/

And we get :

SQL> @bc_cfl.sql
Bulk Collect (154220)1.02 seconds.
Conventional (154220).93 seconds.
PL/SQL procedure successfully completed.

Hmm, looks like the tuning fairy bunked off for a fag when the Bulk Collect was running. The Cursor For Loop is still whizzing along though.
At this point, we could use tkprof to find out exactly was going on.... but I'm going to do something quick and dirty ( no, not that). I'm going to take the query that's at the heart of both PL/SQL blocks and run it in isolation, using autotrace.
Wandering off the point for a moment...any Gods of the Oracle Tuning world who happen to have stumbled across this post will probably have their heads in their hands about now. After all, what's wrong with tkprof ? Well, nothing. As a penance, I promise to cover the relevant merits of tkprof and autotrace in a separate post. In the meantime, I just want to illustrate what's happening in as simple a way as possible and autotrace requires minimal messing about in this instance.

spool autotrace.lis
set lines 132
set autotrace traceonly
set timing on
SELECT *
FROM all_source
/
SELECT *
FROM all_source
/
spool off

I've edited the output here as we're not really interested in the explain plan …

(First execution)
154220 rows selected.
Elapsed: 00:00:07.65
Execution Plan
...
Statistics
----------------------------------------------------------                                                                          
        801  recursive calls                                                                                                        
          0  db block gets                                                                                                          
      40829  consistent gets                                                                                                        
       9516  physical reads                                                                                                         
          0  redo size                                                                                                              
   43171911  bytes sent via SQL*Net to client                                                                                       
     113475  bytes received via SQL*Net from client                                                                                 
      10283  SQL*Net roundtrips to/from client                                                                                      
        826  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
     154220  rows processed                                                                                                         

(Second execution)
154220 rows selected.
Elapsed: 00:00:07.46
Execution Plan
...
Statistics
----------------------------------------------------------                                                                          
          0  recursive calls                                                                                                        
          0  db block gets                                                                                                          
      40634  consistent gets                                                                                                        
          0  physical reads                                                                                                         
          0  redo size                                                                                                              
   43171911  bytes sent via SQL*Net to client                                                                                       
     113475  bytes received via SQL*Net from client                                                                                 
      10283  SQL*Net roundtrips to/from client                                                                                      
        798  sorts (memory)                                                                                                         
          0  sorts (disk)                                                                                                           
     154220  rows processed                                                                                                         
SQL> spool off

As we can see from the Statistics section of the autotrace output, the first ( slower) run has 9516 physical reads. The second run has none. Same query, different number of Physical Reads. Why ?
Well, the second run is benefiting from the fact that the first run has identified all of the required data blocks and loaded them into the buffer cache. Therefore, the second run just has to look in the cache – i.e. in physical memory. Obviously, this is much faster than all that physical I/O, and this is reflected in the response time.
Incidentally, the fact that there are no recursive calls second time around are indicative of a soft parse ( i.e. the statement was stored in the library cache). This will also have a marginal positive impact on response time on an individual statement.

The moral of the story ? Simply this – if it's practical, run the query twice before you start tuning.
Oh, and always check that any "guru" assertion is true for you - for your application on your version of your Oracle database on your operating system on your server.
NOTE – Of course, all of this does not mean that Bulk Collect and Cursor For Loops always deliver the same performance...but that's for another post.

About these ads

One thought on “Caching in – adventures in Oracle Tuning

  1. Pingback: Chirpir News | Caching in – adventures in Oracle Tuning « The Anti-Kyte

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s