Oracle External Tables and the External Modify Clause

I like to think that I’m not completely useless in the kitchen. A pinch of this, a dash of that and a glug of what you fancy ( which may or may not make it’s way into whatever I’m cooking) and the result is usually edible at least.
That said, the combination of precise quantities of substances by means of closely controlled chemical reactions is more Deb’s forte.
The result is usually delicious. Being traditionalists in our house, we do like to follow the time-honoured bake-off format and have a judging session of the finished article. We think of it as the Great British Cake Scoff.
However satisfying the act of culinary creation may be, there are times when you just want something you need to stick in the microwave for 2 minutes.
Which brings us to the matter of Oracle External Tables.

When they first arrived, External Tables provided an easy way to load data from a file directly into the database without all that messing about with SQL*Loader.
Of course, there were some limitations. If you wanted to point an external table at a file, you’d have to issue an alter table statement to set it ( and possibly the directory as well).
This meant that External Table access had to be serialized to ensure that it was pointed at the correct file until you were finished with it.
If you find yourself switching between Oracle versions, it’s worth remembering that, these days, things are a little different, thanks to the EXTERNAL MODIFY, which arrived in 12c.

What I’ll be looking at here is whether External tables can now be used concurrently in different sessions, accessing different files.

I’ll also explore the EXTERNAL MODIFY clause’s aversion to bind variables and how we might work around this securely in PL/SQL.

The Application

The examples that follow were run on an Oracle Developer Day Virtual Box instance running Oracle 19.3.

We have a directory…

create or replace directory recipies_dir as '/u01/app/recipies';

…which contains some text files…

ls -1
debs_lemon_drizzle_cake.txt
mikes_beans_on_toast.txt

…and a simple external table to read files…

create table nom_nom_xt 
(
    line number,
    text varchar2(4000)
)
organization external
(
    type oracle_loader
    default directory upload_files
    access parameters 
    (
        records delimited by newline
        nologfile
        nobadfile
        nodiscardfile
        fields terminated by '~'
        missing field values are null
        (
            line recnum,
            text char(4000)
        )
    ) 
    location('')
)
reject limit unlimited
/

You’ll notice that I’ve specified the default directory as this is mandatory. However the location (i.e. the target file) is currently null.
Now, in the olden days, we’d have to issue a DDL statement to set the location before we could look at a file.
Since 12c however, we have the EXTERNAL MODIFY clause, so we can do this directly in a query :

select text   
from nom_nom_xt 
    external modify 
    ( 
        default directory recipies_dir 
        location('mikes_beans_on_toast.txt')
    )
/


Alternatively…

select text
from nom_nom_xt 
    external modify
    ( 
        location(recipies_dir:'debs_lemon_drizzle_cake.txt')
    )
/

After running these statments, we can see that the EXTERNAL MODIFY clause has had no effect on the table definition itself :

select directory_name, location
from user_external_locations
where table_name = 'NOM_NOM_XT';

Concurrency

Looking at the EXTERNAL MODIFY clause, it would seem that External Tables should now behave like Global Temporary Tables in that, whilst their structure is permanent, the data they contain is session specific.

Let’s put that to the test.
First of all, I’m going to take advantage of the fact I’m on Linux ( Oracle Linux Server 7.6 since you ask) and generate a text file from /usr/share/dict/words – a file that contains a list of words.

In the recipies directory on the os :

for i in {1..100}; do cat /usr/share/dict/words >>alphabet_soup.txt; done
cat alphabet_soup.txt >alphabetty_spaghetti.txt

I now have two rather chunky text files :

ls -lh alphabet*
-rw-r--r--. 1 oracle oinstall 473M May 27 14:25 alphabet_soup.txt
-rw-r--r--. 1 oracle oinstall 473M May 27 14:26 alphabetty_spaghetti.txt

…containing just under 48 million lines each…

cat alphabet_soup.txt |wc -l
47982800

Using the single external table, I’m going to load each file into a separate table in separate sessions.

The script for session 1 is :

set worksheetname Soup
column start_time format a10
column end_time format a10

-- Check that this is a different session from "session 2"
select sys_context('userenv', 'sessionid') from dual;

-- Give me time to switch sessions and start the other script
exec dbms_session.sleep(2);

select to_char(sysdate, 'HH24:MI:SS') as start_time from dual;

set timing on
create table alphabet_soup as
select *
from nom_nom_xt external modify( default directory recipies_dir location('alphabet_soup.txt'));

set timing off

select to_char(sysdate, 'HH24:MI:SS') as end_time from dual;

select count(*) from alphabet_soup;

In session 2 :

set worksheetname Spaghetti
column start_time format a10
column end_time format a10

-- Check that this is a different session from "session 1"
select sys_context('userenv', 'sessionid') from dual;

select to_char(sysdate, 'HH24:MI:SS') as start_time from dual;

set timing on
create table alphabetty_spaghetti as
select *
from nom_nom_xt external modify( default directory recipies_dir location('alphabetty_spaghetti.txt'));

set timing off
select to_char(sysdate, 'HH24:MI:SS') as end_time from dual;

select count(*) from alphabetty_spaghetti;

Note – the set worksheetname command is SQLDeveloper specific.

The results are…

Session 1 (Soup)

SYS_CONTEXT('USERENV','SESSIONID')                                                                                                                                                                                                                              
------------------------------------
490941


PL/SQL procedure successfully completed.


START_TIME
----------
14:45:08


Table ALPHABET_SOUP created.

Elapsed: 00:01:06.199

END_TIME
----------
14:46:15


COUNT(*)
----------
47982800

Session 2 (Spaghetti)

SYS_CONTEXT('USERENV','SESSIONID')                                                                                                                                                                                                                              
-----------------------------------
490942


START_TIME
----------
14:45:09


Table ALPHABETTY_SPAGHETTI created.

Elapsed: 00:01:08.043

END_TIME
----------
14:46:17


COUNT(*)
----------
47982800

As we can see, the elapsed time is almost identical in both sessions. More importantly, both sessions’ CTAS statements finished within a couple of seconds of each other.
Therefore, we can conclude that both sessions accessed the External Table in parallel.

Whilst this does represent a considerable advance in the utility of External Tables, there is something of a catch when it comes to using them to access files via SQL*Plus…

Persuading EXTERNAL MODIFY to eat it’s greens

A common use case for External Tables tends to be ETL processing. In such circumstances, the name of the file being loaded is likely to change frequently and so needs to be specified at runtime.
It’s also not unusual to have an External Table that you want to use on more than one directory ( e.g. as a log file viewer).
On the face of it, the EXTERNAL MODIFY clause should present no barrier to use in PL/SQL :

clear screen
set serverout on 
begin
    for r_line in 
    (
        select text
        from nom_nom_xt external modify ( default directory recipies_dir location ('debs_lemon_drizzle_cake.txt') )
    )
    loop
        dbms_output.put_line(r_line.text);
    end loop;
end;
/

Whilst this works with no problems, look what happens when we try to use a variable to specify the filename :

clear screen
set serverout on 
declare
    v_file varchar2(4000) := 'debs_lemon_drizzle_cake.txt';
begin
    for r_line in 
    (
        select text
        from nom_nom_xt 
            external modify 
            ( 
                default directory recipies_dir 
                location (v_file) 
            )
    )
    loop
        dbms_output.put_line(r_line.text);
    end loop;
end;
/
ORA-06550: line 7, column 90: 
PL/SQL: ORA-00905: missing keyword

Specifying the directory in a variable doesn’t work either :

declare
    v_dir varchar2(4000) := 'recipies_dir';
begin
    for r_line in 
    (
        select text
        from nom_nom_xt 
            external modify 
            ( 
                default directory v_dir 
                location ('debs_lemon_drizzle_cake.txt') 
            )
    )
    loop
        dbms_output.put_line(r_line.text);
    end loop;
end;
/

ORA-06564: object V_DIR does not exist

Just in case you’re tempted to solve this by doing something simple like :

clear screen
set serverout on 
declare
    v_dir all_directories.directory_name%type := 'recipies_dir';
    v_file varchar2(100) := 'mikes_beans_on_toast.txt';
    v_stmnt clob := 
        q'[
            select text
            from nom_nom_xt external modify( default directory <dir> location('<file>'))
        ]';
    v_rc sys_refcursor;
    v_text varchar2(4000);
begin
    v_stmnt := replace(replace(v_stmnt, '<dir>', v_dir), '<file>', v_file);
    open v_rc for v_stmnt;
    loop
        fetch v_rc into v_text;
        exit when v_rc%notfound;
        dbms_output.put_line(v_text);
    end loop;
    close v_rc;
end;
/

You should be aware that this approach is vulnerable to SQL Injection.
I know that it’s become fashionable in recent years for “Security” to be invoked as a reason for all kinds of – often questionable – restrictions on the hard-pressed Software Engineer.
So, just in case you’re sceptical about this, here’s a quick demo :

clear screen
set serverout on 
declare
    v_dir varchar2(500) := 
        q'[recipies_dir location ('mikes_beans_on_toast.txt')) union all select username||' '||account_status||' '||authentication_type from dba_users --]';
    v_file varchar2(100) := 'mikes_beans_on_toast.txt';
    v_stmnt varchar2(4000) := q'[select text from nom_nom_xt external modify (default directory <dir> location ('<file>'))]';
    v_rc sys_refcursor;
    v_text varchar2(4000);
begin
    v_stmnt := replace(replace(v_stmnt, '<dir>', v_dir), '<file>', v_file);
    open v_rc for v_stmnt;
    loop
        fetch v_rc into v_text;
        exit when v_rc%notfound;
        dbms_output.put_line(v_text);
    end loop;
    close v_rc;
end;
/

bread
baked beans
butter
SYS OPEN PASSWORD
SYSTEM OPEN PASSWORD
XS$NULL EXPIRED & LOCKED PASSWORD
HR OPEN PASSWORD

...snip...

PL/SQL procedure successfully completed.

If we resort to Dynamic SQL, we can pass the filename into the query as a bind variable :

set serverout on
declare
    v_file varchar2(4000) := 'debs_lemon_drizzle_cake.txt';
    v_stmnt clob := 
        'select text from nom_nom_xt external modify( default directory recipies_dir location (:v_file))';
    v_rc sys_refcursor;
    v_text varchar2(4000);
    v_rtn number;
begin
    open v_rc for v_stmnt using v_file;
    loop
        fetch v_rc into v_text;
        exit when v_rc%notfound;
        dbms_output.put_line( v_text);
        
    end loop;
    close v_rc;
end;
/

…or, if you prefer…

clear screen
set serverout on
declare
    v_file varchar2(120) := 'debs_lemon_drizzle_cake.txt';
    v_stmnt clob := q'[select text from nom_nom_xt external modify( default directory recipies_dir location (:b1))]';
    v_rc sys_refcursor;
    v_text varchar2(4000);
    
    v_curid number;
    v_rtn number;
begin
    v_curid := dbms_sql.open_cursor;
    dbms_sql.parse(v_curid, v_stmnt, dbms_sql.native);
    
    dbms_sql.bind_variable(v_curid, 'b1', v_file);
    
    v_rtn := dbms_sql.execute(v_curid);
    
    v_rc := dbms_sql.to_refcursor(v_curid);
    loop
        fetch v_rc into v_text;
        exit when v_rc%notfound;
        dbms_output.put_line( v_text);
    end loop;
    close v_rc;
end;
/

225g unsalted butter
225g caster sugar
4 free-range eggs
225g self-raising flour
1 unwaxed lemon
85g icing sugar


PL/SQL procedure successfully completed.

However, Oracle remains rather recalcitrant when you try doing the same with the default directory.

RTFM RTOB ( Read the Oracle Base article) !

After a number of “glugs” from a bottle of something rather expensive whilst trawling through the Oracle Documentation for some clues, I happened to look at the Oracle Base article on this topic which notes that you cannot use bind variables when specifying the Default Directory.

One possible workaround would be to create one external table for each directory that you want to look at.

Alternatively, we can sanitize the incoming value for the DEFAULT DIRECTORY before we drop it into our query.

To this end, DBMS_ASSERT is not going to be much help.
The SQL_OBJECT_NAME function does not recognize Directory Objects…

select dbms_assert.sql_object_name('recipies_dir') from dual;

ORA-44002: invalid object name

… and the SIMPLE_SQL_NAME function will allow pretty much anything if it’s quoted…

select dbms_assert.simple_sql_name(
q'["recipies_dir location('mikes_beans_on_toast.txt')) union all select username from dba_users --"]')
from dual;

DBMS_ASSERT.SIMPLE_SQL_NAME(Q'["RECIPIES_DIRLOCATION('MIKES_BEANS_ON_TOAST.TXT'))UNIONALLSELECTUSERNAMEFROMDBA_USERS--"]')
---------------------------------------------------------------------------
"recipies_dir location('mikes_beans_on_toast.txt')) union all select username from dba_users --"

Time then, to unplug the microwave and cook up something home-made…

I’m running on 19c so I know that :

Additionally, I’m going to assume that we’re following Oracle’s recommendation that quoted identifiers are not used for database object names (including Directories). You can find that pearl of wisdom in the page linked above.

Finally, I want to make sure that a user only accesses a valid directory object on which they have appropriate permissions.

Something like this should get us most of the way :

set serverout on
clear screen
declare
    v_dir varchar2(4000) := 'recipies_dir';
    v_file varchar2(4000) := 'debs_lemon_drizzle_cake.txt';
    v_stmnt clob := 
        'select text from nom_nom_xt external modify( default directory <dir> location (:v_file))';
    v_rc sys_refcursor;
    v_text varchar2(4000);
    v_rtn number;
    
    v_placeholder pls_integer;
    v_found_dir boolean;
    cursor c_valid_dir is
        select null
        from all_directories 
        where directory_name = upper(v_dir);
begin
    if length( v_dir) > 128 then
        raise_application_error(-20101, 'Directory Identifier is too long');
    end if;
    -- Assume allowable characters are alphanumeric and underscore. Reject if it contains anything else
    if regexp_instr(replace(v_dir, '_'), '[[:punct:]]|[[:space:]]') > 0 then
        raise_application_error(-20110, 'Directory Name contains wacky characters');
    end if;    
    open c_valid_dir;
    fetch c_valid_dir into v_placeholder;
    v_found_dir := c_valid_dir%found;
    close c_valid_dir;
    if v_found_dir = false then
        raise_application_error(-20120, 'Directory not found');
    end if;    
    v_stmnt := replace(v_stmnt, '<dir>', v_dir);
    open v_rc for v_stmnt using v_file;
    loop
        fetch v_rc into v_text;
        exit when v_rc%notfound;
        dbms_output.put_line( v_text);
        
    end loop;
    close v_rc;
end;
/

We can now convert this into an Invoker’s rights package, that should restrict access to directories visible by the calling user :

create or replace package peckish
    authid current_user
as
    type t_nom is table of varchar2(4000);
    
    procedure validate_directory(i_dir in varchar2);

    function recipe( i_dir in varchar2, i_file in varchar2)
        return t_nom pipelined;
        
end peckish;        
/

create or replace package body peckish as

    procedure validate_directory( i_dir in varchar2)
    is
        v_placeholder pls_integer;
        v_found_dir boolean;
        cursor c_valid_dir is
            select null
            from all_directories 
            where directory_name = upper(i_dir);
    begin
        if length( i_dir) > 128 then
            raise_application_error(-20101, 'Directory Identifier is too long');
        end if;
    
        if regexp_instr(replace(i_dir, '_'), '[[:punct:]]|[[:space:]]') > 0 then
            raise_application_error(-20110, 'Directory Name contains wacky characters');
        end if;    
        open c_valid_dir;
        fetch c_valid_dir into v_placeholder;
        v_found_dir := c_valid_dir%found;
        close c_valid_dir;
        if v_found_dir = false then
            raise_application_error(-20120, 'Directory not found');
        end if;    
    end validate_directory;
    
    function recipe( i_dir in varchar2, i_file in varchar2)
        return t_nom pipelined
    is
        v_nom nom_nom_xt%rowtype;
        v_stmnt clob := 
            'select line, text from nom_nom_xt external modify( default directory <dir> location (:v_file))';
        v_rc sys_refcursor;
        v_text varchar2(4000);
       
    begin
        validate_directory(i_dir);
        v_stmnt := replace(v_stmnt, '<dir>', i_dir);            
        open v_rc for v_stmnt using i_file;
        loop
            fetch v_rc into v_nom.line, v_nom.text;
            exit when v_rc%notfound;
            pipe row( v_nom);
        end loop;
        close v_rc;
    end recipe;
end peckish;
/

Let’s run some tests :

select line as line_no, text as ingredient 
from table(peckish.recipe('recipies_dir', 'debs_lemon_drizzle_cake.txt'))
/


LINE_NO INGREDIENT
---------- ----------------------------------------
1 225g unsalted butter
2 225g caster sugar
3 4 free-range eggs
4 225g self-raising flour
5 1 unwaxed lemon
6 85g icing sugar

6 rows selected.

select text as ingredient
from table (
peckish.recipe(
'this_is_a_very_long_identifier_to_check_that_the_length_restriction_works_as_expected._Is_that_128_characters_yet_?_Apparently_not_Oh_well_lets_keep_going_for_a_bit',
'mikes_beans_on_toast.txt'
))
/

ORA-20101: Directory Identifier is too long



select text as ingredient
from table(
peckish.recipe
(
q'["recipies_dir location('mikes_beans_on_toast.txt')) union all select username from dba_users --"]',
'debs_lemon_drizzle_cake.txt'
))
/

ORA-20110: Directory Name contains wacky characters


select text as ingredient
from table (peckish.recipe('super_secret_dir', 'mikes_beans_on_toast.txt'))
/

ORA-20120: Directory not found

All of which has left me feeling rather in the mood for a snack. I wonder if there’s any of that cake left ?

Author: mikesmithers

Back in 1993, I discovered that I could get paid money for doing fun stuff with computers. Over the years, I've specialised in Oracle Databases as a developer, a DBA and sometimes, an architect. It's my evil alter-ego - The Antikyte - who writes a blog about my various technical adventures. Yes, that is his Death Star parked in the Disabled Bay. I currently live in the South-West of England with Deb, my long-suffering wife.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.