Back in the mists of time, when Broadband was a way of describing a group of fat blokes with guitars, PL/SQL blinked it’s way into the world. It’s purpose was ( and largely remains) to provide the facility to apply 3GL program structures to SQL from within the database ( hence – Procedural Language / SQL).
As an integral part of the Oracle RDBMS, most PL/SQL I/O activities are on database tables. The ability to read and write OS files didn’t arrive until much later.
Meanwhile, back in the present, things are somewhat better on the File Handling front. So, if you just have to generate that flat-file and would rather not muck about with a pre-compiler (or a Java Stored Procedure), PL/SQL will do the job.
The process is pretty simple :
- Tell the database about a directory you want to read to / write from
- Grant the appropriate privileges to allow access to the directory from within the database
- Use the UTL_FILE Oracle supplied package to do the File I/O stuff.
Create a Directory in Oracle
The Directory object in the database is really just a mapping. You tell Oracle what you want to call the directory and what physical location it maps to on the OS.
The user you create the Directory as in Oracle must have the CREATE ANY DIRECTORY privilege.
Another important point to note, whoever you’re connected to the database as, it is the oracle user that will be doing the OS stuff, so you need to make sure that oracle has permissions to the directory you choose to dump your files.
First to create the output directory on the OS :
sudo su – oracle pwd /usr/lib/oracle/xe mkdir db_out_files
Now we just need to tell the database what to call the directory. Back to the database then :
CREATE OR REPLACE DIRECTORY myfiles AS '/usr/lib/oracle/xe/db_out_files/' /
Obviously, it’s quite a good idea to make sure that the directory exists on the OS. Apart from anything else, Oracle won’t check this when the Directory object is created in the database so the first you’ll know that there’s something amiss is when your PL/SQL falls over in a heap.
Permissions quirk on 10g XE
If you do happen to be running on 10gXE, the permissions on UTL_FILE may be set up slightly differently to other Oracle versions ( 10.2 and 11.2 on Linux are the other two I’ve looked at).
To check to see who has permissions on UTL_FILE
SELECT grantee FROM dba_tab_privs WHERE table_name = 'UTL_FILE' AND privilege = 'EXECUTE';
If you want to grant this, you need to connect as SYS AS SYSDBA ( sysoper won’t do) :
connect sys/pwd@db as sysdba grant execute on utl_file to mike;
Privileges to access the directory from the database
To access the Directory, you must either have been granted privileges on it, or have the CREATE ANY DICTIONARY privilege.
To grant privileges on our new directory to hr ( for example) :
GRANT READ, WRITE ON DIRECTORY myfiles TO hr /
Writing a File
In this example, we’re going to create a csv file containing some data from the hr.employees table.
The file handling code itself should be pretty familiar if you’ve ever used something like C.
set serveroutput on size 1000000 -- -- Script to generate csv of Employees name and email addresses -- to the MYFILES directory ( /usr/lib/oracle/xe/db_out_files/) -- DECLARE l_file_handle UTL_FILE.FILE_TYPE; l_buffer VARCHAR2(100); l_count PLS_INTEGER := 0; BEGIN -- -- Using reference notation so you can see the parameter names -- -- Note - you don't have to specify max_linesize - defaults to 256 -- This parameter does dictate the maximum length of the buffer -- when calling UTL_FILE.PUT_LINE. -- l_file_handle := UTL_FILE.FOPEN( location => 'MYFILES', filename => 'emp_em.csv', open_mode => 'W', max_linesize => 100); -- -- Now we've opened the file, write a header row. Using positional notation -- to illustrate the similarity with the C fprintf function -- - fprintf( file_pointer, "string"); -- UTL_FILE.PUT_LINE(l_file_handle, 'First Name'||','||'Last Name'||','||'email'); -- -- Now loop through the employees table and get all the data -- FOR r_emp IN ( SELECT first_name, last_name, email FROM hr.employees) LOOP -- -- Format the data into comma separated values -- l_buffer := r_emp.first_name||','||r_emp.last_name||','||r_emp.email; -- -- Back to passing by reference cos it's just better -- UTL_FILE.PUT_LINE( file => l_file_handle, buffer => l_buffer); -- -- Keep a count of the number of records -- l_count := l_count + 1; END LOOP; UTL_FILE.FCLOSE(l_file_handle); DBMS_OUTPUT.PUT_LINE( l_count||' records written to emp_em.csv.'); END; /
One point to note – if you happen to have multiple files open and you want to close them all at the same time, you can use
If you now check your output directory, you should find emp_mail.csv which contains all of your comma separated records ( plus one row for the header, don’t forget) :
mikes@mikes-laptop:~$ cd/usr/lib/oracle/xe/db_out_file mikes@mikes-laptop:/usr/lib/oracle/xe/db_out_files$ cat emp_em.csv| wc -l 108 mikes@mikes-laptop:/usr/lib/oracle/xe/db_out_files$
Reading a file
This is a fairly similar process to writing. The file has to be in a Directory that Oracle knows about and to which you have privileges on in the database.
To take a completely random example, let’s try uploading a file containing comma separated values into a table containing a list of people and e-mail addresses.
Oh look, we happen to have a file kicking around at the moment that we can use for just that purpose.
CREATE TABLE spam_targets( first_name VARCHAR2(20), last_name VARCHAR2(20), email VARCHAR2(25)) /
At this point, I should state the assumptions I’m making here are that we know what the file format is and the order in which the columns appear in the file.
Let’s make marketing happy …. I’ve left in some DBMS_OUTPUT statements so we can see what’s happening on the command line, but you definitely don’t need them for the actual file I/O stuff.
set serveroutput on size 1000000 -- -- Script to read the emp_em.csv file and insert the contents -- into the SPAM_TARGETS table -- DECLARE l_fp UTL_FILE.FILE_TYPE; l_dir VARCHAR2(30) := 'MYFILES'; l_fname VARCHAR2(30) := 'emp_em.csv'; l_buffer VARCHAR2(100); -- -- Variable for doing the substring thing -- l_start PLS_INTEGER := 1; l_pos PLS_INTEGER; -- -- PL/SQL table to hold the individual data items in each row -- TYPE typ_cols IS TABLE OF VARCHAR2(30) INDEX BY PLS_INTEGER; tbl_cols typ_cols; l_col_no PLS_INTEGER; l_count PLS_INTEGER := 0; BEGIN -- -- Open the file, but this time we want to READ from it -- l_fp := UTL_FILE.FOPEN( location => l_dir, filename => l_fname, OPEN_MODE => 'R'); IF NOT UTL_FILE.IS_OPEN(l_fp) THEN DBMS_OUTPUT.PUT_LINE('Unable to open file.'); RETURN; END IF; -- -- Need to discard the first line because it's got the header record. -- UTL_FILE.GET_LINE(l_fp, l_buffer); DBMS_OUTPUT.PUT_LINE('Got the header : '); DBMS_OUTPUT.PUT_LINE(l_buffer); -- -- Now loop through the rest of the file and insert each record into -- the table -- LOOP BEGIN -- -- Need this nested block -- because we'll get a NO_DATA_FOUND error when we've -- read past the last line of the file and we need to -- handle it here. -- UTL_FILE.GET_LINE( l_fp, l_buffer); IF l_buffer IS NULL THEN -- -- End of the file so exit the loop -- EXIT; END IF; -- -- Split the line back into it's component data items -- l_col_no := 1; l_start := 1; WHILE INSTR( l_buffer, ',', 1,l_col_no) != 0 LOOP -- -- Loop through the line to get all the values -- Don't really need a PL/SQL table for 3 columns, but if there's -- a lot more it may well save you some typing -- l_pos := INSTR( l_buffer, ',', 1, l_col_no); tbl_cols( l_col_no) := SUBSTR( l_buffer, l_start, l_pos - l_start); l_start := l_pos + 1; l_col_no := l_col_no + 1; END LOOP; -- -- Get the last value on the line ( it won't have a ',' at the end) -- l_start := l_pos + 1; tbl_cols(l_col_no) := SUBSTR( l_buffer, l_start); -- -- Now insert the record -- INSERT INTO spam_targets( first_name, last_name, email) VALUES( tbl_cols(1), tbl_cols(2), tbl_cols(3)); l_count := l_count + 1; DBMS_OUTPUT.PUT_LINE('Insert done - '||l_count); EXCEPTION WHEN NO_DATA_FOUND THEN -- -- We've read past the end of the file -- EXIT; END; END LOOP; UTL_FILE.FCLOSE( l_fp); END; /
If we now check the table ( in the same session – remember, we haven’t commited yet) :
SQL> SELECT COUNT(*) FROM spam_targets; COUNT(*) ---------- 107 SQL> commit; Commit complete. SQL>
Often, when loading data from a file into a database, the process is to load it into a holding table and then do any processing/validation to load it into the application tables themselves.
If this is the case, then it’s usually quite simple to define everything as a VARCHAR2 and do the conversion to dates etc. once the data is safely in the database. Incidentally, there are some alternative methods of uploading the data here, if you require something a bit more tailored to the volume of data you’re processing.
Look, it could’ve been worse – we could have uploaded the phone numbers and subjected innocent people to a call from a war-dialler masquerading as Stephen Hawking. “For the last time, no, I’m not interested in owning my own, Limited Edition Higgs-Boson thankyou very much!”