New Dog, Old Tricks – how to save yourself some typing with sed

We have a new addition to our household –

Teddy


Cute and fluffy he may be, but he’s got to earn his keep. He can start making himself useful by helping me with this post.

It begins one Friday afternoon when an urgent request lands on my desk with a large splat.

The requirement is that some csv files be uploaded into the Oracle 11g Datbasae serving the UAT environment to facilitate some testing.
There are around 20 files, each with a slightly different set of attributes.
The files are currently sitting on the on the Red Hat Linux Server hosting the database.
I have sufficient OS permissions on the server to move them to a directory that has a corresponding database object in the UAT instance.
Nevertheless, the thought of having to knock out 20-odd external tables to read these files might leave me feeling a bit like this…


Fortunately, a certain Lee E. McMahon had the foresight to predict the potential risk to my weekend and wrote the Stream Editor (sed) program

The File

The file I’m using as an example is here on the Database Server :

and it’s contents are :

employee_id,first_name,last_name,email,salary,commission%
145,"John","Russell","JRUSSEL",14000,0.4
146,"Karen","Partners","KPARTNER",13500,0.3
147,"Alberto","Errazuriz","AERRAZUR",12000,0.3
148,"Gerald","Cambrault","GCAMBRAU",11000,0.3
149,"Eleni","Zlotkey","EZLOTKEY",10500,0.2
150,"Peter","Tucker","PTUCKER",10000,0.3
151,"David","Bernstein","DBERNSTE",9500,0.25
152,"Peter","Hall","PHALL",9000,0.25
153,"Christopher","Olsen","COLSEN",8000,0.2
154,"Nanette","Cambrault","NCAMBRAU",7500,0.2
155,"Oliver","Tuvault","OTUVAULT",7000,0.15
156,"Janette","King","JKING",10000,0.35
157,"Patrick","Sully","PSULLY",9500,0.35
158,"Allan","McEwen","AMCEWEN",9000,0.35
159,"Lindsey","Smith","LSMITH",8000,0.3
160,"Louise","Doran","LDORAN",7500,0.3
161,"Sarath","Sewall","SSEWALL",7000,0.25
162,"Clara","Vishney","CVISHNEY",10500,0.25
163,"Danielle","Greene","DGREENE",9500,0.15
164,"Mattea","Marvins","MMARVINS",7200,0.1
165,"David","Lee","DLEE",6800,0.1
166,"Sundar","Ande","SANDE",6400,0.1
167,"Amit","Banda","ABANDA",6200,0.1
168,"Lisa","Ozer","LOZER",11500,0.25
169,"Harrison","Bloom","HBLOOM",10000,0.2
170,"Tayler","Fox","TFOX",9600,0.2
171,"William","Smith","WSMITH",7400,0.15
172,"Elizabeth","Bates","EBATES",7300,0.15
173,"Sundita","Kumar","SKUMAR",6100,0.1
174,"Ellen","Abel","EABEL",11000,0.3
175,"Alyssa","Hutton","AHUTTON",8800,0.25
176,"Jonathon","Taylor","JTAYLOR",8600,0.2
177,"Jack","Livingston","JLIVINGS",8400,0.2
179,"Charles","Johnson","CJOHNSON",6200,0.1

The database object for this directory is :

select directory_name
from dba_directories
where directory_path = '/u01/app/oracle/myfiles'
/

DIRECTORY_NAME
------------------------------
MYFILES

Building the External Table DDL

As you can see from the file, external table columns can be taken from the header record, with a couple of tweaks :

  1. all columns will be defined as varchar2(4000)
  2. the “%” in “commission%” needs to be replaced with “_pct” to make the column name legal in Oracle

First up then we get sed to replace each of the pipes in the header row…

head -1 emps.dat|sed s/,/" varchar2(4000),"/g

employee_id varchar2(4000),first_name varchar2(4000),last_name varchar2(4000),email varchar2(4000),salary varchar2(4000),commission%

The /s switch means “substitute string1 with string2
The /g means – apply this whenever you find string1

We can use the same technique to replace the “%” signs…

$ head -1 emps.dat|sed s/%/_pct/g
employee_id,first_name,last_name,email,salary,commission_pct

This means that we now have the basis for a simple shell script to do the External Table creation legwork for us…

#!/bin/sh
# Script to generate an external table based on the .dat file supplied as $1
baseFname=`basename -s .dat $1`
tsuff=_xt
tname=$baseFname$tsuff
fname=$tname.sql
echo "create table $tname (" >$fname
echo -n `head -1 $1`|sed s/,/" varchar2(4000),\n"/g|sed s/%/_pct/g >>$fname
echo " varchar2(4000))">>$fname
echo "organization external ( type oracle_loader">>$fname
echo "default directory MYFILES access parameters (">>$fname
echo -e "\t records delimited by newline">>$fname
echo -e "\t\t badfile '$baseFname.bad'">>$fname
echo -e "\t\t logfile '$baseFname.log'">>$fname           
echo -e "\t\t skip 1">>$fname
echo -e "\t\t fields terminated by ',' (">>$fname
echo -ne "\t\t\t">>$fname
echo -n `head -1 $1`|sed s/,/" char(4000),\n"/g|sed s/%/_pct/g >>$fname
echo -e " char(4000)">>$fname
echo -e "\t\t)">>$fname
echo -e "\t ) location ( '$1' )">>$fname
echo -e ") reject limit unlimited;">>$fname

Running this for our file…

. ./gentxt.sh emps.dat

… gives us a complete External Table definition in a file called emps_xt.sql …

create table emps_xt (
employee_id varchar2(4000),
first_name varchar2(4000),
last_name varchar2(4000),
email varchar2(4000),
salary varchar2(4000),
commission_pct varchar2(4000))
organization external ( type oracle_loader
default directory MYFILES access parameters (
	 records delimited by newline
		 badfile 'emps.bad'
		 logfile 'emps.log'
		 skip 1
		 fields terminated by ',' (
			employee_id char(4000),
first_name char(4000),
last_name char(4000),
email char(4000),
salary char(4000),
commission_pct char(4000)
		)
	 ) location ( 'emps.dat' )
) reject limit unlimited;

OK, the formatting could use some work. however, the sql itself is valid…

SQL> @emps_xt.sql

Table created.

With the external table in place, we can now upload the data from the file…

Hours of drudgery have been avoided, which is just as well because someone gets a bit grumpy when they don’t get their walkies !

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.