So, the World Cup is in full swing.
Now the lesser teams have fallen by the wayside ( England), we can get on with enjoying a feast of footie.
As well as a glut of goals, the current tournament has given us a salutory reminder of the importance of diet for elite athletes.
After predictably (and brilliantly) destroying England single-handedly, Luis Suaraz found himself a bit peckish and nipped out for an Italian. Now the whole world seems to be commenting on his eating habits.
Like Luis, you may find yourself thinking that you’ve bitten off more than you can chew when confronted by DBMS_DATAPUMP.
The documentation does offer some help…to an extent. However, the whole thing can seem a bit fiddly, especially if you’re used to the more traditional command-line interface for Datapump.
What follows is a tour through DBMS_DATAPUMP based on my own (sometimes painful) experience, broken down into bite-sized chunks.
Much of the functionality to filter object types and even data is common to both Exports and Imports.
So, the approach I’ve taken is to cover the Export first, with a view to finally producing a Full Database export.
I’ve then used the Import process against this to demonstrate some of the package’s filtering capabilities.
So, what’s on the menu today ?
- Privileges required to run a DBMS_DATAPUMP job from your current schema and for the whole database
- Running a consistent export
- Running datapump jobs in the background
- Monitoring running jobs
- Importing from one schema to another
- Specifying the types of objects to include in Exports and Imports
- Specifying subsets of data
- DDL only Jobs
- How to Kill a Datapump Job
The full code examples have all been written and tested on Oracle XE 11gR2.
I’ve tried to maximise the use of in-line hard-coded values and minimise the number of variables in an attempt to make the code easier to follow.
Also, in these examples I’ve made use of the default DATA_PUMP_DIR directory object, but you can use any directory object to which you have the appropriate privileges.
For dessert, there are a couple of other DBMS_DATAPUMP features that I have found useful that are specific to Enterprise Edition ( in one case, with the Partitioning Option) ;
- Including specific table partitions
- Parallel processing
So, a fair bit to get through then. I hope you have an apetite…
Privileges required for using DBMS_DATAPUMP
Obviously, the first thing you need is execute privileges on DBMS_DATAPUMP. By default, this is granted to PUBLIC, although the more security conscious DBAs will have instituted more rigorous controls over just who gets access to this package.
The other thing you’ll need is READ and WRITE privileges on a database Directory Object.
DATA_PUMP_DIR
The DATA_PUMP_DIR directory object is created by default (but any old Directory Object will do)…
select directory_path from dba_directories where directory_name = 'DATA_PUMP_DIR' / DIRECTORY_PATH -------------------------------------------------------------------------------- /u01/app/oracle/admin/XE/dpdump/
The HR user, for example, would need to be granted…
grant read, write on data_pump_dir to hr /
That’s pretty-much it. With just these privileges( along with those for creating/altering the relevant object types), you can run DBMS_DATAPUMP to export/import objects and data in your current schema.
Once again, assuming we’re connected as HR :
set serveroutput on size unlimited declare -- -- Just export the current schema (HR) -- l_dph number; -- The datapump job handle l_status varchar2(4000); -- terminating status of the job begin -- create the job... l_dph := dbms_datapump.open ( operation => 'EXPORT', job_mode => 'SCHEMA', -- schema level export will use the current user by default job_name => 'HR_SCHEMA_EXPORT' -- appropriate job name ); -- Specify the name and location of the export dump file we want to create... dbms_datapump.add_file ( handle => l_dph, filename => 'hr_export.dmp', directory => 'DATA_PUMP_DIR', -- can use any database directory object filetype => dbms_datapump.ku$_file_type_dump_file, reusefile => 1 -- if this file already exists, overwrite it ); -- ...and a log file to track the progress of the export dbms_datapump.add_file ( handle => l_dph, filename => 'hr_export.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- Kick off the export dbms_datapump.start_job( handle => l_dph); -- ...and wait for the job to complete dbms_datapump.wait_for_job( handle => l_dph, job_state => l_status); dbms_output.put_line('Job done - status '||l_status); end; /
It’s probably worth noting that dbms_datapump jobs contain four main components as a rule.
First DBMS_DATAPUMP.OPEN creates a kind of container to hold the definition of the job.
Next, you specify any particular requirements for the job. This must always include reference to a dump file.
Then, use DBMS_DATAPUMP.START_JOB to kick-off the job you’ve defined.
Finally, you can either wait for the job to finish (as in this case), or leave it to run in the background. We’ll come onto background execution in a bit.
Roles required for Full Export/Import
If you want the God-Like powers to export/import the entire database then you will need the roles :
- DATAPUMP_EXP_FULL_DATABASE
- DATAPUMP_IMP_FULL_DATABASE
These roles are granted to the DBA role and the SYS user by default.
Incidentally, the fact that roles play quite a significant part in DBMS_DATAPUMP priviliges means that, if you do intend to wrap this functionality in a package, it would probably be wise to create it with Invoker Rights – e.g.
create or replace package my_datapump_package authid current_user as ...
Now that’s all sorted…
set serveroutput on size unlimited declare -- -- Do a no-nonsense full export -- l_dph number; l_status varchar2(4000); begin l_dph := dbms_datapump.open ( operation => 'EXPORT', job_mode => 'FULL', job_name => 'MIKE_FULL_EXPORT' ); -- Specify the dump file we're going to create dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file, reusefile => 1 ); -- ... and a log file dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- Now press the big red shiny button... dbms_datapump.start_job( handle => l_dph); -- wait around for a bit... dbms_datapump.wait_for_job( handle => l_dph, job_state => l_status); dbms_output.put_line('Job done - status '||l_status); end; /
Pretty much the same as with the schema export we’ve already done. However, this time, we’ve specified the job_mode as FULL rather than SCHEMA.
With this script saved as full_exp_simple.sql…
SQL> @full_exp_simple.sql Job done - status COMPLETED PL/SQL procedure successfully completed. SQL>
Depending on the size of your database ( and the capacity of the server you’re running on), this may take a while.
When it’s all finished, you can wander over to the DATA_PUMP_DIR on the OS and you’ll see the fruit of your labours…
cd /u01/app/oracle/admin/XE/dpdump ls -l my_full_export.* -rw-r----- 1 oracle dba 321515520 Jun 18 19:55 my_full_export.dmp -rw-r--r-- 1 oracle dba 84957 Jun 18 19:55 my_full_export.log
If we look at the logfile, we can see it starts like this :
Starting "MIKE"."MIKE_FULL_EXPORT": Estimate in progress using BLOCKS method... Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA Total estimation using BLOCKS method: 349.9 MB Processing object type DATABASE_EXPORT/TABLESPACE ...
… and ends with …
. . exported "SYSTEM"."SQLPLUS_PRODUCT_PROFILE" 0 KB 0 rows Master table "MIKE"."MIKE_FULL_EXPORT" successfully loaded/unloaded ****************************************************************************** Dump file set for MIKE.MIKE_FULL_EXPORT is: /u01/app/oracle/admin/XE/dpdump/my_full_export.dmp Job "MIKE"."MIKE_FULL_EXPORT" successfully completed at 19:55:33
We now have an export. If, like me, you’ve just run this from a database where you know that there are no other active sessions, then you should have a nice consistent export. Unfortunately, this isn’t always the case…
Making your Export Read Consistent…and other tweaks
To adress this issue of consistency, we’ll need to change our script a bit. Whilst we’re at it, it would also be nice to have some record in the logfile of when the job started so we can work out how long it actually ran for.
Finally, we’d like to be able to run the export in the background so we don’t have to hang around waiting for it.
Consistent = Y – the DATAPUMP equivalent
Those of you old enough to remember the original export utility will recall that you could ensure that data in an export was referrentially consistent by the simple expedient of specifying consistent = Y in the exp command.
The equivalent in DBMS_DATAPUMP is to specify a value for FLASBACK_SCN.
NOTE – in order for this to work, your database must be running in ARCHIVELOG mode. This is especially relevant if you’re playing along on XE, which runs in NOARCHIVELOG by default.
To check the current status of archiving on the database :
select log_mode from v$database / LOG_MODE ------------ ARCHIVELOG
If the query comes back NOARCHIVELOG then you need to enable archiving.
To do this, you need to connect as SYSDBA, shutdown and then mount the database, before starting archiving and then re-opening the database.
These steps can be achieved as follows once connected to the database as sysdba :
shutdown immediate; startup mount; alter database archivelog / alter database open /
NOTE – these steps are intended if you are messing around in your own database ( e.g. XE on a laptop, like I am here). If you’re on a controlled environment, then you need to get your friendly neighbourhood DBA to do the honours.
Assuming the database is now in archivelog mode, the next thing we need to do is find SCN that we can pass to DBMS_DATAPUMP for it to use as a reference point for what we mean by consistent.
The SCN – System Change Number – is incremented every time a transaction completes in the database.
A quick demonstration is probably in order….
SQL> select current_scn from v$database; CURRENT_SCN ----------- 2774525 SQL> create table snacks( snack_name varchar2(30), description varchar2(4000)); Table created. SQL> select current_scn from v$database; CURRENT_SCN ----------- 2774562 SQL> insert into snacks( snack_name, description) 2 values( 'CHIELLINI', 'Lite bite') 3 / 1 row created. SQL> commit; Commit complete. SQL> select current_scn from v$database; CURRENT_SCN ----------- 2774581 SQL> drop table snacks purge; Table dropped. SQL> select current_scn from v$database; CURRENT_SCN ----------- 2774608 SQL>
The fact that the SCN increments by more than one after each transaction completes in this session is explained by the transactions being run by the oracle background processes on my database.
Anyway, if we do want to find the current SCN, or even an SCN for a given time and date, we can simply use the
SQL TIMESTAMP_TO_SCN function :
select timestamp_to_scn( systimestamp) from dual; TIMESTAMP_TO_SCN(SYSTIMESTAMP) ------------------------------ 2647783 SQL>
In order to pass this information to DBMS_DATAPUMP, we need to use the SET_PARAMETERS procedure.
This should do the trick :
... dbms_datapump.set_parameter ( handle => the_datapump_job_handle, name => 'FLASHBACK_SCN', value => timestamp_to_scn( systimestamp) ); ...
Adding a message to the logfile
Next on our to-do list to improve our export is a message in the logfile recording when the export job started.
Oh look, there’s a package member called LOG_ENTRY. I wonder if that will help …
... dbms_datapump.log_entry ( handle => the_datapump_job_handle, message => 'Job starting at '||to_char(sysdate, 'HH24:MI:SS') ); ...
Incidentally, we could include the SCN we’re specifying in the message as well ( although it would need to be captured in a variable before use in both the SET_PARAMETER call above and the LOG_ENTRY call).
One point to note – any entry in the logfile that results from a call to this procedure always starts with “;;;”.
Running the job in the background
Just in case you don’t fancy the idea of hanging around waiting for the job to finish, you can replace the call to WAIT_FOR_JOB with this…
... dbms_datapump.detach( handle => the_datapump_job_handle); ...
The New, Improved Datapump Export Script
If we now apply all of these changes to our original script, it should look something like ….
set serveroutput on size unlimited declare l_dph number; l_scn number; -- The SCN from which the export will run begin -- setup the job l_dph := dbms_datapump.open ( operation => 'EXPORT', job_mode => 'FULL', job_name => 'MIKE_FULL_EXPORT' ); -- define the files dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file, reusefile => 1 ); dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- -- Specify the SCN number to make sure that the export -- is a consistent copy -- l_scn := timestamp_to_scn(systimestamp); dbms_datapump.set_parameter ( handle => l_dph, name => 'FLASHBACK_SCN', value => l_scn ); -- log the start time of the job dbms_datapump.log_entry ( handle => l_dph, message => 'Job starting at '||to_char(sysdate, 'HH24:MI:SS')||' for SCN '||l_scn ); -- start the job dbms_datapump.start_job( handle => l_dph); -- ...and leave it to run dbms_datapump.detach( handle => l_dph); end; /
Monitoring the job
The USER_DATAPUMP_JOBS view (there is also a DBA_ version available) contains details of any currently defined datapump jobs.
SQL> select job_name, state, attached_sessions 2 from user_datapump_jobs; JOB_NAME STATE ATTACHED_SESSIONS ------------------------------ ------------------------------ ----------------- MIKE_FULL_EXPORT EXECUTING 1 SQL>
Additionally, during the export, DATAPUMP will create a temporary table with the same name as the job.
In this table, you can see which object (if any) it’s currently working on…
select object_schema, object_name, work_item, to_char(last_update, 'hh24:mi:ss') last_updated from mike_full_export where state = 'EXECUTING' and object_schema is not null and object_name is not null /
There is an example of how to get “real-time” updates in the Oracle Documentation.
Unfortunately, it relies on DBMS_OUTPUT so the messages don’t get echoed to the screen until after the job is completed.
The simplest way to find out what’s happening right now is to check the logfile.
On Linux, for example, you could simply do this :
cd /u01/app/oracle/admin/XE/dpdump tail -f my_full_export.log
If you haven’t got command line access to the database server, or simply prefer to keep everything in the database, then you could just create an external table based on the logfile. After all, you already have the required privileges on the directory …
create table datapump_log_xt ( line number, text varchar2(4000) ) organization external ( type oracle_loader default directory data_pump_dir access parameters ( records delimited by newline nologfile fields terminated by whitespace ( line recnum, text position(1:4000) ) ) location('') ) reject limit unlimited /
If we want to check progress, we can simply “tail” the file like this :
alter table datapump_log_xt location ('my_full_export.log') / select text from datapump_log_xt where line > ( select max(line) - 10 from datapump_log_xt) order by line; TEXT -------------------------------------------------------------------------------- Processing object type DATABASE_EXPORT/CONTEXT Processing object type DATABASE_EXPORT/SCHEMA/PUBLIC_SYNONYM/SYNONYM Processing object type DATABASE_EXPORT/SCHEMA/SYNONYM Processing object type DATABASE_EXPORT/SCHEMA/TYPE/TYPE_SPEC Processing object type DATABASE_EXPORT/SCHEMA/TYPE/GRANT/OWNER_GRANT/OBJECT_GRANT Processing object type DATABASE_EXPORT/SYSTEM_PROCOBJACT/PRE_SYSTEM_ACTIONS/PROCACT_SYSTEM Processing object type DATABASE_EXPORT/SYSTEM_PROCOBJACT/PROCOBJ Processing object type DATABASE_EXPORT/SYSTEM_PROCOBJACT/POST_SYSTEM_ACTIONS/PROCACT_SYSTEM Processing object type DATABASE_EXPORT/SCHEMA/PROCACT_SCHEMA Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE 10 rows selected. SQL>
So far, we’ve looked exclusively at Exporting everything either from a given schema or an entire database.
Datapump also allows you to filter the data. Generally speaking, these filters apply whether you are importing or exporting.
Importing data from one schema to another
One of the many reasons that you may want to take a DataPump export may be to refresh a development environment.
It’s possible that the schema you’re refreshing on the Development database has a different name from the one on your Production environment. Let’s say, for the sake of argument, that we have a HR_DEV user in our development environment…
create user hr_dev identified by pwd / grant connect, resource, unlimited tablespace, create view, create sequence, create session, alter session, create synonym, create table, create procedure to hr_dev / alter user hr_dev default tablespace users temporary tablespace temp /
Fortunately, importing the HR objects in the export file into the HR_DEV user is fairly simple :
... dbms_datapump.metadata_remap ( handle => the_datapump_job_handle, name => 'REMAP_SCHEMA' old_value => 'HR' value => 'HR_DEV' ); ...
Specifying schemas and object types
At this point, let’s say that we only want to import the HR schema. The rest of the objects in the export file aren’t really relevant to us.
Furthermore, let’s assume we only want to import Tables and Sequences as we’ll re-create all of our PL/SQL stored program units, views etc from source code files.
Filtering by schema
Allow me to introduce the METADATA_FILTER procedure :
... dbms_datapump.metadata_filter ( handle => the_datapump_job_handle, name => 'SCHEMA_EXPR', value => q'[= 'HR']' ); ...
Once again, not as painful as it might appear. However, it’s here where we begin to see the benefits of quote delimiters.
Filtering by object types
Here’s where things get a bit more interesting.
Remember the export logfile. At the beginning, there were a number of entries like :
Processing object type DATABASE_EXPORT/TABLESPACE Processing object type DATABASE_EXPORT/PROFILE Processing object type DATABASE_EXPORT/SYS_USER/USER Processing object type DATABASE_EXPORT/SCHEMA/USER ...
These paths are the basis for how datapump determines what to export.
Fortunately they are stored in some publicly available views :
- DATABASE_EXPORT_OBJECTS
- SCHEMA_EXPORT_OBJECTS
- TABLE_EXPORT_OBJECTS
We’re doing a SCHEMA import so we can check to see that the relevant object_path is available to us by :
select object_path, comments from schema_export_objects where object_path in ('TABLE', 'SEQUENCE') / OBJECT_PATH COMMENTS -------------------- -------------------------------------------------------------------------------- SEQUENCE Sequences in the selected schemas and their dependent grants and audits TABLE Tables in the selected schemas and their dependent objects
The Path Map looks to be a flattened hierarchy ( possibly an XML representation). The point here is that, by specifying a node in this hierarchy, you can persuade DBMS_DATAPUMP to do your bidding.
As both TABLE and SEQUENCE are nodes in the object_path, we should be able to use that here…
... dbms_datapump.metadata_filter ( handle => the_datapump_job_handle, name => 'INCLUDE_PATH_EXPR', value => q'[IN ('TABLE', 'SEQUENCE')]' ); ...
This will give us everything at the level at and below the TABLE and SEQUENCE nodes. Therefore, you will also get INDEXES for the tables…as well as Triggers.
select object_path, comments from schema_export_objects where object_path like '%TABLE%TRIGGER%' / OBJECT_PATH COMMENTS ------------------------------ ------------------------------ SCHEMA_EXPORT/TABLE/TRIGGER Triggers TABLE/TRIGGER Triggers
Remember, we don’t want any PL/SQL program units, so we need to filter these out. Fortunately, calls to the METADATA_FILTER procedure seem to be addative. As TRIGGER appears below TABLE in the hierarchy, we can filter them out with an additional call to the procedure :
... dbms_datapump.metadata_filter ( handle => the_datapump_job_handle, name => 'EXCLUDE_PATH_EXPR', value => q'[= 'TRIGGER']' ); ...
The finished HR Import Script
Here it is then, a script to Import Tables and Sequences from HR to HR_DEV :
set serveroutput on size unlimited declare -- -- Import of HR objects to HR_DEV -- Only importing Tables and Sequences -- l_dph number; l_predicate varchar2(4000); begin -- Setup the job "context" as usual l_dph := dbms_datapump.open ( operation => 'IMPORT', job_mode => 'SCHEMA', job_name => 'IMP_HR_DEV' ); -- We only want objects in the HR schema dbms_datapump.metadata_filter ( handle => l_dph, name => 'SCHEMA_EXPR', value => q'[= 'HR']' ); -- and then only Sequences and Tables dbms_datapump.metadata_filter ( handle => l_dph, name => 'INCLUDE_PATH_EXPR', value => q'[IN ('TABLE', 'SEQUENCE')]' ); -- ...but no triggers... dbms_datapump.metadata_filter ( handle => l_dph, name => 'EXCLUDE_PATH_EXPR', value => q'[= 'TRIGGER']' ); -- and we want to import these objects into the HR_DEV schema dbms_datapump.metadata_remap ( handle => l_dph, name => 'REMAP_SCHEMA', old_value => 'HR', value => 'HR_DEV' ); -- -- If we find a table that already exists in the target -- schema then overwrite it.. -- dbms_datapump.set_parameter ( handle => l_dph, name => 'TABLE_EXISTS_ACTION', value => 'REPLACE' ); -- Now point to the export dump file to take this from dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file ); -- ...and make sure we log what's going on... dbms_datapump.add_file ( handle => l_dph, filename => 'hr_dev_imp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- log the start time... dbms_datapump.log_entry ( handle => l_dph, message => 'Job starting at '||to_char(sysdate, 'HH24:MI:SS') ); -- start the job... dbms_datapump.start_job( handle => l_dph); -- and detach... dbms_datapump.detach(handle => l_dph); end; /
Now to give it a whirl ( saved as hr_dev_imp.sql)….
SQL> @hr_dev_imp.sql PL/SQL procedure successfully completed. SQL>
Looking at the logfile, all appears well…
;;; Job starting at 18:15:29 Master table "MIKE"."IMP_HR_DEV" successfully loaded/unloaded Starting "MIKE"."IMP_HR_DEV": Processing object type DATABASE_EXPORT/SCHEMA/SEQUENCE/SEQUENCE Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA . . imported "HR_DEV"."COUNTRIES" 6.367 KB 25 rows . . imported "HR_DEV"."DEPARTMENTS" 7.007 KB 27 rows . . imported "HR_DEV"."EMPLOYEES" 16.80 KB 107 rows . . imported "HR_DEV"."JOBS" 6.992 KB 19 rows . . imported "HR_DEV"."JOB_HISTORY" 7.054 KB 10 rows . . imported "HR_DEV"."LOCATIONS" 8.273 KB 23 rows . . imported "HR_DEV"."REGIONS" 5.476 KB 4 rows Processing object type DATABASE_EXPORT/SCHEMA/TABLE/INDEX/INDEX Processing object type DATABASE_EXPORT/SCHEMA/TABLE/CONSTRAINT/CONSTRAINT Processing object type DATABASE_EXPORT/SCHEMA/TABLE/INDEX/STATISTICS/INDEX_STATISTICS Processing object type DATABASE_EXPORT/SCHEMA/TABLE/COMMENT Processing object type DATABASE_EXPORT/SCHEMA/TABLE/CONSTRAINT/REF_CONSTRAINT Processing object type DATABASE_EXPORT/SCHEMA/TABLE/STATISTICS/TABLE_STATISTICS Job "MIKE"."IMP_HR_DEV" successfully completed at 18:16:15
If we now connect as HR_DEV and check the objects we have in our schema :
select object_name, object_type from user_objects order by object_type, object_name / OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------------------ COUNTRY_C_ID_PK INDEX DEPT_ID_PK INDEX DEPT_LOCATION_IX INDEX EMP_DEPARTMENT_IX INDEX EMP_EMAIL_UK INDEX EMP_EMP_ID_PK INDEX EMP_JOB_IX INDEX EMP_MANAGER_IX INDEX EMP_NAME_IX INDEX JHIST_DEPARTMENT_IX INDEX JHIST_EMPLOYEE_IX INDEX JHIST_EMP_ID_ST_DATE_PK INDEX JHIST_JOB_IX INDEX JOB_ID_PK INDEX LOC_CITY_IX INDEX LOC_COUNTRY_IX INDEX LOC_ID_PK INDEX LOC_STATE_PROVINCE_IX INDEX REG_ID_PK INDEX DEPARTMENTS_SEQ SEQUENCE EMPLOYEES_SEQ SEQUENCE LOCATIONS_SEQ SEQUENCE COUNTRIES TABLE DEPARTMENTS TABLE EMPLOYEES TABLE JOBS TABLE JOB_HISTORY TABLE LOCATIONS TABLE REGIONS TABLE 29 rows selected. SQL>
Sequences are special
There is one thing to be aware of with importing sequences.
Most database objects ( procedure, packages, triggers, views etc), can be overwritten by using CREATE OR REPLACE.
Like tables, you cannot do this with Sequences.
Unlike tables, DBMS_DATAPUMP does not have an option to re-create existing sequences.
This means that, if we were to refresh the HR_DEV schema again using the same script, we’d be likely to run into a bit of a problem, as you can see from this log file entry :
... Processing object type DATABASE_EXPORT/SCHEMA/SEQUENCE/SEQUENCE ORA-31684: Object type SEQUENCE:"HR_DEV"."LOCATIONS_SEQ" already exists ORA-31684: Object type SEQUENCE:"HR_DEV"."DEPARTMENTS_SEQ" already exists ORA-31684: Object type SEQUENCE:"HR_DEV"."EMPLOYEES_SEQ" already exists ...
With this in mind, it’s probably a good idea to drop any sequences prior to refreshing with an import…
set serveroutput on size unlimited begin for r_seq in ( select sequence_name from dba_sequences where sequence_owner = 'HR_DEV' ) loop -- -- check the sequence name is "clean" -- if regexp_instr( replace( r_seq.sequence_name, '_'), '[[:punct:]]|[[:space:]]') > 0 then raise_application_error( -20000, 'Sequence name contains dodgy characters.'); end if; -- drop the sequence execute immediate 'drop sequence hr_dev.'||r_seq.sequence_name; dbms_output.put_line('Sequence '||r_seq.sequence_name||' dropped.'); end loop; end; /
If we run this we’ll get…
SQL> @drop_sequences.sql Sequence DEPARTMENTS_SEQ dropped. Sequence EMPLOYEES_SEQ dropped. Sequence LOCATIONS_SEQ dropped. PL/SQL procedure successfully completed. SQL>
Specifying a subset of data
Sometimes you don’t want to export/import everything. You might just want a few tables with a subset of data.
In our HR_DEV environment we want to focus on the EMPLOYEES table. We don’t want all of the rows – let’s just have one department…
Specifying which tables to use
Once again, METADATA_FILTER can be used here :
dbms_datapump.metadata_filter ( handle => the_datapump_job_handle, name => 'NAME_LIST', value => q'['EMPLOYEES']', object_path => 'TABLE' );
NAME_LIST tells datapump to look for object names rather than in object type paths.
Specifying the object type path as TABLE means that datapump will only look for a table called EMPLOYEES and not any other type of object with the same name.
Getting a subset of data
Here, we need to use the DATA_FILTER procedure. Unlike it’s METADATA counterpart, you need to provide a syntactically correct predicate for it to work…
... dbms_datapump.data_filter ( handle => the_datapump_job_handle, name => 'SUBQUERY', value => 'where department_id = 20', table_name => 'EMPLOYEES', schema_name => 'HR' ); ...
The Data Sub-set Import Script
Before running this, I’ve taken the precaution of dropping all of the objects from the HR_DEV schema as I don’t want to run into any pesky constraint errors…
set serveroutput on size unlimited declare -- -- import Subset of employees into HR_DEV ( having cleared down the schema first) -- l_dph number; begin l_dph := dbms_datapump.open ( operation => 'IMPORT', job_mode => 'SCHEMA', job_name => 'IMP_EMPLOYEES' ); -- We only want objects in the HR schema dbms_datapump.metadata_filter ( handle => l_dph, name => 'SCHEMA_EXPR', value => q'[= 'HR']' ); -- only TABLES... dbms_datapump.metadata_filter ( handle => l_dph, name => 'INCLUDE_PATH_EXPR', value => q'[= 'TABLE']' ); -- and then only the EMPLOYEES table dbms_datapump.metadata_filter ( handle => l_dph, name => 'NAME_LIST', value => q'['EMPLOYEES']', object_path => 'TABLE' ); -- without any triggers or ref constraints dbms_datapump.metadata_filter ( handle => l_dph, name => 'EXCLUDE_PATH_EXPR', value => q'[IN ('TRIGGER', 'REF_CONSTRAINT')]' ); -- subset of EMPLOYEES dbms_datapump.data_filter ( handle => l_dph, name => 'SUBQUERY', value => 'where department_id = 20', table_name => 'EMPLOYEES', schema_name => 'HR' ); dbms_datapump.metadata_remap ( handle => l_dph, name => 'REMAP_SCHEMA', old_value => 'HR', value => 'HR_DEV' ); -- -- If we find a table that already exists in the target -- schema then overwrite it.. -- dbms_datapump.set_parameter ( handle => l_dph, name => 'TABLE_EXISTS_ACTION', value => 'REPLACE' ); -- -- Now point to the export dump file to take this from -- dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file ); -- -- ...and make sure we log what's going on... -- dbms_datapump.add_file ( handle => l_dph, filename => 'hr_dev_employees_imp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- log the start time... dbms_datapump.log_entry ( handle => l_dph, message => 'Job starting at '||to_char(sysdate, 'HH24:MI:SS') ); -- -- start the job... -- dbms_datapump.start_job( handle => l_dph); -- -- and detach... -- dbms_datapump.detach(handle => l_dph); end; /
Notice that I’ve also specifically excluded the REF_CONSTRAINTS and TRIGGERS from the import.
Run this and we get the following output in the logfile :
;;; Job starting at 18:35:49 Master table "MIKE"."IMP_EMPLOYEES" successfully loaded/unloaded Starting "MIKE"."IMP_EMPLOYEES": Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA . . imported "HR_DEV"."EMPLOYEES" 16.80 KB 2 out of 107 rows Processing object type DATABASE_EXPORT/SCHEMA/TABLE/INDEX/INDEX Processing object type DATABASE_EXPORT/SCHEMA/TABLE/CONSTRAINT/CONSTRAINT Processing object type DATABASE_EXPORT/SCHEMA/TABLE/INDEX/STATISTICS/INDEX_STATISTICS Processing object type DATABASE_EXPORT/SCHEMA/TABLE/COMMENT Processing object type DATABASE_EXPORT/SCHEMA/TABLE/STATISTICS/TABLE_STATISTICS Job "MIKE"."IMP_EMPLOYEES" successfully completed at 18:36:33
We can see that only the EMPLOYEES table and it’s associated indexes have been imported :
select object_name, object_type from user_objects order by object_type, object_name 4 / OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------------------ EMP_DEPARTMENT_IX INDEX EMP_EMAIL_UK INDEX EMP_EMP_ID_PK INDEX EMP_JOB_IX INDEX EMP_MANAGER_IX INDEX EMP_NAME_IX INDEX EMPLOYEES TABLE 7 rows selected. SQL>
…and no Foreign Key constraints :
SQL> select table_name, constraint_name, constraint_type 2 from user_constraints; TABLE_NAME CONSTRAINT_NAME C ------------------------------ ------------------------------ - EMPLOYEES EMP_LAST_NAME_NN C EMPLOYEES EMP_EMAIL_NN C EMPLOYEES EMP_HIRE_DATE_NN C EMPLOYEES EMP_JOB_NN C EMPLOYEES EMP_EMP_ID_PK P EMPLOYEES EMP_EMAIL_UK U EMPLOYEES EMP_SALARY_MIN C 7 rows selected. SQL>
and only those EMPLOYEES in DEPARTMENT_ID 20…
select last_name, department_id from employees / LAST_NAME DEPARTMENT_ID ------------------------- ------------- Hartstein 20 Fay 20 SQL>
DDL_ONLY Datapump operations
Once again the DATA_FILTER procedure comes in here. However, this time, the call is a bit different :
... dbms_datapump.data_filter ( handle => the_datapump_job_handle, name => 'INCLUDE_ROWS', value => 0 ); ...
Here the value is effectively a boolean – 0 = false i.e. don’t include rows. The default is 1 – do include rows.
Incidentally, this time I’ve dropped the user HR_DEV altogether before importing as it will be re-created by the Import if it does not already exist.
This next script will import all of the DDL from HR to HR_DEV, but will not import any data…
set serveroutput on size unlimited declare -- -- Import of HR objects to HR_DEV -- This time all DDL, but no data -- l_dph number; begin l_dph := dbms_datapump.open ( operation => 'IMPORT', job_mode => 'SCHEMA', job_name => 'IMP_DDL_HR_DEV' ); -- We only want objects in the HR schema dbms_datapump.metadata_filter ( handle => l_dph, name => 'SCHEMA_EXPR', value => q'[= 'HR']' ); -- but no data dbms_datapump.data_filter ( handle => l_dph, name => 'INCLUDE_ROWS', value => 0 ); -- -- and we want to import these objects into the HR_DEV schema -- dbms_datapump.metadata_remap ( handle => l_dph, name => 'REMAP_SCHEMA', old_value => 'HR', value => 'HR_DEV' ); -- If we find a table that already exists in the target -- schema then overwrite it.. dbms_datapump.set_parameter ( handle => l_dph, name => 'TABLE_EXISTS_ACTION', value => 'REPLACE' ); -- Now point to the export dump file to take this from dbms_datapump.add_file ( handle => l_dph, filename => 'my_full_export.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file ); -- ...and make sure we log what's going on... dbms_datapump.add_file ( handle => l_dph, filename => 'hr_dev_ddl_imp.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file, reusefile => 1 ); -- log the start time... dbms_datapump.log_entry ( handle => l_dph, message => 'Job starting at '||to_char(sysdate, 'HH24:MI:SS') ); -- start the job... dbms_datapump.start_job( handle => l_dph); -- and detach... dbms_datapump.detach(handle => l_dph); end; /
After running this we find that the HR_DEV user has been created. However, you will need to connect using the password of the HR user included in the export.
We can see now that all of the HR objects have been imported into HR_DEV :
select object_name, object_type from user_objects order by object_type, object_name / OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------------------ ADD_JOB_HISTORY PROCEDURE COUNTRIES TABLE COUNTRY_C_ID_PK INDEX DEPARTMENTS TABLE DEPARTMENTS_SEQ SEQUENCE DEPT_ID_PK INDEX DEPT_LOCATION_IX INDEX EMPLOYEES TABLE EMPLOYEES_SEQ SEQUENCE EMP_DEPARTMENT_IX INDEX EMP_DETAILS_VIEW VIEW EMP_EMAIL_UK INDEX EMP_EMP_ID_PK INDEX EMP_JOB_IX INDEX EMP_MANAGER_IX INDEX EMP_NAME_IX INDEX JHIST_DEPARTMENT_IX INDEX JHIST_EMPLOYEE_IX INDEX JHIST_EMP_ID_ST_DATE_PK INDEX JHIST_JOB_IX INDEX JOBS TABLE JOB_HISTORY TABLE JOB_ID_PK INDEX LOCATIONS TABLE LOCATIONS_SEQ SEQUENCE LOC_CITY_IX INDEX LOC_COUNTRY_IX INDEX LOC_ID_PK INDEX LOC_STATE_PROVINCE_IX INDEX REGIONS TABLE REG_ID_PK INDEX SECURE_DML PROCEDURE SECURE_EMPLOYEES TRIGGER UPDATE_JOB_HISTORY TRIGGER USER_FK_TREE_VW VIEW 35 rows selected. SQL>
…and just to prove that we’ve imported no data at all…
SQL> select count(*) from regions; COUNT(*) ---------- 0 SQL> select count(*) from locations; COUNT(*) ---------- 0 SQL> select count(*) from departments; COUNT(*) ---------- 0 SQL> select count(*) from jobs; COUNT(*) ---------- 0 SQL> select count(*) from job_history; COUNT(*) ---------- 0 SQL> select count(*) from employees; COUNT(*) ---------- 0 SQL> select count(*) from countries; COUNT(*) ---------- 0 SQL>
Killing a runaway job
There will be times when things don’t quite work as expected. This is especially true if you’re playing around with Datapump parameter settings.
Just occasionally when testing the code in this post I’ve come across some variation on the theme of …
ERROR at line 1: ORA-31634: job already exists ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79 ORA-06512: at "SYS.DBMS_DATAPUMP", line 1137 ORA-06512: at "SYS.DBMS_DATAPUMP", line 5283 ORA-06512: at line 11
Yep, I’v fixed my runtime error, re-executed the scrpt and got this insistent little message.
The first thing to do when you get this ( especially if you’re running in the same session that you started the original job) is to logout and log back in again.
The second thing to do is to check to see if the job is still hanging about…
user select job_name, state from user_datapump_jobs /
If so, then you can attach to the job and stop it.
If I’m connected as the job owner and I’m only running the one job, the following script should do the trick…
set serveroutput on size unlimited declare l_dph number; l_job_name user_datapump_jobs.job_name%type; begin select job_name into l_job_name from user_datapump_jobs; dbms_output.put_line('Killing '||l_job_name); l_dph := dbms_datapump.attach( job_name => l_job_name, job_owner => user); dbms_datapump.stop_job( l_dph); dbms_output.put_line('Job '||l_job_name||' terminated.'); exception when no_data_found then dbms_output.put_line('Job has already terminated.'); end; /
This should work almost every time. On the off-chance it doesn’t, and the job does not appear in the USER_DATAPUMP_JOBS view anymore, then it’s possible that the temporary table created as part of the job has not been tidied up.
Remember, the temporary table is created in the job owner’s schema with the same name as the datapump job itself.
So, if I were to hit this problem whilst running the job IMP_HR_DEV, I could check :
select 1 from user_tables where table_name = 'IMP_HR_DEV' /
If the table does still exist and there is no entry in the _DATAPUMP_JOB view, we can simply drop it :
drop table imp_hr_dev /
This should finally persuade datapump that the job is not in fact running.
A couple of Enterprise Edition Features
For those of you on a richer diet, there are a couple of further morsels which may well be worthy of consideration.
If you want to speed up your Datapump job by making use of the CPU cores available you could try :
... dbms_datapump.set_parallel ( handle => the_datapump_job_handle, degree => 32 ); ...
If you’ve really pushed the boat out and have the partitioning option, you can tell datapump to take only specific partitions for tables.
For example, say you have a number of tables that are partitioned by date ranges.
The tables may be partitioned by quarter-year, with the partitions following a consistend naming convention (e.g. Q12014, Q22014 etc).
If we only want to export/import the latest partition for each of these tables( say Q22014), we can do something like this :
... for r_part_tabs in ( select table_name from user_tab_partitions where partition_name = 'Q22014' ) loop dbms_datapump.data_filter ( handle => the_datapump_job_handle, name => 'PARTITION_EXPR', value => q'[= 'Q22014']', table_name => r_part_tabs.table_name ); end loop ...
Despite having gorged ourselves on this feast of DBMS_DATAPUMP delights, it’s fair to say that I’ve by no means covered everything.
Doubtless you will want to do things with this API that I’ve not covered here. Indeed, you’ll probably also find better ways of implementing some of the same functionality.
In the meantime, I’m off for a digestif.
Bon Apetite.
How do you exclude the jobs?
Below is the sql and if my source schema has values on this and I need to skip it for the exp, thanks
select owner from dba_scheduled_jobs
Dennis,
If you look in database_export_objects, there are some paths that refer to JOBS :
So, assuming you want to exclude all of the scheduler jobs from a schema export, something like this might do the job :
HTH,
Mike