Friday, 13 January 2012


I was tasked to move an entire small database from 1 server to another and finish the task ASAP. The reason was the site will be down until the task is completed. A quick test of exporting, coping files to remote server then importing had the following numbers:
Export expdp with parallel = 16, generating 1 file only: 13 mins
Copying 12 GB to remote server: 10 mins
Import with parallel = 16: 45 mins
Total Time: 1 hour and 8 minutes.
Way too much for a downtime in our environment.


Here are the commands we initially used:

expdp username/password directory=dump_dir dumpfile=full.dmp logfile=full.log full=y parallel=16
impdp username/password directory=dump_dir dumpfile=full.dmp logfile=full.log parallel=16
In our attempt to tune this process, we introduced exporting in parallel to multiple files and the results were amazing
expdp username/password directory=dump_dir filesize=1Gdumpfile=full%U.dmp logfile=fulllog parallel=16
Execution time: about 2 minutes
We figured the import will be improve significantly as well, but we were a bit disappointed

impdp username/password directory=dump_dir dumpfile=full%U.dmp logfile=full.log parallel=16
Execution time: about 30 minutes. Better, but not as expected improvement.
We noticed that during import operation, the process takes a lot of time during creating indexes, and the first thing came to our mind is to increase the pga_aggregate_target to big number.
After increasing pga_aggregate_target to 6 GB, performance of import came down to 14 minutes.
Now total time of entire operation: 26 minutes down from 1:08 hours, not bad
 
We can still improve by copying files in parallel between servers, but this wasn’t needed.

(added Jun 26, 2011)
Tip: This is another tip that will improve importing time. When importing using Data Pump, exclude statistics and then use dbms_stats package to calculate statistics in parallel after Data Pump finishes importing. Importing using Data Pump is done serially (one segment after another, not in parallel) even when you use the parallel option, Data Pump will still have 1 thread only calculating statistics.
Here is an example:
impdp username/password EXCLUDE=STATISTICS other options …

No comments: