Friday 11 June 2010

How to stop running RMAN jobs in OEM Grid Control

Life became very easy after Oracle's invention of OEM Grid Control. That is what Oracle promised us when they invented it. A couple of months ago one of my colleagues asked me to schedule backup jobs. In the past I made very nice OS scripts in order to make a backup. But now with OEM Grid Control being available for quite some time, I thought let's try making backups using OEM Grid Control.

 
 

And yes it works fine. Grid Control makes quite interesting RMAN scripts. You can schedule these RMAN scripts. At one glimpse you can see all your backup jobs and the status of these backup jobs in the job activity list. Also you can see if these backup jobs have successfully run. For script kiddies OEM Grid Control is bad news, because it makes scripts for you. But if you like to be wizard kiddy, you feel to be in heaven.

 
 

But after some while
, life with OEM Grid Control turned out not so nice. After some proper backup runs, I found a backup job which remains status running. The next day a new backup job for the same database was automatically started. But this job got immediately status "1 problems".  It turned out that the job was not started because Grid Control says: "An execution in one of the previous runs of the job was still running."

 
 

So I thought: let's stop the running job. So I did. Then Grid Control told me: "The job execution was stopped successfully. Currently running steps will not be stopped." So I thought: life is easy again. But it turned out that this was not true. The running job got status "Stop Pending" and remained this status.

 
 

So I thought: let's kill the running step. So I did. But then Grid Control says: "The step was not killed because it has already completed." But the job remains in status running.

 
 

So I thought: let's delete the job. (what else should you do if OEM Grid Control refuses to listen). So I did, but then OEM Grid Control says: "The specified job, job run or execution is still active. It must finish running, or be stopped before it can be deleted. Filter on status 'Active' to see active executions."

 
 

Doom scenarios as: I will never be able to make a backup anymore of this database, came in  my mind." But finally Oracle Support send me this script:

DECLARE
jguid RAW(16);
BEGIN
SELECT job_id
INTO jguid
FROM mgmt_job
WHERE job_name = '<name of your job>'
AND job_owner = '<owner of your job>'
;
mgmt_job_engine.stop_all_executions_with_id(jguid,TRUE);
COMMIT;
END;

You have to run this script under user sysman on the OEM Grid Control repository database. You can find the name and owner of the job in the job activity list. Using this script I was able to solve my problem. The running job was deleted.

However I had to schedule a new backup job for this database again. This script deletes all runs of this job and so also the next occurrences of this job.

So if you ever run in a similar kind of problem then you can solve it by running this script and schedule a new job again.

Now life is easy again using OEM Grid Control, but not that easy as they promised us at first. 

It is not clear to me why sometimes a job remains in status running.

2 comments:

Pavel said...

Thanks, this helped me!

That Compost Guy said...

We have found that you can do a Stop on the job that's running, then a Force Stop on it and it will stop that job and leave any scheduled jobs in the queue.