Friday 21 December 2012

Archive REDO Log Generated Half the size REDO Logs in 11.2 Database


Recently my customer Upgraded from 9.2.0.8 to 11.2.0.3 and noticed that Arch logs are being generated were only 100M (Half the size of REDO Logs) in size and also Checkpointing (redo switching) is happening every 2-3 mins even with LOW ACTIVITY
>> 11.2 Setup was >>
- SGA Size: 38 GB
- CPU_COUNT=16 ( no of CPU’s = 16)
- LOG_BUFFER = 80MB
- Redo logs size = 200M .
- MTTR target = 300 ( 5 mins )
Explanation >> Archived redolog is (significant) smaller than the redologfile. [ID 1356604.1]))
Cause
There are 2 possible causes for this :
1. Documented and designed behaviour due to explicit forcing an archive creation before the redolog file is full
* SQL> alter system switch logfile;
* SQL> alter system archive log current;
* RMAN> backup archivelog all;
* RMAN> backup database plus archivelog;
Explanation :
The archive logs do not have to be even in size. This was decided a very long time ago,
when blank padding the archive logs was stopped, for a very good reason – in order to save disk space.
* The log switch does not occur when a redo log file is 100% full. There is an internal algorithm
that determines the log switch moment. This also has a very good reason – doing the log switch
at the last moment could incur performance problems (for various reasons, out of the scope of this note).
As a result, after the log switch occurs, the archivers are copying only the actual information from the
redo log files. Since the redo logs are not 100% full after the log switch and the archive logs are
not blank padded after the copy operation has finished, this results in uneven, smaller files than
the original redo log files.
There are a number of factors which combine to determine the log
switch frequency. These are the most relevant factors in this case:
a) RDBMS parameter LOG_BUFFER_SIZE
If this is not explicitly set by the DBA then we use a default;
at instance startup the RDBMS  calculates the number of shared redo
strands as ncpus/16, and the size of each strand is 128Kb * ncpus
(where ncpus is the number of CPUs in the system). The log buffer
size is the number of stands multiplied by the strand size.
The calculated or specified size is rounded up to a multiple of the granule size
of a memory segment in the SGA. For 11.2 if
SGA size >= 128GB then granule size is 512MB
64GB <= SGA size < 128GB then granule size is 256MB
32GB <= SGA size < 64GB then granule size is 128MB
16GB <= SGA size < 32GB then granule size is 64MB
8GB <= SGA size < 16GB then granule size is 32MB
1GB <= SGA size < 8GB then granule size is 16MB
SGA size < 1GB then granule size is 4MB
There are some minimums and maximums enforced.
b) System load
Initially only one redo strand is used, ie the number of “active”
redo strands is 1, and all the processes copy their redo into
that one strand. When/if there is contention for that strand then
the number of active redo strands is raised to 2. As contention
for the active strands increases, the number of active strands
increases. The maxmum possible number of active redo strands is
the number of strands initially allocated in the log buffer.
(This feature is called “dynamic strands”, and there is a hidden
parameter to disable it which then allows processes to use all
the strands from the outset).
The related hidden paramter is : _log_parallelism_dynamic
See unpublished Note 457966.1 About _log_parallelism_dynamic and _log_parallelism_max[This section is not visible to customers.]
c) Log file size
This is the logfile size decided by the DBA when the logfiles are created.
d) The logfile space reservation algorithm
When the RDBMS switches into a new online redo logfile, all the
log buffer redo strand memory is “mapped” to the logfile space.
If the logfile is larger than the log buffer then each strand
will map/reserve its strand size worth of logfile space, and the
remaining logfile space (the “log residue”) is still available.
If the logfile is smaller than the log buffer, then the whole
logfile space is divided/mapped/reserved equally among all the
strands, and there is no unreserved space (ie no log residue).
When any process fills a strand such that all the reserved
underlying logfile space for that strand is used, AND there is
no log residue, then a log switch is scheduled.
Example : 128 CPU’s so the RDBMS allocates a     log_buffer of size 128Mb containing 8 shared strands of size 16Mb.
It may be a bit larger than 128Mb as it rounds up to an SGA granule boundary.
The logfiles are 100Mb, so when the RDBMS switches into a
new online redo logfile each strand reserves 100Mb/8 = 25600 blocks
and there is no log residue. If there is low system load, only one
of the redo strands will be active/used and when 25600 blocks of
that strand are filled then a log switch will be scheduled – the created
archive logs have a size around 25600 blocks.
>>>  100MB / 8 = 12.5 MB = 12800 KB = 13107200 Bytes = 13107200 / 512 bytes (redo block size) = 25600 blocks
With everything else staying the same (128 cpu’s and low load),
using a larger logfile would not really reduce the amount of
unfilled space when the log switches are requested, but it would
make that unfilled space less significant as a percentage of the
total logfile space, eg
1))    – with a 100Mb logfile, the log switch happens with 7 x 16Mb logfile space unfilled (ie the logfile is 10% full when the log switch is requested)
>>> Reason: Is that since we each of the ’8′ redo strands which use 12.5 MB (100MB/8) of space and since each strand if ’16MB’ in size we basically have no space left on redo log file ( 16 MB*8 strands = 128 MB which is >> 100 MB Redo log file size )
>> This means that as soon as ’12.5 MB’ of space is used up (25600 blocks) on 1st active strand (even though other strands are not active) , log file has to be switched
>> reason being that we preallocate redo strand space on redo log file and so the once the 1st active strand is filled up we can’t use the any more space as other strands (even though not active) have already blocked that space..
Hence we have ‘UNUSED’ Space on REDO LOG of 100 MB = 100 – 12.5 = 87.5 MB but as the active strand cannot grow anymore we switch logfile …Hence we get a smaller
archive log even when redo log is only 10% used up.
2))     – with a 1Gb logfile, the log switch would happen with 7 x 16Mb
logfile space unfilled (ie the logfile is 90% full when the
log switch is requested)
>>> Reason: In this case with 8 strands of 16 MB EACH we only block reserve 16*8 – 128 MB on the redo log file and so we still have (1GB – 128 MB) = 896 MB free
And so when the only active strand gets filled up (to its max size- 16MB) , we still have 896 MB free on the redo log file and not being used by any other redo strand
SO THIS Active strand can grow upto the available space (1024 MB – 16 MB  * 7 unused strands = 912 MB ) and so REDO LOG can fill to 912 MB in effect before it switches logfile
>>> Which is 90% of 1 GB
With a high CPU_COUNT, a low load and a redo log file size smaller than
the redolog buffer, you may see small archived log files because of log switches
at about 1/8 of the size of the define log file size.
This is because CPU_COUNT defines the number of redo strands (ncpus/16).
With a low load only a single strand may be used. With redo log file size smaller
than the redolog buffer, the log file space is divided over the available strands.
When for instance only a single active strand is used, a log switch can already occur
when that strand is filled.

Friday 14 December 2012

Resolving Gaps in Data Guard Apply Using Incremental RMAN BAckup


Resolving Gaps in Data Guard Apply Using Incremental RMAN BAckup


Content from http://arup.blogspot.co.uk/2009/12/resolving-gaps-in-data-guard-apply.html

Recently, we had a glitch on a Data Guard (physical standby database) on infrastructure. This is not a critical database; so the monitoring was relatively lax. And that being done by an outsourcer does not help it either. In any case, the laxness resulted in a failure remaining undetected for quite some time and it was eventually discovered only when the customer complained. This standby database is usually opened for read only access from time to time.This time, however, the customer saw that the data was significantly out of sync with primary and raised a red flag. Unfortunately, at this time it had become a rather political issue.

Since the DBA in charge couldn’t resolve the problem, I was called in. In this post, I will describe the issue and how it was resolved. In summary, there are two parts of the problem:

(1) What happened
(2) How to fix it

What Happened

Let’s look at the first question – what caused the standby to lag behind. First, I looked for the current SCN numbers of the primary and standby databases. On the primary:

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
1447102

On the standby:

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
1301571

Clearly there is a difference. But this by itself does not indicate a problem; since the standby is expected to lag behind the primary (this is an asynchronous non-real time apply setup). The real question is how much it is lagging in the terms of wall clock. To know that I used the scn_to_timestamp function to translate the SCN to a timestamp:

SQL> select scn_to_timestamp(1447102) from dual;

SCN_TO_TIMESTAMP(1447102)
-------------------------------
18-DEC-09 08.54.28.000000000 AM

I ran the same query to know the timestamp associated with the SCN of the standby database as well (note, I ran it on the primary database, though; since it will fail in the standby in a mounted mode):

SQL> select scn_to_timestamp(1301571) from dual;

SCN_TO_TIMESTAMP(1301571)
-------------------------------
15-DEC-09 07.19.27.000000000 PM

This shows that the standby is two and half days lagging! The data at this point is not just stale; it must be rotten.

The next question is why it would be lagging so far back in the past. This is a 10.2 database where FAL server should automatically resolved any gaps in archived logs. Something must have happened that caused the FAL (fetch archived log) process to fail. To get that answer, first, I checked the alert log of the standby instance. I found these lines that showed the issue clearly:


Fri Dec 18 06:12:26 2009
Waiting for all non-current ORLs to be archived...
Media Recovery Waiting for thread 1 sequence 700
Fetching gap sequence in thread 1, gap sequence 700-700
… 
Fri Dec 18 06:13:27 2009
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 700-700
DBID 846390698 branch 697108460
FAL[client]: All defined FAL servers have been attempted.

Going back in the alert log, I found these lines:

Tue Dec 15 17:16:15 2009
Fetching gap sequence in thread 1, gap sequence 700-700
Error 12514 received logging on to the standby
FAL[client, MRP0]: Error 12514 connecting to DEL1 for fetching gap sequence
Tue Dec 15 17:16:15 2009
Errors in file /opt/oracle/admin/DEL2/bdump/del2_mrp0_18308.trc:
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
Tue Dec 15 17:16:45 2009
Error 12514 received logging on to the standby
FAL[client, MRP0]: Error 12514 connecting to DEL1 for fetching gap sequence
Tue Dec 15 17:16:45 2009
Errors in file /opt/oracle/admin/DEL2/bdump/del2_mrp0_18308.trc:
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

This clearly showed the issue. On December 15th at 17:16:15, the Managed Recovery Process encountered an error while receiving the log information from the primary. The error was ORA-12514 “TNS:listener does not currently know of service requested in connect descriptor”. This is usually the case when the TNS connect string is incorrectly specified. The primary is called DEL1 and there is a connect string called DEL1 in the standby server.

The connect string works well. Actually, right now there is no issue with the standby getting the archived logs; so there connect string is fine - now. The standby is receiving log information from the primary. There must have been some temporary hiccups causing that specific archived log not to travel to the standby. If that log was somehow skipped (could be an intermittent problem), then it should have been picked by the FAL process later on; but that never happened. Since the sequence# 700 was not applied, none of the logs received later – 701, 702 and so on – were applied either. This has caused the standby to lag behind since that time.

So, the fundamental question was why FAL did not fetch the archived log sequence# 700 from the primary. To get to that, I looked into the alert log of the primary instance. The following lines were of interest:


Tue Dec 15 19:19:58 2009
Thread 1 advanced to log sequence 701 (LGWR switch)
Current log# 2 seq# 701 mem# 0: /u01/oradata/DEL1/onlinelog/o1_mf_2_5bhbkg92_.log
Tue Dec 15 19:20:29 2009Errors in file /opt/oracle/product/10gR2/db1/admin/DEL1/bdump/del1_arc1_14469.trc:
ORA-00308: cannot open archived log '/u01/oraback/1_700_697108460.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
Tue Dec 15 19:20:29 2009
FAL[server, ARC1]: FAL archive failed, see trace file.
Tue Dec 15 19:20:29 2009
Errors in file /opt/oracle/product/10gR2/db1/admin/DEL1/bdump/del1_arc1_14469.trc:
ORA-16055: FAL request rejected
ARCH: FAL archive failed.
Archiver continuing
Tue Dec 15 19:20:29 2009
ORACLE Instance DEL1 - Archival Error. Archiver continuing.

These lines showed everything clearly. The issue was:

ORA-00308: cannot open archived log '/u01/oraback/1_700_697108460.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory


The archived log simply was not available. The process could not see the file and couldn’t get it across to the standby site.

Upon further investigation I found that the DBA actually removed the archived logs to make some room in the filesystem without realizing that his action has removed the most current one which was yet to be transmitted to the remote site. The mystery surrounding why the FAL did not get that log was finally cleared.

Solution

Now that I know the cause, the focus was now on the resolution. If the archived log sequence# 700 was available on the primary, I could have easily copied it over to the standby, registered the log file and let the managed recovery process pick it up. But unfortunately, the file was gone and I couldn’t just recreate the file. Until that logfile was applied, the recovery will not move forward. So, what are my options?

One option is of course to recreate the standby - possible one but not technically feasible considering the time required. The other option is to apply the incremental backup of primary from that SCN number. That’s the key – the backup must be from a specific SCN number. I have described the process since it is not very obvious. The following shows the step by step approach for resolving this problem. I have shown where the actions must be performed – [Standby] or [Primary].

1. [Standby] Stop the managed standby apply process:

SQL> alter database recover managed standby database cancel;

Database altered.

2. [Standby] Shutdown the standby database

3. [Primary] On the primary, take an incremental backup from the SCN number where the standby has been stuck:

RMAN> run { 
2> allocate channel c1 type disk format '/u01/oraback/%U.rmb'; 
3> backup incremental from scn 1301571 database;
4> }


using target database control file instead of recovery catalog
allocated channel: c1
channel c1: sid=139 devtype=DISK

Starting backup at 18-DEC-09
channel c1: starting full datafile backupset
channel c1: specifying datafile(s) in backupset
input datafile fno=00001 name=/u01/oradata/DEL1/datafile/o1_mf_system_5bhbh59c_.dbf
… 
piece handle=/u01/oraback/06l16u1q_1_1.rmb tag=TAG20091218T083619 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:06
Finished backup at 18-DEC-09
released channel: c1

4. [Primary] On the primary, create a new standby controlfile:

SQL> alter database create standby controlfile as '/u01/oraback/DEL1_standby.ctl';

Database altered.

5. [Primary] Copy these files to standby host:

oracle@oradba1 /u01/oraback# scp *.rmb *.ctl oracle@oradba2:/u01/oraback
oracle@oradba2's password:
06l16u1q_1_1.rmb 100% 43MB 10.7MB/s 00:04
DEL1_standby.ctl 100% 43MB 10.7MB/s 00:04 

6. [Standby] Bring up the instance in nomount mode:

SQL> startup nomount

7. [Standby] Check the location of the controlfile:

SQL> show parameter control_files

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_files string /u01/oradata/standby_cntfile.ctl

8. [Standby] Replace the controlfile with the one you just created in primary.

9. $ cp /u01/oraback/DEL1_standby.ctl /u01/oradata/standby_cntfile.ctl
10.[Standby] Mount the standby database:

SQL> alter database mount standby database;
11.[Standby] RMAN does not know about these files yet; so you must let it know – by a process called cataloging. Catalog these files:

$ rman target=/

Recovery Manager: Release 10.2.0.4.0 - Production on Fri Dec 18 06:44:25 2009

Copyright (c) 1982, 2007, Oracle. All rights reserved.

connected to target database: DEL1 (DBID=846390698, not open)
RMAN> catalog start with '/u01/oraback';

using target database control file instead of recovery catalog
searching for all files that match the pattern /u01/oraback

List of Files Unknown to the Database
=====================================
File Name: /u01/oraback/DEL1_standby.ctl
File Name: /u01/oraback/06l16u1q_1_1.rmb

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /u01/oraback/DEL1_standby.ctl
File Name: /u01/oraback/06l16u1q_1_1.rmb

12.Recover these files:

RMAN> recover database;

Starting recover at 18-DEC-09
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/oradata/DEL2/datafile/o1_mf_system_5lptww3f_.dbf
...
channel ORA_DISK_1: reading from backup piece /u01/oraback/05l16u03_1_1.rmb
channel ORA_DISK_1: restored backup piece 1
piece handle=/u01/oraback/05l16u03_1_1.rmb tag=TAG20091218T083619
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07

starting media recovery

archive log thread 1 sequence 8012 is already on disk as file /u01/oradata/1_8012_697108460.dbf
archive log thread 1 sequence 8013 is already on disk as file /u01/oradata/1_8013_697108460.dbf
… 

13. After some time, the recovery fails with the message:

archive log filename=/u01/oradata/1_8008_697108460.dbf thread=1 sequence=8009
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 12/18/2009 06:53:02
RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile '/u01/oradata/1_8008_697108460.dbf'
ORA-00310: archived log contains sequence 8008; sequence 8009 required
ORA-00334: archived log: '/u01/oradata/1_8008_697108460.dbf'

This happens because we have come to the last of the archived logs. The expected archived log with sequence# 8008 has not been generated yet.

14.At this point exit RMAN and start managed recovery process:

SQL> alter database recover managed standby database disconnect from session;

Database altered.

15.Check the SCN’s in primary and standby:

[Standby] SQL> select current_scn from v$database;

CURRENT_SCN
-----------
1447474
[Primary] SQL> select current_scn from v$database;

CURRENT_SCN
-----------
1447478
Now they are very close to each other. The standby has now caught up.

Re-start Active Dataguard


RE-Enable log apply service (By restarting Guard)


[sysdba@dataguard:stbydb-/cluster/home/oracle]$ sid

SIDs on this node:
  +ASM
  stbydb

ORACLE_SID = [stbydb] ?
The Oracle base remains unchanged with value /cluster/app/grid

ORACLE_SID=stbydb
ORACLE_HOME=/cluster/app/grid/product/11.2.0/db_1

[sysdba@dataguard:stbydb-/cluster/home/oracle]$ hems

SQL*Plus: Release 11.2.0.2.0 Production on Fri Dec 14 11:40:29 2012
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options


SYS@stbydb >>  alter database recover managed standby database cancel;

Database altered.

SYS@stbydb >> shutdown immediate;
ORA-01013: user requested cancel of current operation

SYS@stbydb >> shutdown abort;
ORACLE instance shut down.

SYS@stbydb >> startup mount;
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 5.3848E+10 bytes
Fixed Size                  2239232 bytes
Variable Size            2.5904E+10 bytes
Database Buffers         2.7783E+10 bytes
Redo Buffers              158928896 bytes
Database mounted.
SYS@stbydb >> alter database open read only;

Database altered.

SYS@stbydb >> select open_mode from v$database;

OPEN_MODE
--------------------
READ ONLY

SYS@stbydb >> alter database recover managed standby database using current logfile disconnect;

Database altered.

SYS@stbydb >>


Copy archive log from file system to +ASM diskgroup on Standby database


Issue:

Because of  one of the hanging ARC process on Production database, Standby database skip applying some of the archive log ,Hence subsequent archive log not applying and standby db issue gap sequence error in Alert log. 

The temporary solution of this issue is applying archivelogs manually through RMAN on standby database.

But permanent solution was killing hanged ARC process and create new Archival process on the production server.

Below steps shows how to apply archive log manually on standby database with RMAN.



Solution::

catalog archivelog '/cluster/home/oracle/o1_mf_1_245440_8dp02lsr_.arc';

Or run below command to copy it to ASM
RMAN >  copy archivelog  '/cluster/home/oracle/o1_mf_1_245440_8dp02lsr_.arc' to '+ARCHIVELOGS';

CATALOG START WITH '/cluster/home/oracle/archivelog';



 Steps:



[sysdba@dbguard:standbydb-/cluster/home/oracle]$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Fri Dec 14 12:12:21 2012

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: LIVEDB (DBID=123232323)

RMAN> catalog archivelog '/cluster/home/oracle/o1_mf_1_245440_8dp02lsr_.arc';

using target database control file instead of recovery catalog
cataloged archived log
archived log file name=/cluster/home/oracle/o1_mf_1_245440_8dp02lsr_.arc RECID=44233 STAMP=802008765

RMAN> CATALOG START WITH '/cluster/home/oracle/archivelog';

searching for all files that match the pattern /cluster/home/oracle/archivelog

List of Files Unknown to the Database
=====================================
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245536_8dp4sgtp_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245533_8dp4okwt_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245534_8dp4pvoq_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245531_8dp4lwm5_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245529_8dp4b2cy_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245532_8dp4n6sk_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245537_8dp4xztn_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245535_8dp4r4cd_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245530_8dp4fmdv_.arc

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245536_8dp4sgtp_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245533_8dp4okwt_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245534_8dp4pvoq_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245531_8dp4lwm5_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245529_8dp4b2cy_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245532_8dp4n6sk_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245537_8dp4xztn_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245535_8dp4r4cd_.arc
File Name: /cluster/home/oracle/archivelog/o1_mf_1_245530_8dp4fmdv_.arc

RMAN>


Alert.log output ::

Fetching gap sequence in thread 1, gap sequence 245529-245537
Fri Dec 14 12:19:45 2012
Fetching gap sequence in thread 1, gap sequence 245529-245537
Fri Dec 14 12:19:56 2012
Fetching gap sequence in thread 1, gap sequence 245529-245537
Fri Dec 14 12:20:05 2012
RFS[1]: No standby redo logfiles available for thread 1
RFS[1]: Opened log for thread 1 sequence 245563 dbid -88056572 branch 766161863
Fri Dec 14 12:20:05 2012
Archived Log entry 44250 added for thread 1 sequence 245562 ID 0xfac03e01 dest 1:
Archived Log entry 44251 added for thread 1 sequence 245562 ID 0xfac03e01 dest 2:
Archived Log entry 44252 added for thread 1 sequence 245562 ID 0xfac03e01 dest 10:
Fri Dec 14 12:20:06 2012
Fetching gap sequence in thread 1, gap sequence 245529-245537
Fri Dec 14 12:20:16 2012
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245529_8dp4b2cy_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245530_8dp4fmdv_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245531_8dp4lwm5_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245532_8dp4n6sk_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245533_8dp4okwt_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245534_8dp4pvoq_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245535_8dp4r4cd_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245536_8dp4sgtp_.arc
Media Recovery Log /cluster/home/oracle/archivelog/o1_mf_1_245537_8dp4xztn_.arc
Media Recovery Log +ARCHIVELOGS/stbydb/archivelog/2012_12_14/thread_1_seq_245538.15434.802007853
Media Recovery Log +ARCHIVELOGS/stbydb/archivelog/2012_12_14/thread_1_seq_245539.1565

ASM cp Command to Remote Disk Group


ASM cp Command to Remote Disk Group

ASM cp command allows you to copy files between Oracle ASM disk groups and between a disk group and the operating system. Oracle ASM disk groups can be local or remote.
Syntax
cp source_file [rem_connect_str:]target_file
cp [rem_connect_str:]source_file target_file
The rem_connect_str is in the form user@host[.port_number].SID with default port number of 1521. cp cannot copy files between two remote instances. The local Oracle ASM instance must be either the source or the target of the operation.
In an Oracle training material, Oracle New Features for Administrators, the authors stated “the format of the copied files is portable between Little-Endian and Big-Endian systems”. Oracle document (Oracle Automatic Stoarge Management Administrator’s Guide 11g Release 2(11.2), E18951-02) does not address the issue of copying between different endian systems. I tried to copy from Solaris 10 to Linux (see below example) with 11gR2 ASM, but failed with errors.
Example
Local is a Solaris 10 system with 11.2.0.2 ASM instance.
1) copying to another Solaris 10 — Success
ASMCMD>  cp thread_1_seq_4.5135.729122487 sys@sanfords.1521.+ASM1:+recovery/remove.me
Enter password: ******
copying +recovery/testdb2/archivelog/2010_09_07/thread_1_seq_4.5135.729122487 -> sanfords:+recovery/remove.me
2) Copying to Linux (same ASM version) — Failed
ASMCMD> cp thread_1_seq_4.5135.729122487 sys@xh0729.us.oracle.com.1521.+ASM2:+recoverydest/remove.me
Enter password: *******
copying +recovery/testdb2/archivelog/2010_09_07/thread_1_seq_4.5135.729122487 -> xh0729.us.oracle.com:+recoverydest/remove.me
Argument “normal” isn’t numeric in numeric ge (>=) at /u01/app/oracle/product/11.2.0/grid_11202/lib/asmcmdshare.pm line 1676, line 10.
errors
ASMCMD-08016: copy source->’+recovery/testdb2/archivelog/2010_09_07/thread_1_seq_4.5135.729122487′ and target->’+recoverydest/remove.me’ failed
ORA-17627: ORA-12577: Message 12577 not found;  product=RDBMS; facility=ORA
ORA-06512: at “SYS.X$DBMS_DISKGROUP”, line 410
ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)

Friday 7 December 2012

ORA-01102: cannot mount database in EXCLUSIVE mode


Error happened when try open database :

SQL> startup
ORACLE instance started.

Total System Global Area 267227136 bytes
Fixed Size 2212496 bytes
Variable Size 205524336 bytes
Database Buffers 54525952 bytes
Redo Buffers 4964352 bytes
ORA-01102: cannot mount database in EXCLUSIVE mode


 In alert.log
sculkget: lock held by PID: 12359
ORA-09968: unable to lock file
Linux-x86_64 Error: 11: Resource temporarily unavailable
Additional information: 12359
ORA-1102 signalled during: ALTER DATABASE MOUNT...
On OS level :

[oracle@localhost trace]$ ps -ef | grep pmon 

oracle 10222 1 0 10:27 ? 00:00:00 ora_pmon_ORCL
oracle 2522 1 0 Nov09 ? 00:00:06 ora_pmon_orcl
Use :

 Kill -9 10222 2522

Then
1-After that fire export ORACLE_SID= (watch case sensitive )
2-sqlplus / as sysdba
3-startup ;

Wednesday 5 December 2012



[sysdba@proddb:proddb-/cluster/home/oracle]$ sid

SIDs on this node:
  +ASM
  proddb

ORACLE_SID = [proddb] ?
The Oracle base remains unchanged with value /cluster/app/grid

ORACLE_SID=proddb
ORACLE_HOME=/cluster/app/grid/product/11.2.0/db_1

[sysdba@proddb:proddb-/cluster/home/oracle]$ adrci

ADRCI: Release 11.2.0.2.0 - Production on Wed Dec 5 10:26:08 2012

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/cluster/app/grid"
adrci> show home
ADR Homes:
diag/asm/+asm/+ASM
diag/clients/user_sysdba/host_2521857129_80
diag/rdbms/proddb/proddb
diag/tnslsnr/proddb/listener
adrci> set homepath diag/rdbms/proddb/proddb
adrci> show problem

ADR Home = /cluster/app/grid/diag/rdbms/proddb/proddb:
*************************************************************************
PROBLEM_ID           PROBLEM_KEY                                                 LAST_INCIDENT        LASTINC_TIME
-------------------- ----------------------------------------------------------- -------------------- ----------------------------------------
1                    ORA 603                                                     28041                2012-01-12 18:10:07.605000 +00:00
2                    ORA 600 [729]                                               53146                2012-04-26 23:16:44.468000 +01:00
3                    ORA 445                                                     72816                2012-10-27 04:37:58.085000 +01:00
6                    ORA 600 [17090]                                             77732                2012-11-26 22:13:57.472000 +00:00
5                    ORA 600                                                     77733                2012-11-26 22:14:46.972000 +00:00
7                    ORA 600 [ORA-00600: internal error code, arguments: [17090] 77734                2012-11-26 22:14:48.634000 +00:00
8                    ORA 600 [ORA-31671: Worker process DW00 had an unhandled ex 77668                2012-11-26 22:14:50.081000 +00:00
4                    ORA 7445 [skgpgetinfo()+162]                                84821                2012-12-04 17:00:03.936000 +00:00
8 rows fetched

adrci> show incident -p "problem_key='ORA 7445 [skgpgetinfo()+162]'"

ADR Home = /cluster/app/grid/diag/rdbms/proddb/proddb:
*************************************************************************
INCIDENT_ID          PROBLEM_KEY                                                 CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
80820                ORA 7445 [skgpgetinfo()+162]                                2012-11-20 08:00:05.351000 +00:00
84821                ORA 7445 [skgpgetinfo()+162]                                2012-12-04 17:00:03.936000 +00:00
2 rows fetched

adrci> ips pack incident 84821 in /tmp
Generated package 1 in file /tmp/ORA7445sk_20121205104140_COM_1.zip, mode complete
adrci>


If at this step the ADRCI utility complains that the incident was flood-controlled and that no package can be generated for it, then instead of choosing the most recent incident to be packaged, choose the first incident that occurred after an instance startup.

Examples of IPS PACK include:


ips pack problem 100 in /tmp
--Generates the package for the problem id 100 in /tmp

ips pack incident 6439 in /tmp
--Generates the package for the incident id 6439 in /tmp

ips pack problemkey "ORA 1578"
--Generates the package for the problem with the problem_key 'ORA 1578'

ips pack seconds 8
--Generates the package with the incidents occurred in last 8 seconds.

ips pack time '2007-05-01 10:00:00.00' to '2007-05-01 23:00:00.00'
--Generates the package with the incidents occurred between the times '2007-05-01 10:00:00.00' and '2007-05-01 23:00:00.00'


Upload the ZIP file generated to the Service Request you created on the My Oracle Support website . This file will include all the trace files, instance alert file and other diagnostic information for the critical error. Make sure that the ZIP file is uploaded WITHOUT any post processing (such as TAR)

Video Demo:

Database 11g: Quick Steps to Package and Send Critical Error Diagnostic Information to Support [Video] [ID 443529.1]


NOTE:422893.1 - 11g Understanding Automatic Diagnostic Repository.
NOTE:738732.1 - ADR Different Methods to Create IPS Package
NOTE:778.1 - Multimedia Content Reference