Friday, 7 February 2014

Troubleshooting 11.2 Clusterware Node Evictions

Troubleshooting 11.2 Clusterware Node Evictions (Note 1050693.1)

Starting 11.2.0.2, a node eviction may not actually reboot the machine.  This is called a rebootless restart.

To identify which process initiates a reboot, you need to review below are important files

  • Clusterware alert log in /log/alertnodename
  • The cssdagent log(s) in /log//agent/ohasd/oracssdagent_root
  • The cssdmonitor log(s) in /log//agent/ohasd/oracssdmonitor_root
  • The ocssd log(s) in /log//cssd
  • The lastgasp log(s) in /etc/oracle/lastgasp or /var/opt/oracle/lastgasp
  • IPD/OS or OS Watcher data.  IPD/OS is an old name for the Cluster Health Monitor.  The names can be used interchaneably although Oracle now calls the tool Cluster Health Monitor
  • 'opatch lsinventory -detail' output for the GRID home
  • Message files /var/log/message
Common Causes of eviction:

OCSSD Eviction: 1) Network failure or latencies issue between nodes.  It takes 30 consecutive missed checkins to cause a node eviction.  2)  Problem writing / reading the voting disk  3) A member kill escallation like the LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanisim.  If this times out, it could escalate to a node evict.

CSSDAGENT or CSSDMONITOR Eviction:  1) OS Scheduler problem as a result of OS is locked upor execsive amounts of load on the server such as CPU utilization is as high as 100% 2) CSS process is hung 3) Oracle bug

Tuesday, 26 November 2013

Intra-Block(In Exadata Cell) versus Inter-Block row chaining


Oracle database can store only 255 columns in a single row-piece. Anything beyond 255 columns will be stored as a separate row, chained to the main one. This kind of chaining is different from what we know as migrated/chained rows which happen either because of lack of space in current block which causes rows to migrate when updated leaving behind their new address or due to row size more than the block size itself. Both these types of chaining are inter-block row chaining.

intra-block row chaining (as i mentioned above happens due to number of columns exceeding 255) does not cause as much a performance issue as inter-block row chaining. This is because all the row contents could still be found within the same block. When there is a full table scan, DB can easily do scattered reads over this table with number of blocks spanning up to the value for parameter db_file_multiblock_read_count. For scanning rows which are chained to another block, as is the case with migrated/chained rows, DB has to do sequential reads with one block read at a time. This summarily degrades query performance.

Unlike Inter-block row chaining, Intra-block row chaining cannot be detected using ANALYZE TABLE LIST CHAINED ROWS command. However, there is a way to see to what extent the rows are chained. Look out for statistic " table fech continued row". As you access the table with 255+ columns, this value will keep increasing.

 select name, value from v$sysstat where name like 'table fetch %';


Note that if all the columns beyond 255 columns have NULL values, the row will not be chained.

Thursday, 19 September 2013

Recover Dropped view's Defination


Unfortunately One of our developer had dropped one important view from the schema. And we were not taking datapump backup for that schemas. Yes we are taking the RMAN full backup but don't want to go that route if I could find easily.

Here was the solution ( assuming database have all that configuration required for "AS OF TIMESTAMP - FLASHBACK" clause )

select * from ( select view_name , text from dba_views where owner='MGR_KX' )
as of timestamp sysdate - 1 ;

and we got required result with view definition.

Cheers!!

Tuesday, 6 August 2013

Duplicate From Active Database Errors : ORA-17629 and ORA-17627: ORA-12154: Tns:Could Not Resolve

The duplicate database from active failed with the following errors : 

ORA-17629: Cannot connect to the remote database server
ORA-17627: ORA-12154: TNS:could not resolve the connect identifier specified
ORA-17629: Cannot connect to the remote database server


Cause 1:
For Active Duplication you need to connect to the target and Auxiliary DB using net service name ,even if you are running RMAN at the Auxiliary DB.
The net service name to connect to the Auxiliary DB should also be configured at the target Database.

In the tnsnames.ora on the machine of the TARGET database, there are 2 services created to connect to auxiliary database that have the same "service name" but "different" configuration.

Possible Cause 2:
Mismatch of Parameters DEFAULT_DOMAIN and DB_DOMAIN

db_domain=PROD.world         # init.ora
default_domain=PROD.WORLD    # sqlnet.ora

# DEFAULT_DOMAIN defined on sqlnet.ora is 'case' sensitive and when defined must match
   DB_DOMAIN initialization parameter

Solution:

1. To be sure that the tnsnames.ora file of each machine (machine where is the source database and machine where is the auxiliary database) has the service name to connect to auxiliary database.

2. Verify that there are not duplicate service_name with different configuration in the tnsnames.ora file.

Monday, 5 August 2013

ORA-12528: TNS:listener: all appropriate instances are blocking new connections

If you try to duplicate database using rman and get following error:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00554: initialization of internal recovery manager package failed
RMAN-04006: error from auxiliary database: ORA-12528: TNS:listener: all appropriate instances are blocking new connections

Most probably your auxiliary database (in nomount state) is the only instance works on current ORACLE_HOME. Each instance is registered in listener by PMON process, which starts only when database is at least mounted. If there is nothing registered in listener before, instance in blocked.

You can do 2 things:
1. Modify listener.ora like this:

SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = AUXDB)
(ORACLE_HOME = /app/oracle/product/10.2.0/)
(SID_NAME = AUXDB)
)
)
OR
2. Modify tnsnames.ora (10g only!) :

AUXDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = AUXDB)(UR=A)
)
)

Thursday, 25 July 2013

Complete checklist for out-of-place manual upgrade from previous 11.2.0.N version to the latest 11.2.0.N patchset




Tuesday, 23 July 2013

Oracle GoldenGate database Schema Profile check script for Oracle DB

Oracle GoldenGate database Schema Profile check script for Oracle DB (Doc ID 1296168.1)


PURPOSE

This script is intended to query the database by schema to identify current configuration and identify any unsupported data types or types that may need special considerations for Oracle GoldenGate in an Oracle environment. This is the Oracle database profile script.  Added check for deferred constraints. Deferred constraints may cause ADD TRANDATA to select the wrong column for logging. Use KEYCOLS for tables with deferred constraints.

REQUIREMENTS

This script is to be run in sqlplus by a user that has DBA privliges.

CONFIGURING

Download or Save script file as full-schemaCheckOracle_12072011.sql

INSTRUCTIONS

Log into sqlplus as sysdba
run the script -
SQL> @full-schemaCheckOracle_12072011.sql
Review output.