Troubleshooting 11.2 Clusterware Node Evictions (Note 1050693.1)
Starting 11.2.0.2, a node eviction may not actually reboot the machine. This is called a rebootless restart.
To identify which process initiates a reboot, you need to review below are important files
OCSSD Eviction: 1) Network failure or latencies issue between nodes. It takes 30 consecutive missed checkins to cause a node eviction. 2) Problem writing / reading the voting disk 3) A member kill escallation like the LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanisim. If this times out, it could escalate to a node evict.
CSSDAGENT or CSSDMONITOR Eviction: 1) OS Scheduler problem as a result of OS is locked upor execsive amounts of load on the server such as CPU utilization is as high as 100% 2) CSS process is hung 3) Oracle bug
To identify which process initiates a reboot, you need to review below are important files
- Clusterware alert log in
/log/ alertnodename - The cssdagent log(s) in
/log/ /agent/ohasd/oracssdagent_root - The cssdmonitor log(s) in
/log/ /agent/ohasd/oracssdmonitor_root - The ocssd log(s) in
/log/ /cssd - The lastgasp log(s) in /etc/oracle/lastgasp or /var/opt/oracle/lastgasp
- IPD/OS or OS Watcher data. IPD/OS is an old name for the Cluster Health Monitor. The names can be used interchaneably although Oracle now calls the tool Cluster Health Monitor
- 'opatch lsinventory -detail' output for the GRID home
- Message files /var/log/message
OCSSD Eviction: 1) Network failure or latencies issue between nodes. It takes 30 consecutive missed checkins to cause a node eviction. 2) Problem writing / reading the voting disk 3) A member kill escallation like the LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanisim. If this times out, it could escalate to a node evict.
CSSDAGENT or CSSDMONITOR Eviction: 1) OS Scheduler problem as a result of OS is locked upor execsive amounts of load on the server such as CPU utilization is as high as 100% 2) CSS process is hung 3) Oracle bug
No comments:
Post a Comment