Tuesday 19 February 2013

Troubleshooting ORA-03135: connection lost contact issue

Environment: Oracle database 10.2.0.3 RAC 64bit, websphere application server on RHEL4.5 use JDBC driver,Juniper firewalls (failover)
Problem: application server got error "java.sql.SQLException: ORA-03135: connection lost contact" and connection timeout from database
Objective: to fix the connection lost issue

Steps:
1. search the error message to find out the root cause in Oracle support and google:

Troubleshooting ORA-3135 Connection Lost Contact [ID 787354.1]
Resolving Problems with Connection Idle Timeout With Firewall [ID 257650.1]
Achive Log shipments failing in environment with Juniper firewall [ID 1075432.1]
Logs are not shipped to the physical standby database [ID 1130523.1]
ORA-03135: connetion lost contact while shipping from Primary server to standby server [ID 739522.1]

According to Oracle doc ID 787354.1, this ORA-3135 might be caused by one of the following:

(1) Transparent Application Failover (TAF)
(2) Idle Connection Timeout
(3) Firewall Manipulating  Sqlnet Data.

Otherwise, it might need further analysis.
Also, according to ORA-3135 connection lost contact tips: http://www.dba-oracle.com/t_ora_03135_connection_lost_contact.htm, it might be due to customized profile.

sqlplus / as sysdba
sql> set line 32000
sql> select profile, resource_name, limit
from dba_profiles where resource_name IN ('IDLE_TIME', 'CONNECT_TIME');

note: some commands:
sql> create profile myprof limit connect_time 30;


From application side, to avoid this idle connection timeout, refer to -   Putting a firewall between your AppServer and DBMS -  http://www.websphere-world.com/modules.php?name=News&file=article&sid=522to

2. check firewall idle connection timeout and ALG settings
Checked Juniper firewall SQL ALG setting and idle connection timeout

# check alg
# check service ssh
# check service SQL*Net V2         

And some articles:
a. What is Idle timeouts when any service set - J-Net Community - http://forums.juniper.net/t5/ScreenOS-Firewalls-NOT-SRX/What-is-Idle-timeouts-when-Any-service-set/td-p/163

b. Viewing list of ALGs and disabling an ALG differs on screenos versions - http://kb.juniper.net/InfoCenter/index?page=content&id=KB13509
c. Issues with SQL database applications when traffic traverses SRX - http://kb.juniper.net/InfoCenter/index?page=content&id=KB21550
d. Increasing the session idle timeout of a particular service - http://kb.juniper.net/InfoCenter/index?page=content&id=KB4652&actp=LIST
e. How do I verify the NCP settings? (for SQL ALG) - http://kb.juniper.net/InfoCenter/index?page=content&id=KB9283
f. [SRX]what conditions are required for SQL ALG? - http://kb.juniper.net/InfoCenter/index?page=content&id=KB22418&cat=JUNOS&actp=LIST
g. Users losing connections when the IVE Active/Passive Cluster Fails over - http://kb.juniper.net/InfoCenter/index?page=content&id=KB8539&cat=SSL_VPN&actp=LIST

3. We have disabled SQL ALG and application server seemed busy at the time of connection lost.
Our monitoring shows Firewall actually failed over to another one at the time of connection lost and switched back, so the application server got "ORA-03135: connection lost contact" error twice and need to restart application.

Wednesday 13 February 2013

Listener won't register database

First method - Try with starting/stopping listener and register database manually.

lsnrctl start
lsnrctl stop
lsnrctl start

sql > alter system register;

if this is not work,

Second method - get the listener information and set it to LOCAL_LISTENER parameter.

[grid@v-demo:-/home/grid]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 13-JUN-2013 15:36:13

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=v-demo.abcd.co.uk)(PORT=1522)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                13-JUN-2013 15:29:01
Uptime                    0 days 0 hr. 7 min. 12 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /demo/demo1/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /demo/demo1/app/grid/diag/tnslsnr/v-demo/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=v-demo.abcd.co.uk)(PORT=1522)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1522)))
The listener supports no services
The command completed successfully



SYS@live11 >> alter system set local_listener='(ADDRESS=(PROTOCOL=tcp)(HOST=v-demo.abcd.co.uk)(PORT=1522))';

System altered.



[grid@v-demo:-/home/grid]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 13-JUN-2013 15:37:17

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=v-demo.abcd.co.uk)(PORT=1522)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                13-JUN-2013 15:29:01
Uptime                    0 days 0 hr. 8 min. 16 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /demo/demo1/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /demo/demo1/app/grid/diag/tnslsnr/v-demo/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=v-demo.abcd.co.uk)(PORT=1522)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1522)))
Services Summary...
Service "live11" has 1 instance(s).
  Instance "live11", status READY, has 1 handler(s) for this service...
Service "hrlive11XDB" has 1 instance(s).
  Instance "live11", status READY, has 1 handler(s) for this service...
The command completed successfully


Job done!


Monday 4 February 2013

ORA-12547: TNS:lost contact

Very simple issue but took some amount of time in troubleshooting so thought about posting it here. May be it proves to be useful for someone.
Scenario was: Oracle is installed from “oracle” user and all runs well. There is a new OS user “test1″ that also needs to use sqlplus. So granted the necessary permissions on ORACLE_HOME to test1. Tried to connect sqlplus scott/tiger@DB and yes it works. But while trying sqlplus scott/tiger it throws:
1
2
3
4
5
6
7
8
9
10
11
12
$ sqlplus scott/tiger
SQL*Plus: Release 10.2.0.5.0 - Production on Wed May 18 09:32:35 2011
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
ERROR:
ORA-12547: TNS:lost contact
Enter user-name: ^C
$
Did a lot of troubleshooting including checking tnsnames.ora, sqlnet.ora, listener.ora and so on. Nothing was hitting my mind so finally raised an SR. And it has to do with the permissions of the $ORACLE_HOME/bin/oracle binary. The permissions of oracle executable should be rwsr-s–x or 6751 but they were not. See below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ id
uid=241(test1) gid=202(users) groups=1(staff),13(dba)
$
$ cd $ORACLE_HOME/bin
$ ls -ltr oracle
-rwxr-xr-x    1 oracle   dba       136803483 Mar 16 20:32 oracle
$
$ chmod 6751 oracle
$ ls -ltr oracle
-rwsr-s--x    1 oracle   dba       136803483 Mar 16 20:32 oracle
$
$ sqlplus scott/tiger
SQL*Plus: Release 10.2.0.5.0 - Production on Wed May 18 10:23:27 2011
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> show user
USER is "SCOTT"
SQL>