srvctl start database fails with ORA-03113,ORA-27300,ORA-27301,ORA-27302,ORA-27303.

Yesterday I was trying to start DB services using SRVCTL on one of the new environment. I was able to start individual database instances using sqlplus, but it was giving issues for SRVCTL, I was able to start one of the two instances using SRVCTL, but not both, it was giving following error:


[oracle@uat-srv-fin2 dbs] Ora:TESTDB $ srvctl start database -d TESTDB
PRCR-1079 : Failed to start resource ora.testdb.db
CRS-5017: The resource action "ora.testdb.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/uat_srv-fin1/agent/crsd/oraagent_oracle//oraagent_oracle.log".
CRS-2674: Start of 'ora.testdb.db' on 'uat-srv-fin1' failed
CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy.

After checking all RAC related init parameters, and making sure everything is right there, I had a look at DB alert log so as to investigate further and I got something useful :


Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_dia0_28019_base_1.trc:
ORA-27506: IPC error connecting to a port
ORA-27300: OS system dependent operation:proto mismatch failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcon
ORA-27303: additional information: Protocol of this IPC does not match remote (192.168.52.42). SKGXP IPC libraries must be the same version. [local: RDS,remote: UDP]
Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_lmon_28020.trc:
ORA-27550: Target ID protocol check failed. tid vers=1, type=1, remote instance number=2, local instance number=1
LMON (ospid: 28020): terminating the instance due to error 481
System state dump requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_diag_27999_20160815112931.trc
Dumping diagnostic data in directory=[cdmp_20160815112932], requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination].
Instance terminated by LMON, pid = 28020

So messages were pointing to RDS/UDP binary linking.

As per metalink note : ORA-27303: SKGXP IPC libraries must be the same version. [local: RDS,remote: UDP] on Exadata (Doc ID 1574772.1)

RAC instances can’t see communicate with each other. IPC error is due to mismatch in cluster interconnect protocol being in use by these 2 nodes. Node2 used UDP and Node1 uses RDS now. So due to this mismatch the other node is down. Exadata does not support mixing of UDP and RDS ports.

As a solution I did following steps so as to resolve the issue:

1. Shutdown all the instances running on this ORACLE_HOME which was configured for the instance having UDP port
– Execute as oracle user,
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_rds ioracle
2. Startup the instance using srvctl

Now I was able to start database using SRVCTL without any issues:


[oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl start database -d TESTDB
[oracle@uat-srv-fin2 ~] Ora:TESTDB $
[oracle@uat-srv-fin2 ~] Ora:TESTDB $
[oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl status database -d TESTDB
Instance TESTDB1 is running on node uat_srv-fin1
Instance TESTDB2 is running on node uat_srv-fin2

Hope so u will find this post very useful:-)

Cheers

Regards,

Adityanath

 

Advertisements

ORA-27054: NFS file system where the file is created or resides is not mounted with correct options

From last two days, I have been getting “ORA-27054: NFS file system not mounted with correct options” error while running an RMAN backup to a Sun ZFS Storage Appliance.

This error have been observed particularly, while taking controlfile backups. RMAN datafile or archivelog backups without controlfile were running fine.


RMAN-08132: WARNING: cannot update recovery area reclaimable file list
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch01 channel at 02/15/2016 16:02:44
ORA-01580: error creating control backup file /mnt/bkp1/TEST1/TEST1_CTRL_snapcf.ctl
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 3
Additional information: 1

After searching on MOS, I found following note, which explains different mount point options for configuring the NFS mounts.

Sun ZFS Storage Appliance: Oracle Database 11g R2 NFS Mount Point Recommendations (Doc ID 1567137.1)

After discussing the same with SOLARIS team, we found that backup mount point were mounted with all recommended parameters.

So tried to search more documents on MOS and found the one:

Oracle ZFS Storage: FAQ: Exadata RMAN Backup with The Oracle ZFS Storage Appliance (Doc ID 1354980.1)

As per note “DNFS is strongly recommended when protecting an Exadata with the ZFS Storage Appliance. It is required to achieve the published backup and restore rates”.

You can confirm that DNFS is enabled by running following query on database:

select * from v$dnfs_servers;

In my case it was not enabled. DNFS to be enabled on a each database node with the following command:

$ make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

dcli may be used to enable Oracle Direct NFS on all of the database nodes simultaneously:

$ dcli -l oracle -g /home/oracle/dbs_group make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

The database instance should be restarted after enabling Oracle Direct NFS.

After enabling DNFS activity, my backup started running without any issues.

Done!!!!!!!!!!

Hope so u will find this post very useful 🙂

Cheers

Regards,

Adityanath

Start of clusterware fails with – CLSU-00100, CLSU-00101, CLSU-00103, CLSU-00104

Few days back, I was working on a issue where in Clusterware was not coming up on UAT environment. Daemon ora.crsd was OFFLINE.

I tried to start ora.crsd daemon using command “crsctl start res ora.crsd -init”, but failed giving following error:

Captureq

I tried searching about error on metalink and google but didn’t get anything useful.

As per error it has something to do with disc quota, CLSU-00101: Operating System error message: Disc quota exceeded.

After checking on server, I found /u01 (Oracle/Grid binary location) was 100% full.

After clearing space on /u01, I was able to start ora.crsd without any issues.

Hope so u will find this post very useful 🙂

Cheers

Regards,

Adityanath