Yesterday I was trying to start DB services using SRVCTL on one of the new environment. I was able to start individual database instances using sqlplus, but it was giving issues for SRVCTL, I was able to start one of the two instances using SRVCTL, but not both, it was giving following error:
[oracle@uat-srv-fin2 dbs] Ora:TESTDB $ srvctl start database -d TESTDB PRCR-1079 : Failed to start resource ora.testdb.db CRS-5017: The resource action "ora.testdb.db start" encountered the following error: ORA-03113: end-of-file on communication channel Process ID: 0 Session ID: 0 Serial number: 0 . For details refer to "(:CLSN00107:)" in "/u01/app/18.104.22.168/grid/log/uat_srv-fin1/agent/crsd/oraagent_oracle//oraagent_oracle.log". CRS-2674: Start of 'ora.testdb.db' on 'uat-srv-fin1' failed CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy.
After checking all RAC related init parameters, and making sure everything is right there, I had a look at DB alert log so as to investigate further and I got something useful :
Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_dia0_28019_base_1.trc: ORA-27506: IPC error connecting to a port ORA-27300: OS system dependent operation:proto mismatch failed with status: 0 ORA-27301: OS failure message: Error 0 ORA-27302: failure occurred at: skgxpcon ORA-27303: additional information: Protocol of this IPC does not match remote (192.168.52.42). SKGXP IPC libraries must be the same version. [local: RDS,remote: UDP] Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_lmon_28020.trc: ORA-27550: Target ID protocol check failed. tid vers=1, type=1, remote instance number=2, local instance number=1 LMON (ospid: 28020): terminating the instance due to error 481 System state dump requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination]. System State dumped to trace file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_diag_27999_20160815112931.trc Dumping diagnostic data in directory=[cdmp_20160815112932], requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination]. Instance terminated by LMON, pid = 28020
So messages were pointing to RDS/UDP binary linking.
RAC instances can’t see communicate with each other. IPC error is due to mismatch in cluster interconnect protocol being in use by these 2 nodes. Node2 used UDP and Node1 uses RDS now. So due to this mismatch the other node is down. Exadata does not support mixing of UDP and RDS ports.
As a solution I did following steps so as to resolve the issue:
1. Shutdown all the instances running on this ORACLE_HOME which was configured for the instance having UDP port
– Execute as oracle user,
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_rds ioracle
2. Startup the instance using srvctl
Now I was able to start database using SRVCTL without any issues:
[oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl start database -d TESTDB [oracle@uat-srv-fin2 ~] Ora:TESTDB $ [oracle@uat-srv-fin2 ~] Ora:TESTDB $ [oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl status database -d TESTDB Instance TESTDB1 is running on node uat_srv-fin1 Instance TESTDB2 is running on node uat_srv-fin2
Hope so u will find this post very useful:-)