./roothas.sh -postpatch OR root.sh failing with CLSRSC-400: A system reboot is required to continue installing.

Recently I was doing fresh Grid Infrastructure(GI) 12.2 install on one of our UAT boxes, wherein I was facing strange issue.

Both “root.sh” & “./roothas.sh -postpatch” exiting with below error/warning:

CLSRSC-400: A system reboot is required to continue installing.


test-server01:/u01/app/12.2.0.1/grid/bin # cd $ORACLE_HOME/crs/install
test-server01:/u01/app/12.2.0.1/grid/crs/install # ./roothas.sh -postpatch
Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-22-25PM.log
2019/05/27 14:22:30 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'
2019/05/27 14:23:18 CLSRSC-400: A system reboot is required to continue installing.

A simple instruction given by above warning was to reboot machine & retry. I did ask server admin to reboot machine but subsequent rerun of command failed with the same error.


test-server01:/u01/app/12.2.0.1/grid/crs/install # ./roothas.sh -postpatch
Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-37-02PM.log
2019/05/27 14:37:07 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'
2019/05/27 14:37:52 CLSRSC-400: A system reboot is required to continue installing.

I tried checking associated log files to get more details: /u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-37-02PM.log


> ACFS-9428: Failed to load ADVM/ACFS drivers. A system reboot is recommended.
> ACFS-9310: ADVM/ACFS installation failed.
> ACFS-9178: Return code = USM_REBOOT_RECOMMENDED
2019-05-27 14:37:41: ACFS drivers cannot be installed, and reboot may resolve this
2019-05-27 14:37:52: Command output:
> CLSRSC-400: A system reboot is required to continue installing.
>End Command output
2019-05-27 14:37:52: CLSRSC-400: A system reboot is required to continue installing.

So this was definitely due to issue with ACFS drivers.

I found below MOS documents related to my issues but nothing was exactly matching with my situation or operating system.

While Manually Installing a Patch ‘rootcrs.sh -patch’ Fails with – CLSRSC-400: A system reboot is required to continue installing. (Doc ID 2360097.1)

ALERT: root.sh Fails With “CLSRSC-400” While Installing GI 12.2.0.1 on RHEL or OL with RedHat Compatible Kernel (RHCK) 7.3 (Doc ID 2284463.1)

In our environment, we don’t use ACFS file system, so it was not the real problem in my case & we can always get ACFS drivers explicity if we need it in the future.

After reading all the logs I found /u01/app/12.2.0.1/grid/lib/acfstoolsdriver.sh is being called for ACFS driver installation.

I changed following code from

# Now run command with all arguments!
exec ${RUNTHIS} $@

to

# Now run command with all arguments!
#exec ${RUNTHIS} $@
exit 0

After changing above code, “./roothas.sh -postpatch” completed without errors/warning & I was able to complete GI installation successfully.

Note: This workaround is only applicable when ACFS is not being used in the environment so it can be implemented with the forewarning that there is implied risk which one must accept 😉

Hope u will find this post very useful.

Cheers

Regards,
Adityanath

 

 

 

Advertisements

srvctl start database fails with ORA-03113,ORA-27300,ORA-27301,ORA-27302,ORA-27303.

Yesterday I was trying to start DB services using SRVCTL on one of the new environment. I was able to start individual database instances using sqlplus, but it was giving issues for SRVCTL, I was able to start one of the two instances using SRVCTL, but not both, it was giving following error:


[oracle@uat-srv-fin2 dbs] Ora:TESTDB $ srvctl start database -d TESTDB
PRCR-1079 : Failed to start resource ora.testdb.db
CRS-5017: The resource action "ora.testdb.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0.4/grid/log/uat_srv-fin1/agent/crsd/oraagent_oracle//oraagent_oracle.log".
CRS-2674: Start of 'ora.testdb.db' on 'uat-srv-fin1' failed
CRS-2632: There are no more servers to try to place resource 'ora.testdb.db' on that would satisfy its placement policy.

After checking all RAC related init parameters, and making sure everything is right there, I had a look at DB alert log so as to investigate further and I got something useful :


Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_dia0_28019_base_1.trc:
ORA-27506: IPC error connecting to a port
ORA-27300: OS system dependent operation:proto mismatch failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcon
ORA-27303: additional information: Protocol of this IPC does not match remote (192.168.52.42). SKGXP IPC libraries must be the same version. [local: RDS,remote: UDP]
Errors in file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_lmon_28020.trc:
ORA-27550: Target ID protocol check failed. tid vers=1, type=1, remote instance number=2, local instance number=1
LMON (ospid: 28020): terminating the instance due to error 481
System state dump requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/testdb/TESTDB1/trace/TESTDB1_diag_27999_20160815112931.trc
Dumping diagnostic data in directory=[cdmp_20160815112932], requested by (instance=1, osid=28020 (LMON)), summary=[abnormal instance termination].
Instance terminated by LMON, pid = 28020

So messages were pointing to RDS/UDP binary linking.

As per metalink note : ORA-27303: SKGXP IPC libraries must be the same version. [local: RDS,remote: UDP] on Exadata (Doc ID 1574772.1)

RAC instances can’t see communicate with each other. IPC error is due to mismatch in cluster interconnect protocol being in use by these 2 nodes. Node2 used UDP and Node1 uses RDS now. So due to this mismatch the other node is down. Exadata does not support mixing of UDP and RDS ports.

As a solution I did following steps so as to resolve the issue:

1. Shutdown all the instances running on this ORACLE_HOME which was configured for the instance having UDP port
– Execute as oracle user,
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_rds ioracle
2. Startup the instance using srvctl

Now I was able to start database using SRVCTL without any issues:


[oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl start database -d TESTDB
[oracle@uat-srv-fin2 ~] Ora:TESTDB $
[oracle@uat-srv-fin2 ~] Ora:TESTDB $
[oracle@uat-srv-fin2 ~] Ora:TESTDB $ srvctl status database -d TESTDB
Instance TESTDB1 is running on node uat_srv-fin1
Instance TESTDB2 is running on node uat_srv-fin2

Hope so u will find this post very useful:-)

Cheers

Regards,

Adityanath

 

ORA-27054: NFS file system where the file is created or resides is not mounted with correct options

From last two days, I have been getting “ORA-27054: NFS file system not mounted with correct options” error while running an RMAN backup to a Sun ZFS Storage Appliance.

This error have been observed particularly, while taking controlfile backups. RMAN datafile or archivelog backups without controlfile were running fine.


RMAN-08132: WARNING: cannot update recovery area reclaimable file list
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch01 channel at 02/15/2016 16:02:44
ORA-01580: error creating control backup file /mnt/bkp1/TEST1/TEST1_CTRL_snapcf.ctl
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 3
Additional information: 1

After searching on MOS, I found following note, which explains different mount point options for configuring the NFS mounts.

Sun ZFS Storage Appliance: Oracle Database 11g R2 NFS Mount Point Recommendations (Doc ID 1567137.1)

After discussing the same with SOLARIS team, we found that backup mount point were mounted with all recommended parameters.

So tried to search more documents on MOS and found the one:

Oracle ZFS Storage: FAQ: Exadata RMAN Backup with The Oracle ZFS Storage Appliance (Doc ID 1354980.1)

As per note “DNFS is strongly recommended when protecting an Exadata with the ZFS Storage Appliance. It is required to achieve the published backup and restore rates”.

You can confirm that DNFS is enabled by running following query on database:

select * from v$dnfs_servers;

In my case it was not enabled. DNFS to be enabled on a each database node with the following command:

$ make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

dcli may be used to enable Oracle Direct NFS on all of the database nodes simultaneously:

$ dcli -l oracle -g /home/oracle/dbs_group make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

The database instance should be restarted after enabling Oracle Direct NFS.

After enabling DNFS activity, my backup started running without any issues.

Done!!!!!!!!!!

Hope so u will find this post very useful 🙂

Cheers

Regards,

Adityanath

Start of clusterware fails with – CLSU-00100, CLSU-00101, CLSU-00103, CLSU-00104

Few days back, I was working on a issue where in Clusterware was not coming up on UAT environment. Daemon ora.crsd was OFFLINE.

I tried to start ora.crsd daemon using command “crsctl start res ora.crsd -init”, but failed giving following error:

Captureq

I tried searching about error on metalink and google but didn’t get anything useful.

As per error it has something to do with disc quota, CLSU-00101: Operating System error message: Disc quota exceeded.

After checking on server, I found /u01 (Oracle/Grid binary location) was 100% full.

After clearing space on /u01, I was able to start ora.crsd without any issues.

Hope so u will find this post very useful 🙂

Cheers

Regards,

Adityanath