OEM agent version 13c installation on AIX fails with OUI-10039:Unable to access the inventory /u01/app/oraInventory on this system.

Hello Readers,

Few days ago, I was installing OEM agent version 13.3.0.0.0 on my DEV box with OS AIX version 7.2. I was using silent method using agentDeploy.sh.

Previously I had installed it successfully on multiple machines without any issues but this one failed with below error:

java.io.IOException: OUI-10039:Unable to access the inventory /u01/app/oraInventory on this system. Please ensure you have the proper permissions to read/write/search the inventory.

I tried to see if there are any permissions issue on folder /u01/app/oraInventory, I found mentioned directory was not present. So this error is expected one. But why Oracle is searching inventory at incorrect location???

When I dug into associated log file, I found more details about these errors.


2019-10-07 12:20:48,465 WARNING [34] oracle.sysman.oii.oiip.oiipg.OiipgPropertyLoader - The inventory pointer location /var/opt/oracle/oraInst.loc is either not readable or does not exist
2019-10-07 12:20:48,475 INFO [34] oracle.sysman.nextgen.utils.NextGenInventoryUtil - Setting default inventory location to: '/u01/app/oraInventory'
2019-10-07 12:20:48,475 WARNING [34] oracle.sysman.oii.oiip.oiipg.OiipgPropertyLoader - The inventory pointer location /var/opt/oracle/oraInst.loc is either not readable or does not exist
2019-10-07 12:20:48,475 WARNING [34] oracle.sysman.oii.oiip.oiipg.OiipgPropertyLoader - The inventory pointer location /var/opt/oracle/oraInst.loc is either not readable or does not exist
2019-10-07 12:20:48,477 SEVERE [34] oracle.sysman.oii.oiii.OiiiInstallAreaControl - OUI-10039:Unable to access the inventory /u01/app/oraInventory on this system. Please ensure you have the proper permissions to read/write/search the inventory.
2019-10-07 12:20:48,477 SEVERE [34] oracle.sysman.nextgen.impl.NextGenInstallerImpl - java.io.IOException: OUI-10039:Unable to access the inventory /u01/app/oraInventory on this system. Please ensure you have the proper permissions to read/write/search the inventory.

So basically, Oracle tries to check oraInst.loc under folder /var/opt/oracle & if it doesn’t find any, then it sets default inventory location to ‘/u01/app/oraInventory’.

I feel, This is somewhat agent software BUG, as loaction of oraInst.loc in AIX is ‘/etc’ not ‘/var/opt/oracle’.

Now there are two questions. First one, how to resolve this & second one, why my other installations went successful.

Reason behind other installation to be successful was Oracle indeed find oraInventory at its default location. So whenever oraInventory is located at ‘/u01/app/oraInventory’, you wont face this issue.

Now how to resolve this. You can always create softlink oraInst.loc to “/var/opt/oracle”.

Steps are given below:


Login using root:
1. mkdir -p /var/opt/oracle/
2. cd /var/opt/oracle/
3. ln -s /etc/oraInst.loc oraInst.loc
4. ls -lrt oraInst.loc

Once you perform above steps, you will be install OEM agent successfully.

Hope u will find this post very useful. πŸ™‚

Cheers

Regards,
Adityanath

 

Advertisements

./roothas.sh -postpatch OR root.sh failing with CLSRSC-400: A system reboot is required to continue installing.

Recently I was doing fresh Grid Infrastructure(GI) 12.2 install on one of our UAT boxes, wherein I was facing strange issue.

Both “root.sh” & “./roothas.sh -postpatch” exiting with below error/warning:

CLSRSC-400: A system reboot is required to continue installing.


test-server01:/u01/app/12.2.0.1/grid/bin # cd $ORACLE_HOME/crs/install
test-server01:/u01/app/12.2.0.1/grid/crs/install # ./roothas.sh -postpatch
Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-22-25PM.log
2019/05/27 14:22:30 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'
2019/05/27 14:23:18 CLSRSC-400: A system reboot is required to continue installing.

A simple instruction given by above warning was to reboot machine & retry. I did ask server admin to reboot machine but subsequent rerun of command failed with the same error.


test-server01:/u01/app/12.2.0.1/grid/crs/install # ./roothas.sh -postpatch
Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-37-02PM.log
2019/05/27 14:37:07 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'
2019/05/27 14:37:52 CLSRSC-400: A system reboot is required to continue installing.

I tried checking associated log files to get more details: /u01/app/crsdata/test-server01/crsconfig/hapatch_2019-05-27_02-37-02PM.log


> ACFS-9428: Failed to load ADVM/ACFS drivers. A system reboot is recommended.
> ACFS-9310: ADVM/ACFS installation failed.
> ACFS-9178: Return code = USM_REBOOT_RECOMMENDED
2019-05-27 14:37:41: ACFS drivers cannot be installed, and reboot may resolve this
2019-05-27 14:37:52: Command output:
> CLSRSC-400: A system reboot is required to continue installing.
>End Command output
2019-05-27 14:37:52: CLSRSC-400: A system reboot is required to continue installing.

So this was definitely due to issue with ACFS drivers.

I found below MOS documents related to my issues but nothing was exactly matching with my situation or operating system.

While Manually Installing a Patch ‘rootcrs.sh -patch’ Fails with – CLSRSC-400: A system reboot is required to continue installing. (Doc ID 2360097.1)

ALERT: root.sh Fails With “CLSRSC-400” While Installing GI 12.2.0.1 on RHEL or OL with RedHat Compatible Kernel (RHCK) 7.3 (Doc ID 2284463.1)

In our environment, we don’t use ACFS file system, so it was not the real problem in my case & we can always get ACFS drivers explicity if we need it in the future.

After reading all the logs I found /u01/app/12.2.0.1/grid/lib/acfstoolsdriver.sh is being called for ACFS driver installation.

I changed following code from

# Now run command with all arguments!
exec ${RUNTHIS} $@

to

# Now run command with all arguments!
#exec ${RUNTHIS} $@
exit 0

After changing above code, “./roothas.sh -postpatch” completed without errors/warning & I was able to complete GI installation successfully.

Note: This workaround is only applicable when ACFS is not being used in the environment so it can be implemented with the forewarning that there is implied risk which one must accept πŸ˜‰

Hope u will find this post very useful.

Cheers

Regards,
Adityanath

 

 

 

DST_UPGRADE_STATE = DATAPUMP(1) causing issue in Oracle DB upgrade.

Yesterday, I was busy upgrading my UAT database from 12.1.0.2 to 12.2.0.1. As a prerequisite when I ran preupgrade.jar into 12.1 RDBMS home it gave me below warning:


-- CHECK/FIXUP name: pending_dst_session
--
-- The call to run_fixup below will test whether
-- the following issue originally identified by
-- the preupgrade tool is still present
-- and if so, it will attempt to perform the action
-- necessary to resolve it.
--
-- ORIGINAL PREUPGRADE ISSUE:
-- + Complete any pending DST update operation before starting the database
-- upgrade.
--
-- There is an unfinished DST update operation in the database. It's
-- current state is: DATAPUMP(1)
--
-- There must not be any Daylight Savings Time (DST) update operations
-- pending in the database before starting the upgrade process.
-- Refer to My Oracle Support Note 1509653.1 for more information.
--
fixup_result := dbms_preup.run_fixup('pending_dst_session');

I tried querying DATABASE_PROPERTIES to get current DST_UPGRADE_STATE output, it was shown as below:


SQL> SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;
PROPERTY_NAME VALUE
-------------------------------------------------------------------------------------------------------------------------------- ---------
DST_PRIMARY_TT_VERSION 18
DST_SECONDARY_TT_VERSION 14
DST_UPGRADE_STATE DATAPUMP(1)

I followed below MOS notes to resolve the issues, but all ended without any luck.

Updating the RDBMS DST version in 12c Release 1 (12.1.0.1 and up) using DBMS_DST (Doc ID 1509653.1)
How To Cleanup Orphaned DataPump Jobs In DBA_DATAPUMP_JOBS ? (Doc ID 336014.1)

Then I thought of ignoring this error & proceeded with DB upgrade. DB was successfully upgraded. But I once again stuck during step of DST upgrade from 18 to 26.

It was not allowing me to upgrade DST version from 18 to 26 as DST_UPGRADE_STATE was not NONE.

Then I googled it & found below steps to resolve it:


1. ALTER SESSION SET EVENTS ‘30090 TRACE NAME CONTEXT FOREVER, LEVEL 32’;
2. exec dbms_dst.unload_secondary;
3. ALTER SESSION SET EVENTS ‘30090 TRACE NAME CONTEXT FOREVER, OFF’;


I did check DST_UPGRADE_STATE post implementing it.


SQL> SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

PROPERTY_NAME VALUE
-------------------------------------------------------------------------------------------------------------------------------- ---------
DST_PRIMARY_TT_VERSION 18
DST_SECONDARY_TT_VERSION 14
DST_UPGRADE_STATE NONE

Now my database was ready for DST upgrade as DST_UPGRADE_STATE is NONE. πŸ™‚

Hope u will find this post very useful.

Cheers

Regards,
Adityanath

INVALID JServer JAVA Virtual Machine in Oracle RDBMS Database 12.1.0.2.

Recently I was busy upgrading our DEV database from 12.1 to 12.2 & found JServer JAVA Virtual Machine registry component was in INVALID state.


COMP_NAME COMP_ID VERSION STATUS
----------------------------------- ------------------------------ ------------------------------ -----------
Oracle Application Express APEX 4.2.5.00.08 VALID
OWB OWB 11.2.0.3.0 VALID
OLAP Catalog AMD 11.2.0.4.0 OPTION OFF
Spatial SDO 12.1.0.2.0 VALID
Oracle Multimedia ORDIM 12.1.0.2.0 VALID
Oracle XML Database XDB 12.1.0.2.0 VALID
Oracle Text CONTEXT 12.1.0.2.0 VALID
Oracle Workspace Manager OWM 12.1.0.2.0 VALID
Oracle Database Catalog Views CATALOG 12.1.0.2.0 VALID
Oracle Database Packages and Types CATPROC 12.1.0.2.0 VALID
JServer JAVA Virtual Machine JAVAVM 12.1.0.2.0 INVALID ====> Issue
Oracle XDK XML 12.1.0.2.0 VALID
Oracle Database Java Packages CATJAVA 12.1.0.2.0 VALID
OLAP Analytic Workspace APS 12.1.0.2.0 VALID
Oracle OLAP API XOQ 12.1.0.2.0 VALID
Oracle Real Application Clusters RAC 12.1.0.2.0 OPTION OFF

As a prerequisite of upgrade, I had to rectify this before attempting upgrade.

As a first step, I tried running UTLRP.sql, but still component was in INVALID state.

I even checked, status of all objects in the database with object_type like JAVA%.


SYS@TESTDB:TESTDB> select owner, status, count(*) from all_objects where object_type like '%JAVA%' group by owner, status;

OWNER STATUS COUNT(*)
-------------------------------------------------------------------------------------------------------------------------------- ------- ----------
SYS VALID 29238
MDSYS VALID 650
ORDSYS VALID 2589

So now only option, I had left is to reinstall JAVA package inside database.

PFB steps for the same:

  1. alter system set java_jit_enabled = FALSE;
  2. alter system set “_system_trig_enabled”=FALSE;
  3. alter system set job_queue_processes=0;
  4. create or replace java system;
  5. alter system set java_jit_enabled = true;
  6. alter system set “_system_trig_enabled”=TRUE;
  7. alter system set JOB_QUEUE_PROCESSES=1000;
  8. @?/rdbms/admin/utlrp.sql

After applying above steps, JServer JAVA Virtual Machine became VALID. πŸ™‚


COMP_NAME COMP_ID VERSION STATUS
----------------------------------- ------------------------------ ------------------------------ -----------
Oracle Application Express APEX 4.2.5.00.08 VALID
OWB OWB 11.2.0.3.0 VALID
OLAP Catalog AMD 11.2.0.4.0 OPTION OFF
Spatial SDO 12.1.0.2.0 VALID
Oracle Multimedia ORDIM 12.1.0.2.0 VALID
Oracle XML Database XDB 12.1.0.2.0 VALID
Oracle Text CONTEXT 12.1.0.2.0 VALID
Oracle Workspace Manager OWM 12.1.0.2.0 VALID
Oracle Database Catalog Views CATALOG 12.1.0.2.0 VALID
Oracle Database Packages and Types CATPROC 12.1.0.2.0 VALID
JServer JAVA Virtual Machine JAVAVM 12.1.0.2.0 VALID ====> Fixed
Oracle XDK XML 12.1.0.2.0 VALID
Oracle Database Java Packages CATJAVA 12.1.0.2.0 VALID
OLAP Analytic Workspace APS 12.1.0.2.0 VALID
Oracle OLAP API XOQ 12.1.0.2.0 VALID
Oracle Real Application Clusters RAC 12.1.0.2.0 OPTION OFF

Hope u will find this post very useful πŸ™‚

Cheers

Regards,
Adityanath

ORA-06598: insufficient INHERIT PRIVILEGES privilege

Few days ago I observed, all of a sudden, one of the application related cron job started failing with following error.

ORA-06598: insufficient INHERIT PRIVILEGES privilege

This job was intended to drop temporary tables in application schema. We had written a shell script in which SYS user executes procedure owned by application schema.

Only thing that was changed at DB end, that DB was upgraded from 11g to 12c.

After investigating further on the error, I found this was due to a new 12c security feature.

Before Oracle Database 12c, a PL/SQL code/pacakge/procedure always ran with the privileges of its invoker. If its invoker had higher privileges than its owner, then the code might perform operations unintended by, or forbidden to, its owner. Here we can see security gap.

For example, User A creates a new package and we execute it from users with higher privileges, like SYS. Now user A knows that SYS uses this package regularly, so user A could replace the contents of this package with some malacious code any time and do anything in the database, knowing the code will be ran by SYS sooner or later.

In 12c this behavior can be controlled using INHERITANCE PRIVILEGES.

See following linkΒ for more details.

INHERIT PRIVILEGES and INHERIT ANY PRIVILEGES Privileges

As of Oracle Database 12c, a PL/SQL code/pacakge/procedure can run with the privileges of its invoker only if its owner has either the INHERIT PRIVILEGES privilege on the invoker or the INHERIT ANY PRIVILEGES privilege.

I was able to resolve the issue after issuing below command:


SQL> grant inherit privileges on user sys to <application schema>;

Grant succeeded.

Hope so u will find this post very useful πŸ™‚

Cheers

Regards,
Adityanath

CELL-02630: There is a communication error between Management Server and Cell Server caused by a mismatch of security keys. Check that both servers have access to and use the same $OSSCONF/cellmskey.ora file.

IHAC who is on Exadata image version: 11.2.3.3.0.131014.1, faced below issue.

Any command on cellcli was failing with error: CELL-02630.

CELL-02630: There is a communication error between Management Server and Cell Server caused by a mismatch of security keys. Check that both servers have access to and use the same $OSSCONF/cellmskey.ora file.

If you read error description, it refers to communication error between CELLSRV & MS processes, due to mismatch in security keys.

I did check current status of CELL process & all were up & running.


[root@test01celadm01 ~]# service celld status
rsStatus: running
msStatus: running
cellsrvStatus: running

MS always creates key file ==> cellmskey.ora on startup if it does not exist. But in our case it was not present. (Not sure if someone deleted it manually)

I asked customer to restart MS process & check if it helps. After restarting MS process, CELLCLI commands started working as expected πŸ™‚


CellCLI> alter cell restart services ms

Restarting MS services...
The RESTART of MS services was successful.

CellCLI> list celldisk
CD_00_test01celadm01 normal
CD_01_test01celadm01 normal
CD_02_test01celadm01 normal
CD_03_test01celadm01 normal
CD_04_test01celadm01 normal
CD_05_test01celadm01 normal
CD_06_test01celadm01 normal
CD_07_test01celadm01 normal
CD_08_test01celadm01 normal
CD_09_test01celadm01 normal
CD_10_test01celadm01 normal
CD_11_test01celadm01 normal
FD_00_test01celadm01 normal
FD_01_test01celadm01 normal
FD_02_test01celadm01 normal
FD_03_test01celadm01 normal
FD_04_test01celadm01 normal
FD_05_test01celadm01 normal
FD_06_test01celadm01 normal
FD_07_test01celadm01 normal
FD_08_test01celadm01 normal
FD_09_test01celadm01 normal
FD_10_test01celadm01 normal
FD_11_test01celadm01 normal
FD_12_test01celadm01 normal
FD_13_test01celadm01 normal
FD_14_test01celadm01 normal
FD_15_test01celadm01 normal

Hope so u will find this post very useful πŸ™‚

Cheers

Regards,
Adityanath

New Exadata install getting Warning:Flash Cache size is not consistent for all storage nodes in the cluster.

Recently my customer faced the following issue, wherein after completing the X7-2 Exadata Install, Flash cache was showing different size in one of the cell node than other cells.

Everything went well with onecommand install until step 15 which had this warning:

Warning:Flash Cache size is not consistent for all storage nodes in the cluster. Flash Cache on [celadm06.test.local] does not match with the Flash Cache size on the cell celadm01.test.local in cluser /u01/app/12.2.0.1/grid

We checked flashcache size using dcli command:


[root@celadm01 linux-x64]# dcli -g cell_group -l root cellcli -e "list flashcache detail" | grep size
celadm01: size: 23.28692626953125T
celadm02: size: 23.28692626953125T
celadm03: size: 23.28692626953125T
celadm04: size: 23.28692626953125T
celadm05: size: 23.28692626953125T
celadm06: size: 23.28680419921875T ==================> Smaller flashcache than other cells
celadm07: size: 23.28692626953125T

All Flash disks were in a normal state and there was no hardware failure reported.

After investigating furter through sundiag report, I found below mismatch.


name: FD_00_celadm06
comment: 
creationTime: 2018-07-22T14:11:18+00:00
deviceName: /dev/md310
devicePartition: /dev/md310
diskType: FlashDisk
errorCount: 0
freeSpace: 0 =================================================>>>>>>>>>>>>>>>>>>>>>>>>>> freeSpace is 0
id: ***********
physicalDisk: ***********
size: 5.8218994140625T
status: normal

name: FD_01_celadm06
comment: 
creationTime: 2018-07-22T14:11:18+00:00
deviceName: /dev/md304
devicePartition: /dev/md304
diskType: FlashDisk
errorCount: 0
freeSpace: 0 =================================================>>>>>>>>>>>>>>>>>>>>>>>>>> freeSpace is 0
id: ***********
physicalDisk: ***********
size: 5.8218994140625T
status: normal

name: FD_02_celadm06
comment: 
creationTime: 2018-07-22T14:11:18+00:00
deviceName: /dev/md305
devicePartition: /dev/md305
diskType: FlashDisk
errorCount: 0
freeSpace: 0 =================================================>>>>>>>>>>>>>>>>>>>>>>>>>> freeSpace is 0
id: ***********
physicalDisk: ***********
size: 5.8218994140625T
status: normal

name: FD_03_celadm06
comment: 
creationTime: 2018-07-23T19:31:59+00:00
deviceName: /dev/md306
devicePartition: /dev/md306
diskType: FlashDisk
errorCount: 0
freeSpace: 160M =================================================>>>>>>>>>>>>>>>>>>>>>>> freeSpace 160M is not released
id: ***********
physicalDisk: ***********
size: 5.8218994140625T
status: normal

So I found the culprit πŸ™‚ The mismatch in flash cache size was caused by freeSpace not being released on one of the flash disks (FD_03_celadm06) as we can see in the logs.

I did ask customer to recreate flashcache using following procedure.


1) Check to make sure at least one mirror copy of the extents is available.

CellCLI> list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
– If reporting ‘YES’ continue to step #2

2) Manually flush the flashcache:
# cellcli -e alter flashcache all flush

In a 2nd window… Check status of flashcach flush.
The following command should return “working” for each flash disk on each cell while the cache is being flushed and “completed” when it is finished.
# cellcli -e \”LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror \”” | grep FD

3) Drop Flashlog:
# cellcli -e drop flashlog all

4) Drop flashcache:
# cellcli -e drop flashcache all

5) Recreate flashlog:
# cellcli -e create flashlog all

6) Recreate flashcache:
# cellcli -e create flashcache all

7) Finally check the flashcache size to see if it’s now at correct size:
# cellcli -e list flashcache detail | grep size


Issue was resolved after dropping and recreating the flashlog and flashcache on particular cellnode. πŸ™‚

Hope so u will find this post very useful πŸ™‚

Cheers

Regards,
Adityanath