Oracle9i RAC Install Tips for RedHat LINUX AS 2.1
Private Networks
The number of private networks between RAC nodes must be established and their respective functions clearly defined.
These private networks must be configured, tuned and verified as operational
The names of the hosts attached to these networks must be uniformly published to every RAC node.
Shared Storage
Three major approaches exist for providing the shared storage needed by Oracle9i RAC.
Each of these approaches has advantages and disadvantages that can be arranged using the following attributes:
Ease of use – file systems and network attached storage offer advantages over raw devices because database files can be more easily moved, renamed and resized.
Ease of replication – underlying hardware that hosts network attached and direct attached storage may have underlying mirroring and snapshot capabilities that allow transparent data replication.
Greater design flexibility – OS may impose limitations on the total number of raw devices and per disk drive;
Improved performance – the lower number of software layers between Oracle processes and directly attached disk storage allows raw devices and certain cluster file systems to offer the highest level of overall database performance.
OS services
OS configuration for RedHat AS 2.1
Public and private interconnect
It is first necessary to allocate NIC for the public and private networks both with static IPs, not DHCP.
The addresses for the one or more private networks can be assigned from the following network numbers, provided that the private networks using these numbers will NOT be connected to the Internet:
Class A networks protocol 10.0.0.0, netmask 255.0.0.0
Class B networks protocol 172.16.0.0, to 172.31.0.0, netmask 255.255.0.0
Class C networks protocol 192.168.0.0, to 192.168.255.0, netmask 255.255.255.0
A final step is to publish all public and private host names to all machines in a RAC. This can be done manually by adding entries to v/etc/hosts file or by using a network service such as DNS. It is recommended that you use /etc/hosts for smaller networks.
# public hostnames
172.31.149.14 tiger.fooxz.com tiger
172.31.149.15 lion.fooxyz.com lion
# private hostnames
192.168.1.1 tigeri.fooxz.com tigeri
192.168.1.2 lioni.fooxyz.com lioni
# hostnames used for accessing Netapp filer
10.0.1.11 tigeri.xooxyz.com tigerii
10.0.1.12 lioni.fooxyz.com lionii
# public and private netapp filer hostnames
172.31.148.226 pluto.fooxyz.com
10.0.1.10 pluto_e3 pluto_private
Shared Storage Configuration
Persistent binding – some fiber channel adapters use an underlying communication protocol that introduces a race condition in how devices report their identity. This causes storage devices to be renamed after reboot. Check manufacturer documentation to avoid this.
Configuring shared raw devices (disk partitioning and raw device binding)
TO partition disk use the fdisk or disk druid. Raw device binding – edit /etc/sysconfig/rawdevices for every RAC node.
Common mistakes:
- Forgetting to make sure that the kernel on every RAC node has loaded a partition table from the shared disks that includes all the partitions that will be mapped to shared raw devices (use #cat /proc/partitions).
- Incorrect or incomplete mapping of raw devices to partitions
- Creating partitions of incorrect sizes
None of these problems occur under OCFS. If the expected partitions are not present, reboot the affected node. The last two problems can be solved by determining size (#fdisk –l or #raw –qa).
Oracle cluster file system
The OCFS was developed by Oracle to simplify the management of RAC database data files. The installation of OCFS requires a private network configured.
# /usr/sbin/load_ocfs
# /sbin.mount –a –t ocfs
<partition device name> <mount point name> ocfs uid-1001,gid-100
The numbers following the uid and gid options correspond to the user id or the oracle user and group id of the dba group; verify that thses values are correct for any RAC deployment where OCFS will be used. As with network attached storage, it is useful to reboot RAC nodes.
Kernel parameter configuration
This is for RH AS 2.1 for each node.
# /bin/sh
echo “65536 “ > /proc/sys/fs/file-max
echo “2147483648” > /proc/sys/kernel/shmax
echo “4096” > /proc/sys/kernel/shmni
echo “2097152” > /proc/sys/kernel/shmall
echo “1024 6500 > /proc/sys/net/ipv4/ip_local_port_range
ulimit –u 16384
echo “100 32000 100 100” > /proc/sys/kernel/sem
ulimit –n 65536
# cd /etc/rc3.d
# ln –s ../init.d/rhas_ossetup.sh s77rhas_ossetup
# cd ../rc5.d
# ln –s ../init.d/rhas_ossetup.sh s77rhas_ossetup
# /sbin/swapon –s
Finally, OUI can install RAC in one of two ways:
The Oracle CFS can be used to store datafiles and redo log files, it cannot be used to store a single ORACLE_HOME directory.
The oracle user should be able to run commands via rsh and copy files with rcp on all other nodes without the password.
# chkconfig –list rlogin
# chlconfig –list rsh
# chkconfig rlogin on
# chkconfig rsh on
<node private hostname> oracle
Cluster Manager installation and configuration
A key feature of the Oracle9i cluster manager for LINUX is an associated agent that monitors system health and resets a RAC node when that instance hangs.
In the place of watchdog daemon, a new LINUX kernel module called the hangcheck-timer, which periodically verifies that the system task scheduler is functioning correctly and resets the node immediately when the system hangs. It has 2 parameters:
The following watchdog and cluster manager parameters will permit coexistence of the hangcheck-timer and watchdog for Oracle 9.2.0.1 RAC deployments:
Watchdogd: -d /dev/null –l –0 –m <softdog soft_margin setting>
Oracm: /a:0
These settings should be inserted into the ocmargs.ora file or whatever script or configuration file is used to start the OCM on all the RAC nodes.
Using the hangcheck-timer module
The use of the hangcheck –timer module requires coordination between the settings of hangcheck-tick and the hangcheck-margin and the MissCount parameter of the Cluster Manager. The MissCount parameter for the Cluster Manager (in the cmcfg.ora) must be set using the following formula:
MissCount> hangcheck-tick + hangcheck-margin.
The default value of the MissCOunt paramter is a minimum of 210 seconds using the hangcheck-timer parameter.
Installing and starting the Oracle Cluster manager
Make sure that the list of public and private nodes exactly correspond. The Cluster manager requires the use of a log directory that is not properly created or replicated across the RAC during OUI use. Likewise, several log directories used by other Oracle tools and services will not be replaced properly to other nodes in the cluster during the installation of RAC database software if they are not created before installation begins.
The following procedure will create eight log directories for use by the Cluster Manager, the SQL*Net listener, the Oracle Intelligent Agent, and the RAC database. The following will create the Cluster Manager log directory:
$ mkdir –p $ORACLE_HOME/oracm/log
These two commands will create the necessary directories for the SQL*Net listener:
$ mkdir –p $ORACLE_HOME/network/log
$ mkdir –p $ORACLE_HOME/network/trace
The directories created by these commands will be used by database instances:
$ mkdir –p $ORACLE_HOME/rdbms/log
$ mkdir –p $ORACLE_HOME/rdbms/audit
Finally, these commands create directories used by the OIA:
$ mkdir –p $ORACLE_HOME/network/agent/log
$ mkdir –p $ORACLE_HOME/network/agent/reco
It is not possible to start the OCM and proceed with RAC installation.
Oracle 9.2.0.1 with RAC installation
The root.sh must be run on all nodes of the cluster. If the script is not present on all nodes of the cluster, a host equivalence problem exists within the cluster. Exit the installation, take care of the problem and begin again.
Normally, the Cluster Node Selection screen appears first after the Welcome screen when RAC is being installed. The File Locations screen will appear instead when the Oracle Cluster manager is not functioning properly. Abort the installation, verify that the Cluster Manager is running:
# ps ax | grep oracm
Restart the Cluster manager or repair its configuration based on whether or not any Cluster Manager processes are found. Log information for the Cluster Manager can be found in the ORACLE_HOME/oracm/log directory.
Use of a single, shared ORACLE_HOME directory by an entire cluster requires that additional configuration steps are taken during the pause provided to run the root.sh script. These steps are necessary because each RAC database instance in the cluster expects to have sole access to several configuration and log directories:
$ORACLE_HOME/network/admin
$ORACLE_HOME/network/agent
$ORACLE_HOME/network.log
$ORACLE_HOME/rdbms/dbs
This can be done the following way:
1 .Cp – R $ORACLE_HOME/network/admin \
$ORACLE_HOME/network/admin_node1
Cp – R $ORACLE_HOME/network/admin \
$ORACLE_HOME/network/admin_node2
rm –rf $ORACLE_HOME/network/admin
2.Create identically named symbolic link on the local disks of each RAC node that points to the newly created directories from the previous step. The following can be used on node1:
ln –s $ORACLE_HOME/network/admin_node1 \
/var/opt/oracle/links/netadmin
Likewise, a similar command is issued for node 2:
ln –s $ORACLE_HOME/network/admin_node2 \
/var/opt/oracle/links/netadmin
Ln –s /var/opt/orac;e/links/netadmin \
$ORACLE_HOME/network/admin
After Cluster Configuration Assistant is done, go to a terminal window, log on to the RAC as oracle and check on the status of Oracle Global Services daemon (gsd):
$ gsdctl stat
GSD is running on the local node
If the Cluster Configuration Asistant is unable to configure and start up the Oracle gsd on the cluster, stop and cancel all remaining configuration tools and exit from OUI. The following can manually configure the gsd from the node:
$ srvconfig –init
and started by running from all nodes:
$ gsdctl start
The remaining configuration tools can be run manually.
$ $ORACLE_HOME/bin/netca
Next, the Intelligent Agent can be started by performing the following steps:
$ cd $ORACLE_HOME/network/agent
$ rm –f * .q services.ora dbsnmp.ver
If one or more databases using the same ORACLE_HOME directory have been registered with an OEM repository, delete the databases from any repositories where they are registered, and then run the two commands above.
$ agentctl start
$ORACLE_HOME/bin/dbca –DatafileDestination \
<absolute path of shared data file destination>
Additional manual configuration
$srvctl status database –d <database name>