Hardware implementation of Oracle 10g RAC can be found here.
This document describes how to implement cost efficient Oracle Real Application Clusters using only common PC hardware and Linux operating system. In a Real Application Clusters environment, each node has to access all the data stored in the database. While traditional approach requires expensive storage subsystems (such as network disk arrays) to provide this functionality, this solution allows you to build scalable and high available database system only with common Intel PCs connected into Ethernet network.
In this solution, a standard shared disk subsystem is replaced by a native Linux technology -- Network Block Device (NBD) that maps remote files to local block devices (e.g. /dev/nb0) via TCP/IP network. So that one computer (not necessarily Linux machine) serves as data storage all cluster nodes (Linux machines) instead of expensive disk array.
Typical configuration |
Simple NBD configuration |
gunzip nbd-2.0.tar.gz tar xfv nbd-2.0.tarThis installation doesn't contain necessary header files usually included in Linux distribution -- make new directory for Linux kernel NBD driver headers
mkdir nbd/linuxand copy nbd.h from your Linux kernel source into this new nbd/linux directory.
cd nbd ./configure gcc -O2 -I. -lsocket -lnsl -o nbd-server nbd-server.cDon't forget to add libraries socket and nsl (on Solaris version only!).
This is nbd-server version 2.0 Usage: port file_to_export [size][kKmM] [-r] [-m] [-c] [-a timeout_sec] -r read only -m multiple file -c copy on write -a maximum idle seconds, terminates when idle time exceeded if port is set to 0, stdin is used (for running from inetd) if file_to_export contains '%s', it is substituted with IP address of machine trying to connectBinary file nbd-server is the only thing necessary to use on the server side.
cp nbd-server /usr/local/sbin/Finally it is possible to create /etc/nbd_server.allow file that contains list of IP addresses allowed to connect to NBD server.
mkfile <size>m <path_to_data_disk>/<tbs_name>_rawRecommended size and name:
tablespace/file size datafile name SYSTEM 300M system_raw USERS 30M users_raw TEMP 110M temp_raw UNDOTBS (per instance) 210M undo_<n>_raw INDX 30M indx_raw TOOLS 20M tools_raw controlfile1 120M controlfile_1_raw controlfile2 120M controlfile_2_raw redo logs (2 per inst) 130M redo<n>_<x>_raw spfile 5M spfile_raw srvmconfig 110M srvctl_raw node monitor 10M nm_rawExample:
mkfile 300m /orac/system_raw mkfile 30m /orac/users_raw mkfile 110m /orac/temp_raw mkfile 210m /orac/undo_1_raw mkfile 210m /orac/undo_2_raw mkfile 30m /orac/indx_raw mkfile 30m /orac/tools_raw mkfile 120m /orac/controlfile_1_raw mkfile 120m /orac/controlfile_2_raw mkfile 130m /orac/redo1_1_raw mkfile 130m /orac/redo1_2_raw mkfile 130m /orac/redo2_1_raw mkfile 130m /orac/redo2_2_raw mkfile 5m /orac/spfile_raw mkfile 110m /orac/srvctl_raw mkfile 10m /orac/nm_raw mkfile 20m /orac/drsys_1_raw
nbd-server <port> <filename>
/usr/sbin/nbd-server 4101 /orac/system_raw & /usr/sbin/nbd-server 4102 /orac/users_raw & /usr/sbin/nbd-server 4103 /orac/temp_raw & /usr/sbin/nbd-server 4104 /orac/undo_1_raw & /usr/sbin/nbd-server 4105 /orac/undo_2_raw & /usr/sbin/nbd-server 4106 /orac/indx_raw & /usr/sbin/nbd-server 4107 /orac/tools_raw & /usr/sbin/nbd-server 4108 /orac/controlfile_1_raw & /usr/sbin/nbd-server 4109 /orac/controlfile_2_raw & /usr/sbin/nbd-server 4110 /orac/redo1_1_raw & /usr/sbin/nbd-server 4111 /orac/redo1_2_raw & /usr/sbin/nbd-server 4112 /orac/redo2_1_raw & /usr/sbin/nbd-server 4113 /orac/redo2_2_raw & /usr/sbin/nbd-server 4114 /orac/spfile_raw & /usr/sbin/nbd-server 4115 /orac/srvctl_raw & /usr/sbin/nbd-server 4116 /orac/nm_raw & /usr/sbin/nbd-server 4117 /orac/drsys_1_raw &Now the NBD server should be running and waiting for a client connection.
# External (real and public) interface (eth0) 147.251.48.197 rac1 # Second node connected via public interface 147.251.48.198 rac2 # Cluster interconnect (private) interface (eth1) 10.0.0.1 rac1-int # Second node connected via private interface 10.0.0.2 rac2-int # Shared data subsystem (NBD) interface (eth2) 10.1.1.1 rac1-data # Second node will use 10.1.1.2 address but its unreachable via # this interface. # NBD server address 10.1.1.100 rac-dataConfigure your local network interfaces (using ifconfig command) according to this /etc/hosts file settings.
modprobe nbdTo start NBD client you must specify the server and its assigned linux device.
/usr/sbin/nbd-client <data server> <port> /dev/nb<n>Example:
/usr/sbin/nbd-client rac-data 4101 /dev/nb1 /usr/sbin/nbd-client rac-data 4102 /dev/nb2 /usr/sbin/nbd-client rac-data 4103 /dev/nb3 /usr/sbin/nbd-client rac-data 4104 /dev/nb4 /usr/sbin/nbd-client rac-data 4105 /dev/nb5 /usr/sbin/nbd-client rac-data 4106 /dev/nb6 /usr/sbin/nbd-client rac-data 4107 /dev/nb7 /usr/sbin/nbd-client rac-data 4108 /dev/nb8 /usr/sbin/nbd-client rac-data 4109 /dev/nb9 /usr/sbin/nbd-client rac-data 4110 /dev/nb10 /usr/sbin/nbd-client rac-data 4111 /dev/nb11 /usr/sbin/nbd-client rac-data 4112 /dev/nb12 /usr/sbin/nbd-client rac-data 4113 /dev/nb13 /usr/sbin/nbd-client rac-data 4114 /dev/nb14 /usr/sbin/nbd-client rac-data 4115 /dev/nb15 /usr/sbin/nbd-client rac-data 4116 /dev/nb16 /usr/sbin/nbd-client rac-data 4117 /dev/nb17Now block devices should be configured and you should be able to access remote data. Furthermore Oracle Real application clusters need raw access to shared disk subsystem so mapping raw devices to block devices is needed. This could by done with standard raw command.
/usr/bin/raw /dev/raw/raw<n> /dev/nb<n>Example:
/usr/bin/raw /dev/raw/raw1 /dev/nb1 /usr/bin/raw /dev/raw/raw2 /dev/nb2 /usr/bin/raw /dev/raw/raw3 /dev/nb3 /usr/bin/raw /dev/raw/raw4 /dev/nb4 /usr/bin/raw /dev/raw/raw5 /dev/nb5 /usr/bin/raw /dev/raw/raw6 /dev/nb6 /usr/bin/raw /dev/raw/raw7 /dev/nb7 /usr/bin/raw /dev/raw/raw8 /dev/nb8 /usr/bin/raw /dev/raw/raw9 /dev/nb9 /usr/bin/raw /dev/raw/raw10 /dev/nb10 /usr/bin/raw /dev/raw/raw11 /dev/nb11 /usr/bin/raw /dev/raw/raw12 /dev/nb12 /usr/bin/raw /dev/raw/raw13 /dev/nb13 /usr/bin/raw /dev/raw/raw14 /dev/nb14 /usr/bin/raw /dev/raw/raw15 /dev/nb15 /usr/bin/raw /dev/raw/raw16 /dev/nb16 /usr/bin/raw /dev/raw/raw17 /dev/nb17To access these files also as an Oracle user, additional permission commands should be run (considering that oracle is a name of a user who runs Oracle database server):
chmod 600 /dev/nb* chmod 600 /dev/raw* chown oracle:dba /dev/nb* chown oracle:dba /dev/raw*It is also recommended to create local client aliases (linux symbolic links) for every raw device.
ln -s /dev/raw/raw<n> /orac/<database name>/<raw file alias>Example:
ln -s /dev/raw/raw1 /orac/system_raw ln -s /dev/raw/raw2 /orac/users_raw ln -s /dev/raw/raw3 /orac/temp_raw ln -s /dev/raw/raw4 /orac/undo_1_raw ln -s /dev/raw/raw5 /orac/undo_2_raw ln -s /dev/raw/raw6 /orac/indx_raw ln -s /dev/raw/raw7 /orac/tools_raw ln -s /dev/raw/raw8 /orac/controlfile_1_raw ln -s /dev/raw/raw9 /orac/controlfile_2_raw ln -s /dev/raw/raw10 /orac/redo1_1_raw ln -s /dev/raw/raw11 /orac/redo1_2_raw ln -s /dev/raw/raw12 /orac/redo2_1_raw ln -s /dev/raw/raw13 /orac/redo2_2_raw ln -s /dev/raw/raw14 /orac/spfile_raw ln -s /dev/raw/raw15 /orac/srvctl_raw ln -s /dev/raw/raw16 /orac/nm_rawFinally create configuration ASCII file identifying the raw device (which is used by the Database Creation Assistant).
<tablespace or datafile>=<path to raw device>Example:
cat > /orac/datafiles.conf <<EOF system=/orac/system_raw users=/orac/users_raw temp=/orac/temp_raw undotbs1=/orac/undo_1_raw undotbs2=/orac/undo_2_raw indx=/orac/indx_raw tools=/orac/tools_raw control1=/orac/controlfile_1_raw control2=/orac/controlfile_2_raw redo1_1=/orac/redo1_1_raw redo1_2=/orac/redo1_2_raw redo2_1=/orac/redo2_1_raw redo2_2=/orac/redo2_2_raw spfile=/orac/spfile_raw srvconfig_loc=/orac/srvctl_raw EOFDon't forget to set the environment variable DBCA_RAW_CONFIG to this file name so that Database Creation Assistant can find this configuration.
export DBCA_RAW_CONFIG=/orac/datafiles.conf
modprobe softdog soft_margin=60 soft_noboot=1
Miroslav Kripac,
Masaryk University Brno
Comments, suggestions and questions are welcome and can be sent to
kripac@fi.muni.cz