Wednesday, 16 December 2009

Oracle RAC and TAF to Guarantee availability

One of the most exciting new features in Oracle Database is Real Application Clusters (RAC). The Oracle RAC solution delivers 24/7 database availability, performance, and scalability. Cache Fusion is the key memory feature that enables Oracle RAC performance, and the new Transparent Application Failover (TAF) is what applications use to sync up with Oracle RAC availability. This article explores the cooperation between Oracle RAC, Cache Fusion, and TAF and offers insights into the architecture and use of these tools for continuous availability and infinite scalability.

Oracle RAC Architecture

Oracle has long recognized that a clustered environment is the best protection against hardware and software failure. In a clustered environment, many Oracle instances exist on separate servers, each with direct connectivity to a single Oracle database. Should any single server or instance fail, processing continues on the surviving servers.

Cache Fusion and Oracle RAC

The introduction of the Cache Fusion shared RAM cache for multiple Oracle instances is a breakthrough in clustered solutions. Oracle RAC fully implements Cache Fusion, which both provides high performance and enables continuous cluster availability. The high-availability capability of Oracle RAC is almost unfathomable. It's estimated that in a 12-computer configuration, any application running on Oracle RAC will not experience a catastrophic failure for well over 100,000 years.

Cache Fusion technology changes the internal configuration of the Oracle system global area (SGA). Cache Fusion moves the RAM data buffers from local RAM storage into a shared RAM area accessible by all Oracle instances.

Beyond high performance and high availability, Oracle RAC offers significant benefits as a scalability tool. Whenever the processing load becomes excessive in an existing Oracle RAC cluster, you can add additional processors—each with its own Oracle instance—to the Oracle RAC configuration. This allows companies to start small and scale infinitely as processing demands increase.

Oracle RAC and Hardware Failover

To detect a node failure, the Cluster Manager uses a background process—Global Enqueue Service Monitor (LMON)—to monitor the health of the cluster. When a node fails, the Cluster Manager reports the change in the cluster's membership to Global Cache Services (GCS) and Global Enqueue Service (GES). These services are then remastered based on the current membership of the cluster.

To successfully remaster the cluster services, Oracle RAC keeps track of all resources and resource states on each node and then uses this information to restart these resources on a backup node.

These processes also manage the state of in-flight transactions and work with TAF to either restart or resume the transactions on the new node. Now let's see how Oracle RAC and TAF work together to ensure that a server failure does not cause an unplanned service interruption.

Using Transparent Application Failover

After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were "in flight" at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.

For an application to use TAF, it must use failover-aware API calls from the Oracle Call Interface (OCI). Inside OCI are TAF callback routines that can be used to make any application failover-aware.

While the concept of failover is simple, providing an apparent instant failover can be extremely complex, because there are many ways to restart in-flight transactions. The TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level:

  • SELECT failover. With SELECT failover, Oracle Net keeps track of all SELECTstatements issued during the transaction, tracking how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to another Oracle RAC node and re-executes the SELECT statements, repositioning the cursors so the client can continue fetching rows as if nothing has happened. The SELECT failover approach is best for data warehouse systems that perform complex and time-consuming transactions.
  • SESSION failover. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to another Oracle RAC node; any work in progress is lost. SESSION failover is ideal for online transaction processing (OLTP) systems, where transactions are small.

Oracle TAF also offers choices on how to restart a failed transaction. The Oracle DBA may choose one of the following failover methods:

  • BASIC failover. In this approach, the application connects to a backup node only after the primary connection fails. This approach has low overhead, but the end user experiences a delay while the new connection is created.
  • PRECONNECT failover. In this approach, the application simultaneously connects to both a primary and a backup node. This offers faster failover, because a pre-spawned connection is ready to use. But the extra connection adds everyday overhead by duplicating connections.

Currently, TAF will fail over standard SQL SELECT statements that have been caught during a node crash in an in-flight transaction failure. In the current release of TAF, however, TAF must restart some types of transactions from the beginning of the transaction.

The following types of transactions do not automatically fail over and must be restarted by TAF:

  • Transactional statements. Transactions involving INSERT, UPDATE, or DELETEstatements are not supported by TAF.
  • ALTER SESSION statements. ALTER SESSION and SQL*Plus SETstatements do not fail over.
  • The following do not fail over and cannot be restarted:
  • Temporary objects. Transactions using temporary segments in the TEMP tablespace and global temporary tables do not fail over.
  • PL/SQL package states. PL/SQL package states are lost during failover.

Using Oracle RAC and TAF Together

The continuous availability features of Oracle RAC and TAF come together when these products cooperate in restarting failed transactions. Let's take a closer look at how this works.

Within each connected Oracle Net client, tnsnames.ora file parameters define the failover types and methods for that client. The parameters direct Oracle RAC and TAF on how to restart any transactions that may be in-flight during a hardware failure on the node.

It is important to note that TAF failover control is external to the Oracle RAC cluster, and each Oracle Net client may have unique failover types and methods, depending on processing requirements. The following is a client tnsnames.ora file entry for a node, including its current TAF failover parameters:

 
 

bubba.world =

(DESCRIPTION_LIST =

(FAILOVER = true)

(LOAD_BALANCE = true)

(DESCRIPTION =

(ADDRESS =

(PROTOCOL = TCP)

(HOST = redneck)(PORT = 1521))

(CONNECT_DATA =

(SERVICE_NAME = bubba)

(SERVER = dedicated)

(FAILOVER_MODE =

(BACKUP=cletus)

(TYPE=select)

(METHOD=preconnect)

(RETRIES=20)

(DELAY=3)

)

)

)

 
 

The failover_mode section of the tnsnames.ora file lists the parameters and their values:

BACKUP=cletus. This names the backup node that will take over failed connections when a node crashes. In this example, the primary server is bubba, and TAF will reconnect failed transactions to the cletus instance in case of server failure.

TYPE=select. This tells TAF to restart all in-flight transactions from the beginning of the transaction (and not to track cursor states within each transaction).

METHOD=preconnect. This directs TAF to create two connections at transaction startup time: one to the primary bubba database and a backup connection to the cletus database. In case of instance failure, the cletus database will be ready to resume the failed transaction.

RETRIES=20. This directs TAF to retry a failover connection up to 20 times.

DELAY=3. This tells TAF to wait three seconds between connection retries.

Remember, you must set these TAF parameters in every tnsnames.ora file on every Oracle Net client that needs transparent failover.

Putting It All Together

An Oracle Net client can be a single PC or a huge application server. In the architectures of giant Oracle RAC systems, each application server has a customized tnsnames.ora file that governs the failover method for all connections that are routed to that application server.

Watching TAF in Action

The transparency of TAF operation is a tremendous advantage to application users, but DBAs need to quickly see what has happened and where failover traffic is going, and they need to be able to get the status of failover transactions. To provide this capability, the Oracle data dictionary has several new columns in the V$SESSION view that give the current status of failover transactions.

The following query calls the new FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns of the V$SESSION view. Be sure to note that the query is restricted to nonsystem sessions, because Oracle data definition language (DDL) and data manipulation language (DML) are not recoverable with TAF.

 
 

select

username,

sid,

serial#,

failover_type,

failover_method,

failed_over

from

v$session

where

username not in ('SYS','SYSTEM',

'PERFSTAT')

and

failed_over = 'YES';

 
 

You can run this script against the backup node after an instance failure to see those transactions that have been reconnected with TAF. Remember, TAF will quickly redirect transactions, so you'll only see entries for a short period of time immediately after the failover.  A backup node can have a variety of concurrent failover transactions, because the tnsnames.ora file on each Oracle Net client specifies the backup node, the failover type, and the failover method.

Conclusion

Oracle RAC, TAF, and Cache Fusion work together to guarantee continuous availability and infinite scalability. To summarize, here's a short description of each component:

Oracle RAC. The clustering component of Oracle that allows the creation of multiple, independent Oracle instances, all sharing a single database.

Cache Fusion. The shared RAM component of Oracle RAC that provides fast interchange of Oracle data blocks between SGA regions.

TAF. The failover method implemented on the Oracle Net client to restart in-flight transactions when a node crashes.

No comments: