Thursday, 3 June 2010

ORA-04030


 

Applies to:

Oracle Server - Enterprise Edition - Version: 8.1.5.0 to 10.2.0.4 - Release: 8.1.5 to 10.2
Information in this document applies to any platform.

Purpose

This article is intended to

Common Bugs

Notes:
Backport possibilities are only to indicate technical backport or patchset exception may be possible.
Actual availability of backports/PSEs are subject to backport policies as per
Note 209768.1 Pub Database, FMW, and OCS Software Error Correction Support Policy

Bug

Reported

Fixed

Notes

Details

Bug 3130972

Versions < 10.1.0.2

9.2.0.6/10.1.0.2.0

Backports possible

The realfree allocator on Unix systems imposes a cap at 1Gb of memory per process. This fix relaxes that limit as
in some cases it is desirable to allow a process to use over 1Gb of private memory. If the limit is reached an
ORA-4030 occurs even though the system may have plenty of free memory available and can map that memory into the
process address space.
(The realfree allocator is used on Unix systems where
PGA_AGGREGATE_TARGET is in use)
Workaround:
Configure the system to allow less than 1Gb of memory per process to avoid ORA-4030 errors.

Bug 3565920

Versions < 10.1

9.2.0.8 and higher

Backports no longer

If the shared pool is resized then subsequeunt queries against views based on
X$KSMSP, such as V$SHARED_POOL_RESERVED, result in a memory leak in the cursor work heap which then fails with ORA-4030.

Bug 4475206

Versions < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

The PGA memory consumed becomes significantly higher than the value of the
parameter pga_aggregate_target if a query has a very long chain of
hash-join operations. This chain must be right-deep, ie. the build is
performed on a base table and the probe is produced by an hash-join sub-tree.
This is not really a memory leak but excess memory use compared to what
should be used.

Bug 4625938

Versions < 10.2

10.2

10.1.x

A memory leak involving 'peihstdep' and 'PEIDEF' can
occur when TABLE functions are used and JDBC connection pooling is enabled.
Workaround:
Disable connection pooling.

Bug 5118748

Versions < 10.2.0.3

10.2.0.3/11.1.0.6

Backports possible

ORA 4030 or a memory leak can occur when using a lot of collections in PLSQL. Heapdumps of the "koh-kghu call " heap include multiple chunks of type "pmucp2upkl korfp

Bug 5220562

Versions < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

An insert select DML session's process size may increases / ORA-4030
(with sort sub heap chunks) when there is concurrent DDL on the
partitioned tables / objects involved in the DML.
Workaround:
Avoid concurrent execution of the DML and DDL, if possible.

Bug 5389854

Versions < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

Execution of a procedure using bulk insert with save exceptions will
consume large amounts of PGA memory during execution. If run with
extremely large number of rows or for a long period of time this can
lead to ORA-4030.
The memory shows up on the "callheap,DARWIN" subheap as "koh-kghu call" memory

Bug 5391505

Versions < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

PGA memory may keep on increasing during query parsing and can reach a large
amount (sometimes even over 1G) when OR expansion occurs.
Ultimately ORA-4030 may be encountered as memory runs out.
The memory shows as "perm" space in the "kxs-heap-c" subheap.
Workaround:
alter session set "_no_or_expansion" = true

Bug 5464834

Versions < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

ORA-4030 (kxs-heap-c,temporary memory) can occur when using EXPDP

Bug 5866410

Versions < 11

11.1.0.6

Backports possible

Bulk insert in PLSQL can consume a large amount of PGA memory
which can lead to ORA-4030 errors.
A heapdump will show lot of free memory in the free lists which
is not used but instead fresh allocations are made.
Workaround:
Chunk the FORALL loop. Do a hybrid of FOR & FORALL so that the
bulk_rowcount arrays doesnt grow abnormally large

Bug 5947623

Versions >= 10.2.0.1 but < 11.1.0.7

10.2.0.4/11.1.0.7

Backport possible

it is possible for a query to allocate too much memory executing a hash join
over large volumes of data with few distinct join key values.
The impact on 64-bit systems is greater.
This is not really a memory leak as the fix only makes the query to spill to
disk earlier.
Workaround:
set "_pga_max_size"

Bug 6011182

Versions >= 10.2.0.1 but < 10.2.0.4

10.2.0.4

Backport possible

High elapsed time and high memory consumption during parse
can occur for queries with large numbers of query blocks.
If you see high elapsed times and/or memory consumption during parse for a
query, and the query has a large number of query blocks (eg many views,
subqueries or UNION ALL branches) you may be hitting this bug.
For error ORA-04030
The path of the leak is: top call heap -> call heap -> typecheck
largest memory allocations w/comments:
"logdef: qcopCre", "kkoFroAnn: qksf", "frodef: qksfroC"

Bug 6052169

Versions < 11

11.1.0.6

Backports possible

Slave memory grows unbounded and finally fails with ORA-4030.
A heapdump of the memory shows 'Statement Alloc' string.

Bug 6061892

Versions >= 10.2.0.1 but < 10.2.0.4

10.2.0.4/11.1.0.6

Backports possible

It is possible to get error ORA-4030 and ORA-21780 when an application
that drops and/or recreates many plsql packages or top-level
Noteocedures or functions which are used in calls from SQL to PL/SQL.
The leak is found in the heaps:
pga heap -> session heap -> PLS non-lib hp.
Most of the hunks on heap 'PLS non-lib hp' are PEIDEF or peihstdep
Note:
This fix introduces the problem described in bug 6139254

Bug 6408917

Versions < 11

11.1.0.6

Backports possible

Excessive PGA memory consumption can be seen when using ref cursors returned
from Java stored procedures. Ultimately this can lead to out of memory
errors such as:
ORA-29532: java call terminated by uncaught java exception: java.lang.outofmemory

Bug 6414844

Versions < 11.2

11.2

Backports possible to 10.1

memory may be wasted in a subheap using kghu memory allocation.

A heapdump will show free chunks in each extent with a "free" chunk which is
not quite large enough to satisfy a request. This is not really a memory
leak but inefficient use of memory in certain circumstances.
eg:
EXTENT 0 addr=0xb3810008
Chunk b3810010 sz= 16272 free " "  Wasted space
Chunk b3813fa0 sz= 16416 freeable "koh-kghu call "
Chunk b3817fc0 sz= 16416 freeable "koh-kghu call " ds=0xb7a1daf0
Chunk b381bfe0 sz= 16416 freeable "koh-kghu call " ds=0xb7a1daf0





Questions and Answers

What is an ORA-4030?

Basically, an ORA-4030 indicates a limit has been reached with respect to the Oracle process private memory allocation.
Typical causes:

* OS Memory limit reached such as physical memory and/or swap/virtual paging
* OS limits reached such as kernel or user shell limits that limit overall, user level or process level memory usage
* OS limit on PGA memory size due to SGA attach address
Note 262540.1 Relocate SGABEG on 64-bit Oracle
* Oracle internal limit example Bug 3130972
* Application design causing limits to be reached
* Bug – space leaks, heap leaks

What is difference between 4030 and 4031?

An ORA-4030 error is an error in the PGA; Memory limitation related to an Oracle rdbms server process is reached.
This includes the database background processes and oracle shadow processes that are spawned directly by the database or the listener.

An ORA-4031 error is an error in the SGA; Memory limitation in SGA component such as shared pool,large pool, java pool, streams pool is reached.

What are the contents of Program Global Area memory?

A Program Global Area (PGA) is a memory region that contains data and control information for a server process. It is a non-shared memory area created by Oracle when a server process is started. Access to it is exclusive to that server process and is read and written only by Oracle code acting on behalf of it. The total PGA memory allocated by each server process attached to an Oracle instance is also referred to as the aggregated PGA memory allocated by the instance.

Components of the PGA vary depending on how the database is configured. The PGA is comprised of four memory areas

* User Session Memory

User session memory is the memory allocated to hold a session's variables (logon information) and other information related to the session. For a shared server, the session memory is shared and not private

* Private SQL Areas - contains data such as bind information and runtime buffers

A private SQL area contains data such as bind information and runtime memory structures. Each session that issues a SQL statement has a private SQL area. Each user that submits the same SQL statement has his or her own private SQL area that uses a single shared SQL area. Thus, many private SQL areas can be associated with the same shared SQL area.

The private SQL area of a cursor is itself divided into two areas whose lifetimes are different:

* The persistent area, which contains, for example, bind information. It is freed only when the cursor is closed.
* The run-time area, which is freed when the execution is terminated.

The location of a private SQL area depends on the type of connection established for a session. If a session is connected through a Dedicated Server, private SQL areas are located in the server process's PGA. However, if a session is connected through a Shared Server, part of the private SQL area (specifically, the peristent area) is kept in the SGA.

The runtime area contains information used while the SQL statement is being executed. The size of the runtime area depends on the type and complexity of the SQL statement being executed and on the sizes of the rows that are processed by the statement. In general, the runtime area is somewhat smaller for INSERT, UPDATE, and DELETE statements than it is for SELECT statements, particularly when the SELECT statement requires a sort.

Oracle creates the runtime area as the first step of an execute request. For INSERT, UPDATE, and DELETE statements, Oracle frees the runtime area after the statement has been executed. For queries, Oracle frees the runtime area only after all rows are fetched or the query is canceled.

The location of a private SQL area depends on the type of connection established for a session. If a session is connected via a dedicated server, private SQL areas are located in the user's PGA. However, if a session is connected via the multi-threaded server, the persistent areas and, for SELECT statements, the runtime areas, are kept in the SGA.

* SQL Work Areas

The application developer of an Oracle precompiler program or OCI program can explicitly open cursors, or handles to specific private SQL areas, and use them as a named resource throughout the execution of the program. Recursive cursors that Oracle issues implicitly for some SQL statements also use shared SQL areas.

The management of private SQL areas is the responsibility of the user process. The allocation and deallocation of private SQL areas depends largely on which application tool you are using, although the number of private SQL areas that a user process can allocate is always limited by the initialization parameter OPEN_CURSORS. The default value of this parameter is 50.

A private SQL area continues to exist until the corresponding cursor is closed or the statement handle is freed. Although Oracle frees the runtime area after the statement completes, the persistent area remains waiting. Application developers close all open cursors that will not be used again to free the persistent area and to minimize the amount of memory required for users of the application.

* Cursor and SQL Areas

For complex queries (for example, decision-support queries), a big portion of the runtime area is dedicated to work areas allocated by memory-intensive operators such as the following:

* Sort-based operators (order by, group-by, rollup, window function)
* Hash-join
* Bitmap merge
* Bitmap create

For example, a sort operator uses a work area (sometimes called the sort area) to perform the in-memory sort of a set of rows. Similarly, a hash-join operator uses a work area (also called the hash area) to build a hash table from its left input. If the amount of data to be processed by these two operators does not fit into a work area, then the input data is divided into smaller pieces. This allows some data pieces to be processed in memory while the rest are spilled to temporary disk storage to be processed later. Although bitmap operators do not spill to disk when their associated work area is too small, their complexity is inversely proportional to the size of their work area. Thus, these operators run faster with larger work area.

In shared server mode, portions of the UGA are stored in the following:

* Large pool if configured.
* Shared pool if large pool is not configured.

Note that as of 10g, the work areas controlled by PGA_AGGREGATE_TARGET are located in the pga for shared servers. On 9i shared server work areas are defined by the *_area_size parameters and located in the sga.

Why do I see processes growing larger than the PGA_AGGREGATE_TARGET?

Parameter PGA_AGGREGATE_TARGET does not set a hard limit on pga size. It is only a target value used to dynamically size the process work areas.
It also does not affect other areas of the pga that are allowed to grow beyond this limit. See below for explanation.

Can you control the size of a process?

As a database administrator you can control the size of a process within limitations.
For example, you can configure certain database initialization parameters that affect the size of a process work area.
If you choose, you can manually control the maximum size of SQL work areas by setting the following parameters: SORT_AREA_SIZE, HASH_AREA_SIZE, BITMAP_MERGE_AREA_SIZE and CREATE_BITMAP_AREA_SIZE. Setting these parameters is difficult, because the maximum work area size is ideally selected from the data input size and the total number of work areas active in the system. These two factors vary a lot from one work area to another and from one time to another. Thus, the various *_AREA_SIZE parameters are hard to tune under the best of circumstances.

With 9i and higher, you can automatically and globally manage the size of SQL work areas by specifying the total size dedicated to PGA memory for the Oracle instance by setting the initialization parameter PGA_AGGREGATE_TARGET. The specified number (for example, 2G) is a global target for the Oracle instance, and Oracle tries to ensure that the total amount of PGA memory allocated across all database server processes never exceeds this target. The key word in this statement is "tries". It is possible that PGA memory will grow beyond the "target".

The initialization parameter WORKAREA_SIZE_POLICY is a session- and system-level parameter that can be set to MANUAL or AUTO. The default is AUTO. You can switch back and forth between automatic and manual management of the work areas by setting PGA_AGGREGATE_TARGET and changing the WORKAREA_SIZE_POLICY parameter.

With PGA_AGGREGATE_TARGET, sizing of work areas for all dedicated sessions is automatic and all *_AREA_SIZE parameters are ignored for these sessions. Beginning with version 10 the PGA_AGGREGATE_TARGET is also applicable to shared server sessions.

At any given time, the total amount of PGA memory available to active work areas on the instance is automatically derived from the parameter PGA_AGGREGATE_TARGET. This amount is set to the value of PGA_AGGREGATE_TARGET minus the PGA memory allocated by other components of the system (for example, PGA memory allocated by sessions). The resulting PGA memory is then allotted to individual active work areas based on their specific memory requirement.

Keep in mind that the PGA_AGGREGATE_TARGET does not limit the amount of pga memory usage. It is only a target and is used to dynamically size the process work areas. It does not affect other areas of the pga that are allowed to grow beyond this limit.

There are certain areas of pga that cannot be controlled by initialization parameters. Such areas include pl/sql memory collections such as pl/sql tables and varrays.
Depending on the programming code and amount of data being handled these areas can grow very large (up to 20G internal limit on 10) and can consume large amounts of memory. This memory growth can be controlled by good programming practices. As an example, use LIMIT clause with BULK COLLECT.

See
http://www.oracle.com/technology/oramag/oracle/08-mar/o28plsql.html



Additionally, programming mistakes can also lead to excessive memory usage.
For example a recursive or infinite loop can use up memory. An errorstack will show this code as follows:

----- PL/SQL Call Stack -----
object line object
handle number name
70000003d2d6198 17 package body SCOTT.REGISTRY_PKG
70000003d2d6198 133 package body SCOTT.REGISTRY_PKG
70000003d2d6198 155 package body SCOTT.REGISTRY_PKG
70000003d2d6198 31 package body SCOTT.REGISTRY_PKG
70000003d2d6198 133 package body SCOTT.REGISTRY_PKG
70000003d2d6198 155 package body SCOTT.REGISTRY_PKG
PL/SQL call stack truncated after 1024 bytes.

Refer to following notes for more details on Automatic PGA Memory Management

Note 223730.1 Automatic PGA Memory Managment in 9i and 10g

Note 443746.1 Automatic Memory Management(AMM) on 11g

Can you limit the size of a process?

You can take steps to control the size of a process as discussed above.
However, from within the database framework you cannot place a hard limit on the size of a process by setting any initialization parameters or database configuration.
You can limit the size of a process from the OS side by setting kernel limits or user shell limits.
However, this leads to the ORA-4030 and will cause transaction rollback.


What information needs to be gathered to diagnose?

Resolving memory issues is typically an iterative process requiring gathering of information over a period of time to identify possible trends that typically leads to further data gathering and diagnosis.

As part of initial step some basic questions should be answered, because a memory issue is usually triggered by some change.
When did problem start?
What changes have been made to system, including OS, applications, database installation?
Can the problem be isolated to any particular query, job or application?
If problem can be isolated to particular code, does problem reproduce on other systems?
Does problem code work when run through local connection?

Then, need to gather data from the database and OS.

RDA - This will provide basic OS information that can be reviewed for memory/swap configuration, as well as database configuration.
Alert log - may list errors, trace files and non-default parameters for startup.
Related trace files - initial trace files are typically of not much use for memory issues; need to setup event for proper heapdump as discussed below.
Run database monitoring queries - the queries provided later in this note can be used to identify a particular process growing large or total pga and sga usage per instance.
Run OS monitoring queries - OSwatcher is a good start to see OS information related to memory,swap and relative process sizes
Get heapdumps - methods discussed below on how to gather



Why does my code give ORA-4030 when run through listener connection, but not local connection?

On most unix/linux OS, for local connections, the shadow processes spawned on the database server inherit the ulimits of the OS user that starts the database.
For processes spawned through a listener connection, it is the ulimits of the OS user that starts the listener that are inherited.

Need to verify that memory related resource limits as listed by ulimit -a are set large enough to prevent the ORA-4030.
See below discussion on ulimits.

When PAM authentication is being used on linux, the limits as set by pam may be an issue if starting the db or listener.
See Note 261220.1 Ulimit Command Unusable on RHEL 3.0

What to look at in RDA?

Need to review the ulimits for the OS oracle user that is starting the database and listener.
Typically an RDA may not be run as oracle user, but as root. Need to distinguish if that is the case and then ask for appropriate oracle user ulimits. On most OS, for local connections, the shadow processes spawned on the database server inherit the ulimits of the OS user that starts the database.
For processes spawned through a listener connection, it is the ulimits of the OS user that starts the listener that are inherited.

Look at physical ram and swap/paging configuration.
Depending on the OS, some information may show the free memory and paging used. Look in both the Overview->System Information link
and the Performance -> Overview link for information related to memory. This may include information from various OS commands/utilities such as swap,free,top,vmstat
For some OS, may need to request specific OS related information or try using the OSwatcher utility.

What kernel or shell limits need to be checked?

While Oracle support cannot address all cases due to the numerous OS releases and versions, can make the following general recommendations for typical platforms.

Check the ulimits of OS user starting the database and listener.
In particular we are interested in the data (d) and memory limits (m or v).
The stack (s) is typically not an issue unless set too high on a 32 bit system which can prevent database startup.

typically, all soft settings can be shown with
>ulimit -a

for soft limits of individual resource limit use S
example for data segment size limit
>ulimit -dS

for hard limits of individual resource limit use H
>ulimit -dH

Note that specifying unlimited may actually correspond to a numeric value, although unlimited is listed in the ulimit display.
May need to consult OS documentation or vendor to confirm what actual values may be. If in doubt can always set to a numeric value rather than unlimited.

Additionally, if on RAC, how are database/listener started?
Is lsrnctl used directly to start listener or using srvctl or CRS?

The difference relates back to the ulimits of user starting the listener, srvctl
or the root user limits starting the CRS.

Need to review the limits for data and memory of root user.
Note 758131.1 SOME Client Connection FAIL WITH ORA-4030 ERROR when
connecting using RAC
Note 753516.1 The difference between using srvctl vs using sqlplus for
start/stop one or more database nodes

For further information refer to Note 188149.1 How to Display and Change UNIX Process Resource Limits
or specific OS documentation.

For common kernel limits on various OS see Note 169706.1 Oracle® Database on AIX®,HP-UX®,Linux®,Mac OS® X,
Solaris���,Tru64 Unix® Operating Systems Installation and Configuration Requirements Quick Reference (8.0.5 to 11.1)

On Solaris,AIX and linux, it is the ulimit settings that determine process memory related resource limits. No kernel settings are used to control the process memory limits.


HP
still uses kernel parameters such as following that set hard limits
maxtsiz(_64bit) / maxdsiz(_64bit) / maxssiz(_64bit)

TRU64
make use of  kernel limits
/sbin/sysconfig -q proc
per-proc-data-size/max-per-proc-data-size
per-proc-address-space/max-per-proc-address-space
per-proc-stack-size/max-per-proc-stack-size
vm_rss_maxpercent

In some rare cases have seen some misleading process spawning or memory issues occurred due to other kernel settings that limited number of processes such as a low setting of
NPROC.  If such limits are set low and there are a large number of processes on the server consider increasing this.

How to monitor pga usage from within the database?

Queries can be run against the instance to show pga memory usage.
Depending on the time frame of the problem occurrence these queries need to be run at different time intervals to review for trends.

The queries are best run through a cron job, Oracle scheduler job or on MSwindows using the task scheduler.
An example unix shell script is contained in attachment memleak.sh that can be used to kick off the sql script at specified intervals. The script can be used to run pga_2010script1.sql containing pga memory queries.

For example start meamleak.sh collecting at 20 second increments

>./memleak.sh 20 > memleak.out


Monitor the pga usage for all processes related to an instance from v$sesstat.
Look at trends of individual processes growing in size.

REM v$sesstat pga/uga memory size
select p.spid, s.sid, substr(n.name,1,25) memory, s.value as Bytes from v$sesstat s, v$statname n, v$process p, v$session vs
where s.statistic# = n.statistic#
/* this query currently looks at both uga and pga, if only one of these is desired modify the like clause to pga or uga */
and n.name like '%ga memory%'
and s.sid=vs.sid
and vs.paddr=p.addr
/* --remove comment delimiters to view only certain sizes, i.e. over 10Mbytes
and s.value > 10000000 */
order by s.value asc;


List Largest process.

/* Do Not eliminate all background process because certain background processes do need to be monitored at times */
select pid,spid,substr(username,1,20) "USER" ,program,PGA_USED_MEM,PGA_ALLOC_MEM,PGA_FREEABLE_MEM,PGA_MAX_MEM
from v$process
where pga_alloc_mem=(select max(pga_alloc_mem) from v$process
where program not like '%LGWR%');


Summation of pga based on v$process

REM allocated includes free PGA memory not yet released to the operating system by the server process
select sum(pga_alloc_mem)/1024/1024 as "Mbytes allocated", sum(PGA_USED_MEM)/1024/1024 as "Mbytes used" from v$process;


Summation of pga memory based on v$sesstat

select sum(value)/1024/1024 as Mbytes from v$sesstat s, v$statname n
where
n.STATISTIC# = s.STATISTIC# and
n.name = 'session pga memory';


PGA stats from v$pgastat

select substr(name,1,30), value, unit from v$pgastat;



List all processes including pga size from v$process
Outer join will show if any defunct processes exist without associated session.

set linesize 120
set pagesize 120
column spid heading 'OSpid' format a8
column pid heading 'Orapid' format 999999
column sid heading 'Sess id' format 99999
column serial# heading 'Serial#' format 999999
column status heading 'Status' format a8
column pga_alloc_mem heading 'PGA alloc' format 99,999,999,999
column pga_used_mem heading 'PGA used' format 99,999,999,999
column username heading 'oracleuser' format a12
column osuser heading 'OS user' format a12
column program heading 'Program' format a20

SELECT
p.spid,
p.pid,
s.sid,
s.serial#,
s.status,
p.pga_alloc_mem,
p.PGA_USED_MEM,
s.username,
s.osuser,
s.program
FROM
v$process p,
v$session s
WHERE s.paddr ( + ) = p.addr
and p.BACKGROUND is null /* Remove if need to monitor background processes */
Order by p.pga_alloc_mem desc;


Summation of pga and sga gives a value of total memory usage by oracle instance
--look at total memory used by instance SGA and PGA

select sum(bytes)/1024/1024 as Mbytes from
(select value as bytes from v$sga
union all
select value as bytes from
v$sesstat s,
v$statname n
where
n.STATISTIC# = s.STATISTIC# and
n.name = 'session pga memory'
);

 
 

How to monitor memory usage from the OS on unix/linux?

Memory usage on the OS side needs to be monitored.
A good method on unix/linux boxes is to setup OSwatcher per note:
Note.301137.1 Ext/Pub OS Watcher User Guide

OS Watcher (OSW) is a collection of UNIX shell scripts intended to collect and archive operating system and network metrics to aid support in diagnosing performance issues. OSW operates as a set of background processes on the server and gathers OS data on a regular basis, invoking such Unix utilities as top, vmstat, netstat and iostat.

Depending on the issue, set the rate data is gathered to reasonable value
If problem quickly reproduces may want low values such as 5 minutes or even lower.
If problem is unknown or long running may want to set to 30 minutes or higher.

Other methods may include customer scripts or 3rd party utilities that monitor OS memory usage.

How to monitor memory usage from the OS on MSwindows?

Run perfmon utility to monitor memory usage.
This tool shows the memory used by the Oracle process more accurately than the information in Task Manager.

Method to do this varies on the windows version, but typical method is as follows:
If you go to the Control Panel (under Start, Settings) and click on Adminstrative Tools you can startup the Performance monitor.
You can also start up a Windows command window and type 'Perfmon'.

This tool shows the memory used by the Oracle process more accurately than the information in Task Manager.
Under the console tree, select Performance Logs and Alerts. Right click on 'Counter Logs' and select 'New Log Settings' to start a new log.
Click on the Add Counters button to add new counters to the log.
Click on 'Process' under 'Performance Object'.
Select counters, 'Private Bytes', 'Thread Count', 'Handles', and 'Virtual Bytes' in the 'Select counters from list' window.
You need to select these counters for the process, 'Oracle' and 'Total' located in the 'Select instance from list' window. Click on the 'Add' button.
Also, select Performance object 'Memory' and select counter 'Available MBytes'. This will track
overall free memory available on the system.

Set the appropriate sample data interval. If problem occurs over a very long period of time such as several hours, then may set the interval to a few minutes.
If problem occurs relatively quickly on order of a few minutes, then set interval to a few seconds.
These logs can be uploaded to oracle support for review and can be imported into MSExcel for graphing to review for trends.
If the log is directly saved as a comma delimited file it can be imported directly. If saved as a binary file, it must first be opened with the permon tool and then saved as a comma delimited file for import into MSExcel.

Monitor memory usage on the 'Virtual Bytes' counter. 'Virtual Bytes' shows the actual "committed memory" that the process is allocated.

Why do we still get ORA-4030 or ORA-12500 on MSwindows 32 bit database after adding more ram?

In a windows 32 bit system the addressable memory is limited to 4G for a process. This translates to about 1.8G of actual memory usage that an oracle process can address including the SGA and PGA usage. This can be increased to about 3G with use of the /3GB switch configured in the boot.ini file.
So even if you have over 4G of ram, a single instance cannot address any more memory above this 4G. This includes the total memory usage of an SGA and the sum of all associated PGAs (threads associated with the oracle process).
An exception to this is Address Windowing Extensions (AWE).
AWE allows a process to address memory above the 4G limit with use of /PAE switch. Oracle uses the AWE to allow usage of additional buffer cache above the 4G limit
with setting of database parameter USE_INDIRECT_DATA_BUFFERS=TRUE.
The AWE window is allocated in the lower 4G addressable memory space and by default is 1G. AWE size can be explicitly set in registry to reduce this size below 1G subject to the calculation as determined by
Note 225349.1.
The threads attached to the oracle process(the PGAs) and other SGA components cannot make use of this AWE and thus cannot make use of any memory above the 4G. So need to consider that even if you setup the /3GB switch and use AWE it may reduce addressable memory available to the PGA.
See Note 225349.1 Implementing Address Windowing Extensions (AWE) or VLM on Windows Platforms that discusses configuration of /3GB and /PAE.

Also, note that with the 3GB switch set it can impact the amount of nonpaged pool space available to the OS. This space is used for async i/o and in an i/o intensive system could lead to i/o errors. The /USERVA switch may need to be set to increase available nonpaged pool space. The default value is 3070.
May need to set this lower to 3030 or lower such as 2090.

How to create a heapdump?

Three methods of gathering a heapdump are shown below depending on problem occurrence.

Method A. Set event in the pfile if general cause of error is unknown
Method B. Set event at session level if problem can be easily reproduced from session
Method C. Attach using oradebug to specific process

Method A
Set event at instance level to fire for any occurrence of the ORA-4030.
The following will set in memory.

ALTER SYSTEM SET EVENTS '4030 trace name heapdump level 536870917;name errorstack level 3';

To turn off these events:
ALTER SYSTEM SET EVENTS '4030 trace name heapdump off;name errorstack off';

If instance will be restarted, need to set in spfile to persist across startup.

ALTER SYSTEM SET EVENT='4030 trace name heapdump level 536870917;name errorstack level 3' scope=spfile;
or
place in pfile immediately after any other events
event= '4030 trace name heapdump level 536870917;name errorstack level 3'

Note that if on a RAC system need to also specify the instance or all instances with sid= in the alter statement.

Specify SID = '*' if you want Oracle Database to change the value of the parameter for all instances.

Specify SID = 'sid' if you want Oracle Database to change the value of the parameter only for the instance sid. This setting takes precedence over previous and subsequent ALTER SYSTEM SET statements that specify SID = '*'.

To remove just these events from spfile:
ALTER SYSTEM SET EVENT='4030 trace name heapdump off;name errorstack off' scope=spfile;

Method B
Set event only at session level.
This will set event at session level and create trace file when error occurs in that session only.

ALTER SESSION SET EVENTS '4030 trace name heapdump level 536870917;name errorstack level 3';

Method C
Can attach to a particular process using oradebug to get a heapdump.
This is useful and required if the process grows large, but does not cause an error.

Reproduce the problem, and monitor process memory growth.

Start a new sqlplus session to attach with oradebug

SQL>sqlplus /nolog
SQL>conn / as sysdba
SQL> alter system set max_dump_file_size=unlimited;
SQL> connect sys as sysdba
SQL> oradebug setospid <OSpid from query above for unix systems>

If on a MSwindows platform you will need to attach using the setorapid instead of setospid
SQL> oradebug setorapid <Orapid from query above>

SQL> oradebug unlimit
SQL> oradebug dump errorstack 10
SQL> oradebug dump heapdump 536870917
SQL> oradebug tracefile_name (shows the path and filename information)
SQL> oradebug close_trace (closes the trace file)
SQL>exit

What heapdump level to gather?

@Internal see note for heapdump level definitions

With an ORA-4030 or process memory growth we are interested in the following heaps:

                Level
top PGA         1 --on release 10 and higher includes top call heap
top UGA        4
Current call     8
User call         16

As of release 9.2.0.5 and higher, also interested in generating the five largest subheaps for each subheap with additional level 536870912

Typical level to gather would be 536870917
On releases lower than 9.2.0.5 will not be able to use this; use level 5

See high amount of 'perm' allocations. When to set 10235 event?

In some cases the heapdump may show large allocations of type 'perm'.
On releases 9.2.0.5 and higher can set event 10235 level 65536 in order to further diagnose what these perm allocations are.
However, this event should only be set under direction from Oracle Support.

This level causes comments to be included with each perm allocation where possible.
The comments show as "cprm" chunks under the "perm" chunks in the PERMANENT section of the heapdump.

IMPORTANT: Do NOT set this event at session level. It can cause ORA-600 dumps.
Set at instance level only.

ALTER SYSTEM SET EVENTS '10235 trace name context forever, level 65536';

To turn off:
ALTER SYSTEM SET EVENTS '10235 trace name context off';

If instance will be restarted, need to set in spfile to persist across startup.

ALTER SYSTEM SET EVENT='10235 trace name context forever, level 65536' scope=spfile;
or
place in pfile immediately after any other events
event = '10235 trace name context forever, level 65536'

Note that if on a RAC system need to also specify the instance or all instances with sid= in the alter statement.

To turn off:
ALTER SYSTEM SET EVENT='10235 trace name context off' scope=spfile;



Configurations leading to excessive memory usage problems

High swap usage on Solaris
-------------------------------
DISM may be in use.

Solaris Dynamic Intimate Shared Memory (DISM) provides shared memory that is dynamically resizable. DISM means that applications can respond to changes in memory availability by dynamically increasing or
decreasing the size of optimized shared memory segments. Oracle uses DISM for its dynamic System Global Area (SGA) capability.

DISM is available on Solaris 9 and higher.

Oracle database release 9i and higher uses DISM depending on certain database parameter settings.
Oracle 9i and higher, if SGA_MAX_SIZE > SGA_TARGET (or sum of sga compenents) then DISM is used.
On 11G, DISM is also used if MEMORY_TARGET or MEMORY_MAX_TARGET is set.
Otherwise, ISM is used if DISM is not enabled.

There are two issues to be aware of when DISM is being used:

First, DISM (unlike ISM), requires a swap reservation for all pages, whether or not they're allocated.
This basically means need to have at least as much swap as the SGA size for instances that
are using DISM. An ORA-4030 could result on systems with inadequate swap space.

Second, additional issue with DISM per Sun bug 6559612.
If the ora_dism daemon fails to start, or dies for any reason, that can cause excessive memory/swap space usage to occur relative to the amount of async IO activity. ORA-4030 errors may occur and you could also see other errors in the database alert log such as following:

ORA-17500: ODM err:ODM ERROR V-41-4-1-253-12 Not enough space

To determine if DISM is being used for an Oracle instance, you can look on the OS side with ps tool.
>ps -aef | grep dism

If dism is being used it will show as a process in format ora_dism_$ORACLE_SID

To disable use of DISM can simply unset SGA_MAX_SIZE, MEMORY_TARGET, MEMORY_MAX_TARGET
 or set sga_max_size=sga_target
or allow sga_max_size to default to sum of the sga components.


ZFS file system
---------------
Oracle processes such as exp utility writing to a ZFS file system appear to use all of system memory.
The ZFS adaptive replacement cache (ARC) tries to use most of a system's available memory to cache file system data.
The default is to use all of physical memory except 1 Gbyte. As memory pressure increases, the ARC relinquishes memory."
See the ZFS Best Practices Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Administration_Considerations

or latest Solaris 10 documentation for configuring zfs_arc_max to limit cache memory.


Mounting Oracle_home location in concurrent I/O mode
----------------------------------------------------
Having an ORACLE_HOME on a filesystem mounted with "cio" option is not supported.
Such a configuration will cause, installation, relinking and other unexpected problems.
One such problem is high memory use for oracle background processes
CIO creates private memory allocations for the executable files in ORACLE_HOME that cannot be shared or reused.
This leads to "orphans" memory segments that are not shown from the database side (v$process),
but the OS shows more and more memory growth--particularly for background processes.


Unused free memory on linux system with hugepages configured
------------------------------------------------------------
If SGA size > HugePages size setting, then hugepages cannot be used.
This results in free memory never being used.
The sizing of hugepages should be just greater than the sga size in order for hugepages to be
implemented and to minimize wasted memory.

Verify if hugepages are being used.

>grep Huge /proc/meminfo

HugePages_Total: 1350
HugePages_Free: 5
Hugepagesize: 2048 kB

HugePages_Free should be much lower than HugePages_Total. If equivalent, then hugepages may not be
used because the SGA > hugepages.

Example:

sum of sga components or SGA_MAX_SIZE = 2648M

Need to configure at least that much space for hugepages.
Size for 2700M with a 2K hugepage size == 2764800Kb / 2048kb = 1350 pages required.
If you were to set the SGA larger than this the hugepages could not be used. You would need to
adjust the number of hugepages higher.

See Note 361323.1 HugePages on Linux for further information regarding hugepages.


Excessive memory/swap on AIX when large pages not configured correctly
----------------------------------------------------------------------------
If database parameter LOCK_SGA=true then large pages will be used.
If not configured properly, that can lead to excessive memory/swap usage.
Configure large pages for the SGA per
Note 372157.1 How to enable Large Page Feature on AIX-Based Systems
The utility svmon -G will show large page size=16 MB in use if properly configured.
Note, also that on 10.2.0.4 large pages are not used properly due to unpublished Bug 7226548.
See Note:740603.110.2.0.4 not using large pages on AIX.





References

NOTE:199746.1 - How to Resolve ORA-4030 Errors on UNIX
NOTE:233869.1 - Diagnosing and Resolving ORA-4030 errors

No comments: