Tuesday 26 May 2015

Selectivity Calculation in SQL query



This article explains how the CBO determines the selectivity for various
predicates.



Selectivity
~~~~~~~~~~~
Selectivity is a measure of the proportion of a row source retrieved by
application of a particular predicate or combination of predicates.

Within the Oracle kernel it is expressed as a value between 0 and 1.
The closer the value is to 0 the more selective the predicate is.
Selectivity is only used by the CBO.

Basic Selectivity formula:
~~~~~~~~~~~~~~~~~~~~~~~~~~

                     Number of records satisfying a condition
Selectivity = -----------------------------------------
                     Total Number of records

In the optimizer, selectivity is used to compare the usefulness of various
predicates in combination with base object costs.

Knowing the proportion of the total data set that a column predicate defines
is very helpful in defining actual access costs.

By default, column selectivity is based on the high and low values and the
number of values in the column with an assumption of even distribution of
data between these two points.

Histogram data can give better selectivity estimates for unevenly distributed
column data. There is more discussion regarding Histograms later.

Selectivity is also used to define the cardinality of a particular row source
once predicates have been applied. Cardinality is the expected number of rows
that will be retrieved from a row source. Cardinality is useful in determining
nested loop join and sort costs. Application of selectivity to the original
cardinality of the row source will produce the expected (computed) cardinality
for the row source.


Glossary of Terms:
~~~~~~~~~~~~~~~~~~

NDV Number of Distinct Values
Cardinality Number of rows
Selectivity Proportion of a dataset returned by a particular predicate(or
group of predicates)

In the following illustrations there are 2 tables (T1 & T2) with columns (c1) and (c2) respectively.

Selectivities:
~~~~~~~~~~~~~~
Without histograms
~~~~~~~~~~~~~~~~~~
c1 = '4076'              1/NDV
c1 > '4076'              1 - (High - Value / High - Low)
c1 >= '4076'             1 - (High - Value / High - Low) + 1/NDV
c1 like '4076'           1/NDV

Join selectivity
~~~~~~~~~~~~~~~~

The selectivity of a join is defined as the selectivity of the most selective
join column adjusted by the proportion of not null values in each join column.


 Sel = 1/max[NDV(t1.c1),NDV(t2.c2)] *
( (Card t1 - # t1.c1 NULLs) / Card t1) *
( (Card t2 - # t2.c2 NULLs) / Card t2)

Bind Variable selectivity
~~~~~~~~~~~~~~~~~~~~~~~~~

Bind variables present a special case because the optimizer has no idea what
the bind variable value is prior to query optimization. This does not present
a problem with equality predicates since a uniform distribution of data is
assumed and the selectivity is taken as 1/NDV for the column. However for
range predicates it presents a major issue because the optimizer does not
know where the range starts or stops. Because of this the optimizer has to
make some assumptions as follows:

c1 =    :bind1           1/NDV
c1 >    :bind1           Default of 5%
c1 >=   :bind1           Default of 5%
c1 like :bind1           Default of 25%

For more information on bind variables see Note:70075.1

Selectivity With Histograms
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Histograms provide additional information about column selectivity for
columns whose distribution is non uniform. Histograms store information about
column data value ranges. Each range is stored in a single row and is often
called a 'bucket'. There are 2 different methods for storing histograms in
Oracle. If there are a small number of distinct
column values (i.e. less than the number of buckets), the column value
and the count of that value is stored. If not then a series of endpoints
are stored to enable more accurate selectivity to be determined.

The first method allows the accurate figures to be used. However with
inexact histograms the terms popular and non-popular value are introduced
and are used to help determine selectivity. A popular value is a value that
spans multiple endpoints whereas a non-popular value does not.
See Note:72539.1 for more information on histograms.

Exact histograms
~~~~~~~~~~~~~~~~
c1 = '4706'         count of value '4076' / Total Number of Rows
c1 > value          count of values > '4076' / Total Number of Rows

InExact Histograms
~~~~~~~~~~~~~~~~~~
col = pop value         # popular buckets / # buckets
col = non pop           (Density)
col > value             # buckets > value / # buckets


Rules for combining selectivity
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Let P1 and P2 be 2 distinct predicates of query Q

 P1 AND P2
       S(P1&P2) = S(P1) * S(P2)
 P1 OR P2
       S(P1|P2) = S(P1) + S(P2) -[S(P1) * S(P2)]

Index Selectivity for concatenated indexes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Starting with 10.2, when a concatenated index, with all its columns having
equality predicates, is used as an access path, the optimizer uses 1/NDK as
the selectivity (where NDK is the number of distinct keys in the index).

On 9.2.0.7 and 9.2.0.8 this may be enabled with event 38060 level 1.
On 10.1.0.4 and 10.1.0.5 this may be enabled with event 38059 level 1.
On 10.2 adjustments will be made to the selectivity to account for nulls
in the index keys. This also occurs on 10.1.0.5 (with event 38059.)


Join cardinality
~~~~~~~~~~~~~~~~

 Card(Pj) = Card(T1) * Card(T2) * Sel(Pj)




Reference:  https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=334064325845090&parent=DOCUMENT&sourceId=740052.1&id=68992.1&_afrWindowMode=0&_adf.ctrl-state=rjetmzhiv_144

Sunday 19 April 2015

Understanding and repairing ASM disk layout using kfed


Kfed parameters

  • aun - Allocation Unit (AU) number to read from. Default is AU0, or the very beginning of the ASM disk.
  • aus – AU size. Default is 1048576 (1MB). Specify the aus when reading from a disk group with non-default AU size.
  • blkn – block number to read. Default is block 0, or the very first block of the AU.
  • dev – ASM disk or device name. Note that the keyword dev can be omitted, but the ASM disk name is mandatory.
Read ASM disk header block from  AU[0]
[root@grac41 Desktop]# kfed read  /dev/asm_test_1G_disk1 | egrep 'name|size|type'
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD<-- ASM disk header
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9   <-- ASM disk name
kfdhdb.grpname:                    TEST ; 0x048: length=4   <-- ASM DG name
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9   <-- ASM Failgroup
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200     <-- Disk sector size   
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000     <-- ASM block size
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000 <-- AU size : 1 Mbyte 
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff <-- ASM disk size : 1 GByte  

Check ASM block types for the first 2 AUs 
AU[0] :
[root@grac41 Desktop]# kfed find /dev/asm_test_1G_disk1
Block 0 has type 1
Block 1 has type 2
Block 2 has type 3
Block 3 has type 3
Block 4 has type 3
Block 5 has type 3
Block 6 has type 3
Block 7 has type 3
Block 8 has type 3
Block 9 has type 3
Block 10 has type 3
..
Block 252 has type 3
Block 253 has type 3
Block 254 has type 3
Block 255 has type 3

AU[1] :
[root@grac41 Desktop]#  kfed find /dev/asm_test_1G_disk1 aun=1
Block 256 has type 17
Block 257 has type 17
Block 258 has type 13
Block 259 has type 18
Block 260 has type 13
..
Block 508 has type 13
Block 509 has type 13
Block 510 has type 1
Block 511 has type 19

Summary :
--> Disk header size is 512 bytes 
    AU size = 1Mbyte  --> AU block size = 4096 
    This translates to 1048576 / 4096 = 256 blocks to read an AU ( start with block 0 - 255 )
    Block 510 and block 0 storing an ASM disk header ( == type 1 )

Run the kfed command below if you interested in a certain ASM block type ( use output from kfed find to the type info )
[root@grac41 Desktop]#  kfed read  /dev/asm_test_1G_disk1 aun=1 blkn=255  | egrep 'type'
kfbh.type:                           19 ; 0x002: KFBTYP_HBEAT

Some ASM block types 
[root@grac41 Desktop]# kfed read  /dev/asm_test_1G_disk1 aun=0 blkn=0  | egrep 'type' 
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.type:                            5 ; 0x002: KFBTYP_LISTHEAD
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
kfbh.type:                           18 ; 0x002: KFBTYP_PST_DTA
kfbh.type:                           19 ; 0x002: KFBTYP_HBEAT

Repair ASM disk header block in AU[0] with kfed repair

In ASM versions 11.1.0.7 and later, the ASM disk header block is backed up in the second last ASM metadata block in the allocation unit 1.
 
Verify ASM DISK Header block located in  AU[0] and AU[1]
AU[0] :
[root@grac41 Desktop]# kfed read  /dev/asm_test_1G_disk1 aun=0 blkn=0 | egrep 'name|size|type'
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9
kfdhdb.grpname:                    TEST ; 0x048: length=4
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff

AU[1] : 
[root@grac41 Desktop]# kfed read  /dev/asm_test_1G_disk1 aun=1 blkn=254  | egrep 'name|size|type'
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9
kfdhdb.grpname:                    TEST ; 0x048: length=4
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff

Erase Disk header block in first AU ( aun=0 blkn=0 )
# dd if=/dev/zero of=/dev/asm_test_1G_disk1  bs=4096 count=1

Verify ASM disk header
# kfed read /dev/asm_test_1G_disk1 aun=0 blkn=0
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
--> Corrupted ASM disk header detected in AU [0]

Repair disk header in AU[0] with kfed 
[grid@grac41 ASM]$ kfed repair  /dev/asm_test_1G_disk1 
[grid@grac41 ASM]$ kfed read /dev/asm_test_1G_disk1 aun=0 blkn=0
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9
kfdhdb.grpname:                    TEST ; 0x048: length=4
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff
--> kfed repair worked - Disk header restored 

Can kfed repair the Disk header block stored in the 2.nd AU ?


Delete  Disk header block in AU[1]
First use dd to figure out whether we are getting the correct block
[grid@grac41 ASM]$  dd if=/dev/asm_test_1G_disk1 of=-  bs=4096 count=1 skip=510 ; strings block1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000464628 s, 8.8 MB/s
ORCLDISK
TEST_0000
TEST
TEST_0000
--> looks like an ASM disk header - go ahead and erase that block

[grid@grac41 ASM]$  dd if=/dev/zero of=/dev/asm_test_1G_disk1  bs=4096 count=1  seek=510 
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00644028 s, 636 kB/s

Verify ASM disk header block in AU[1]
[grid@grac41 ASM]$ kfed read /dev/asm_test_1G_disk1 aun=1 blkn=254
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
--> Corrupted ASM disk header detected

[grid@grac41 ASM]$ kfed repair  /dev/asm_test_1G_disk1 
KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]
--> kfed repair doesn' work 

Repair block with dd
grid@grac41 ASM]$ dd if=/dev/asm_test_1G_disk1  bs=4096  count=1 of=/dev/asm_test_1G_disk1  bs=4096 count=1  seek=510 
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.0306682 s, 134 kB/s
[grid@grac41 ASM]$ kfed read /dev/asm_test_1G_disk1 aun=0 blkn=0
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9
kfdhdb.grpname:                    TEST ; 0x048: length=4
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff

# kfed read /dev/asm_test_1G_disk1 aun=1 blkn=254
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:               TEST_0000 ; 0x028: length=9
kfdhdb.grpname:                    TEST ; 0x048: length=4
kfdhdb.fgname:                TEST_0000 ; 0x068: length=9
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.dsksize:                    1023 ; 0x0c4: 0x000003ff

Summary:
 to fix the backup block or the ASM disk header in AU 1 block you need to use dd

Reference

http://www.hhutzler.de/blog/using-kfeddd-to-repair-asm-disk-header-corruption/



Startup sequence of 11gR2 Clusterware

Short summary of the startup sequence: INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process (Oracle High Availability Services Daemon).  This daemon spawns 4 processes.

Level 1: OHASD Spawns:
  • cssdagent - Agent responsible for spawning CSSD.
  • orarootagent - Agent responsible for managing all root owned ohasd resources.
  • oraagent - Agent responsible for managing all oracle owned ohasd resources.
  • cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).
Level 2: OHASD rootagent spawns:
  • CRSD - Primary daemon responsible for managing cluster resources.
  • CTSSD - Cluster Time Synchronization Services Daemon
  • Diskmon
  • ACFS (ASM Cluster File System) Drivers 
Level 2: OHASD oraagent spawns:
  • MDNSD - Used for DNS lookup
  • GIPCD - Used for inter-process and inter-node communication
  • GPNPD - Grid Plug & Play Profile Daemon
  • EVMD - Event Monitor Daemon
  • ASM - Resource for monitoring ASM instances
Level 3: CRSD spawns:
  • orarootagent - Agent responsible for managing all root owned crsd resources.
  • oraagent - Agent responsible for managing all oracle owned crsd resources.
Level 4: CRSD rootagent spawns:
  • Network resource - To monitor the public network
  • SCAN VIP(s) - Single Client Access Name Virtual IPs
  • Node VIPs - One per node
  • ACFS Registery - For mounting ASM Cluster File System
  • GNS VIP (optional) - VIP for GNS
Level 4: CRSD oraagent spawns:
  • ASM Resouce - ASM Instance(s) resource
  • Diskgroup - Used for managing/monitoring ASM diskgroups.  
  • DB Resource - Used for monitoring and managing the DB and instances
  • SCAN Listener - Listener for single client access name, listening on SCAN VIP
  • Listener - Node listener listening on the Node VIP
  • Services - Used for monitoring and managing services
  • ONS - Oracle Notification Service
  • eONS - Enhanced Oracle Notification Service
  • GSD - For 9i backward compatibility
  • GNS (optional) - Grid Naming Service - Performs name resolution


Saturday 14 March 2015

Oracle Wait events

buffer busy waits

The buffer busy waits event occurs when a session wants to access a data block in the buffer cache that is currently in use by some other session. The other session is either reading the same data block into the buffer cache from the datafile, or it is modifying the one in the buffer cache.
In order to guarantee that the reader session has a coherent image of the block with either all of the changes or none of the changes, the session modifying the block marks the block header with a flag letting other sessions know that a change is taking place and to wait until the complete change is applied.
Prior to Oracle Database 10Release 1, buffer busy waits event was posted by the session when it had to wait for the other session to read the same data block into the buffer cache. However, starting with Oracle Database 10Release 1, such waits are now posted as read by other session event. The buffer busy waits event denotes the waits by a session for data block change completion by some other session.
Oracle Database 10Release 1 has another event titled buffer busy. Do not confuse this event with buffer busy waits. The buffer busy event is posted by sessions accessing cached metadata in a database using Automatic Storage Management (ASM).
Although the view V$WAITSTAT is not a component of Oracle Wait Interface, it provides valuable wait statistics for each class of buffer. The most common buffer classes that encounter buffer busy waitsare data blocks, segment header, undo blocks, and undo header.
The following example shows a sample output from querying V$WAITSTAT view:
 select *
from   v$waitstat
where  count > 0;


CLASS                   COUNT       TIME
------------------ ---------- ----------
data block            4170082    1668098
segment header            116         98
undo header               916       1134
undo block               2087       1681

Wait Parameters

Wait parameters for buffer busy waits are described here:
  • P1 From Oracle8 Database onwards, P1 shows the absolute file number where the data block in question resides.
  • P2 The actual block number the processes need to access.
  • P3 Prior to Oracle Database 10g Release 1, this is a number indicating the reason for the wait. Oracle posts this event from multiple places in the kernel code with different reason code. The value of this reason code depends on the Oracle release—that is, the reason code changed from pre-Oracle8 Database to Oracle9i Database. Oracle Database 10Release 1 does not use reason code anymore, and P3 refers to the class in the V$WAITCLASS view in Oracle Database 10gChapter 6 has more details about how to interpret this information.
For Oracle releases prior to Oracle Database 10g Release 1, Table 3-1 lists the reason codes and their descriptions. The reason code in parentheses applies to Oracle releases 8.1.5 and below.
Wait time: 100cs or 1 second

control file parallel write

The control file parallel write event occurs when the session waits for the completion of the write requests to all of the control files. The server process issues these write requests in parallel. Starting with Oracle 8.0.5, the CKPT process writes the checkpoint position in the online redo logs to the control files every three seconds. Oracle uses this information during database recovery operation. Also, when you perform DML operations using either the NOLOGGING or UNRECOVERABLE option, Oracle records the unrecoverable SCN in the control files. The Recovery Manager (RMAN) records backup and recovery information in control files.
There is no blocking session for control file parallel write. The session is blocked waiting on the OS and its I/O subsystem to complete the write to all control files. The session performing the write to the control files will be holding the CF enqueue so other sessions may be waiting on this enqueue. If systemwide waits for this wait event are significant, this indicates either numerous writes to the control file, or slow performance of writes to the control files.

Wait Parameters

Wait parameters for control file parallel write are described here:
  • P1 Number of control files the server process is writing to
  • P2 Total number of blocks to write to the control files
  • P3 Number of I/O requests

Wait Time

The actual elapsed time to complete all I/O requests.

db file parallel read

Contrary to what the name suggests, the db file parallel read event is not related to any parallel operation—neither parallel DML nor parallel query. This event occurs during the database recovery operation when database blocks that need changes as a part of recovery are read in parallel from the datafiles. This event also occurs when a process reads multiple noncontiguous single blocks from one or more datafiles.

Wait Parameters

Wait parameters for db file parallel read are described here:
  • P1 Number of files to read from
  • P2 Total number of blocks to read
  • P3 Total number of I/O requests (the same as P2 since multiblock read is not used)

Wait Time

No timeouts. The session waits until all of the I/Os are completed.

db file parallel write

Contrary to what the name suggests, the db file parallel write event is not related to any parallel DML operation. This event belongs to the DBWR process, as it is the only process that writes the dirty blocks to the datafiles. The blocker is the operating system I/O subsystem. This can also have an impact on the I/O subsystem in that the writes may impact read times of sessions reading from the same disks.
DBWR compiles a set of dirty blocks into a “write batch”. It issues multiple I/O requests to write the write batch to the datafiles and waits on this event until the I/O requests are completed. However, when using asynchronous I/O, DBWR does not wait for the whole batch write to complete, it waits only for a percentage of the batch to complete before pushing the free buffers back onto the LRU chain so that they can be used. It may also issue more write requests.

Wait Parameters

Wait parameters for db file parallel write are described here:
  • P1 Number of files to write to
  • P2 Total number of blocks to write
  • P3 From Oracle9i Release 9.2 onward, P3 shows the timeout value in centiseconds to wait for the I/O completion; prior to this release, P3 indicates the total number of I/O requests, which is the same as P2 (blocks).

Wait Time

No timeouts. The session waits until all the I/Os are completed.

db file scattered read

The db file scattered read event is posted by the session when it issues an I/O request to read multiple data blocks. The blocks read from the datafiles are scattered into the buffer cache. These blocks need not remain contiguous in the buffer cache. The event typically occurs during full table scans or index fast full scans. The initialization parameter, DB_FILE_MULTIBLOCK_READ_COUNT determines the maximum number of data blocks to read.
Waiting on datafile I/O completion is normal in any Oracle database. The presence of this wait event does not necessarily indicate a performance problem. However, if the time spent waiting for multiblock reads is significant compared to other waits, you must investigate the reason for it.

Wait Parameters

Wait parameters for db file scattered read are described here:
  • P1 File number to read the blocks from
  • P2 Starting block number to begin reading
  • P3 Number of blocks to read

Wait Time

No timeouts. The session waits until all of the I/Os are completed to read specified number of blocks.

db file sequential read

The db file sequential read wait event occurs when the process waits for an I/O completion for a sequential read. The name is a bit misleading, suggesting a multiblock operation, but this is a single block read operation. The event gets posted when reading from an index, rollback or undo segments, table access by rowid, rebuilding control files, dumping datafile headers, or the datafile headers.
Waiting on datafile I/O completion is normal in any Oracle Database. The presence of this wait event does not necessarily indicate a performance problem. However, if the time spent waiting for single block reads is significant compared to other waits, you must investigate the reason for it.

Wait Parameters

No timeouts. Wait parameters for db file sequential read are described here:
  • P1 File number to read the data block from
  • P2 Starting block number to read
  • P3 1 in most cases, but for temporary segments can be more than 1

Wait Time

No timeouts. The session waits until the I/O is completed to read the block.
Note 
In a paper titled “Why are Oracle’s Read Events ‘Named Backwards’?” Jeff Holt explains how the events db file sequential read and db file scattered read got their names. Basically, the db file sequential read happens when the buffer cache memory locations that receive data from disk are contiguous. In the case of db file scattered read those are not guaranteed to be contiguous.The paper is available at www.hotsos.com.

db file single write

The db file single write event is posted by DBWR. It occurs when Oracle is updating datafile headers, typically during a checkpoint. You may notice this event when your database has an inordinate number of database files.

Wait Parameters

Wait parameters for db file single write are described here:
  • P1 File number to write to
  • P2 Starting block number to write to
  • P3 The number of blocks to write, typically 1

Wait Time

No timeouts. Actual time it takes to complete the I/O operation.

direct path read

The direct path read event occurs when Oracle is reading data blocks directly into the session’s PGA instead of the buffer cache in the SGA. Direct reads may be performed in synchronous I/O or asynchronous I/O mode, depending on the hardware platform and the value of the initialization parameter, DISK_ASYNCH_IO. Direct read I/O is normally used while accessing the temporary segments that reside on the disks. These operations include sorts, parallel queries, and hash joins.
The number of waits and time waited for this event are somewhat misleading. If the asynchronous I/O is not available, the session waits till the I/O completes. But these are not counted as waits at the time the I/O request is issued. The session posts a direct path read wait event when accessing the data after the completion of the I/O request. In this case, the wait time will be negligibly small.
If the asynchronous I/O is available and in use, then the session may issue multiple direct path read requests and continue to process the blocks that are already cached in the PGA. The session will register direct path read wait event only when it cannot continue processing because the required block has not been read into the buffer. Therefore, the number of read requests may not be the same as the number of waits. Due to these anomalies, it is unlikely that you will see this wait event reported in V$SYSTEM_EVENT and V$SESSION_EVENT views.
Starting from Oracle Release 8.1.7 there is a separate direct path read (lob) event for reading LOB segments.

Wait Parameters

Wait parameters for direct path read are described here:
  • P1 Absolute file number to read from
  • P2 Starting block number to read from
  • P3 Number of blocks to read

Wait Time

No timeouts. Actual time until the outstanding I/O request completes.

direct path write

The direct path write wait event is just an opposite operation to that of direct path read. Oracle writes buffers from the session’s PGA to the datafiles. A session can issue multiple write requests and continue processing. The OS handles the I/O operation. If the session needs to know if the I/O operation was completed, it will wait on direct path write event.
The direct path write operation is normally used when writing to temporary segments, in direct data loads (inserts with APPEND hint, or CTAS), or in parallel DML operations.
As with the direct path write event, the number of waits and time waited for this event can be misleading when asynchronous I/O is in use.
Starting from Oracle 8.1.7 there is a separate direct path write (lob) event for writing to uncached LOB segments.

Wait Parameters

Wait parameters for direct path write are described here:
  • P1 Absolute file number to write to
  • P2 Starting block number to write from
  • P3 Number of blocks to write

Wait Time

No timeouts. Actual time until the outstanding I/O request completes.

enqueue

An enqueue is a shared memory structure used by Oracle to serialize access to the database resources. The process must acquire the enqueue lock on the resource to access it. The process will wait on this event if the request to acquire the enqueue is not successful because some other session is holding a lock on the resource in an incompatible mode. The processes wait in queue for their turn to acquire the requested enqueue. A simple example of such an enqueue wait is a session waiting to update a row when some other session has updated the row and not yet committed (or rolled back) its transaction and has a lock on it in an exclusive mode.
There are various types of enqueue to serialize access to various resources, uniquely identified by a two-character enqueue name. For example:
  • ST Enqueue for Space Management Transaction
  • SQ Enqueue for Sequence Numbers
  • TX Enqueue for a Transaction
    Note 
    In Oracle Database 10Release 1, each enqueue type is represented by its own wait event, which makes it much easier to understand exactly what type of enqueue the session is waiting for. Please refer to Appendix B for a complete list of these enqueue waits.

Wait Parameters

Wait parameters for enqueue are described here:
  • P1 Enqueue name and mode requested by the waiting process. This information is encoded in ASCII format. The following SQL statement shows how you can find out the enqueue name and mode requested by the waiting process:
    col Name format a4
    select sid,
           chr(bitand(p1, -16777216)/16777215) ||
           chr(bitand(p1,16711680)/65535) "Name",
           (bitand(p1, 65535)) "Mode"
    from   v$session_wait
    where  event = 'enqueue';
    
    
           SID Name       Mode
    ---------- ---- ----------
            64 TX            6
  • P2 Resource identifier ID1 for the requested lock, same as V$LOCK.ID1
  • P3 Resource identifier ID2 for the requested lock, same as V$LOCK.ID2
The values for resource identifiers ID1 and ID2 are dependent on the enqueue name.

Wait Time

The wait time is dependent on enqueue name, but in most cases Oracle waits for up to three seconds or until the enqueue resource becomes available, whichever occurs first. When the wait event times out, Oracle will check that the session holding the lock is still alive and, if so, wait again.

free buffer waits

The free buffer waits event occurs when the session cannot find free buffers in the database buffer cache to read in data blocks or to build a consistent read (CR) image of a data block. This could mean either the database buffer cache is too small, or the dirty blocks in the buffer cache are not getting written to the disk fast enough. The process will signal DBWR to free up dirty buffers but will wait on this event.

Wait Parameters

Wait parameters for free buffer waits are described here:
  • P1 File number from which Oracle is reading the block 
  • P2 Block number from the file that Oracle wants to read into a buffer
  • P3 Not used prior to Oracle Database 10Release 1; in this release it shows the SET_ID# for the LRU and LRUW lists in the buffer cache

Wait Time

Oracle will wait up to one second for free buffers to become available and then try to find a free buffer again.

latch free

The latch free wait occurs when the process waits to acquire a latch that is currently held by other process. Like enqueue, Oracle uses latches to protect data structures. One process at a time can either modify or inspect the data structure after acquiring the latch. Other processes needing access to the data structure must wait till they acquire the latch. Unlike enqueue, processes requesting latch do not have to wait in a queue. If the request to acquire a latch fails, the process simply waits for a short time and requests the latch again. The short wait time is called “spin”. If the latch is not acquired after one or more spin iterations, the process sleeps for a short time and tries to acquire the latch again, sleeping for successively longer periods until the latch is obtained. 
The most common latches you need to know are cache buffer chains, library cache, and shared pool. These and other latches are discussed in detail in Chapter 6.

Wait Parameters

Wait parameters for latch free are described here:
  • P1 Address of the latch for which the process is waiting 
  • P2 Number of the latch, same as V$LATCHNAME.LATCH#. To find out the latch name waited on, you can use the following SQL statement:
    select * 
    from   v$latchname
    where  latch# = &p2_value;
  • P3 Number of tries; a counter showing the number of attempts the process made to acquire the latch

Wait Time

The wait time for this event increases exponentially. It does not include the time the process spent spinning on the latch.
In Oracle Database 10g Release 1, most latches have their own wait events. Table 3-2 lists the wait events associated with latches.
Table 3-2: Latch Events in Oracle Database 10g
latch: In memory undo latch
latch: messages
latch: KCL gc element parent latch
latch: object queue header heap
latch: cache buffer handles
latch: object queue header operation
latch: cache buffers chains
latch: parallel query alloc buffer
latch: cache buffers lru chain
latch: redo allocation
latch: checkpoint queue latch
latch: redo copy
latch: enqueue hash chains
latch: redo writing
latch: gcs resource hash
latch: row cache objects
latch: ges resource hash list
latch: session allocation
latch: library cache
latch: shared pool
latch: library cache lock
latch: undo global data
latch: library cache pin
latch: virtual circuit queues

library cache pin

The library cache pin wait event is associated with library cache concurrency. It occurs when the session tries to pin an object in the library cache to modify or examine it. The session must acquire a pin to make sure that the object is not updated by other sessions at the same time. Oracle posts this event when sessions are compiling or parsing PL/SQL procedures and views.
What actions to take to reduce these waits depend heavily on what blocking scenario is occurring. A common problem scenario is the use of DYNAMIC SQL from within a PL/SQL procedure where the PL/SQL code is recompiled and the DYNAMIC SQL calls something that depends on the calling procedure. If there is general widespread waiting, the shared pool may need tuning. If there is a blocking scenario, the following SQL can be used to show the sessions that are holding and/or requesting pins on the object that are given in P1 in the wait:
 select s.sid, kglpnmod "Mode", kglpnreq "Req"
from   x$kglpn p, v$session s
where  p.kglpnuse=s.saddr
and    kglpnhdl='&P1RAW' ;

Wait Parameters

Wait parameters for library cache pin are described here:
  • P1 Address of the object being examined or loaded
  • P2 Address of the load lock
  • P3 Contains the mode plus the namespace (mode indicates which data pieces of the object are to be loaded; namespace is the object namespace as displayed in V$DB_OBJECT_CACHE view)

Wait Time

For the PMON process it is one second; for all others it is three seconds.

library cache lock

The library cache lock event is also associated with library cache concurrency. A session must acquire a library cache lock on an object handle to prevent other sessions from accessing it at the same time, or to maintain a dependency for a long time, or to locate an object in the library cache.

Wait Parameters

Wait parameters for library cache lock are described here:
  • P1 Address of the object being examined or loaded
  • P2 Address of the load lock
  • P3 Contains the mode plus the namespace (mode indicates which data pieces of the object are to be loaded; namespace is the object namespace as displayed in V$DB_OBJECT_CACHE view)

Wait Time

For the PMON process it is one second; for all others it is three seconds.

log buffer space

The log buffer space wait occurs when the session has to wait for space to become available in the log buffer to write new information. The LGWR process periodically writes to redo log files from the log buffer and makes those log buffers available for reuse. This wait indicates that the application is generating redo information faster than LGWR process can write it to the redo files. Either the log buffer is too small, or redo log files are on disks with I/O contention.

Wait Parameters

Wait parameters are not used for log buffer space.

Wait Time

Normally one second, but five seconds if the session has to wait for a log file switch to complete.

log file parallel write

The log file parallel write wait occurs when the session waits for LGWR process to write redo from log buffer to all the log members of the redo log group. This event is typically posted by LGWR process. The LGWR process writes to the active log file members in parallel only if the asynchronous I/O is in use. Otherwise, it writes to each active log file member sequentially.
The waits on this event usually indicate slow disk devices or contention where the redo logs are located.

Wait Parameters

Wait parameters for log parallel write are described here:
  • P1 Number of log files to write to
  • P2 Number of OS blocks to write to
  • P3 Number of I/O requests

Wait Time

Actual elapsed time it takes to complete all I/Os. Although the log files are written to in parallel, the write is not complete till the last I/O operation is complete.

log file sequential read

The log file sequential read wait occurs when the process waits for blocks to be read from the online redo logs files. The ARCH process encounters this wait while reading from the redo log files.

Wait Parameters

Wait parameters for log file sequential read are described here:
  • P1 Relative sequence number of the redo log file within the redo log group
  • P2 Block number to start reading from
  • P3 Number of OS blocks to read starting from P2 value

Wait Time

Actual elapsed time it takes to complete the I/O request to read.

log file switch (archiving needed)

The log file switch wait indicates that the ARCH process is not keeping up with LGWR process writing to redo log files. When operating the database in archive log mode, the LGWR process cannot overwrite or switch to the redo log file until the ARCH process has archived it by copying it to the archived log file destination. A failed write to the archive log file destination may stop the archiving process. Such an error will be reported in the alert log file.

Wait Parameters

Wait parameters are not used for log file switch (archiving needed).

Wait Time

One second

log file switch (checkpoint incomplete)

The log file switch wait indicates that the process is waiting for the log file switch to complete, but the log file switch is not possible because the checkpoint process for that log file has not completed. You may see this event when the redo log files are sized too small.

Wait Parameters

Wait parameters are not used for log file switch (checkpoint incomplete).

Wait Time

One second

log file switch completion

This wait event occurs when the process is waiting for log file switch to complete.

Wait Parameters

Wait parameters are not used for log file switch completion.

Wait Time

One second

log file sync

When a user session completes a transaction, either by a commit or a rollback, the session’s redo information must be written to the redo logs by LGWR process before the session can continue processing. The process waits on this event while LGWR process completes the I/O to the redo log file.
If a session continues to wait on the same buffer#, the SEQ# column of V$SESSION_WAIT view should increment every second. If not, then the local session has a problem with wait event timeouts. If the SEQ# column is incrementing, the blocking process is the LGWR process. Check to see what LGWR process is waiting on because it may be stuck.
Tune LGWR process to get good throughput to disk; for example, do not put redo logs on RAID-5 disk arrays. If there are lots of short-duration transactions see if it is possible to BATCH transactions together so there are fewer distinct COMMIT operations. Each COMMIT has to have it confirmed that the relevant REDO is written to disk. Although commits can be piggybacked by Oracle, reducing the overall number of commits by batching transactions can have a very beneficial effect.

Wait Parameters

Wait parameters for log file sync are described here:
  • P1 The number of the buffer in the log buffer that needs to be synchronized
  • P2 Not used
  • P3 Not used

Wait Time

One second

SQL*Net message from client

This wait event is posted by the session when it is waiting for a message from the client to arrive. Generally, this means that the session is sitting idle. Excessive wait time on this event in batch programs that do not interact with an end user at a keyboard may indicate some inefficiency in the application code or in the network layer. However, the database performance is not degraded by high wait times for this wait event, because this event clearly indicates that the perceived database performance problem is actually not a database problem.

Wait Parameters

Wait parameters for SQL*Net message from client are described here:
  • P1 Prior to Oracle8i release, the value in this parameter was not of much use. Since Oracle8i, the P1RAW column contains an ASCII value to show what type of network driver is in use by the client connections; for example, bequeath, and TCP.
  • P2 The number of bytes received by the session from the client—generally one, even though the received packet will contain more than 1 byte.
  • P3 Not used.

Wait Time

The actual time it takes for the client message to arrive since the last message the session sent to the client.

SQL*Net message to client

This wait event is posted by the session when it is sending a message to the client. The client process may be too busy to accept the delivery of the message, causing the server session to wait, or the network latency delays may be causing the message delivery to take longer.

Wait Parameters

Wait parameters for SQL*Net message to client are described here:
  • P1 Prior to Oracle8i Database, the value in this parameter was not of much use. Since Oracle8i, the P1RAW column contains an ASCII value to show what type of network driver is in use by the client connections, for example, bequeath and TCP.
  • P2 Number of bytes sent to client. This is generally one even though the sent packet will contain more than 1 byte.
  • P3 Not used.

Wait Time

Actual elapsed time it takes to complete the message delivery to the client.


Reference: Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning by  Richmond SheeKirtikumar Deshpande and K Gopalakrishnan