Surachart Opun's Blog: exadata

Showing posts with label exadata. Show all posts

Thursday, August 11, 2011

INSERT with PARALLEL hint + force direct-path INSERT

I posted What wrong? (INSERT) PARALLEL hint - It not compress. I know because it's not use "direct path write".
How Hybrid Columnar Compression Works When Rows are Inserted/Updated. [ID 1332853.1]
The hybrid columnar compression (HCC) works with Direct path loads only like ALTER TABLE MOVE, IMPDP with DIRECT option or DIRECT path inserts. HCC is meant for those type of data which do not get updated or not updated very often. If you update rows in HCC table, The entire compression unit is locked. The rows that are updated may be moved to a lower compression level such as No compression/ OLTP compression

But many friends told me, it should work with compression. I tested by using "ALTER SESSION force PARALLEL DML", To make sure it uses parallel with DML(direct-path INSERT).

Case 1:

SQL> insert /*+ parallel */ into tb_b t select * from dba_objects where rownum <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> select sum(g.bytes / 1024) sum,
g.segment_name,
g.tablespace_name
from dba_segments g
where g.segment_name = 'TB_B'
group by tablespace_name, segment_name; 2 3 4 5 6

SUM SEGMENT_NAME TABLESPACE_NAME
---------- --------------------------------------------------------------------------------- ------------------------------
1024 TB_B TBS_TEST_EXA

Case 2: use "ALTER SESSION { ENABLE | FORCE } PARALLEL DML" before insert

SQL> ALTER SESSION force PARALLEL DML;

Session altered.

SQL> insert /*+ parallel */ into tb_b t select * from dba_objects where rownum <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> select sum(g.bytes / 1024) sum,
g.segment_name,
g.tablespace_name
from dba_segments g
where g.segment_name = 'TB_B'
group by tablespace_name, segment_name;

SUM SEGMENT_NAME TABLESPACE_NAME
---------- --------------------------------------------------------------------------------- ------------------------------
64 TB_B TBS_TEST_EXA

trace 10046 and review trace file:

PARSING IN CURSOR #47620816211296 len=198 dep=1 uid=0 oct=3 lid=0 tim=1313054941154387 hv=2190895844 ad='c3af40060' sqlid='b91n32219cur4'
select ilevel, bestsortcol, tinsize, toutsize, analyzer from (select * from compression$ where obj#=:1 and ulevel=:2 order by mtime desc) where rownum < 2
END OF STMT

WAIT #47620816212296: nam='direct path write' ela= 312 file number=530 first dba=556 block cnt=8 obj#=19 tim=1313054941193775
WAIT #47620816212296: nam='enq: FB - contention' ela= 134 name|mode=1178730502 tablespace #=536 dba=2222981680 obj#=-1 tim=1313054941210382
WAIT #47620816212296: nam='gc current multi block request' ela= 138 file#=530 block#=567 id#=33554433 obj#=55937 tim=1313054941210600
WAIT #47620816212296: nam='asynch descriptor resize' ela= 1 outstanding #aio=0 current aio limit=203 new aio limit=201 obj#=19 tim=1313054941211131
EXEC #47620816212296:c=266959,e=3487089,p=0,cr=810,cu=118,mis=1,r=10000,dep=0,og=1,plh=1346303798,tim=1313054941211293
STAT #47620816212296 id=1 cnt=1 pid=0 pos=1 obj=0 op='PX COORDINATOR (cr=803 pr=0 pw=8 time=3478516 us)'

It used "direct path write" with parallel and DATA was Compressed!!!

What wrong? (INSERT) PARALLEL hint - It not compress

I saw some post about (INSERT) parallel hint not work with "compress for query high " on communities.oracle.com, So I tested it...
*** Oracle 11.2.0.2 and Image version: 11.2.2.2.0.101206.2 ***
On table no compress:

SQL> create table tb_a tablespace TBS_TEST_EXA as select * from dba_objects where 1=0;

Table created.

SQL> insert /*+ parallel */ into tb_a t select * from dba_objects;

17808 rows created.

SQL> select sum(g.bytes / 1024) sum,
g.segment_name,
g.tablespace_name
from dba_segments g
where g.segment_name = 'TB_A'
group by tablespace_name, segment_name;

SUM SEGMENT_NAME TABLESPACE_NAME
---------- --------------------------------------------------------------------------------- ------------------------------
2048 TB_A TBS_TEST_EXA

On compress for query high:

SQL> create table tb_b tablespace TBS_TEST_EXA compress for query high as select * from dba_objects where 1=0;

Table created.

Case A: Use PARALLEL hint

SQL> ALTER session SET EVENTS '10046 trace name context forever, level 12';
SQL> insert /*+ parallel(t) */ into tb_b t select * from dba_objects;
SQL> ALTER session SET EVENTS '10046 trace name context off';

In trace file: I don't see about compression. just see:

WAIT #47502448530384: nam='asynch descriptor resize' ela= 2 outstanding #aio=0 current aio limit=219 new aio limit=154 obj#=19 tim=1313001654410274
EXEC #47502448530384:c=101984,e=113787,p=0,cr=1579,cu=2514,mis=0,r=17807,dep=0,og=1,plh=3173891493,tim=1313001654410582
STAT #47502448530384 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL (cr=1579 pr=0 pw=0 time=113756 us)'

Check segment size:

SQL> select sum(g.bytes / 1024) sum,
g.segment_name,
g.tablespace_name
from dba_segments g
where g.segment_name = 'TB_B'
group by tablespace_name, segment_name;

SUM SEGMENT_NAME TABLESPACE_NAME
---------- --------------------------------------------------------------------------------- ------------------------------
2048 TB_B TBS_TEST_EXA

Case B: Use APPEND hint

SQL> ALTER session SET EVENTS '10046 trace name context forever, level 12';
SQL> insert /*+ append */ into tb_b t select * from dba_objects;
SQL> ALTER session SET EVENTS '10046 trace name context off';

In trace file:

PARSING IN CURSOR #47502447102008 len=198 dep=1 uid=0 oct=3 lid=0 tim=1313001730563910 hv=2190895844 ad='da3536290' sqlid='b91n32219cur4'
select ilevel, bestsortcol, tinsize, toutsize, analyzer from (select * from compression$ where obj#=:1 and ulevel=:2 order by mtime desc) where rownum < 2
END OF STMT

WAIT #47502448530384: nam='asynch descriptor resize' ela= 1 outstanding #aio=0 current aio limit=223 new aio limit=158 obj#=19 tim=1313001730671372
WAIT #47502448530384: nam='direct path write' ela= 380 file number=530 first dba=642 block cnt=15 obj#=19 tim=1313001730683122
EXEC #47502448530384:c=118983,e=121768,p=0,cr=781,cu=55,mis=0,r=17807,dep=0,og=1,plh=332911567,tim=1313001730683230
STAT #47502448530384 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD AS SELECT (cr=781 pr=0 pw=15 time=121705 us)'

Check segment size:

SQL> select sum(g.bytes / 1024) sum,
g.segment_name,
g.tablespace_name
from dba_segments g
where g.segment_name = 'TB_B'
group by tablespace_name, segment_name;

SUM SEGMENT_NAME TABLESPACE_NAME
---------- --------------------------------------------------------------------------------- ------------------------------
192 TB_B TBS_TEST_EXA

Oops! What wrong?

Saturday, June 04, 2011

Monitoring Exadata Cell using some Views

When we have Exadata, we might ask some questions, what dynamic V$ views we can use for monitoring?
V$CELL - to display Exadata Cell Identification.

SQL> desc v$cell
Name Null? Type
----------------------------------------- -------- ----------------------------
CELL_PATH VARCHAR2(400)
CELL_HASHVAL NUMBER

CELL_PATH - IP addresses of the cell.
CELL_HASHVAL - A numeric hash value for the cell.

SQL> column CELL_PATH format a20
SQL> select * from v$cell;

CELL_PATH CELL_HASHVAL
-------------------- ------------
192.168.101.5 2520626383
192.168.101.6 4087315787
192.168.101.7 2611738764
192.168.101.8 1901900577
192.168.101.9 1557764708
192.168.101.10 3737599385
192.168.101.11 3126672640

We can use CELL_HASHVAL column in V$CELL view join in V$SESSION_WAIT and V$ACTIVE_SESSION_HISTORY views (P1)

SQL> select a.sid ,'Event: ' ||event || ' on '|| b.cell_path ||' '|| P3 || ' '|| P3TEXT from V$SESSION_WAIT a, V$CELL b where b.cell_hashval = a.p1 and a.event like 'cell%';

SID 'EVENT:'||EVENT||'ON'||B.CELL_PATH||''||P3||''||P3TEXT
---------- ---------------------------------------------------------------------
3603 Event: cell single block physical read on 192.168.101.5 8192 bytes
1719 Event: cell single block physical read on 192.168.101.5 8192 bytes
209 Event: cell single block physical read on 192.168.101.5 8192 bytes

V$CELL_STATE - to display Exadata Cell Performance Statistics

SQL> desc V$CELL_STATE
Name Null? Type
----------------------------------------- -------- ----------------------------
CELL_NAME VARCHAR2(1024)
STATISTICS_TYPE VARCHAR2(15)
OBJECT_NAME VARCHAR2(1024)
STATISTICS_VALUE CLOB

CELL_NAME - IP addresses of the cell.
STATISTICS_TYPE - statistics type, such as thread_stats.
OBJECT_NAME - key for a specific statistics type, such as the thread ID if STATISTICS_TYPE is thread_stats.
STATISTICS_VALUE - statistic values in an XML document with attribute and value pairs.

SQL> SPOOL /tmp/cell_state_test.log
SQL> SET PAGESIZE 10000 LONG 500000
SQL> SELECT cell_name, statistics_type, object_name, XMLTYPE(statistics_value) FROM v$cell_state;
SQL> SPOOL OFF

V$CELL_THREAD_HISTORY - to display the samples that Cell Server takes of the threads in the cells visible to the database client.

SQL> desc V$CELL_THREAD_HISTORY
Name Null? Type
----------------------------------------- -------- ----------------------------
CELL_NAME VARCHAR2(1024)
SNAPSHOT_ID NUMBER
SNAPSHOT_TIME DATE
THREAD_ID NUMBER
JOB_TYPE VARCHAR2(32)
WAIT_STATE VARCHAR2(32)
WAIT_OBJECT_NAME VARCHAR2(32)
SQL_ID VARCHAR2(13)
DATABASE_ID NUMBER
INSTANCE_ID NUMBER
SESSION_ID NUMBER
SESSION_SERIAL_NUM NUMBER

CELL_NAME - IP addresses of the cell.
SNAPSHOT_ID - ID of the snapshot(NUMBER).
SNAPSHOT_TIME - date and time of the snapshot.
THREAD_ID - thread ID (NUMBER).
JOB_TYPE - job that the thread was running when the snapshot was taken.
WAIT_STATE - A unique state that identifies the location of the wait, if any exists.
WAIT_OBJECT_NAME - Object being waited on, if any exists. For example, the mutex name.
SQL_ID - The identifier of the SQL command that the client was processing for the job that is running.
DATABASE_ID - ID of the database (NUMBER).
INSTANCE_ID - ID of the instance (NUMBER).
SESSION_ID - ID of the session (NUMBER).
SESSION_SERIAL_NUM - session serial number (NUMBER).

SQL> select cell_name, snapshot_id, job_type, wait_state, sql_id, session_id, session_serial_num from V$CELL_THREAD_HISTORY where sql_id ='ab3swhv5g138y';

CELL_NAME SNAPSHOT_ID JOB_TYPE WAIT_STATE SQL_ID SESSION_ID SESSION_SERIAL_NUM
------------------------------ ----------- -------------------------------- -------------------------------- ------------- ---------- ------------------
192.168.101.9 7530062 CacheGet waiting_for_SKGXP_send ab3swhv5g138y 1892 2965

V$CELL_REQUEST_TOTALS - to diaplay a historical view of the types and frequencies of the requests being run by a cell.

SQL> desc V$CELL_REQUEST_TOTALS
Name Null? Type
----------------------------------------- -------- ----------------------------
CELL_NAME VARCHAR2(1024)
SNAPSHOT_ID NUMBER
SNAPSHOT_TIME DATE
STATISTICS_NAME VARCHAR2(32)
STATISTICS_VALUE NUMBER

CELL_NAME - IP addresses of the cell.
SNAPSHOT_ID -ID of the snapshot(NUMBER).
SNAPSHOT_TIME - date and time of the snapshot.
STATISTICS_NAME - name of the statistic.
STATISTICS_VALUE - value of the statistic.

SQL> select * from V$CELL_REQUEST_TOTALS where snapshot_id=7530062;

CELL_NAME SNAPSHOT_ID SNAPSHOT_ STATISTICS_NAME STATISTICS_VALUE
------------------------------ ----------- --------- -------------------------------- ----------------
192.168.101.9 7530062 04-JUN-11 CacheGet Jobs 1
192.168.101.9 7530062 04-JUN-11 CachePut Jobs 0
192.168.101.9 7530062 04-JUN-11 Predicate Disk Read Jobs 0
192.168.101.9 7530062 04-JUN-11 Process Ioctl Jobs 4
192.168.101.9 7530062 04-JUN-11 Smart IO Total Request Size 0

V$SYSSTAT - this is view what we often use for monitoring database and be able to use monitoring Exadata.

SQL> SELECT name, value/1024/1024 MB from v$sysstat a WHERE
a.name = 'physical read total bytes' OR
a.name = 'physical write total bytes' OR
a.name = 'cell physical IO interconnect bytes' OR
a.name = 'cell physical IO bytes eligible for predicate offload' OR
a.name = 'cell physical IO bytes saved during optimized file creation' OR
a.name = 'cell physical IO bytes saved during optimized RMAN file restore' OR
a.name = 'cell IO uncompressed bytes' OR
a.name = 'cell physical IO interconnect bytes returned by smart scan' OR
a.name = 'cell physical IO bytes saved by storage index';

NAME MB
---------------------------------------------------------------- ----------
physical read total bytes 1586343.81
physical write total bytes 7141430.75
cell physical IO interconnect bytes 18947204.6
cell physical IO bytes saved during optimized file creation 700946.953
cell physical IO bytes saved during optimized RMAN file restore 0
cell physical IO bytes eligible for predicate offload 1105701.38
cell physical IO bytes saved by storage index 365.953125
cell physical IO interconnect bytes returned by smart scan 5990.22793
cell IO uncompressed bytes 403520.695

Key Statistics on V$SYSSTAT view about Exadata:
cell flash cache read hits - The number of read requests that were a cache hit on exadata flash cache.
cell IO uncompressed bytes - The total size of uncompressed data that is processed on the cell. For scan on hybrid-columnar-compressed tables, this statistic is the size of data after decompression.
cell physical IO interconnect bytes returned by smart scan - The number of bytes that are returned by the cell for Smart Scan only, and does not include bytes for other database I/O.
cell physical IO bytes saved by storage index - The number of bytes saved by storage index.
cell physical IO bytes eligible for predicate offload - The total number of I/O bytes processed with physical disks when processing was offloaded to the cell.
cell physical IO bytes pushed back due to excessive CPU - The number of I/O bytes sent back to the database server for processing due to CPU usage on Exadata Cell.
cell physical IO bytes saved during optimized file creation - The number of I/O bytes saved by the database host by offloading the file creation operation to cells. This statistic shows the Exadata Cell benefit due to optimized file creation operations.
cell physical IO bytes saved during optimized RMAN file restore - The number of I/O bytes saved by the database host by offloading the RMAN file restore operation to cells. This statistic shows the Exadata Cell benefit due to optimized RMAN file restore operations.
cell physical IO interconnect bytes - The number of I/O bytes exchanged over the interconnection between the database host and cells.
physical read requests optimized - Total number of read requests satisfied either by using Exadata Smart Flash Cache or storage index.
physical read total bytes - Total amount of I/O bytes for reads processed with physical disks. This includes when processing was offloaded to the cell and when processing was not offloaded.
physical read total bytes optimized - Total number of bytes read from Exadata Smart Flash Cache or storage index.
physical write total bytes - Total amount of I/O bytes for writes processed with physical disks. This includes when processing was offloaded to the cell and when processing was not offloaded.

Other Views:
V$BACKUP_DATAFILE - contains columns relevant to Exadata Cell during Oracle Recovery Manager (RMAN) incremental backups.
V$SEGMENT_STATISTICS - segment-level statistics can be used to detect specific objects what are performing optimized reads from the cell. The "optimized physical read" statistic name is the number of read requests for an objects that were read from Exadata Smart Flash Cache or from the storage index.

SQL> SELECT object_name, value FROM V$segment_statistics WHERE statistic_name='optimized physical reads' AND value>1000 ORDER BY value;

OBJECT_NAME VALUE
------------------------------ ----------
TB_TEST 63153

V$SQL - lists statistics on shared SQL areas. List columns:
* PHYSICAL_READ_BYTES
* PHYSICAL_WRITE_BYTES
* IO_INTERCONNECT_BYTES
* IO_CELL_OFFLOAD_ELIGIBLE_BYTES
* IO_CELL_UNCOMPRESSED_BYTES
* IO_CELL_OFFLOAD_RETURNED_BYTES
* OPTIMIZED_PHY_READ_REQUESTS
We can see these columns in V$SQLAREA/V$SQLAREA_PLAN_HASH/V$SQLSTATS/V$SQLSTATS_PLAN_HASH views also.

SQL> select sql_id, physical_read_bytes, physical_write_bytes, io_interconnect_bytes, io_cell_offload_eligible_bytes, io_cell_uncompressed_bytes, io_cell_offload_returned_bytes, optimized_phy_read_requests from v$sql where sql_id='a28xja6zsgnb7';

SQL_ID PHYSICAL_READ_BYTES PHYSICAL_WRITE_BYTES IO_INTERCONNECT_BYTES IO_CELL_OFFLOAD_ELIGIBLE_BYTES IO_CELL_UNCOMPRESSED_BYTES IO_CELL_OFFLOAD_RETURNED_BYTES OPTIMIZED_PHY_READ_REQUESTS
------------- ------------------- -------------------- --------------------- ------------------------------ -------------------------- ------------------------------ ---------------------------
a28xja6zsgnb7 2184994816 0 66946880 2184912896 2185003008 66864960 3

SQL> SELECT sql_text,
io_cell_offload_eligible_bytes/1024/1024 cell_offload_eligible_mb,
io_cell_uncompressed_bytes/1024/1024 io_uncompressed_mb,
io_interconnect_bytes/1024/1024 io_interconnect_mb,
io_cell_offload_returned_bytes/1024/1024 cell_return_bytes_mb,
(physical_read_bytes + physical_write_bytes)/1024/1024 io_disk_mb
FROM v$sql WHERE
sql_id='a28xja6zsgnb7';

SQL_TEXT CELL_OFFLOAD_ELIGIBLE_MB IO_UNCOMPRESSED_MB IO_INTERCONNECT_MB CELL_RETURN_BYTES_MB IO_DISK_MB
------------------------------ ------------------------ ------------------ ------------------ -------------------- ----------
SELECT * from TB_TEST 2083.69531 2083.78125 63.84552 63.767395 2083.77344

Monday, May 30, 2011

InfiniBand switch - relocated Subnet Manager Master to another switch

After I posted some about sm_priority is not set to recommended value of 5 on infiniband switch. Thank You everyone for comment, documents and ideas.

I ignored about this warning. why?
Exadata Document:

Exadata Database Machine Full Racks and Oracle Exadata Database Machine X2-2 Half Racks have three Sun Datacenter InfiniBand Switch 36 switches. The switch at rack unit 1 (U1) is referred to as the spine switch. The switches at rack unit 20 (U20) and rack unit 24 (U24) in Oracle Exadata Database Machine X2-2 racks, or unit 21(U21) and rack unit 23 (U23) in Oracle Exadata Database Machine X2-8 Full Racks are referred to as leaf switches. The spine switch is the Subnet Manager Master for the InfiniBand subnet. It has priority 8.

Sun Datacenter InfiniBand Switch 36 Topic Set

By setting a Subnet Manager to a higher priority than another Subnet Manager, it becomes the primary (or Master) Subnet Manager.

Infiniband Switch (01) = spine switch (spine switch is the Subnet Manager Master) - So, It should have a higher priority and Exadata Document (The spine switch is the Subnet Manager Master for the InfiniBand subnet. It has priority 8).

However, I found something wrong, maybe I tested many thing on Exadata.

# getmaster
20110530 11:06:00 OpenSM Master on Switch : 0x0021286ccca9a0a0 ports 36 Sun DCS 36 QDR switch exasw-ib2 enhanced port 0 lid 4 lmc 0

So, Relocated Subnet Manager Master to another switch. remote to leaf switch (exasw-ib2 ) and then disable/enable SM.

# ssh exasw-ib2

# disablesm
Stopping IB Subnet Manager.. [ OK ]

# enablesm
Starting IB Subnet Manager. [ OK ]

# getmaster
20110530 11:08:31 OpenSM Master on Switch : 0x0021286cd635a0a0 ports 36 Sun DCS 36 QDR switch exasw-ib1 enhanced port 0 lid 1 lmc 0.

It's relocated to 01 (Infiniband Switch).

Thursday, May 26, 2011

exachk - WARNING!!! sm_priority is not set to recommended value of 5 on infiniband switch

I used exachk and found WARNING => sm_priority is not set to recommended value of 5 on infiniband switch exasw-ib1
So, I checked them by CheckSWProfile.sh script.

# ./CheckSWProfile.sh -I exasw-ib1,exasw-ib2,exasw-ib3
Checking if switch exasw-ib1 is pingable...
Checking if switch exasw-ib2 is pingable...
Checking if switch exasw-ib3 is pingable...
Use the default password for all switches? (y/n) [n]: y
[ERROR] OpenSM configurations mismatch for switch exasw-ib1
Found: controlled_handover=TRUE log_max_size=8 polling_retry_number=5 routing_engine=ftree sminfo_polling_timeout=1000 sm_priority=8
Required: controlled_handover=TRUE log_max_size=8 polling_retry_number=5 routing_engine=ftree sminfo_polling_timeout=1000 sm_priority=5

then I changed sm_priority (8 to 5) in /etc/opensm/opensm.conf on infiniband switch (exasw-ib1)

# vi /etc/opensm/opensm.conf

#Begin /etc/opensm/opensm.conf
.
.
# SM priority used for deciding who is the master
# Range goes from 0 (lowest priority) to 15 (highest).
sm_priority 5
.
.
.
# End /etc/opensm/opensm.conf

After changed ... restart Subnet Manager on infiniband switch

# /etc/init.d/opensmd

Usage: opensmd {start|stop|restart|status}

# /etc/init.d/opensmd restart
Stopping IB Subnet Manager.. [ OK ]
Starting IB Subnet Manager. [ OK ]

back to database server again... and used CheckSWProfile.sh script.

# ./CheckSWProfile.sh -I exasw-ib1,exasw-ib2,exasw-ib3
Checking if switch exasw-ib1 is pingable...
Checking if switch exasw-ib2 is pingable...
Checking if switch exasw-ib3 is pingable...
Use the default password for all switches? (y/n) [n]: y
[INFO] SUCCESS All switches have correct software and firmware version:
SWVer: 1.1.3-2
[INFO] SUCCESS All switches have correct opensm configuration:
controlled_handover=TRUE log_max_size=8 polling_retry_number=5 routing_engine=ftree sminfo_polling_timeout=1000 sm_priority=5

Saturday, April 02, 2011

How to display content attributes on Exadata Storage?

On Exadata, After We created flashcache on exadata storage. How to know it work!!! Because on database, we don't change anything.
On Database: we can check...

SQL> select name,value from v$sysstat where name in ('physical read total IO requests','cell flash cache read hits');

NAME VALUE
---------------------------------------------------------------- ----------
physical read total IO requests 17467680
cell flash cache read hits 108435

By the way, On Exadata Storage: We can use "LIST FLASHCACHECONTENT" command to display content attributes on Exadata Storage.

CellCLI> help list flashcachecontent

Usage: LIST FLASHCACHECONTENT [<filters>] [<attribute_list>] [DETAIL]

Purpose: Displays specified attributes for flash cache entries.

Arguments:
<filters>: An expression which determines the entries to be displayed.
<attribute_list>: The attributes that are to be displayed.
ATTRIBUTES {ALL | attr1 [, attr2]... }

Options:
[DETAIL]: Formats the display as an attribute on each line, with
an attribute descriptor preceding each value.

Examples:
LIST FLASHCACHECONTENT DETAIL

So, I would like to test some... about " LIST FLASHCACHECONTENT" command

CellCLI> list flashcachecontent detail
cachedKeepSize: 0
cachedSize: 131072
dbID: 2080757153
dbUniqueName: DB
hitCount: 0
missCount: 2
objectNumber: 8
tableSpaceNumber: 0

cachedKeepSize: 0
cachedSize: 65536
dbID: 2080757153
dbUniqueName: DB
hitCount: 0
missCount: 3
objectNumber: 29
tableSpaceNumber: 0

cachedKeepSize: 0
cachedSize: 229376
dbID: 2080757153
dbUniqueName: DB
hitCount: 6
missCount: 4
objectNumber: 36
tableSpaceNumber: 0

However, We can display specific object:
On Database:

SQL> alter table SALES storage (cell_flash_cache keep);

Table altered.

SQL> select * from sales;

SQL> select object_id from user_objects where object_name='SALES';

OBJECT_ID
----------
25076

On Storage: Check from objectNumber=25076

CellCLI> list flashcachecontent where objectnumber=25076 detail
cachedKeepSize: 153632768
cachedSize: 153632768
dbID: 2080757153
dbUniqueName: DB
hitCount: 9
hoursToExpiration: 24
missCount: 154
objectNumber: 25076
tableSpaceNumber: 673

What did we see? database name/tablespace number/object number/cached size/...

SQL> select name from v$database;

NAME
---------
DB

SQL> select name from ts$ where ts#=673;

NAME
------------------------------
EXA_TEST

Achieving Extreme Performance with Oracle Exadata

I posted about How to start about Exadata Learning? A book is a good way for learning. Last week, I got Achieving Extreme Performance with Oracle Exadata book from Amazon.
Achieving Extreme Performance with Oracle Exadata book has 2 PARTS (10 Chapters):
- PART I: Features and Foundations
- PART II: Best Practices
This book on PART (I) introduces us to learn about Oracle 11g Enterprise Edition Features. On Oracle Exadata Database Machine uses Oracle Database 11gR2, Partitioning, Parallel, Compression, Resource Manager, ASM and RAC (Grid Infrastructure). So we should know about them before maintain Exadata. On part (I) explains about Exadata Software Features and Exdata Machine Platform HW Components also.

On PART (II) was written about Best Practices. Example: Managing the Exadata Database Machine/HA and Backup Strategies/Exadata with DW and OLTP/...
We will learn more idea/sample for Exadata management.

However, we can learn more on Oracle Library and Oracle Support.

Sunday, March 20, 2011

What should we contain Online Redo Log Files in Flash Disk?

I asked about How to use Flash Disk (Exadata)? on oracle-l - I think this may have benefit for someone. However, We should know about Flash Disk benefits on Oracle Database. and I have some question: What should we contain Online Redo Log Files in Flash Disk?

Online redo log files record all changes made to Oracle Database data files and control file. When a change is made to the database, Oracle Database generates a redo record in the redo buffer. Log Writer (LGWR) writes the contents of the redo buffer to the online redo log. LGWR performs sequential writes of redo to disk.

That's a concept what we know, and LGWR can perform fast sequential writes of redo to disk, the database improves performance.

A solid-state drive (SSD) is a data storage device that uses solid-state memory to store persistent data with the intention of providing access in the same manner of a traditional block i/o hard disk drive. SSD has write speeds lower than its read speeds and sequential writes isn't better faster than Hard Disk.

So, I think It's not good idea to contain Online Redo Log Files in Flash Disk. How do you think?

Reference:
http://www.freelists.org/post/oracle-l/How-do-you-use-FlashDisk
http://www.emc.com/collateral/hardware/white-papers/h5967-leveraging-clariion-cx4-oracle-deploy-wp.pdf

Friday, March 04, 2011

ORA-15067 - When drop disks in ASM DiskGroup Normal redundancy

I have ASM Diskgroup and redundancy type is Normal ans tested to drop all disks from one Cell Server:

SQL> select NAME, TOTAL_MB, FREE_MB, STATE, TYPE, REQUIRED_MIRROR_FREE_MB, USABLE_FILE_MB from v$asm_diskgroup where name='TEST_DG';

NAME TOTAL_MB FREE_MB STATE TYPE REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB
---------------------------------------- ---------- ---------- ----------- ------ ----------------------- --------------
TEST_DG 32768 18008 MOUNTED NORMAL 2978 7515

SQL> select path, header_status, os_mb,free_mb, failgroup from v$asm_disk where group_number = (select group_number from v$asm_diskgroup where name='TEST_DG') order by failgroup;

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP
---------------------------------------- ------------ ---------- ---------- ----------
o/192.168.99.5/TEST_DG_FD_03_EXACEL01 MEMBER 1024 564 EXACEL01
.
.
o/192.168.99.6/TEST_DG_FD_12_EXACEL02 MEMBER 1024 552 EXACEL02
.
.

SQL> select * from gv$asm_operation;

no rows selected

SQL> alter diskgroup TEST_DG drop disks in failgroup EXACEL01 rebalance power 11 NOWAIT;
alter diskgroup TEST_DG drop disks in failgroup EXACEL01 rebalance power 11 NOWAIT
*
ERROR at line 1:
ORA-15067: command or option incompatible with diskgroup redundancy

Found ORA-15067 and found out it.

ORA-15067: command or option incompatible with diskgroup redundancy
Cause: An attempt was made to use a feature which is not permitted by the diskgroup's redundancy policy. Common examples are forcibly dropping a disk from an EXTERNAL REDUNDANCY diskgroup, using the FAILGROUP clauses with an EXTERNAL REDUNDANCY diskgroup, or using invalid template attributes.
Action: Omit the option from the command

Not sure about error and idea to test then. So, I just added new ASM Disk and tried to drop again.

SQL> alter diskgroup TEST_DG add disk 'o/*/TEST_DG*EXACEL03' rebalance power 11 NOWAIT;

Diskgroup altered.

SQL> select * from gv$asm_operation;

INST_ID GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
---------- ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- -----------
1 4 REBAL RUN 11 11 1931 2051 8646 0

SQL> select * from gv$asm_operation;

no rows selected

SQL> select NAME, TOTAL_MB, FREE_MB, STATE, TYPE, REQUIRED_MIRROR_FREE_MB, USABLE_FILE_MB from v$asm_diskgroup where name='TEST_DG';

NAME TOTAL_MB FREE_MB STATE TYPE REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB
---------------------------------------- ---------- ---------- ----------- ------ ----------------------- --------------
TEST_DG 49152 34184 MOUNTED NORMAL 5004 14590

SQL> select path, header_status, os_mb,free_mb, failgroup from v$asm_disk where group_number = (select group_number from v$asm_diskgroup where name='TEST_DG') order by failgroup;

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP
---------------------------------------- ------------ ---------- ---------- ----------
o/192.168.99.5/TEST_DG_FD_14_EXACEL01 MEMBER 1024 696 EXACEL01
.
.
o/192.168.99.6/TEST_DG_FD_12_EXACEL02 MEMBER 1024 708 EXACEL02
.
.
o/192.168.99.7/TEST_DG_FD_10_EXACEL03 MEMBER 1024 712 EXACEL03
.
.

SQL> alter diskgroup TEST_DG drop disks in failgroup EXACEL01 rebalance power 11 NOWAIT;

Diskgroup altered.

SQL> select * from gv$asm_operation;

no rows selected

SQL> select path, header_status, os_mb,free_mb, failgroup from v$asm_disk where group_number = (select group_number from v$asm_diskgroup where name='TEST_DG') order by failgroup;

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP
---------------------------------------- ------------ ---------- ---------- ----------
o/192.168.99.6/TEST_DG_FD_12_EXACEL02 MEMBER 1024 616 EXACEL02
.
.
o/192.168.99.7/TEST_DG_FD_09_EXACEL03 MEMBER 1024 584 EXACEL03
.
.

SQL> select path, header_status, os_mb,free_mb, failgroup, mount_status from v$asm_disk where path like '%TEST_DG%' and failgroup='EXACEL01';

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP MOUNT_S
---------------------------------------- ------------ ---------- ---------- ---------- -------
o/192.168.99.5/TEST_DG_FD_10_EXACEL01 FORMER 1024 0 EXACEL01 CLOSED
.
.
.

MOUNT_STATUS should not be “CACHED” for DROPPED Disks.

I dropped all Disks from one cell server. I'm curious. My ASM DiskGroup is Normal redundancy type, could not drop!!! had to add disks and tired again.

Thursday, March 03, 2011

Learn Sun InfiniBand Switch

On Exadata X2-2, Oracle use "Sun InfiniBand Switch".
How to learn about it? We can find out Sun Datacenter InfiniBand Switch 36 Documentation and learn from them.
- On Operating System on it:

# cat /etc/redhat-release
CentOS release 5.2 (Final)

# uname -r
2.6.27.13-nm2

- Check the status of the management controller, power supplies, fans, and switch chip:

# showunhealthy
OK - No unhealthy sensors

# env_test
Environment test started:
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.30 V
Measured 3.3V Standby = 3.39 V
Measured 12V = 12.06 V
Measured 5V = 5.00 V
Measured VBAT = 3.11 V
Measured 2.5V = 2.50 V
Measured 1.8V = 1.80 V
Measured I4 1.2V = 1.22 V
Voltage test returned OK
Starting PSU test:
PSU 0 present
PSU 1 present
PSU test returned OK
Starting Temperature test:
Back temperature 32.5
Front temperature 33.1
SP temperature 37.1
Switch temperature 52, maxtemperature 58
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 11627
Fan 2 running at rpm 11627
Fan 3 running at rpm 11847
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting onboard ibdevice test:
Switch OK
All Internal ibdevices OK
onboard ibdevice test returned OK
Environment test PASSED

and other samples:

# showtemps
Back temperature 32.5
Front temperature 33.1
SP temperature 36.1
Switch temperature 52, maxtemperature 58
All temperatures OK

# checkpower
PSU 0 present status: OK
PSU 1 present status: OK

# getfanspeed
Fan 0 not present
Fan 1 rpm 11847
Fan 2 rpm 11736
Fan 3 rpm 11847
Fan 4 not present

- Check the link status of the InfiniBand cables:

# listlinkup
Connector 0A Not present
Connector 1A Not present
Connector 2A Not present
Connector 3A Not present
Connector 4A Not present
Connector 5A Not present
Connector 6A Not present
Connector 7A Not present
Connector 8A Present <-> Switch Port 31 is up (Enabled)
Connector 9A Present <-> Switch Port 14 is up (Enabled)
Connector 10A Present <-> Switch Port 16 is up (Enabled)
Connector 11A Present <-> Switch Port 18 is up (Enabled)
Connector 12A Not present
Connector 13A Present <-> Switch Port 09 is up (Enabled)
Connector 14A Present <-> Switch Port 07 is up (Enabled)
Connector 15A Present <-> Switch Port 05 is up (Enabled)
Connector 16A Present <-> Switch Port 03 is up (Enabled)
Connector 17A Present <-> Switch Port 01 is up (Enabled)
Connector 0B Not present
Connector 1B Not present
Connector 2B Not present
Connector 3B Not present
Connector 4B Not present
Connector 5B Not present
Connector 6B Not present
Connector 7B Not present
Connector 8B Present <-> Switch Port 32 is up (Enabled)
Connector 9B Present <-> Switch Port 13 is up (Enabled)
Connector 10B Present <-> Switch Port 15 is up (Enabled)
Connector 11B Present <-> Switch Port 17 is up (Enabled)
Connector 12B Present <-> Switch Port 12 is up (Enabled)
Connector 13B Present <-> Switch Port 10 is up (Enabled)
Connector 14B Present <-> Switch Port 08 is up (Enabled)
Connector 15B Present <-> Switch Port 06 is up (Enabled)
Connector 16B Present <-> Switch Port 04 is up (Enabled)
Connector 17B Present <-> Switch Port 02 is up (Enabled)

*** "Not present" status is no cable attached or the link at either end of the cable is down.
and check status port

# getportstatus 31
Port status for connector 8A Switch Port 31
Adminstate:......................Enabled
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkSpeedActive:.................10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps

- Verify the InfiniBand Fabric
ibdiagnet - Performs diagnostics upon the InfiniBand fabric and reports status.
ibnetdiscover - Discovers and displays the InfiniBand fabric topology and connections.
ibcheckerrors - Checks the entire InfiniBand fabric for errors.

# ibcheckerrors
## Summary: 14 nodes checked, 0 bad nodes found
## 62 ports checked, 0 ports have errors beyond threshold

Other Links:
Sun Datacenter InfiniBand Switch 72 Documentation

ORA-15041 - drop disks in ASM diskgroup

Test to drop disk in ASM Disk Group and found error: ORA-15041

SQL> alter diskgroup TEST_DG drop disks in failgroup EXACEL02 rebalance power 11 NOWAIT;

Diskgroup altered.

SQL> select * from gv$asm_operation;

INST_ID GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE
---------- ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- ----------
1 4 REBAL ERRS 11 ORA-15041

SQL> select path, header_status, os_mb,free_mb, failgroup, mount_status from v$asm_disk where path like '%TEST_DG%' and failgroup='EXACEL02';

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP MOUNT_S
---------------------------------------- ------------ ---------- ---------- ---------- -------
.
.
.
o/192.168.99.6/TEST_DG_FD_03_EXACEL02 FORMER 1024 0 EXACEL02 CLOSED
o/192.168.99.6/TEST_DG_FD_00_EXACEL02 MEMBER 1024 1012 EXACEL02 CACHED

MOUNT_STATUS should not be “CACHED” for DROPPED Disks.

ORA-15041: diskgroup space exhausted
Cause: The diskgroup ran out of space.
Action: Add more disks to the diskgroup, or delete some existing files.

Idea: Have to remove some files in this ASM disk group or add new disk to this ASM disk group, (Actually we can check "required_mirror_free_mb" in V$ASM_DISKGROUP before drop Disk). After we fixed it (this issue: I resized some database files):

SQL> select * from gv$asm_operation;

no rows selected

SQL> select path, header_status, os_mb,free_mb, failgroup, mount_status from v$asm_disk where path like '%TEST_DG%' and failgroup='EXACEL02';

PATH HEADER_STATU OS_MB FREE_MB FAILGROUP MOUNT_S
---------------------------------------- ------------ ---------- ---------- ---------- -------
.
.
.
o/192.168.99.6/TEST_DG_FD_03_EXACEL02 FORMER 1024 0 EXACEL02 CLOSED
o/192.168.99.6/TEST_DG_FD_15_EXACEL02 FORMER 1024 0 EXACEL02 CLOSED

Other Post: ORA-15041 - ASM Disk Group Unbalanced

Wednesday, March 02, 2011

test Smart Scan with CELL_OFFLOAD_PROCESSING parameter

Read Exadata Part I: Smart Scan (by UWE HESSE), hope to see more about Smart Scan Performance and then test... this idea use CELL_OFFLOAD_PROCESSING parameter.
CELL_OFFLOAD_PROCESSING initialization parameter enables SQL processing offload to Exadata Cell (default value is TRUE).
*** "sales" table has 20000000 rows ***
Test count(*) data:

SQL> alter session set cell_offload_processing=FALSE;

Session altered.

SQL> select count(*) from sales;

COUNT(*)
----------
20000000

Elapsed: 00:00:08.01

SQL> alter session set cell_offload_processing=TRUE;

Session altered.

SQL> select count(*) from sales;

COUNT(*)
----------
20000000

Elapsed: 00:00:00.46

Test to query data (no index):

SQL> alter session set cell_offload_processing=FALSE;

Session altered.

SQL> select * from sales where id=4731;

ID FLAG PRODUCT CHANNEL_ID CUST_ID AMOUNT_SOLD ORDER_DAT SHIP_DATE
---------- ---------- ------------------------- ---------- ---------- ----------- --------- ---------
4731 4732 Oracle Enterprise Edition 1 731 5000 27-SEP-96 28-SEP-96

Elapsed: 00:00:08.15

SQL> explain plan for select * from sales where id=4732;

Explained.

SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 781590677
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2090 | 224K| 51614 (1)| 00:10:20 |
|* 1 | TABLE ACCESS STORAGE FULL| SALES | 2090 | 224K| 51614 (1)| 00:10:20 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=4732)
Note
-----
- dynamic sampling used for this statement (level=2)

Spent time 08.15 seconds.

SQL> alter session set cell_offload_processing=TRUE;

Session altered.

SQL> select * from sales where id=4730;

ID FLAG PRODUCT CHANNEL_ID CUST_ID AMOUNT_SOLD ORDER_DAT SHIP_DATE
---------- ---------- ------------------------- ---------- ---------- ----------- --------- ---------
4730 4731 Oracle Enterprise Edition 0 730 5000 26-SEP-96 27-SEP-96

Elapsed: 00:00:00.29

SQL> explain plan for select * from sales where id=4733;

Explained.

SQL> SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 781590677
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2090 | 224K| 51614 (1)| 00:10:20 |
|* 1 | TABLE ACCESS STORAGE FULL| SALES | 2090 | 224K| 51614 (1)| 00:10:20 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - storage("ID"=4733)
filter("ID"=4733)
Note
-----
- dynamic sampling used for this statement (level=2)

Spent time 00.29 seconds.

How do we know Oracle use SQL processing offload to Exadata Cell ?
- On Explain plan: we will see storage("ID"=...) but can not tell that query used SMART SCAN... have to trace event 10046 !!!
However, if "cell_offload_plan_display"=NEVER.
SQL Processing offload to Exadata Cell does not display on explain plan.

What operations benefit from Smart Scan?
- Full scan of a heap table
- Fast full scan of a B-Tree or bitmap index

Which function can we be offload to the Exadata?
- Check from query
select * from v$sqlfn_metadata where offloadable='YES';

Tuesday, March 01, 2011

What is ipmitool?

What is IPMI? The Intelligent Platform Management Interface (IPMI) specification is an Intel-led standard that defines a set of common interfaces to a computer system that system administrators can use to monitor system health and manage the system.
IPMI provides monitoring, management voltage, temperature, fans, power supplies, hardware inventory information and etc.

What is ipmitool? impitool is a utility for managing and configuring devices that support the Intelligent Platform Management Interface (IPMI) version 1.5 and version 2.0 specifications.

On this article, I just posted samples to use "ipmitool" on Sun Server.

# # ipmitool -help
ipmitool version 1.8.10.3

Example: "ipmitool" command

# ipmitool lan print
Set in Progress : Set Complete
Auth Type Support : NONE MD2 MD5 PASSWORD
Auth Type Enable : Callback : MD2 MD5 PASSWORD
: User : MD2 MD5 PASSWORD
: Operator : MD2 MD5 PASSWORD
: Admin : MD2 MD5 PASSWORD
: OEM :
IP Address Source : Static Address
IP Address :
Subnet Mask :
MAC Address :
SNMP Community String : public
IP Header : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control : ARP Responses Disabled, Gratuitous ARP Disabled
Gratituous ARP Intrvl : 5.0 seconds
Default Gateway IP :
Default Gateway MAC : 00:00:00:00:00:00
Backup Gateway IP : 0.0.0.0
Backup Gateway MAC : 00:00:00:00:00:00
802.1q VLAN ID : Disabled
802.1q VLAN Priority : 0
RMCP+ Cipher Suites : 2,3,0
Cipher Suite Priv Max : XXXXXXXXXXXXXXX
: X=Cipher Suite Unused
: c=CALLBACK
: u=USER
: o=OPERATOR
: a=ADMIN
: O=OEM

# ipmitool chassis status
System Power : on
Power Overload : false
Power Interlock : inactive
Main Power Fault : false
Power Control Fault : false
Power Restore Policy : previous
Last Power Event :
Chassis Intrusion : inactive
Front-Panel Lockout : inactive
Drive Fault : false
Cooling/Fan Fault : false

# ipmitool power status
Chassis Power is on

# ipmitool sensor list

# ipmitool sensor list | grep degree
ESMR/T_AMB | na | degrees C | na | -10.000 | -5.000 | 0.000 | 50.000 | 55.000 | 60.000
T_OUT0 | 32.000 | degrees C | ok | na | na | na | 55.000 | 60.000 | 65.000
T_OUT1 | 34.000 | degrees C | ok | na | na | na | 55.000 | 60.000 | 65.000
T_OUT2 | 30.000 | degrees C | ok | na | na | na | 55.000 | 60.000 | 65.000
T_AMB | 20.000 | degrees C | ok | 5.000 | na | na | na | na | na

# ipmitool sunoem cli "show /SYS/T_AMB value"
Connected. Use ^D to exit.
-> show /SYS/T_AMB value

/SYS/T_AMB
Properties:
value = 20.000 degree C

-> Session closed
Disconnected

Print Product Serial Number:

# ipmitool sunoem cli "show /SYS product_serial_number"
Connected. Use ^D to exit.
-> show /SYS product_serial_number

/SYS
Properties:
product_serial_number = < SERIAL NUMBER >

-> Session closed
Disconnected

Print about BIOS:

# ipmitool sunoem cli "show /SYS/MB/BIOS"

Print System Event Log:

# ipmitool sel list

Or use Execute SP CLI:

# ipmitool sunoem cli
Connected. Use ^D to exit.
-> show /SYS/T_AMB value
show /SYS/T_AMB value

/SYS/T_AMB
Properties:
value = 19.000 degree C

ipmitool - utility for controlling IPMI-enabled devices.
IPMI improves server management and helps reduce costs. IPMI helps lower the overall costs of server management by enabling customers to save time, maximize IT resources and potentially manage multivendor environments in the same way.

Learn More:
IPMItool Homepage
Intelligent Platform Management Interface Specification
OpenIPMI Homepage
FreeIPMI Homepage

Sunday, February 27, 2011

How to use Exadata Smart Flash Cache?

I posted about Oracle 11gR2 Flash Cache. I had to use both db_flash_cache_file and db_flash_cache_size parameters for Database Smart Flash Cache(this feature is supported on Exadata Environment). After I read Exadata Part IV: Flash Cache (by UWE HESSE). I interested about Exadata Smart Flash Cache. How to use Exadata Smart Flash Cache? - So... I tested:

On Storage:

CellCLI> list flashcache detail
< Nothing >

CellCLI> create flashcache all size=100g
Flash cache exacel01_FLASHCACHE successfully created

CellCLI> list FLASHCACHECONTENT
< Nothing >

CellCLI> list flashcache detail
name: exacel01_FLASHCACHE
cellDisk:
creationTime: 2011-02-27T01:38:12+07:00
degradedCelldisks:
effectiveCacheSize: 100G
id:
size: 100G
status: normal

On Database:

SQL> create table tb_test as select * from all_objects;

Table created.

CELL_FLASH_CACHE storage clause attribute allows the override of the automatic caching policy in Exadata Smart Flash Cache when the COMPATIBLE parameter is greater than or equal to 11.2.0.0. CELL_FLASH_CACHE storage_clause option:
- NONE
- DEFAULT
- KEEP

On Database:

SQL> alter table tb_test storage (cell_flash_cache keep);

Table altered.

SQL> select * from tb_test;

SQL> select name, value from v$sysstat where name in ('physical read total IO requests','cell flash cache read hits');

NAME VALUE
---------------------------------------------------------------- ----------
physical read total IO requests 586219
cell flash cache read hits 86

On Storage:

CellCLI> list FLASHCACHECONTENT
2080757153 0 40
2080757153 0 227
2080757153 0 288
2080757153 0 290
2080757153 0 384
2080757153 1 6548
2080757153 1 6550
2080757153 1 15926
2080757153 1 16513
2080757153 1 16991
2080757153 1 17170

OK... just created flashcache!!! and used CELL_FLASH_CACHE storage_clause option on Database.

read more(Exadata Smart Flash Cache and the Sun Oracle Database Machine)
***

Saturday, February 26, 2011

Reliable Datagram Sockets (RDS)

Reliable Datagram Sockets (RDS) is a reliable-socket off-load driver and inter-processor communication (IPC) protocol with low overhead, low-latency, high-bandwidth. RDS enables enhanced application performance and cluster scalability.
***Contributor: www.openfabrics.org (Particularly, Oracle)***
Reference:
http://www.openfabrics.org
http://oss.oracle.com/projects/rds/

RDS protocol provides reliable datagram services multiplexing UDP packets over InfiniBand connection improving performance to Oracle RAC. It provides high performance cluster interconnect for Oracle RAC. (Interest Link)

How to know RDS is used on RAC?

SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.

On Oracle RAC (no RDS) - In trace file:

SSKGXPT 0x6700190 flags SSKGXPT_READPENDING socket no 7 IP 192.168.99.1 UDP 48798
context timestamp 0

On Oracle RAC (RDS) - In trace file:

SKGXP:[2b4d5065d400.97]{ctx}: SSKGXPT 0x2b4d506b1d98 flags 0x0 sockno 10 IP 192.168.99.1 RDS 26387 lerr 0
SKGXP:[47611061326848.97]{ctx}: SKGXPGPID Internet address 192.168.99.1 RDS port number 26387

Or... check in alert log file:

Cluster communication is configured to use the following interface(s) for this instance
192.168.99.1
cluster interconnect IPC version:Oracle RDS/IP (generic)
IPC Vendor 1 proto 3

If database cluster (RAC) not use RDS - rebuild RAC IPC library for RDS (use "oracle" user and stop what instances use this ORACLE_HOME before):

$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_rds ioracle

If need to revert back RAC to use UDP instead of RDS:

$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk ipc_g ioracle

Note: some command-line on Linux (rds-tools git repo):

rds-info - display information from the RDS kernel module
rds-ping - test reachability of remote node over RDS
rds-stress - send messages between processes over RDS sockets

Example:

# rds-info -n

RDS Connections:
LocalAddr RemoteAddr NextTX NextRX Flg
192.168.99.1 192.168.99.1 117066 6 --C
192.168.99.1 192.168.99.3 1167415 1060920 --C
192.168.99.1 192.168.99.2 1104151 1018883 --C
192.168.99.1 192.168.99.5 106692 62660 --C
192.168.99.1 192.168.99.4 1129073 990390 --C
192.168.99.1 192.168.99.7 341345 413031 --C
192.168.99.1 192.168.99.6 530757 603414 --C
192.168.99.1 192.168.99.9 108975 120114 --C
192.168.99.1 192.168.99.8 491816 560933 --C
192.168.99.1 192.168.99.11 270190 283060 --C
192.168.99.1 192.168.99.10 578831 588658 --C

# rds-ping 192.168.99.2
1: 29 usec
2: 28 usec
3: 31 usec
4: 33 usec

server01
# rds-stress
waiting for incoming connection on 0.0.0.0:4000

then on server02:
# rds-stress -s 192.168.99.1 -p 4000 -t 1 -d 1 -D 1024000
connecting to 192.168.99.1:4000
negotiated options, tasks will start in 2 seconds
Starting up....
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu %
1 1252 1252 2647.90 1250670.54 1250670.54 28.47 764.87 -1.00
1 1251 1251 2643.31 1247501.25 1249497.26 29.39 764.85 -1.00
1 1247 1247 2637.50 1246756.98 1244758.98 29.56 767.66 -1.00
1 1248 1248 2639.52 1246709.66 1246709.66 30.07 766.60 -1.00
1 1247 1248 2638.61 1245780.38 1246779.40 29.93 767.10 -1.00
^C

Check on server01
# rds-stress
waiting for incoming connection on 0.0.0.0:4000
accepted connection from 192.168.99.2:46507 on 192.168.99.1:4000
negotiated options, tasks will start in 2 seconds
Starting up....
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu %
1 1252 1252 2648.12 1250774.24 1250774.24 31.86 761.95 -1.00
1 1250 1250 2643.83 1248746.26 1248746.26 31.46 763.10 -1.00
1 1245 1245 2635.89 1244996.27 1244996.27 32.84 766.34 -1.00
1 1247 1247 2640.09 1246980.05 1246980.05 33.06 765.58 -1.00
1 1247 1246 2639.09 1247006.24 1246006.23 32.69 765.56 -1.00
---------------------------------------------
1 1247 1247 2641.41 1247701.42 1247501.50 32.39 764.81 -1.00 (average)

On Exadata, Oracle uses RDS (Reliable Datagram Sockets) V3. Oracle has developed it and we know in name the Zero Data loss UDP (ZDP) protocol.

Thursday, February 24, 2011

Distributed Shell for Oracle Storage

"dcli" script executes commands on multiple cells in parallel threads. The cells are referenced by their domain name or ip address. Local files can be copied to cells and executed on cells.
If we would like to know more about setting up "dcli". we can find out on My Oracle Support - Setting up DCLI Utility on Oracle DB host [ID 787205.1]

This is shell utility on Exadata servers, It seems to use "SSH login without password" idea (push ssh key to cell's authorized_keys file) on "root" user or ...
Example:

# cat all_group
exadb01
exadb02
exadb03
exadb04
exacel01
exacel02
exacel03
exacel04
exacel05
exacel06
exacel07

# dcli -g all_group -l root "hostname"
exadb01: exadb01.domain
exadb02: exadb02.domain
exadb03: exadb03.domain
exadb04: exadb04.domain
exacel01: exacel01.domain
exacel02: exacel02.domain
exacel03: exacel03.domain
exacel04: exacel04.domain
exacel05: exacel05.domain
exacel06: exacel06.domain
exacel07: exacel07.domain

# cat dbs_group
exadb01
exadb02
exadb03
exadb04

# dcli -g dbs_group -l root "vmstat 1 2| tail -1"
exadb01: 0 0 0 93954080 84384 3399876 0 0 0 0 1352 10532 0 0 100 0 0
exadb02: 0 0 0 93701856 98800 3552584 0 0 0 0 1282 10636 0 0 100 0 0
exadb03: 0 0 0 92977872 98836 4266000 0 0 0 0 1275 10504 0 0 100 0 0
exadb04: 0 0 0 93257040 85704 3739464 0 0 0 0 1258 10277 0 0 100 0 0

If want to use "vmstat" with "dcli", use option:
--vmstat=VMSTATOPS vmstat command options
Example:

# dcli -l root -g dbs_group --vmstat="-a 1 2"
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
00:54:49: r b swpd free inact active si so bi bo in cs us sy id wa st
exadb01: 0 0 0 73336200 20123384 4346176 0 0 0 8 3 4 0 0 100 0 0
exadb02: 0 0 0 92061888 2797188 3472388 0 0 0 17 4 2 0 0 100 0 0
exadb03: 0 0 0 91947248 2783296 3597512 0 0 0 16 4 6 0 0 100 0 0
exadb04: 1 0 0 91658088 2993800 3664716 0 0 0 4 3 2 0 0 100 0 0
Minimum: 0 0 0 73336200 2783296 3472388 0 0 0 4 3 2 0 0 100 0 0
Maximum: 1 0 0 92061888 20123384 4346176 0 0 0 17 4 6 0 0 100 0 0
Average: 0 0 0 87250856 7174417 3770198 0 0 0 11 3 3 0 0 100 0 0
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
00:54:50: r b swpd free inact active si so bi bo in cs us sy id wa st
exadb01: 0 0 0 73334824 20123416 4346228 0 0 0 0 1109 10562 1 0 99 0 0
exadb02: 1 0 0 92062160 2797184 3472456 0 0 0 8 1104 9977 0 0 100 0 0
exadb03: 0 0 0 91947776 2783292 3597516 0 0 0 4 1112 10077 0 0 100 0 0
exadb04: 0 0 0 91657872 2993804 3664720 0 0 0 16 1108 10109 0 0 100 0 0
Minimum: 0 0 0 73334824 2783292 3472456 0 0 0 0 1104 9977 0 0 99 0 0
Maximum: 1 0 0 92062160 20123416 4346228 0 0 0 16 1112 10562 1 0 100 0 0
Average: 0 0 0 87250658 7174424 3770230 0 0 0 7 1108 10181 0 0 99 0 0

This is easy to executes commands on multiple cells. But I'm afraid "SSH login without password" idea for "root" user or ... on all cells(servers). How do you think?

Wednesday, February 23, 2011

How to start about Exadata Learning?

The foundation of the Exadata family of products is the Oracle Exadata Database Machine (Database Machine). The Database Machine is a complete and fully integrated database system that includes all the components to quickly and easily deploy any enterprise database delivering the best performance. The Exadata Storage Server (Exadata storage or Exadata cells) is used as the storage for the Oracle Database in the Database Machine and is used to grow existing Database Machine deployments.

However, I don't want to explain what is EXADATA? I'm just curious some idea: How to start about Exadata Learning?
I think someone who work as Oracle DBA need to know this answer. sure!!! Me too. If You have Exadata, It may easy for learning -)

Idea 1: Search "exadata" word on My Oracle Support, I guaratee it's useful.

Master Note for Oracle Database Machine and Exadata Storage Server [ID 1187674.1]
Exadata V2 Starter Kit [ID 1244344.1]

However, Database System on Exadata is Oracle RAC!!! we should know Oracle RAC also.

Master Note for Real Application Clusters (RAC) Oracle Clusterware and Oracle Grid Infrastructure [ID 1096952.1]

Idea 2: If no My Oracle Support, we can check on Oracle Learning Library (Exadata and Database Machine Version 2 Series)

Idea 3: Read from Blogs:
- The Oracle Instructor (by UWE HESSE)
- Kerry Osborne’s Oracle Blog
- Kevin Closson’s Oracle Blog
- Structured Data (by Greg Rahn)
- Frits Hoogland Weblog
- ocpdba oracle weblog
- Pythian Blog
- ETC ...

*** watch some Webcast also ***

Idea 4: Read on Oracle Forums

Idea 5: Buy Exadata books (check on Oracle Press ,APRESS or Amazon)

Idea 6: Study from Oracle University

Idea 7: How about your idea? I hope to hear some idea from you -)

Wednesday, September 16, 2009

Exadata Version 2

Oracle Unveils Exadata Version 2: The First Database Machine for OLTP

The Exadata Database Machine Version 2, made by Sun and Oracle, is the world’s fastest machine for both data warehousing and online transaction processing (OLTP).

Exadata Version 2 is available 4 models:

- full rack (8 database servers and 14 storage servers)

- half-rack (4 database servers and 7 storage servers)

- quarter-rack (2 database servers and 3 storage servers)

- a basic system (1 database server and 1 storage server)

http://www.oracle.com/corporate/pricing/exadata-pricelist.pdf

Surachart Opun's Blog

Thursday, August 11, 2011

INSERT with PARALLEL hint + force direct-path INSERT

What wrong? (INSERT) PARALLEL hint - It not compress

Saturday, June 04, 2011

Monitoring Exadata Cell using some Views

Monday, May 30, 2011

InfiniBand switch - relocated Subnet Manager Master to another switch

Thursday, May 26, 2011

exachk - WARNING!!! sm_priority is not set to recommended value of 5 on infiniband switch

Saturday, April 02, 2011

How to display content attributes on Exadata Storage?

Achieving Extreme Performance with Oracle Exadata

Sunday, March 20, 2011

What should we contain Online Redo Log Files in Flash Disk?

Friday, March 04, 2011

ORA-15067 - When drop disks in ASM DiskGroup Normal redundancy

Thursday, March 03, 2011

Learn Sun InfiniBand Switch

ORA-15041 - drop disks in ASM diskgroup

Wednesday, March 02, 2011

test Smart Scan with CELL_OFFLOAD_PROCESSING parameter

Tuesday, March 01, 2011

What is ipmitool?

Sunday, February 27, 2011

How to use Exadata Smart Flash Cache?

Saturday, February 26, 2011

Reliable Datagram Sockets (RDS)

Thursday, February 24, 2011

Distributed Shell for Oracle Storage

Wednesday, February 23, 2011

How to start about Exadata Learning?

Wednesday, September 16, 2009

Exadata Version 2

Search This Blog

Blog Archives

Thursday, August 11, 2011

Saturday, June 04, 2011

Monday, May 30, 2011

Thursday, May 26, 2011

Saturday, April 02, 2011

Sunday, March 20, 2011

Friday, March 04, 2011

Thursday, March 03, 2011

Wednesday, March 02, 2011

Tuesday, March 01, 2011

Sunday, February 27, 2011

Saturday, February 26, 2011

Thursday, February 24, 2011

Wednesday, February 23, 2011

Wednesday, September 16, 2009

Search This Blog

Feed Me!

Blog Archives