Tuesday, January 04, 2011

oradebug Man ^ ^

This topic is not mean special like movie topic or something else. I'm just curious on my DBA style, maybe it is wrong style, or the situation made me to do like that. When the Oracle Database has the issue and need to investigate... we will check from waited events and etc. If we have Enterprise Manager, that's easy way to help.
If not, we may find session id, that we interest... and then use dbms_support or dbms_monitor to trace.
But I often get notify "Please check database. we find some... some database process using high %CPU". This question makes me to use "oradebug" to help.
ORADEBUG command in SQL*Plus is tool for troubleshooting and diagnosing performance problems.
Example:
$ top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2904 oracle 16 0 1301m 137m 29m R 66.6 10.7 0:03.05 oracleorcl (LOCAL=NO)
2882 oracle 16 0 1201m 77m 68m S 12.9 6.0 0:04.28 ora_j001_orcl
2517 oracle -2 0 1190m 2904 2808 S 2.1 0.2 0:17.87 ora_vktm_orcl
After know OS pid (setospid = Set OS pid of process to debug)
SQL> oradebug setospid 2904
Oracle pid: 31, Unix process pid: 2904, image: oracle@linuxtest01
SQL> oradebug TRACEFILE_NAME
/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2904.trc
SQL> oradebug unlimit
Statement processed.

SQL> oradebug event 10046 trace name context forever, level 12;
Statement processed.
Then review /oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2904.trc file and investigate the issue, maybe we may use "tkprof" (To analyze SQL trace files), If that issue occur from some SQL statements.
But some issue can not, we can investigate from wait in trace file. Example:
$ tail -f /oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2904.trc
*** 2011-01-04 22:02:27.772
WAIT #5171700: nam='resmgr:cpu quantum' ela= 195641 location=3 consumer group id=12422 =0 obj#=-1 tim=1294153347772181
WAIT #5171700: nam='resmgr:cpu quantum' ela= 56196 location=3 consumer group id=12422 =0 obj#=-1 tim=1294153348141281
!!! If need to trace off
SQL> oradebug event 10046 trace name context off;
Statement processed.
If that oracle process uses high CPU%, and don't want to kill it yet. we may use "oradebug" to suspend it.
$ ps -e -o pcpu,user,pid,args | sort -k 1 | grep 2904
62.8 oracle 2904 oracleorcl (LOCAL=NO)

SQL> oradebug suspend
Statement processed.

$ ps -e -o pcpu,user,pid,args | sort -k 1 | grep 2904
6.0 oracle 2904 oracleorcl (LOCAL=NO)
However, I don't want to write more... I just post idea when someone told me "We find some... some database process using high %CPU"

How about you? Just wish to hear from you!!!

Remark:
SQL> oradebug help
HELP [command] Describe one or all commands
SETMYPID Debug current process
SETOSPID <ospid> Set OS pid of process to debug
SETORAPID <orapid> ['force'] Set Oracle pid of process to debug
SETORAPNAME <orapname> Set Oracle process name to debug
SHORT_STACK Get abridged OS stack
CURRENT_SQL Get current SQL
DUMP <dump_name> <lvl> [addr] Invoke named dump
DUMPSGA [bytes] Dump fixed SGA
DUMPLIST Print a list of available dumps
EVENT <text> Set trace event in process
SESSION_EVENT <text> Set trace event in session
DUMPVAR <p|s|uga> <name> [level] Print/dump a fixed PGA/SGA/UGA variable
DUMPTYPE <address> <type> <count> Print/dump an address with type info
SETVAR <p|s|uga> <name> <value> Modify a fixed PGA/SGA/UGA variable
PEEK <addr> <len> [level] Print/Dump memory
POKE <addr> <len> <value> Modify memory
WAKEUP <orapid> Wake up Oracle process
SUSPEND Suspend execution
RESUME Resume execution
FLUSH Flush pending writes to trace file
CLOSE_TRACE Close trace file
TRACEFILE_NAME Get name of trace file
LKDEBUG Invoke global enqueue service debugger
NSDBX Invoke CGS name-service debugger
-G <Inst-List | def | all> Parallel oradebug command prefix
-R <Inst-List | def | all> Parallel oradebug prefix (return output
SETINST <instance# .. | all> Set instance list in double quotes
SGATOFILE <SGA dump dir> Dump SGA to file; dirname in double quotes
DMPCOWSGA <SGA dump dir> Dump & map SGA as COW; dirname in double quotes
MAPCOWSGA <SGA dump dir> Map SGA as COW; dirname in double quotes
HANGANALYZE [level] [syslevel] Analyze system hang
FFBEGIN Flash Freeze the Instance
FFDEREGISTER FF deregister instance from cluster
FFTERMINST Call exit and terminate instance
FFRESUMEINST Resume the flash frozen instance
FFSTATUS Flash freeze status of instance
SKDSTTPCS <ifname> <ofname> Helps translate PCs to names
WATCH <address> <len> <self|exist|all|target> Watch a region of memory
DELETE <local|global|target> watchpoint <id> Delete a watchpoint
SHOW <local|global|target> watchpoints Show watchpoints
DIRECT_ACCESS <set/enable/disable command | select query> Fixed table access
CORE Dump core without crashing process
IPC Dump ipc information
UNLIMIT Unlimit the size of the trace file
PROCSTAT Dump process statistics
CALL [-t count] <func> [arg1]...[argn] Invoke function with arguments

2 comments:

Coskan Gundogar said...

Miladin Modrakovic shared the best reference so far along with Tanels documentation on oradebug

http://www.evdbt.com/Oradebug_Modrakovic.pdf

Surachart Opun said...

Yes, this is a great (un)document.
ORADEBUG - UNDOCUMENTED ORACLE UTILITY By Miladin Modrakovic
http://www.evdbt.com/Oradebug_Modrakovic.pdf