联系:手机/微信(+86 17813235971) QQ(107644445)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
几个月以前的一个数据库故障,今天拿出来在win上重新分析,数据库启动报ORA-600 6711错
C:\Users\XFF>SQLPLUS / AS SYSDBASQL*Plus: Release 12.1.0.2.0 Production on 星期日 7月 14 16:17:32 2024Copyright (c) 1982, 2014, Oracle. All rights reserved.已连接到空闲例程。SQL> startup mount pfile='d:/pfile.txt'ORACLE 例程已经启动。Total System Global Area 6442450944 bytesFixed Size 6205768 bytesVariable Size 1493175992 bytesDatabase Buffers 4932501504 bytesRedo Buffers 10567680 bytes数据库装载完毕。SQL> alter database open;alter database open*第 1 行出现错误:ORA-01092: ORACLE instance terminated. Disconnection forcedORA-00600: internal error code, arguments: [6711], [4436379], [1], [4436389],[0], [], [], [], [], [], [], []进程 ID: 44144会话 ID: 67 序列号: 39084 |
根据经验该报错为:ORA-600 [6711] “Cluster Key Chain corruption”,也就是说很可能是cluster相关对象异常导致该问题.
对启动过程进行跟踪
PARSING IN CURSOR #17695456 len=189 dep=4 tim=233428646426 hv=186852205 ad='7ffda1eea168' sqlid='2tkw12w5k68vd'select user#,password,datats#,tempts#,type#,defrole,resource$, ptime,decode(defschclass,NULL,'DEFAULT_CONSUMER_GROUP',defschclass),spare1,spare4,ext_username,spare2 from user$ where name=:1END OF STMTPARSE #17695456:c=0,e=168,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=0,tim=233428646426BINDS #17695456: Bind#0 oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00 oacflg=18 fl2=0001 frm=01 csi=871 siz=32 off=0 kxsbbbfp=010b2df0 bln=32 avl=03 flg=05 value="SYS"EXEC #17695456:c=0,e=418,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=4,plh=1457651150,tim=233428646901WAIT #17695456: nam='db file sequential read' ela= 126 file#=1 block#=417 blocks=1 obj#=46 tim=233428647046FETCH #17695456:c=0,e=153,p=1,cr=2,cu=0,mis=0,r=1,dep=4,og=4,plh=1457651150,tim=233428647069STAT #17695456 id=1 cnt=1 pid=0 pos=1 obj=22 op='TABLE ACCESS BY INDEX ROWID USER$ (cr=2 pr=1 pw=0 time=151 us cost=1 size=139 card=1)'STAT #17695456 id=2 cnt=1 pid=1 pos=1 obj=46 op='INDEX UNIQUE SCAN I_USER1 (cr=1 pr=1 pw=0 time=149 us)'CLOSE #17695456:c=0,e=2,dep=4,type=0,tim=233428647111Incident 2601 created, dump file: C:\APP\XFF\diag\rdbms\ecp\ecp\incident\incdir_2601\ecp_ora_40516_i2601.trcORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []FETCH #15289752:c=2062500,e=2544215,p=13,cr=65626,cu=28,mis=0,r=0,dep=3,og=3,plh=3312420081,tim=233431176536=====================PARSE ERROR #387363008:len=50 dep=1 uid=0 oct=3 lid=0 tim=233431176680 err=600select cost from resource_cost$ where resource#=:1ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], []ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [] |
这个操作触发了递归查询
PARSING IN CURSOR #387319440 len=151 dep=5 lid=0 tim=233428641503 hv=2507062328 ad='7ffd9ffa23a8' sqlid='7u49y06aqxg1s'select /*+ rule */ bucket, endpoint, col#, epvalue, epvalue_raw, ep_repeat_count from histgrm$ where obj#=:1 and intcol#=:2 and row#=:3 order by bucketEND OF STMTPARSE #387319440:c=0,e=11,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641503BINDS #387319440: Bind#0 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=72 off=0 kxsbbbfp=00eb2be0 bln=22 avl=02 flg=05 value=22 Bind#1 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=24 kxsbbbfp=00eb2bf8 bln=22 avl=02 flg=01 value=2 Bind#2 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000001 frm=00 csi=00 siz=0 off=48 kxsbbbfp=00eb2c10 bln=22 avl=01 flg=01 value=0EXEC #387319440:c=0,e=105,p=0,cr=0,cu=0,mis=0,r=0,dep=5,og=3,plh=3312420081,tim=233428641652WAIT #387319440: nam='db file sequential read' ela= 124 file#=1 block#=45660 blocks=1 obj#=66 tim=233428641792FETCH #387319440:c=0,e=173,p=1,cr=3,cu=0,mis=0,r=20,dep=5,og=3,plh=3312420081,tim=233428641834STAT #387319440 id=1 cnt=20 pid=0 pos=1 obj=0 op='SORT ORDER BY (cr=3 pr=1 pw=0 time=169 us cost=0 size=0 card=0)'STAT #387319440 id=2 cnt=20 pid=1 pos=1 obj=66 op='TABLE ACCESS CLUSTER HISTGRM$ (cr=3 pr=1 pw=0 time=148 us)'STAT #387319440 id=3 cnt=1 pid=2 pos=1 obj=65 op='INDEX UNIQUE SCAN I_OBJ#_INTCOL# (cr=2 pr=0 pw=0 time=2 us)'CLOSE #387319440:c=0,e=36,dep=5,type=3,tim=233428641886 |
查看对应的trace文件
[TOC00000]Jump to table of contentsDump continued from file: C:\APP\XFF\diag\rdbms\ecp\ecp\trace\ecp_ora_40516.trc[TOC00001]ORA-00600: 内部错误代码, 参数: [6711], [4436379], [1], [4436389], [0], [], [], [], [], [], [], [][TOC00001-END][TOC00002]========= Dump for incident 2601 (ORA 600 [6711]) ========[TOC00003]----- Beginning of Customized Incident Dump(s) -----kdsDumpState: cdb: 0 dspdb: 0 type: 3*** ENTER: kds state dump *** row 0x0043b1a5.28 continuation at: 0x0043b1a5.0 file# 1 block# 242085 slot 0 (dscnt: 0)KDSTABN_GET: 1 ..... ntab: 2curSlot: 0 ..... nrows: 40Dumping kcb descriptor:kcbds 0x0000000017100DF0 : tsn 0, rdba 0x0043b1a5, afn 1, objd 64, cls 1, tidflg 0x0 0x0 0x0 dsflg 0x00100000, dsflg2 0x00004000, lobid 00000000:00000000, cnt 0, addr 0x00007FFD55D1C014 dx 0x0000000000000000 env [0x0000000017178C7C]: (scn: 0x0000.54290647 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000660)kcb_dw_scan_dumpctx: not in DW scankdsgrp1_dump database not fully open*** EXIT: kds state dump ***----- End of Customized Incident Dump(s) -----[TOC00003-END] |
通过对相关rdba进行dump分析,确认对象id为64和trace中报的信息匹配
DUL> rdba 0x0043b1a5 rdba : 0x0043b1a5=4436389 rfile# : 1 block# : 242085DUL> dump datafile 1 block 242085 headerBlock Header:block type=0x06 (table/index/cluster segment data block)block format=0xa2 (oracle 10)block rdba=0x0043b1a5 (file#=1, block#=242085)scn=0x0000.438d4a86, seq=1, tail=0x4a860601block checksum value=0xd591=54673, flag=6Data Block Header Dump: Object id on Block? Y seg/obj: 0x40=64 csc: 0x00.438d4a80 itc: 2 flg: - typ: 1 (data) fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc0x01 0x0002.01f.00014b92 0x00c01897.6e20.07 C--- 0 scn 0x0000.438c5fca0x02 0x000a.01a.0011bb8e 0x00c0292c.0317.42 --U- 22 fsc 0x0000.438d4a86Data Block Dump:================flag=0x0 --------ntab=2nrow=41frre=23fsbo=0x68ffeo=0xb90avsp=0x1ce1tosp=0x1ce1 |
进一步分析该id为什么对象,使用dul unload obj$
确认对对象为cluster C_OBJ#_INTCOL#,对应的表为HISTGRM$(统计信息中存储直方图信息表),明白这一些,处理起来就比较容易了,open数据库过程中绕过该对象访问,然后对该表进行处理即可
浙公网安备 33010602011771号