ASM data scrubbing for fixing ORA-15196: invalid ASM block header

Today I’ve stumbled upon on ORA-15196: invalid ASM block header during making backup through RMAN.

[2016-01-26 06:53:58,046] INFO: [RMAN] RMAN-00571: ===========================================================
[2016-01-26 06:53:58,046] INFO: [RMAN] RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
[2016-01-26 06:53:58,046] INFO: [RMAN] RMAN-00571: ===========================================================
[2016-01-26 06:53:58,046] INFO: [RMAN]
[2016-01-26 06:53:58,046] INFO: [RMAN] RMAN-03009: failure of backup command on ch1 channel at 01/26/2016 06:53:50
[2016-01-26 06:53:58,046] INFO: [RMAN] ORA-01122: ???? ???? ?????? 7 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,046] INFO: [RMAN] ORA-01110: ???? ?????? 7: '+DATA/B/DATAFILE/owapub.285.902008683'
[2016-01-26 06:53:58,046] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,047] INFO: [RMAN] RMAN-03009: failure of backup command on ch1 channel at 01/26/2016 06:53:00
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01122: ???? ???? ?????? 10 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01110: ???? ?????? 10: '+DATA/B/DATAFILE/olap.290.902008687'
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,047] INFO: [RMAN] RMAN-03009: failure of backup command on ch1 channel at 01/26/2016 06:53:00
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01122: ???? ???? ?????? 9 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01110: ???? ?????? 9: '+DATA/B/DATAFILE/odm.289.902008685'
[2016-01-26 06:53:58,047] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,047] INFO: [RMAN] RMAN-03009: failure of backup command on ch4 channel at 01/26/2016 06:52:52
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01122: ???? ???? ?????? 6 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01110: ???? ?????? 6: '+DATA/B/DATAFILE/ctxd.288.902008683'
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,048] INFO: [RMAN] RMAN-03009: failure of backup command on ch1 channel at 01/26/2016 06:53:00
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-19566: ???????? ????? 0 ???????????? ?????? ??? ????? +DATA/B/DATAFILE/system.282.902008669
[2016-01-26 06:53:58,048] INFO: [RMAN] RMAN-03009: failure of backup command on ch2 channel at 01/26/2016 06:50:55
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01122: ???? ???? ?????? 5 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01110: ???? ?????? 5: '+DATA/B/DATAFILE/system.286.902008683'
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,048] INFO: [RMAN] RMAN-03009: failure of backup command on ch4 channel at 01/26/2016 06:50:55
[2016-01-26 06:53:58,048] INFO: [RMAN] ORA-19566: ???????? ????? 0 ???????????? ?????? ??? ????? +DATA/B/DATAFILE/system.283.902008669
[2016-01-26 06:53:58,048] INFO: [RMAN] RMAN-03009: failure of backup command on ch4 channel at 01/26/2016 06:50:47
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01122: ???? ???? ?????? 3 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01110: ???? ?????? 3: '+DATA/B/DATAFILE/system.284.902008669'
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,049] INFO: [RMAN] RMAN-03009: failure of backup command on ch2 channel at 01/26/2016 06:50:54
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-19566: ???????? ????? 0 ???????????? ?????? ??? ????? +DATA/B/DATAFILE/system.281.902008669
[2016-01-26 06:53:58,049] INFO: [RMAN] RMAN-03009: failure of backup command on ch2 channel at 01/26/2016 06:43:28
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01122: ???? ???? ?????? 8 ?? ?????? ???????? ??? ???????????
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01110: ???? ?????? 8: '+DATA/B/DATAFILE/apps_ts_queues.287.902008683'
[2016-01-26 06:53:58,049] INFO: [RMAN] ORA-01210: ????????? ????? ?????? ????????? ?? ???????????? ????????
[2016-01-26 06:53:58,066] INFO: [RMAN]
[2016-01-26 06:53:58,066] INFO: [RMAN] RMAN>
[2016-01-26 06:53:58,067] INFO: [RMAN]
[2016-01-26 06:53:58,067] INFO: [RMAN] Recovery Manager complete.

I’ve decided to re-cataloged datafile copy and have tried to restore those files again, but got the same problem.

RMAN> restore datafile 7,10,9,6,5,8;

Starting restore at 26 jan 2016 11:41
using channel ORA_DISK_1

channel ORA_DISK_1: restoring datafile 00007
input datafile copy RECID=454566 STAMP=902144368 file name=/opt/backup/db_rolling_f_backup_data_D_I-2598577434_TS-OWAPUB_FNO-7_prqqdddb
destination for restore of datafile 00007: +DATA/B/DATAFILE/owapub.285.902008683
channel ORA_DISK_1: copied datafile copy of datafile 00007
output file name=+DATA/B/DATAFILE/owapub.285.902008683 RECID=0 STAMP=0
channel ORA_DISK_1: restoring datafile 00010
input datafile copy RECID=454565 STAMP=902144344 file name=/opt/backup/db_rolling_f_backup_data_D_I-2598577434_TS-OLAP_FNO-10_pbqqddbr
destination for restore of datafile 00010: +DATA/B/DATAFILE/olap.290.902008687
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00601: fatal error in recovery manager
RMAN-03004: fatal error during execution of command
ORA-01092: ORACLE instance terminated. Disconnection forced
ORACLE error from target database:
ORA-03135: connection lost contact
Process ID: 1064
Session ID: 2326 Serial number: 5651

Errors in the ASM alert log.

 language=sql
WARNING: cache read a corrupt block: group=2(DATA) fn=286 indblk=0 disk=1 (DISK2) incarn=3916093435 au=2645 blk=0 count=7
Tue Jan 26 13:27:18 2016
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc:
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
NOTE: a corrupted block from group DATA was dumped to /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc
WARNING: cache read (retry) a corrupt block: group=2(DATA) fn=286 indblk=0 disk=1 (DISK2) incarn=3916093435 au=2645 blk=0 count=1
Tue Jan 26 13:27:18 2016
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc:
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
WARNING: cache read (retry) a corrupt block: group=2(DATA) fn=286 indblk=0 disk=10 (DISK11) incarn=3916093431 au=2652 blk=0 count=1
Tue Jan 26 13:27:18 2016
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc:
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
NOTE: a corrupted block from group DATA was dumped to /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc
WARNING: cache read (retry) a corrupt block: group=2(DATA) fn=286 indblk=0 disk=10 (DISK11) incarn=3916093431 au=2652 blk=0 count=1
Tue Jan 26 13:27:18 2016
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_13607.trc:
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
ERROR: cache failed to read group=2(DATA) fn=286 indblk=0 from disk(s): 1(DISK2) 10(DISK11)
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [286] [2147483648] [32 != 1]

And inside the trace file.

SQL> alter diskgroup DATA dismount force /* ASM SERVER:1705649721 */
Tue Jan 26 11:41:34 2016
NOTE: client OEBS:BCKP1H_OEBS:balancebckp1h dismounting group 2 (DATA)
Tue Jan 26 11:41:34 2016
NOTE: client +ASM1:asmvol dismounting group 2 (DATA)
NOTE: client +ASM1:asmvol deregistered
Tue Jan 26 11:41:35 2016
ERROR: ORA-15130 in COD recovery for diskgroup 2/0x65aa2639 (DATA)
ERROR: ORA-15130 thrown in RBAL for group number 2
Tue Jan 26 11:41:35 2016
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_24165.trc:
ORA-15130: diskgroup "DATA" is being dismounted
Tue Jan 26 11:41:36 2016
NOTE: AMDU dump of disk group DATA initiated at /opt/oracle/base/diag/asm/+asm/+ASM1/trace
Errors in file /opt/oracle/base/diag/asm/+asm/+ASM1/trace/+ASM1_ora_17158.trc  (incident=6697):
ORA-15335: ASM metadata corruption detected in disk group 'DATA'
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DISK2" in group "DATA" may result in a data loss
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
Incident details in: /opt/oracle/base/diag/asm/+asm/+ASM1/incident/incdir_6697/+ASM1_ora_17158_i6697.trc
Tue Jan 26 11:41:37 2016
-------------------------------------------------------------------------------
Trace Bucket Dump End: KSI
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
ORA-15196: invalid ASM block header [kfc.c:29297] [endian_kfbh] [290] [2147483648] [0 != 1]
----- Abridged Call Stack Trace -----
ksedsts()+244<-kfcReadBlk()+1931<-kfcReadBuffer()+3292<-kfcGet0Priv()+9313<-kfcGet1Priv()+1697<-kfcGetPriv()+882<-kffLoadInDirXPs()+1944<-kffSendMap()+2614<-kffFaultXPMap()+2582<-kfnsFaultXPMap()+1099<-kfnDispatchInt()+663<-kfnDispatch()+67<-opiodr()+1165<-ttcpip()+2699
<-opitsk()+1734<-opiino()+945<-opiodr()+1165<-opidrv()+587<-sou2o()+145<-opimai_real()+154<-ssthrdmain()+412<-main()+236<-__libc_start_main()+253
----- End of Abridged Call Stack Trace -----
NOTE: process _user17158_+asm1 (17158) initiating offline of disk 11.3916093264 (DISK12) with mask 0x7e in group 2 (DATA) with client assisting
NOTE: process _user17158_+asm1 (17158) initiating offline of disk 1.3916093267 (DISK2) with mask 0x7e in group 2 (DATA) with client assisting
NOTE: call GMON to check too many offlines in grp 2
NOTE: checking PST: grp = 2
kfdp_updateDsk(): callcnt 21 grp 2
NOTE: checking PST for grp 2 done.
WARNING: GMON returns TRUE for too many offlines for group 2 set flags local 1
NOTE: OFFLINING flag (0x5011) for disk DISK12 set only locally.
NOTE: OFFLINING flag (0x5011) for disk DISK2 set only locally.
NOTE: initiating PST update: grp 2 (DATA), dsk = 11/0xe96ad750, mask = 0x6a, op = clear
NOTE: initiating PST update: grp 2 (DATA), dsk = 1/0xe96ad753, mask = 0x6a, op = clear
kfdp_updateDsk(): callcnt 22 grp 2
DDE rules only execution for: ORA 15066

Ok, lets try to use new 12c feature scrubbing.

SQL> alter diskgroup data scrub repair power high wait
Tue Jan 26 13:50:52 2016
SQL> alter diskgroup data scrub repair power high wait
Tue Jan 26 13:50:52 2016
NOTE: Start scrubbing diskgroup DATA
NOTE: Waiting for scrubbing to finish
Tue Jan 26 13:50:52 2016
SCRB started with pid=34, OS id=9174
Tue Jan 26 13:50:52 2016
SUCCESS: alter diskgroup data scrub disk disk2 repair power high
Tue Jan 26 13:50:59 2016
NOTE: Corrupted block 65928 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_2_seq_304812.264.902008353
Tue Jan 26 13:51:00 2016
NOTE: Scrubbing block 65928 in file 264.902008353 in slave
Tue Jan 26 13:51:00 2016
NOTE: Corrupted block 92552 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_2_seq_304808.265.902008353
Tue Jan 26 13:51:00 2016
NOTE: Successfully scrubbed block 65928 in file 264.902008353
Tue Jan 26 13:51:01 2016
NOTE: Scrubbing block 92552 in file 265.902008353 in slave
NOTE: Successfully scrubbed block 92552 in file 265.902008353
Tue Jan 26 13:51:05 2016
NOTE: Corrupted block 315760 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_1_seq_291844.259.902008351
Tue Jan 26 13:51:06 2016
NOTE: Scrubbing block 315760 in file 259.902008351 in slave
NOTE: Successfully scrubbed block 315760 in file 259.902008351
Tue Jan 26 13:51:11 2016
NOTE: Corrupted block 315768 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_1_seq_291844.259.902008351
Tue Jan 26 13:51:12 2016
NOTE: Scrubbing block 315768 in file 259.902008351 in slave
NOTE: Successfully scrubbed block 315768 in file 259.902008351
Tue Jan 26 13:51:23 2016
NOTE: Corrupted block 1239440 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_2_seq_304811.266.902008355
Tue Jan 26 13:51:24 2016
NOTE: Scrubbing block 1239440 in file 266.902008355 in slave
NOTE: Successfully scrubbed block 1239440 in file 266.902008355
Tue Jan 26 13:51:36 2016
NOTE: Corrupted block 1216896 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_1_seq_291844.259.902008351
Tue Jan 26 13:51:37 2016
NOTE: Scrubbing block 1216896 in file 259.902008351 in slave
NOTE: Successfully scrubbed block 1216896 in file 259.902008351
Tue Jan 26 13:51:42 2016
NOTE: Corrupted block 1866128 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_2_seq_304811.266.902008355
Tue Jan 26 13:51:43 2016
NOTE: Scrubbing block 1866128 in file 266.902008355 in slave
NOTE: Successfully scrubbed block 1866128 in file 266.902008355
Tue Jan 26 13:51:54 2016
NOTE: Corrupted block 1911168 found in file +DATA/B/ARCHIVELOG/2016_01_24/thread_1_seq_291844.259.902008351
Tue Jan 26 13:51:55 2016
NOTE: Scrubbing block 1911168 in file 259.902008351 in slave
NOTE: Successfully scrubbed block 1911168 in file 259.902008351
...
...
...
Tue Jan 26 15:35:54 2016
NOTE: Corrupted block 128 found in file +DATA/B/DATAFILE/SYSTEM.281.902008669
Tue Jan 26 15:35:55 2016
NOTE: Scrubbing block 128 in file 281.902008669 in slave
NOTE: Successfully scrubbed block 128 in file 281.902008669
Tue Jan 26 15:36:14 2016
NOTE: Corrupted block 129 found in file +DATA/B/DATAFILE/SYSTEM.281.902008669
Tue Jan 26 15:36:15 2016
NOTE: Scrubbing block 129 in file 281.902008669 in slave
NOTE: Successfully scrubbed block 129 in file 281.902008669

And if we have look into trace file +ASM1_scv4_17458.trc we can notice what’s going on

HARD VIOLATION(S) DETECTED!
CORRUPTED BLOCK INFORMATION:
  memory address: 0x7f2d8536f000
  block size: 8192
  block number: 171 [0xab]
  block offset in this write request of 1 block(s): 0
  file type: datafile copy (type value: 12)
REQUESTED CHECK(S):
  Magic number  Block size  Block number  Checksum  Head-tail match
FAILED CHECK(S):
  BLOCK SIZE CHECK FAILED
   FOUND BLOCK SIZE 512 (ENCODED AS 1) AT OFFSET 0x1 IN BUFFER
   BUT BLOCK SIZE IS EXPECTED TO BE 8192 (ENCODED AS 5) ACCORDING TO IO TAG
  BLOCK NUMBER CHECK FAILED
   FOUND BLOCK NO 4194105 [0x3fff39] ENCODED IN DBA 150994745 [0x8ffff39] AT OFFSET 0x4 IN BUFFER
   BUT BLOCK NO IS EXPECTED TO BE 171 [0xab] ACCORDING TO IO TAG
  HEAD-TAIL MATCH CHECK FAILED
   FOUND TAIL CONSISTENCY VALUE 0x4e495f50 at OFFSET 0x1ffc IN BUFFER
   BUT CONSISTENCY VALUE COMPUTED FROM HEADER IS 0x9c7323a

There are a lot of files which has been repaired. But not all is went well.

WARNING: cache read a corrupt block: group=1(DATA) fn=290 indblk=0 disk=11 (DISK12) incarn=3931425692 au=4012 blk=0 count=1
...
WARNING: cache read (retry) a corrupt block: group=1(DATA) fn=290 indblk=0 disk=11 (DISK12) incarn=3931425692 au=4012 blk=0 count=1
...
WARNING: cache read (retry) a corrupt block: group=1(DATA) fn=290 indblk=0 disk=1 (DISK2) incarn=3931425695 au=4014 blk=0 count=1
...
WARNING: cache read (retry) a corrupt block: group=1(DATA) fn=290 indblk=0 disk=1 (DISK2) incarn=3931425695 au=4014 blk=0 count=1
...
ERROR: disk 1(DISK2) in group 1(DATA) cannot be offlined because all disks [1(DISK2), 11(DISK12)] with mirrored data would be offline.

In my case, both versions of block, original and mirrored has been corrupted.

I’ve payed attention for this

FAILED CHECK(S):
  BLOCK SIZE CHECK FAILED
   FOUND BLOCK SIZE 512 (ENCODED AS 1) AT OFFSET 0x1 IN BUFFER
   BUT BLOCK SIZE IS EXPECTED TO BE 8192 (ENCODED AS 5) ACCORDING TO IO TAG

To be continued


 

Advertisements

5 thoughts on “ASM data scrubbing for fixing ORA-15196: invalid ASM block header

  1. Игорь Усольцев

    Maxim, were these ORA-15196 related to the ASM diskgroup with disks bigger than 2Tb?

    Reply
    1. mczimm Post author

      Yes Игорь. The problem has appeared on diskgroup with disks bigger than 2Tb. But I have another system with the same software version, and PSU number, and there are disks bigger than 2Tb too, and there hasn’t appeared this problem yet.

      Reply
  2. mczimm Post author

    Игорь, of course I’m considered it. I’ve installed PSU5 which resolved this problem. Please read this document Doc ID 2065748.1

    Reply
  3. Pingback: ORA-15196: invalid ASM block header. Continued investigation. | Yet Another Oracle Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s