Voting hack

Some days ago i’ve stumbled with interesting problem with my voting files. After i’ve restarted my cluster software on one of my node, i’ve can’t bring it back online, because the voting files has lost.

2013-12-27 14:16:01.118:
[cssd(25186)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /opt/oracle/product/grid/11.2.0.4/log/server1/cssd/ocssd.log
2013-12-27 14:16:16.175:
[cssd(25186)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /opt/oracle/product/grid/11.2.0.4/log/server1/cssd/ocssd.log
2013-12-27 14:16:31.229:
[cssd(25186)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /opt/oracle/product/grid/11.2.0.4/log/server1/cssd/ocssd.log

I’ve no idea what happening. In one of my “screen” console i have the output log from root.sh script and there i can find information about my voting files

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4256: Updating the profile
Successful addition of voting disk 0ad3c76d429d4f0fbf6c31ec5966aa72.
Successful addition of voting disk 7b53c5a105a64fe9bf7fb50e1164da95.
Successful addition of voting disk 372a10ce47c34f32bf83365e28c6ef06.
Successfully replaced voting disk group with +ASM_GRP.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   0ad3c76d429d4f0fbf6c31ec5966aa72 (ORCL:DISK_09) [ASM_GRP]
 2. ONLINE   7b53c5a105a64fe9bf7fb50e1164da95 (ORCL:DISK_10) [ASM_GRP]
 3. ONLINE   372a10ce47c34f32bf83365e28c6ef06 (ORCL:DISK_11) [ASM_GRP]
Located 3 voting disk(s).

Obviously we can use that information. Let’s try to recover our ocr and voting. First of all i need to recover my OCR information.

/opt/oracle/product/grid/11.2.0.4/bin/crsctl start crs -excl

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/ocrconfig -showbackup
PROT-24: Auto backups for the Oracle Cluster Registry are not available

server1     2013/12/27 13:51:46     /opt/oracle/product/grid/11.2.0.4/cdata/ASM_GRP/backup_20131227_135146.ocr

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/ocrconfig -add +OCRVTG_ASM_GRP

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2948
         Available space (kbytes) :     259172
         ID                       :  345172360
         Device/File Name         :     +ASM_GRP
                                    Device/File integrity check succeeded
         Device/File Name         : +OCRVTG_ASM_GRP
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded


[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/ocrconfig -delete +ASM_GRP

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2948
         Available space (kbytes) :     259172
         ID                       :  345172360
         Device/File Name         : +OCRVTG_ASM_GRP
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded


         Logical corruption check succeeded

Ok, we’ve restored our OCR. Now we need to restore or recreate or whatever, our voting files.

/opt/oracle/product/grid/11.2.0.4/bin/crsctl replace votedisk +OCRVTG_ASM_GRP
Failed to create voting files on disk group OCRVTG_ASM_GRP.
Change to configuration failed, but was successfully rolled back.
CRS-4000: Command Replace failed, or completed with errors.

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl delete css votedisk 0ad3c76d429d4f0fbf6c31ec5966aa72 force
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl add css votedisk +OCRVTG_ASM_GRP
CRS-4671: This command is not supported for ASM diskgroups.
CRS-4000: Command Add failed, or completed with errors.

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl add css votedisk /tmp/vote.dat force
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl query css votedisk
Located 0 voting disk(s).

Wow! I can’t add or replace voting files, more interesting, “crsctl” told me i haven’t any voting files. I’ve spended some time with that problem, closely investigated the log files, read the documentation, etc. And finally i’ve executed this command.

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl replace votedisk 0ad3c76d429d4f0fbf6c31ec5966aa72 +OCRVTG_ASM_GRP
Now formatting voting disk: 0ad3c76d429d4f0fbf6c31ec5966aa72.
Unable to apply correct permissions to new voting file 0ad3c76d429d4f0fbf6c31ec5966aa72.
CLSU-00100: Operating System function: scrsctl_vdiskperms 2 failed with error data: 2
CLSU-00101: Operating System error message: No such file or directory
CLSU-00103: error location: chown
CLSU-00104: additional error information: failed to chown 0ad3c76d429d4f0fbf6c31ec5966aa72
Change to configuration failed, but was successfully rolled back.
CRS-4256: Updating the profile
Segmentation fault (core dumped)

As you can see our process has down with SEGFAULT. But after that, magic was happening. I was able to add voting.

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl replace votedisk +OCRVTG_ASM_GRP
CRS-4256: Updating the profile
Successful addition of voting disk 36c2aa2204934f3abf37e6003638d25e.
Successful addition of voting disk a020dc1c0d994f5cbfaa9359cdd9851e.
Successful addition of voting disk 6988e475a3e84f5abfa44635c5c2df9f.
Successfully replaced voting disk group with +ocrvtg_ASM_GRP.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

[root@server1 ~]# /opt/oracle/product/grid/11.2.0.4/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   36c2aa2204934f3abf37e6003638d25e (ORCL:VOTE_ASM_GRP_01) [OCRVTG_ASM_GRP]
 2. ONLINE   a020dc1c0d994f5cbfaa9359cdd9851e (ORCL:VOTE_ASM_GRP_02) [OCRVTG_ASM_GRP]
 3. ONLINE   6988e475a3e84f5abfa44635c5c2df9f (ORCL:VOTE_ASM_GRP_03) [OCRVTG_ASM_GRP]
Located 3 voting disk(s).

Some times Oracle really surprise me 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s