-1

My Server went into a faulty state since the database could not write on the partition. I found out that the partition went into Read Only mode. Finally to fix it, I had to do a hard reboot.

Linux 2.6.18-164.el5PAE #1 SMP Tue Aug 18 15:59:11 EDT 2009 i686 i686 i386 GNU/Linux

/var/log/messages

Oct 31 00:56:45 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network 
Oct 31 00:57:05 ota3g1 Had[17275]: VCS CRITICAL V-16-1-50086 CPU usage on ota3g1.mtsallstream.com is 100% 
Oct 31 01:01:47 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network 
Oct 31 01:06:50 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network 
Oct 31 01:11:52 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network 
Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x2 received Data: x2 x20 x80000 x0 x0
Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1303 Link Up Event x3 received Data: x3 x1 x10 x1 x0 x0 0
Oct 31 01:12:12 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x4 received Data: x4 x20 x80000 x0 x0
Oct 31 01:12:40 ota3g1 kernel:  rport-8:0-0: blocked FC remote port time out: saving binding
Oct 31 01:12:40 ota3g1 kernel: lpfc 0000:29:00.1: 1:(0):0203 Devloss timeout on WWPN 20:25:00:a0:b8:74:f5:65 NPort x0000e4 Data: x0 x7 x0
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 38617577
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283532153
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 90825
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-16.
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 868841
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-10.
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37759889
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283349449
Oct 31 01:12:40 ota3g1 kernel: printk: 6 messages suppressed.
Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-12.
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_reserve_inode_write: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 1545
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 12745
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-10, logical block 1545
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_reserve_inode_write: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-10
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_dirty_inode: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37757897
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 1097
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0
Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_dirty_inode: Journal has aborted
Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121
Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0
Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-12
Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000
Oct 31 01:12:41 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089
Oct 31 01:12:41 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0
Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-16
Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000

df -h

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/cciss-root
                      4.9G  730M  3.9G  16% /
/dev/mapper/cciss-home
                      9.7G  1.2G  8.1G  13% /home
/dev/mapper/cciss-var
                      9.7G  494M  8.8G   6% /var
/dev/mapper/cciss-usr
                       15G  2.6G   12G  19% /usr
/dev/mapper/cciss-tmp
                      3.9G  153M  3.6G   5% /tmp
/dev/sda1             996M   43M  902M   5% /boot
tmpfs                 5.9G     0  5.9G   0% /dev/shm
/dev/mapper/cciss-product
                       25G   16G  7.4G  68% /product
/dev/mapper/cciss-opt
                       20G  4.5G   14G  25% /opt
/dev/mapper/dg_db1-vol_db1_system
                       18G  2.2G   15G  14% /database/OTADB/sys
/dev/mapper/dg_db1-vol_db1_undo
                       18G  5.8G   12G  35% /database/OTADB/undo
/dev/mapper/dg_db1-vol_db1_redo
                      8.9G  4.3G  4.2G  51% /database/OTADB/redo
/dev/mapper/dg_db1-vol_db1_sgbd
                      8.9G  654M  7.8G   8% /database/OTADB/admin
/dev/mapper/dg_db1-vol_db1_arch
                       98G   24G   69G  26% /database/OTADB/arch
/dev/mapper/dg_db1-vol_db1_indexes
                      240G   14G  214G   6% /database/OTADB/index
/dev/mapper/dg_db1-vol_db1_data
                      275G   47G  215G  18% /database/OTADB/data
/dev/mapper/dg_dbrman-vol_db_rman
                      8.9G  351M  8.1G   5% /database/RMAN
/dev/mapper/dg_app1-vol_app1
                      151G  113G   31G  79% /files/ota

/etc/fstab

/dev/cciss/root         /                       ext3    defaults        1 1
/dev/cciss/home         /home                   ext3    defaults        1 2
/dev/cciss/var          /var                    ext3    defaults        1 2
/dev/cciss/usr          /usr                    ext3    defaults        1 2
/dev/cciss/tmp          /tmp                    ext3    defaults        1 2
LABEL=/boot             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/cciss/swap         swap                    swap    defaults        0 0
/dev/cciss/product              /product                ext3    defaults        1 2
/dev/cciss/opt          /opt            ext3    defaults        1 2
/dev/dg_db1/vol_db1_system              /database/OTADB/sys             ext3    defaults        1 2
/dev/dg_db1/vol_db1_undo                /database/OTADB/undo            ext3    defaults        1 2
/dev/dg_db1/vol_db1_redo                /database/OTADB/redo            ext3    defaults        1 2
/dev/dg_db1/vol_db1_sgbd                /database/OTADB/admin           ext3    defaults        1 2
/dev/dg_db1/vol_db1_arch                /database/OTADB/arch            ext3    defaults        1 2
/dev/dg_db1/vol_db1_indexes             /database/OTADB/index           ext3    defaults        1 2
/dev/dg_db1/vol_db1_data                /database/OTADB/data            ext3    defaults        1 2
/dev/dg_dbrman/vol_db_rman              /database/RMAN          ext3    defaults        1 2
/dev/dg_app1/vol_app1           /files/ota              ext3    defaults        1 2

Thanks for all the help.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
Dev G
  • 11
  • 1
  • 1
  • 1

1 Answers1

0

Your Emulex card (logline: lpfc 0000:29:00.1: 1:1305) saw a disconnect event on its Fiber Channel port. The ext3 filesystem therefore had trouble saving it's journal. You will probably have to fsck them when the connection comes up again. As with all hard disconnect events, there is a risk of dataloss, but it should be limited to unflushed dirty pages.

As for the Oracle (from your LVM naming scheme, it looks to me) environment, I'm not qualified to guess how much hot water you're in.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296