Its a relatively common problem when something goes wrong in a SAN for ext3 to detect the disk write errors and remount the filesystem read-only. Thats all well and good, only when the SAN is fixed I can't figure out how to re-re-mount the filesystem read-write without rebooting.
Behold:
[root@localhost ~]# multipath -ll
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 1:0:0:1 sdb 8:16 [active][ready]
\_ 2:0:0:1 sdc 8:32 [active][ready]
[root@localhost ~]# mount /dev/mapper/mpath0 /mnt/foo
[root@localhost ~]# touch /mnt/foo/blah
All good, now I yank the LUN out from under it.
[root@localhost ~]# touch /mnt/foo/blah
[root@localhost ~]# touch /mnt/foo/blah
touch: cannot touch `/mnt/foo/blah': Read-only file system
[root@localhost ~]# tail /var/log/messages
Mar 18 13:17:33 localhost multipathd: sdb: tur checker reports path is down
Mar 18 13:17:34 localhost multipathd: sdc: tur checker reports path is down
Mar 18 13:17:35 localhost kernel: Aborting journal on device dm-2.
Mar 18 13:17:35 localhost kernel: Buffer I/O error on device dm-2, logical block 1545
Mar 18 13:17:35 localhost kernel: lost page write due to I/O error on dm-2
Mar 18 13:17:36 localhost kernel: ext3_abort called.
Mar 18 13:17:36 localhost kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal
Mar 18 13:17:36 localhost kernel: Remounting filesystem read-only
It only thinks its read-only, in reality its not even there.
[root@localhost ~]# multipath -ll
sdb: checker msg is "tur checker reports path is down"
sdc: checker msg is "tur checker reports path is down"
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdb 8:16 [failed][faulty]
\_ 2:0:0:1 sdc 8:32 [failed][faulty]
[root@localhost ~]# ll /mnt/foo/
ls: reading directory /mnt/foo/: Input/output error
total 20
-rw-r--r-- 1 root root 0 Mar 18 13:11 bar
How it still remembers that 'bar' file being there... mystery, but not important right now. Now I re-present the LUN:
[root@localhost ~]# tail /var/log/messages
Mar 18 13:23:58 localhost multipathd: sdb: tur checker reports path is up
Mar 18 13:23:58 localhost multipathd: 8:16: reinstated
Mar 18 13:23:58 localhost multipathd: mpath0: queue_if_no_path enabled
Mar 18 13:23:58 localhost multipathd: mpath0: Recovered to normal mode
Mar 18 13:23:58 localhost multipathd: mpath0: remaining active paths: 1
Mar 18 13:23:58 localhost multipathd: dm-2: add map (uevent)
Mar 18 13:23:58 localhost multipathd: dm-2: devmap already registered
Mar 18 13:23:59 localhost multipathd: sdc: tur checker reports path is up
Mar 18 13:23:59 localhost multipathd: 8:32: reinstated
Mar 18 13:23:59 localhost multipathd: mpath0: remaining active paths: 2
Mar 18 13:23:59 localhost multipathd: dm-2: add map (uevent)
Mar 18 13:23:59 localhost multipathd: dm-2: devmap already registered
[root@localhost ~]# multipath -ll
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][enabled]
\_ 1:0:0:1 sdb 8:16 [active][ready]
\_ 2:0:0:1 sdc 8:32 [active][ready]
Great right? It says [rw] right there. Not so fast:
[root@localhost ~]# touch /mnt/foo/blah
touch: cannot touch `/mnt/foo/blah': Read-only file system
OK, doesn't do it automatically, I'll just give it a little push:
[root@localhost ~]# mount -o remount /mnt/foo
mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only
The hell you are:
[root@localhost ~]# mount -o remount,rw /mnt/foo
mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only
Noooooooooo.
I have tried all sorts of different mount/tune2fs/dmsetup commands and I cannot figure out how to get it to un-flag the block device as write-protected. Rebooting will fix it, but I'd much rather do it on-line. An hour of googling has gotten me nowhere either. Save me ServerFault.
-
Do you think it's related to
http://www.redhat.com/magazine/026dec06/features/tips_tricks/
see section on Why does the ext3 filesystems on my Storage Area Network (SAN) repeatedly become read-only?
it's quite an old article, and is talking about fibre channel. It may be related to your problem.
cagenut : Yep, its not that exact specific bug since I'm running much newer versions than those they reference, but all sorts of similar situations can cause it. The world of fibre-channel, hbas/hba-firmware/hba-drivers, array firmware, switch firmware, fabric design, device-mapper/multipathd config, lvm, and ext3 is just plain a lot of moving parts. Work on enough environments and you'll see this scenario caused by a grab bag of similar but not identical problems. The question at hand is, how to recover/remount without rebooting. -
mount -o remount,rw /mnt/fo
Chris S : I know FreeBSD, not Linux. But for fBSD it's `mount -rw /mnt/foo`, so this one looks the most right to me.Alex : I have never had this work in the scenario outlined in the question. Once the disk is marked read-only due to errors, it has always taken a reboot for me.cagenut : I'll edit this into the OP, but Alex is right here, the problem appears to be below the filesystem: [root@localhost ~]# mount -o remount,rw /mnt/foo mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only -
Your problem is not the remounting, it's the multipath device which is still read-only.
cagenut : yes I got that. the question is how to change it.Teddy : @cagenut: No, that wasn't what you asked. That may be what you should have asked, though.cagenut : Right there in the post Teddy: "I cannot figure out how to get it to un-flag the block device as write-protected."From Teddy
0 comments:
Post a Comment