Friday, January 28, 2011

Surprising corruption and never-ending fsck after resizing a filesystem.

System in question has Debian Lenny installed, running a 2.6.27.38 kernel. System has 16Gb memory, and 8x1Tb drives running behind a 3Ware RAID card.

The storage is managed via LVM, and is exclusively comprised of ext3 filesystems.

Short version:

  • Running a KVM guest which had 1.7Tb storage allocated to it.
  • The guest was reaching a full-disk.
  • So we decided to resize the disk that it was running upon

We're pretty familiar with LVM, and KVM, so we figured this would be a painless operation:

  • Stop the KVM guest.
  • Extend the size of the LVM partition: "lvextend -L+500Gb ..."
  • Check the filesystem : "e2fsck -f /dev/mapper/..."
  • Resize the filesystem: "resize2fs /dev/mapper/"
  • Start the guest.

The guest booted successfully, and running "df" showed the extra space, however a short time later the system decided to remount the filesystem read-only, without any explicit indication of error.

Being paranoid we shut the guest down and ran the filesystem check again, given the new size of the filesystem we expected this to take a while, however it has now been running for > 24 hours and there is no indication of how long it will take.

Using strace I can see the fsck is "doing stuff", similarly running "vmstat 1" I can see that there are a lot of block input/output operations occurring.

So now my question is threefold:

  • Has anybody come across a similar situation? Generally we've done this kind of resize in the past with zero issues.

  • What is the most likely cause? (3Ware card shows the RAID arrays of the backing stores as being A-OK, the host system hasn't rebooted and nothing in dmesg looks important/unusual)

  • Ignoring btrfs + ext3 (not mature enough to trust) should we make our larger partitions in a different filesystem in the future to avoid either this corruption (whatever the cause) or reduce the fsck time? xfs seems like the obvious candidate?

  • It seems that volumes larger than 1Tb have problems with virtio:

    https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/574665

    http://kerneltrap.org/mailarchive/linux-kvm/2010/4/23/6261185/thread

    http://sourceforge.net/tracker/index.php?func=detail&aid=2933400&group_id=180599&atid=893831

    From Jim
  • In this case it's probably virtio and the 1TB problem.

    But for me i came accross similar problems while accessing alternately a device outside a virtual machine (including shutdown this machine) and inside the virtual machine. If you access the block device inside the virtual machine with direct access (e.g. in kvm config), this means without cache/buffers and outside with buffers you can get the following problem:

    • resize device outside vm, the cache/buffers get filled on the kvm host.

    • start vm, recognize (other!) problems and shutdown.

    • fsck device.

    If all went very bad you read date from cache but that was changed within the previously run virtual machine which accessed the device without buffers/cache!

    I also do a lot of ext3 resizes (since 2.6.18) and i do this all the time ONLINE! AFAIK this uses kernel functions to resize, while offline resize uses userland code.

  • Also check your KVM cache settings; KVM can do no caching, read cache, writeback, and writethrough caching, which can take you rather by surprise.

    From Rodger

0 comments:

Post a Comment