[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

[ogfs-dev]Stabilizing some OpenGFS corner cases



As I have been working with OpenGFS, I have come across a several system
crashes.  I checked in a few of the more simple fixes this morning, but I
have a couple additional fixes on which I would like feedback.

1) OGFS_ASSERT(list_empty(&sdp->sd_log_ail),); in ogfs_shutdown_log()

An easy way to reproduce is to start "iozone -a" on an OpenGFS filesystem in
the background.  Chdir out of the OpenGFS filesystem and wait 10-20 seconds.
Kill the iozone, and unmount the filesystem immediately.  My node takes the
assert every time.

The problem is that there are dirty buffers associated with transactions on
the AIL at the time ogfs_pull_tail() is called from ogfs_put_super().  This
causes the transactions to remain on the AIL, and then ogfs_shutdown_log()
takes the assert.

My fix involves creating a new function called ogfs_ail_flush(), modeled
after ogfs_trans_check_empty(), and clear_from_ail().  This function gets
called in a loop along with ogfs_pull_tail() until the AIL is empty.  Only
then is ogfs_shutdown_log() called by ogfs_put_super().

I have attached a patch that I have been using for about a month without
problems.

2) OGFS_ASSERT(*block != BLKALLOC_INTERNAL_NOENT,); in ogfs_blkalloc()

This assert has since been replaced with a return of -EIO, but the problem
still remains.

This happens when the filesystem is near capacity and a reservation is made
requiring both metadata and data blocks.  try_rgrp_fit() reserves the data
blocks first, then the metadata blocks.  If there are not enough free
metadata blocks, it pulls blocks from the free data block pool in groups of
OGFS_META_CLUMP (64) until it has taken all of the free data blocks.  It is
that last partial clump that causes the problem.  Code often allocates the
metdata blocks (via ogfs_metaalloc()) before it allocates the data blocks
(via ogfs_blkalloc()).  ogfs_metaalloc() will then call clump_alloc(), which
will deplete the entire free data block pool, converting blocks that were
intended by the reservation to be used as data blocks.  

A simple fix is to disallow try_rgrp_fit() from reserving a partial clump of
metadata blocks (possibly causing reservations to fail when they strictly
should succeed).

(Credit to Shobhit Dayal for finding this problem and suggesting the fix.)

I'd appreciate any feedback y'all can offer!

-steve
--
Steve Landherr -- steve-sf <at> chiquapin.com
San Francisco, California

Attachment: umount-ail-assert.patch
Description: Binary data


[Site Home]     [Kernel list]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [DVD Store]     [Linux Clusters]     [Linux RAID]     [Linux Resources]

Powered by Linux