On Fri, Aug 07, 2009 at 05:07:32PM +0800, Yan Zheng wrote:
> On 08/07/2009 03:19 PM, Jens Axboe wrote:
> > On Fri, Aug 07 2009, Yan Zheng wrote:
> >> On 08/07/2009 02:50 PM, Jens Axboe wrote:
> >>> On Fri, Aug 07 2009, Yan Zheng wrote:
> >>>> invalidate_inode_pages2_range may return -EBUSY occasionally
> >>>> which results Oops. This patch fixes the issue by moving
> >>>> invalidate_inode_pages2_range into a loop and keeping calling
> >>>> it until the return value is not -EBUSY.
> >>>>
> >>>> Signed-off-by: Yan Zheng <zheng.yan@xxxxxxxxxx>
> >>>>
> >>>> ---
> >>>> diff -urp 1/fs/btrfs/relocation.c 2/fs/btrfs/relocation.c
> >>>> --- 1/fs/btrfs/relocation.c 2009-07-29 10:03:04.367858774 +0800
> >>>> +++ 2/fs/btrfs/relocation.c 2009-08-07 13:26:43.882147138 +0800
> >>>> @@ -2553,8 +2553,13 @@ int relocate_inode_pages(struct inode *i
> >>>> last_index = (start + len - 1) >> PAGE_CACHE_SHIFT;
> >>>>
> >>>> /* make sure the dirty trick played by the caller work */
> >>>> - ret = invalidate_inode_pages2_range(inode->i_mapping,
> >>>> - first_index, last_index);
> >>>> + while (1) {
> >>>> + ret = invalidate_inode_pages2_range(inode->i_mapping,
> >>>> + first_index, last_index);
> >>>> + if (ret != -EBUSY)
> >>>> + break;
> >>>> + cond_resched();
> >>>> + }
> >>> If it returns EBUSY, would it not make more sense to call
> >>> filemap_write_and_wait_range() instead of hammering on invalidate?
> >>>
> >> The pages to invalidate are not dirty, they are from page read-ahead.
> >> Actually I have no idea how invalidate_inode_pages2_range can return
> >> -EBUSY here. (the only user of the inode is the balancer, and it does
> >> not hold references to the pages)
> >
> > Weird, I looked it up, and it already does a writeback wait. But I guess
> > that's not your issue. Patch still looks like a hack though, it would be
> > far better to figure out why it returns EBUSY and fix/wait appropriately
> > for that to pass.
> >
>
> EBUSY is from the EXTENT_LOCK test in try_release_extent_state. The test
> can be true is because some codes call lock_extent while corresponding
> pages are not all locked. (one example is btrfs_finish_ordered_io)
Ok, please use schedule_timeout(HZ/10) instead then.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html