On Sun, Nov 14, 2010 at 05:00:18PM -0500, Christoph Hellwig wrote:
> On Sun, Nov 14, 2010 at 09:42:06PM +0100, Andrea Arcangeli wrote:
> > btrfs misses this:
> >
> > + .migratepage = btree_migratepage,
> >
> > It's a bug that can trigger upstream too (not only with THP) if there
> > are hugepage allocations (like while incrasing nr_hugepages). Chris
> > already fixed it with an experimental patch.
>
> If the lack of an obscure method causes data corruption something
> is seriously wrong with THP. At least from the 10.000 foot view
I just wrote above that it can happen upstream without THP. It's not
THP related at all. THP is the consumer, this is a problem in migrate
that will trigger as well with migrate_pages or all other possible
migration APIs.
If more people would be using hugetlbfs they would have noticed
without THP.
> I can't quite figure what the exact issue is, though.
> fallback_migrate_page seems to do the right thing to me for that
> case.
>
> Btw, there's also another issue with the page migration code when used
> for filesystem pages. If directly calls into ->writepage instead
> of using the flusher threads. On most filesystems this will
> "only" cause nasty I/O patterns, but on ext4 for example it will
> be more nasty as ext3 doesn't do conversions from delayed allocations to
> real ones. So unless you're doing a lot of overwrites it will be
> hard to make any progress in writeout().
+static int btree_migratepage(struct address_space *mapping,
+ struct page *newpage, struct page *page)
+{
+ /*
+ * we can't safely write a btree page from here,
+ * we haven't done the locking hook
+ */
+ if (PageDirty(page))
+ return -EAGAIN;
fallback_migrate_page would call writeout() which is apparently not
ok in btrfs for locking issues leading to corruption.
> Btw, what codepath does THP call migrate_pages from? If you don't
> use an explicit thread writeout will be a no-op on btrfs and XFS, too.
THP never calls migrate_pages, it's memory compaction that calls it
from inside alloc_pages(order=9). It got noticed only with THP because
it makes more frequent hugepage allocations than nr_hugepages in
hugetlbfs (and maybe there are more THP users already).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html