diff options
author | Andrew Morton <akpm@osdl.org> | 2006-09-30 23:29:29 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@g5.osdl.org> | 2006-10-01 00:39:33 -0700 |
commit | bd4c8ce41a2e2f0c5bf54343ab54e8e09faec021 (patch) | |
tree | e8a7d3cfafbe6ee35672953718ba4223e450d938 /mm/truncate.c | |
parent | d025c9db7f31fc0554ce7fb2dfc78d35a77f3487 (diff) | |
download | kernel_samsung_smdk4412-bd4c8ce41a2e2f0c5bf54343ab54e8e09faec021.tar.gz kernel_samsung_smdk4412-bd4c8ce41a2e2f0c5bf54343ab54e8e09faec021.tar.bz2 kernel_samsung_smdk4412-bd4c8ce41a2e2f0c5bf54343ab54e8e09faec021.zip |
[PATCH] invalidate_inode_pages2(): ignore page refcounts
The recent fix to invalidate_inode_pages() (git commit 016eb4a) managed to
unfix invalidate_inode_pages2().
The problem is that various bits of code in the kernel can take transient refs
on pages: the page scanner will do this when inspecting a batch of pages, and
the lru_cache_add() batching pagevecs also hold a ref.
Net result is transient failures in invalidate_inode_pages2(). This affects
NFS directory invalidation (observed) and presumably also block-backed
direct-io (not yet reported).
Fix it by reverting invalidate_inode_pages2() back to the old version which
ignores the page refcounts.
We may come up with something more clever later, but for now we need a 2.6.18
fix for NFS.
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'mm/truncate.c')
-rw-r--r-- | mm/truncate.c | 34 |
1 files changed, 32 insertions, 2 deletions
diff --git a/mm/truncate.c b/mm/truncate.c index 8fde6580657..f4edbc179d1 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -287,9 +287,39 @@ unsigned long invalidate_inode_pages(struct address_space *mapping) { return invalidate_mapping_pages(mapping, 0, ~0UL); } - EXPORT_SYMBOL(invalidate_inode_pages); +/* + * This is like invalidate_complete_page(), except it ignores the page's + * refcount. We do this because invalidate_inode_pages2() needs stronger + * invalidation guarantees, and cannot afford to leave pages behind because + * shrink_list() has a temp ref on them, or because they're transiently sitting + * in the lru_cache_add() pagevecs. + */ +static int +invalidate_complete_page2(struct address_space *mapping, struct page *page) +{ + if (page->mapping != mapping) + return 0; + + if (PagePrivate(page) && !try_to_release_page(page, 0)) + return 0; + + write_lock_irq(&mapping->tree_lock); + if (PageDirty(page)) + goto failed; + + BUG_ON(PagePrivate(page)); + __remove_from_page_cache(page); + write_unlock_irq(&mapping->tree_lock); + ClearPageUptodate(page); + page_cache_release(page); /* pagecache ref */ + return 1; +failed: + write_unlock_irq(&mapping->tree_lock); + return 0; +} + /** * invalidate_inode_pages2_range - remove range of pages from an address_space * @mapping: the address_space @@ -356,7 +386,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping, } } was_dirty = test_clear_page_dirty(page); - if (!invalidate_complete_page(mapping, page)) { + if (!invalidate_complete_page2(mapping, page)) { if (was_dirty) set_page_dirty(page); ret = -EIO; |