| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nfs_migrate_page() does not wait for FS-Cache to finish with a page, probably
leading to the following bad-page-state:
BUG: Bad page state in process python-bin pfn:17d39b
page:ffffea00053649e8 flags:004000000000100c count:0 mapcount:0 mapping:(null)
index:38686 (Tainted: G B ---------------- )
Pid: 31053, comm: python-bin Tainted: G B ----------------
2.6.32-71.24.1.el6.x86_64 #1
Call Trace:
[<ffffffff8111bfe7>] bad_page+0x107/0x160
[<ffffffff8111ee69>] free_hot_cold_page+0x1c9/0x220
[<ffffffff8111ef19>] __pagevec_free+0x59/0xb0
[<ffffffff8104b988>] ? flush_tlb_others_ipi+0x128/0x130
[<ffffffff8112230c>] release_pages+0x21c/0x250
[<ffffffff8115b92a>] ? remove_migration_pte+0x28a/0x2b0
[<ffffffff8115f3f8>] ? mem_cgroup_get_reclaim_stat_from_page+0x18/0x70
[<ffffffff81122687>] ____pagevec_lru_add+0x167/0x180
[<ffffffff811226f8>] __lru_cache_add+0x58/0x70
[<ffffffff81122731>] lru_cache_add_lru+0x21/0x40
[<ffffffff81123f49>] putback_lru_page+0x69/0x100
[<ffffffff8115c0bd>] migrate_pages+0x13d/0x5d0
[<ffffffff81122687>] ? ____pagevec_lru_add+0x167/0x180
[<ffffffff81152ab0>] ? compaction_alloc+0x0/0x370
[<ffffffff8115255c>] compact_zone+0x4cc/0x600
[<ffffffff8111cfac>] ? get_page_from_freelist+0x15c/0x820
[<ffffffff810672f4>] ? check_preempt_wakeup+0x1c4/0x3c0
[<ffffffff8115290e>] compact_zone_order+0x7e/0xb0
[<ffffffff81152a49>] try_to_compact_pages+0x109/0x170
[<ffffffff8111e94d>] __alloc_pages_nodemask+0x5ed/0x850
[<ffffffff814c9136>] ? thread_return+0x4e/0x778
[<ffffffff81150d43>] alloc_pages_vma+0x93/0x150
[<ffffffff81167ea5>] do_huge_pmd_anonymous_page+0x135/0x340
[<ffffffff814cb6f6>] ? rwsem_down_read_failed+0x26/0x30
[<ffffffff81136755>] handle_mm_fault+0x245/0x2b0
[<ffffffff814ce383>] do_page_fault+0x123/0x3a0
[<ffffffff814cbdf5>] page_fault+0x25/0x30
nfs_migrate_page() calls nfs_fscache_release_page() which doesn't actually wait
- even if __GFP_WAIT is set. The reason that doesn't wait is that
fscache_maybe_release_page() might deadlock the allocator as the work threads
writing to the cache may all end up sleeping on memory allocation.
However, I wonder if that is actually a problem. There are a number of things
I can do to deal with this:
(1) Make nfs_migrate_page() wait.
(2) Make fscache_maybe_release_page() honour the __GFP_WAIT flag.
(3) Set a timeout around the wait.
(4) Make nfs_migrate_page() return an error if the page is still busy.
For the moment, I'll select (2) and (4).
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG
if there's been an I/O error because it may see the parent cache object in an
unexpected state. It should only BUG if there hasn't been an I/O error.
In this case the problem was produced by remounting the cache partition to be
R/O. The EROFS state was detected and the cache was aborted, but not
everything handled the aborting correctly.
SysRq : Emergency Remount R/O
EXT4-fs (sda6): re-mounted. Opts: (null)
Emergency Remount complete
CacheFiles: I/O Error: Failed to update xattr with error -30
FS-Cache: Cache cachefiles stopped due to I/O error
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:128!
invalid opcode: 0000 [#1] SMP
CPU 0
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY
RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache]
RSP: 0018:ffff880000853d40 EFLAGS: 00010206
RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0
RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280
RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268
R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162
FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040)
Stack:
ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260
ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198
ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308
Call Trace:
[<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache]
[<ffffffff81047e44>] process_one_work+0x1eb/0x37f
[<ffffffff81047de6>] ? process_one_work+0x18d/0x37f
[<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache]
[<ffffffff810482e4>] worker_thread+0x15a/0x21a
[<ffffffff8104818a>] ? rescuer_thread+0x188/0x188
[<ffffffff8104bf96>] kthread+0x7f/0x87
[<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10
[<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
[<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe
[<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53
[<ffffffff813ad6f0>] ? gs_change+0xb/0xb
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
| |
Limit the number of I/O error reports for a cache to 1 to prevent massive
amounts of noise. After the first I/O error the cache is taken off line
automatically, so must be restarted to resume caching.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
| |
Don't mask off the object event mask when printing it. That way it can be seen
if threre are bits set that shouldn't be.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
| |
Initialise the object event mask with the calculated mask rather than unmasking
undefined events also.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
| |
Convert the fscache_object event IDs from #defines into an enum. Also add an
extra label to the enum to carry the event count and redefine the event mask
in terms of that.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CacheFiles is missing some calls to fscache_retrieval_complete() in the error
handling/collision paths of its reader functions.
This can be seen by the following assertion tripping in fscache_put_operation()
whereby the operation being destroyed is still in the in-progress state and has
not been cancelled or completed:
FS-Cache: Assertion failed
3 == 5 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:408!
invalid opcode: 0000 [#1] SMP
CPU 2
Modules linked in: xfs ioatdma dca loop joydev evdev
psmouse dcdbas pcspkr serio_raw i5000_edac edac_core i5k_amb shpchp
pci_hotplug sg sr_mod]
Pid: 8062, comm: httpd Not tainted 3.1.0-rc8 #1 Dell Inc. PowerEdge 1950/0DT097
RIP: 0010:[<ffffffff81197b24>] [<ffffffff81197b24>] fscache_put_operation+0x304/0x330
RSP: 0018:ffff880062f739d8 EFLAGS: 00010296
RAX: 0000000000000025 RBX: ffff8800c5122e84 RCX: ffffffff81ddf040
RDX: 00000000ffffffff RSI: 0000000000000082 RDI: ffffffff81ddef30
RBP: ffff880062f739f8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000003 R12: ffff8800c5122e40
R13: ffff880037a2cd20 R14: ffff880087c7a058 R15: ffff880087c7a000
FS: 00007f63dcf636e0(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0c0a91f000 CR3: 0000000062ec2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process httpd (pid: 8062, threadinfo ffff880062f72000, task ffff880087e58000)
Stack:
ffff880062f73bf8 0000000000000000 ffff880062f73bf8 ffff880037a2cd20
ffff880062f73a68 ffffffff8119aa7e ffff88006540e000 ffff880062f73ad4
ffff88008e9a4308 ffff880037a2cd20 ffff880062f73a48 ffff8800c5122e40
Call Trace:
[<ffffffff8119aa7e>] __fscache_read_or_alloc_pages+0x1fe/0x530
[<ffffffff81250780>] __nfs_readpages_from_fscache+0x70/0x1c0
[<ffffffff8123142a>] nfs_readpages+0xca/0x1e0
[<ffffffff815f3c06>] ? rpc_do_put_task+0x36/0x50
[<ffffffff8122755b>] ? alloc_nfs_open_context+0x4b/0x110
[<ffffffff815ecd1a>] ? rpc_call_sync+0x5a/0x70
[<ffffffff810e7e9a>] __do_page_cache_readahead+0x1ca/0x270
[<ffffffff810e7f61>] ra_submit+0x21/0x30
[<ffffffff810e818d>] ondemand_readahead+0x11d/0x250
[<ffffffff810e83b6>] page_cache_sync_readahead+0x36/0x60
[<ffffffff810dffa4>] generic_file_aio_read+0x454/0x770
[<ffffffff81224ce1>] nfs_file_read+0xe1/0x130
[<ffffffff81121bd9>] do_sync_read+0xd9/0x120
[<ffffffff8114088f>] ? mntput+0x1f/0x40
[<ffffffff811238cb>] ? fput+0x1cb/0x260
[<ffffffff81122938>] vfs_read+0xc8/0x180
[<ffffffff81122af5>] sys_read+0x55/0x90
Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the new FS-Cache invalidation facility from NFS to deal with foreign
changes being detected on the server rather than attempting to retire the old
cookie and get a new one.
The problem with the old method was that NFS did not wait for all outstanding
storage and retrieval ops on the cache to complete. There was no automatic
wait between the calls to ->readpages() and calls to invalidate_inode_pages2()
as the latter can only wait on locked pages that have been added to the
pagecache (which they haven't yet on entry to ->readpages()).
This was leading to oopses like the one below when an outstanding read got cut
off from its cookie by a premature release.
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
PGD 15889067 PUD 15890067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
Pid: 4544, comm: tar Not tainted 3.1.0-rc4-fsdevel+ #1064 /DG965RY
RIP: 0010:[<ffffffffa0075118>] [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
RSP: 0018:ffff8800158799e8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800070d41e0 RCX: ffff8800083dc1b0
RDX: 0000000000000000 RSI: ffff880015879960 RDI: ffff88003e627b90
RBP: ffff880015879a28 R08: 0000000000000002 R09: 0000000000000002
R10: 0000000000000001 R11: ffff880015879950 R12: ffff880015879aa4
R13: 0000000000000000 R14: ffff8800083dc158 R15: ffff880015879be8
FS: 00007f671e9d87c0(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000a8 CR3: 000000001587f000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tar (pid: 4544, threadinfo ffff880015878000, task ffff880015875040)
Stack:
ffffffffa00b1759 ffff8800070dc158 ffff8800000213da ffff88002a286508
ffff880015879aa4 ffff880015879be8 0000000000000001 ffff88002a2866e8
ffff880015879a88 ffffffffa00b20be 00000000000200da ffff880015875040
Call Trace:
[<ffffffffa00b1759>] ? nfs_fscache_wait_bit+0xd/0xd [nfs]
[<ffffffffa00b20be>] __nfs_readpages_from_fscache+0x7e/0x13f [nfs]
[<ffffffff81095fe7>] ? __alloc_pages_nodemask+0x156/0x662
[<ffffffffa0098763>] nfs_readpages+0xee/0x187 [nfs]
[<ffffffff81098a5e>] __do_page_cache_readahead+0x1be/0x267
[<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
[<ffffffff81098d7b>] ra_submit+0x1c/0x20
[<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
[<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
[<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
[<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
[<ffffffff810c22c4>] do_sync_read+0xba/0xfa
[<ffffffff810a62c9>] ? might_fault+0x4e/0x9e
[<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
[<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
[<ffffffff810c29a4>] vfs_read+0xaa/0x13a
[<ffffffff810c2a79>] sys_read+0x45/0x6c
[<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement invalidation for CacheFiles. This is in two parts:
(1) Provide an invalidation method (which just truncates the backing file).
(2) Abort attempts to copy anything read from the backing file whilst
invalidation is in progress.
Question: CacheFiles uses truncation in a couple of places. It has been using
notify_change() rather than sys_truncate() or something similar. This means
it bypasses a bunch of checks and suchlike that it possibly should be making
(security, file locking, lease breaking, vfsmount write). Should it be using
vfs_truncate() as added by a preceding patch or should it use notify_write()
and assume that anyone poking around in the cache files on disk gets
everything they deserve?
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Make a more complete truncate operation available to CacheFiles (including
security checks and suchlike) so that it can use this to clear invalidated
cache files.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a proper invalidation method rather than relying on the netfs retiring
the cookie it has and getting a new one. The problem with this is that isn't
easy for the netfs to make sure that it has completed/cancelled all its
outstanding storage and retrieval operations on the cookie it is retiring.
Instead, have the cache provide an invalidation method that will cancel or wait
for all currently outstanding operations before invalidating the cache, and
will cause new operations to queue up behind that. Whilst invalidation is in
progress, some requests will be rejected until the cache can stack a barrier on
the operation queue to cause new operations to be deferred behind it.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the state management of internal fscache operations and the accounting of
what operations are in what states.
This is done by:
(1) Give struct fscache_operation a enum variable that directly represents the
state it's currently in, rather than spreading this knowledge over a bunch
of flags, who's processing the operation at the moment and whether it is
queued or not.
This makes it easier to write assertions to check the state at various
points and to prevent invalid state transitions.
(2) Add an 'operation complete' state and supply a function to indicate the
completion of an operation (fscache_op_complete()) and make things call
it. The final call to fscache_put_operation() can then check that an op
in the appropriate state (complete or cancelled).
(3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
govern the state of an object:
(a) The ->n_ops is now the number of extant operations on the object
and is now decremented by fscache_put_operation() only.
(b) The ->n_in_progress is simply the number of objects that have been
taken off of the object's pending queue for the purposes of being
run. This is decremented by fscache_op_complete() only.
(c) The ->n_exclusive is the number of exclusive ops that have been
submitted and queued or are in progress. It is decremented by
fscache_op_complete() and by fscache_cancel_op().
fscache_put_operation() and fscache_operation_gc() now no longer try to
clean up ->n_exclusive and ->n_in_progress. That was leading to double
decrements against fscache_cancel_op().
fscache_cancel_op() now no longer decrements ->n_ops. That was leading to
double decrements against fscache_put_operation().
fscache_submit_exclusive_op() now decides whether it has to queue an op
based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
will persist in being true even after all preceding operations have been
cancelled or completed. Furthermore, if an object is active and there are
runnable ops against it, there must be at least one op running.
(4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
provide a function to record completion of the pages as they complete.
When n_pages reaches 0, the operation is deemed to be complete and
fscache_op_complete() is called.
Add calls to fscache_retrieval_complete() anywhere we've finished with a
page we've been given to read or allocate for. This includes places where
we just return pages to the netfs for reading from the server and where
accessing the cache fails and we discard the proposed netfs page.
The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs. This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.
FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY
RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
[<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
[<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
[<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
[<ffffffff810d8d47>] evict+0xa1/0x15c
[<ffffffff810d8e2e>] dispose_list+0x2c/0x38
[<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
[<ffffffff810c56b7>] prune_super+0xd5/0x140
[<ffffffff8109b615>] shrink_slab+0x102/0x1ab
[<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
[<ffffffff8103e009>] ? process_timeout+0xb/0xb
[<ffffffff8109dba3>] kswapd+0x270/0x289
[<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
[<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
[<ffffffff8104bf7a>] kthread+0x7f/0x87
[<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
[<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
[<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
[<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
[<ffffffff813ad6b0>] ? gs_change+0xb/0xb
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
| |
Make fscache_relinquish_cookie() log a warning and wait if there are any
outstanding reads left on the cookie it was given.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
| |
Downgrade some debugging statements to not unconditionally print stuff, but
rather be conditional on the appropriate module parameter setting.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
| |
Check that the netfs isn't trying to relinquish a cookie that still has read
operations in progress upon it. If there are, then give log a warning and BUG.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
| |
Downgrade the requirements passed to the allocator in the gfp flags parameter.
FS-Cache/CacheFiles can handle OOM conditions simply by aborting the attempt to
store an object or a page in the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.
There are, however, two problems with this:
(1) It can lead to the PG_fscache mark being applied _after_ the page is set
PG_uptodate and unlocked (by the call to fscache_end_io()).
(2) CacheFiles's ref on the page is dropped immediately following
fscache_end_io() - and so may not still be held when the mark is applied.
This can lead to the page being passed back to the allocator before the
mark is applied.
Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page. This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.
The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:
BUG: Bad page state in process tar pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping: (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G W 3.1.0-rc4-fsdevel+ #1064
Call Trace:
[<ffffffff8109583c>] ? dump_page+0xb9/0xbe
[<ffffffff81095916>] bad_page+0xd5/0xea
[<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
[<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
[<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
[<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
[<ffffffff81098d7b>] ra_submit+0x1c/0x20
[<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
[<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
[<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
[<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
[<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
[<ffffffff810c22c4>] do_sync_read+0xba/0xfa
[<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
[<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
[<ffffffff810c29a4>] vfs_read+0xaa/0x13a
[<ffffffff810c2a79>] sys_read+0x45/0x6c
[<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b
As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.
Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't. This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.
Reported-by: M. Stevens <m@tippett.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge commit 752451f01c45 ("Merge branch 'i2c-embedded/for-next' of
git://git.pengutronix.de/git/wsa/linux") resulted in a build breakage
for OMAP
arch/arm/mach-omap2/i2c.c: In function 'omap_pm_set_max_mpu_wakeup_lat_compat':
arch/arm/mach-omap2/i2c.c:130:2: error: implicit declaration of function 'omap_pm_set_max_mpu_wakeup_lat'
make[1]: *** [arch/arm/mach-omap2/i2c.o] Error 1
Fix this by including the appropriate header file with the function
prototype.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull virtio update from Rusty Russell:
"Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.
There's a slightly non-trivial merge in virtio-net, as we cleaned up
the virtio add_buf interface while DaveM accepted the mq virtio-net
patches."
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
virtio_console: Add support for remoteproc serial
virtio_console: Merge struct buffer_token into struct port_buffer
virtio: add drv_to_virtio to make code clearly
virtio: use dev_to_virtio wrapper in virtio
virtio-mmio: Fix irq parsing in command line parameter
virtio_console: Free buffers from out-queue upon close
virtio: Convert dev_printk(KERN_<LEVEL> to dev_<level>(
virtio_console: Use kmalloc instead of kzalloc
virtio_console: Free buffer if splice fails
virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
virtio: console: don't rely on virtqueue_add_buf() returning capacity.
virtio_net: don't rely on virtqueue_add_buf() returning capacity.
virtio-net: remove unused skb_vnet_hdr->num_sg field
virtio-net: correct capacity math on ring full
virtio: move queue_index and num_free fields into core struct virtqueue.
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a simple serial connection driver called
VIRTIO_ID_RPROC_SERIAL (11) for communicating with a
remote processor in an asymmetric multi-processing
configuration.
This implementation reuses the existing virtio_console
implementation, and adds support for DMA allocation
of data buffers and disables use of tty console and
the virtio control queue.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Refactoring the splice functionality by unifying the approach for
sending scatter-lists and regular buffers. This simplifies
buffer handling and reduces code size. Splice will now allocate
a port_buffer and send_buf() and free_buf() can always be used
for any buffer.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add drv_to_virtio wrapper to get virtio_driver from device_driver.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use dev_to_virtio wrapper in virtio to make code clearly.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When the resource_size_t is 64-bit long, the sscanf() on
the virtio device command line paramter string may return
wrong value because its format was defined as "%u". Fixed
by using an intermediate local value of a known length.
Also added cleaned up the resource creation and added extra
comments to make the parameters parsing easier to follow.
Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Free pending output buffers from the virtio out-queue when
host has acknowledged port_close.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (rebased & cut down)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
dev_<level> calls take less code than dev_printk(KERN_<LEVEL>
and reducing object size is good.
Convert if (printk_ratelimit()) dev_printk to dev_<level>_ratelimited.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Avoid the more cpu expensive kzalloc when allocating buffers.
Originally kzalloc was intended for isolating the guest from
the host by not sending random guest data to the host. But device
isolation is not yet in place so kzalloc is not really needed.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Free the allocated scatter list if send_pages fails in function
port_splice_write.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| | |
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| | |
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| | |
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| | |
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| | |
We simplified virtqueue_add_buf(), make it clear in the callers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Now noone relies on this behavior, we simplify virtqueue_add_buf() so it
return 0 or -errno.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now we can easily use vq->num_free to determine if there are descriptors
left in the queue, we're about to change virtqueue_add_buf() to return 0
on success. The virtio_net driver is the only one which actually uses
the return value, so change that.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
[Split from "correct capacity math on ring full" -- Rusty]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Capacity math on ring full is wrong: we are
looking at num_sg but that might be optimistic
because of indirect buffer use.
The implementation also penalizes fast path
with extra memory accesses for the benefit of
ring full condition handling which is slow path.
It's easy to query ring capacity so let's do just that.
This change also makes it easier to move vnet header
for tx around as follow-up patch does.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
They're generic concepts, so hoist them. This also avoids accessor
functions (though kept around for merge with DaveM's net tree).
This goes even further than Jason Wang's 17bb6d4088 patch
("virtio-ring: move queue_index to vring_virtqueue") which moved the
queue_index from the specific transport.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use the module_pci_driver() macro to make the code simpler
by eliminating module_init and module_exit calls.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| | |
Signed-off-by: Alex Russell <giles.alex@hotmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
virtio requests are scatter-gather-style descriptors, but no
assumptions should be made about the layout. lguest was lazy here,
but saved by the fact that the network device hands all requests to
tun (which does it correctly) and console and random devices simply
use readv and writev.
Block devices, however, are broken: we convert to iovecs internally,
just make sure we handle the correctly.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is analogous to commit a1b383870a made by Rusty Russell to all
the VirtIO headers at the time. This eases the use of the header as
is by other OSes.
Signed-off-by: Bryan Venteicher <bryanv@daemoninthecloset.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Virtio devices may attempt to add descriptors to a virtqueue from atomic
context using GFP_ATOMIC allocation. This is problematic because such
allocations can fall outside of the lowmem mapping, causing virt_to_phys
to report bogus physical addresses which are subsequently passed to
userspace via the buffers for the virtual device.
This patch masks out __GFP_HIGH and __GFP_HIGHMEM from the requested
flags when allocating descriptors for a virtqueue. If an atomic
allocation is requested and later fails, we will return -ENOSPC which
will be handled by the driver.
Cc: stable@kernel.org
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When using a virtio transport, the 9p net device may pass the physical
address of a kernel buffer to userspace via a scatterlist inside a
virtqueue. If the kernel buffer is mapped outside of the linear mapping
(e.g. highmem), then virt_to_page will return a bogus value and we will
populate the scatterlist with junk.
This patch uses kmap_to_page when populating the page array for a kernel
buffer.
Cc: stable@kernel.org
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some virtio device drivers (9p) need to translate high virtual addresses
to physical addresses, which are inserted into the virtqueue for
processing by userspace.
This patch exports the kmap_to_page symbol, so that the affected drivers
can be compiled as modules.
Cc: stable@kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This update contains overall only driver-specific fixes. Slightly
large LOC are seen in usb-audio driver for a couple of new device
quirks and cs42l71 ASoC driver for enhanced features. The others are
a few small (regression) fixes HD-audio, and yet other small / trival
ASoC fixes."
* tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Support for Digidesign Mbox 2 USB sound card:
ALSA: HDA: Fix sound resume hang
ALSA: hda - bug fix for invalid connection list of Haswell HDMI codec pins
ALSA: hda - Fix the wrong pincaps set in ALC861VD dallas/hp fixup
ALSA: hda - Set codec->single_adc_amp flag for Realtek codecs
ASoC: atmel-ssc: change disable to disable in dts node
ASoC: Prevent pop_wait overwrite
ALSA: usb-audio: ignore-quirk for HP Wireless Audio
ALSA: hda - Always turn on pins for HDMI/DP
ALSA: hda - Fix pin configuration of HP Pavilion dv7
ASoC: core: Fix splitting of log messages
ASoC: cs42l73: Change VSPIN/VSPOUT to VSPINOUT
ASoC: cs42l73: Add DAPM events for power down.
ASoC: cs42l73: Add DMIC's as DAPM inputs.
ASoC: sigmadsp: Fix endianness conversion issue
ASoC: tpa6130a2: Use devm_* APIs
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch is the result of a lot of trial and error, since there are no specs
available for the device.
Full duplex support is provided, i.e. playback and recording in stereo.
The format is hardcoded at 48000Hz @ 24 bit, which is the maximum that the
device supports. Also, MIDI in and MIDI out both work.
Users will notice that the S/PDIF light also flashes when playback or recording
is active. I believe this means that S/PDIF input/output is simultaneously
activated with the analogue i/o during use.
But this particular functionality remains untested.
Note that this particular version of the patch is so far untested on the
physical hardware because I have not compiled a full kernel with the changes.
However, extensive testing has been done by many users of the hardware
who believe other versions of my patch have worked since circa 2009.
[Modified to make a function static by tiwai]
Signed-off-by: Damien Zammit <damien@zamaudio.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Resuming a switcheroo'd HDA controller hangs since the completion
is one-shot (thus works the first time). Fix by using completions
that explictly need rearming, so remain fired before.
Signed-off-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Haswell HDMI codec pins may report invalid connection list entries, which
will cause failure to play audio via HDMI or Display Port.
So this patch adds fixup for Haswell to workaround this hardware issue:
enable DP1.2 mode and override the pins' connection list entries with proper
value.
Signed-off-by: Mengdong Lin <mengdong.lin@intel.com>
Signed-off-by: Xingchao Wang <xingchao.wang@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|