aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* uml: fix irqstack crashJeff Dike2007-09-193-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a crash caused by an interrupt coming in when an IRQ stack is being torn down. When this happens, handle_signal will loop, setting up the IRQ stack again because the tearing down had finished, and handling whatever signals had come in. However, to_irq_stack returns a mask of pending signals to be handled, plus bit zero is set if the IRQ stack was already active, and thus shouldn't be torn down. This causes a problem because when handle_signal goes around the loop, sig will be zero, and to_irq_stack will duly set bit zero in the returned mask, faking handle_signal into believing that it shouldn't tear down the IRQ stack and return thread_info pointers back to their original values. This will eventually cause a crash, as the IRQ stack thread_info will continue pointing to the original task_struct and an interrupt will look into it after it has been freed. The fix is to stop passing a signal number into to_irq_stack. Rather, the pending signals mask is initialized beforehand with the bit for sig already set. References to sig in to_irq_stack can be replaced with references to the mask. [akpm@linux-foundation.org: use UL] Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix NUMA Memory Policy Reference CountingLee Schermerhorn2007-09-193-12/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch proposes fixes to the reference counting of memory policy in the page allocation paths and in show_numa_map(). Extracted from my "Memory Policy Cleanups and Enhancements" series as stand-alone. Shared policy lookup [shmem] has always added a reference to the policy, but this was never unrefed after page allocation or after formatting the numa map data. Default system policy should not require additional ref counting, nor should the current task's task policy. However, show_numa_map() calls get_vma_policy() to examine what may be [likely is] another task's policy. The latter case needs protection against freeing of the policy. This patch adds a reference count to a mempolicy returned by get_vma_policy() when the policy is a vma policy or another task's mempolicy. Again, shared policy is already reference counted on lookup. A matching "unref" [__mpol_free()] is performed in alloc_page_vma() for shared and vma policies, and in show_numa_map() for shared and another task's mempolicy. We can call __mpol_free() directly, saving an admittedly inexpensive inline NULL test, because we know we have a non-NULL policy. Handling policy ref counts for hugepages is a bit trickier. huge_zonelist() returns a zone list that might come from a shared or vma 'BIND policy. In this case, we should hold the reference until after the huge page allocation in dequeue_hugepage(). The patch modifies huge_zonelist() to return a pointer to the mempolicy if it needs to be unref'd after allocation. Kernel Build [16cpu, 32GB, ia64] - average of 10 runs: w/o patch w/ refcount patch Avg Std Devn Avg Std Devn Real: 100.59 0.38 100.63 0.43 User: 1209.60 0.37 1209.91 0.31 System: 81.52 0.42 81.64 0.34 Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix user namespace exiting OOPsPavel Emelyanov2007-09-193-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turned out, that the user namespace is released during the do_exit() in exit_task_namespaces(), but the struct user_struct is released only during the put_task_struct(), i.e. MUCH later. On debug kernels with poisoned slabs this will cause the oops in uid_hash_remove() because the head of the chain, which resides inside the struct user_namespace, will be already freed and poisoned. Since the uid hash itself is required only when someone can search it, i.e. when the namespace is alive, we can safely unhash all the user_struct-s from it during the namespace exiting. The subsequent free_uid() will complete the user_struct destruction. For example simple program #include <sched.h> char stack[2 * 1024 * 1024]; int f(void *foo) { return 0; } int main(void) { clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0); return 0; } run on kernel with CONFIG_USER_NS turned on will oops the kernel immediately. This was spotted during OpenVZ kernel testing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Convert uid hash to hlistPavel Emelyanov2007-09-194-10/+11
| | | | | | | | | | | | Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses list_heads, thus occupying twice as much place as it could. Convert it to hlist_heads. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kernel/user.c: Use list_for_each_entry instead of list_for_eachMatthias Kaehlcke2007-09-191-6/+2
| | | | | | | | | kernel/user.c: Convert list_for_each to list_for_each_entry in uid_hash_find() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ext34: ensure do_split leaves enough free space in both blocksEric Sandeen2007-09-192-8/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The do_split() function for htree dir blocks is intended to split a leaf block to make room for a new entry. It sorts the entries in the original block by hash value, then moves the last half of the entries to the new block - without accounting for how much space this actually moves. (IOW, it moves half of the entry *count* not half of the entry *space*). If by chance we have both large & small entries, and we move only the smallest entries, and we have a large new entry to insert, we may not have created enough space for it. The patch below stores each record size when calculating the dx_map, and then walks the hash-sorted dx_map, calculating how many entries must be moved to more evenly split the existing entries between the old block and the new block, guaranteeing enough space for the new entry. The dx_map "offs" member is reduced to u16 so that the overall map size does not change - it is temporarily stored at the end of the new block, and if it grows too large it may be overwritten. By making offs and size both u16, we won't grow the map size. Also add a few comments to the functions involved. This fixes the testcase reported by hooanon05@yahoo.co.jp on the linux-ext4 list, "ext3 dir_index causes an error" Thanks to Andreas Dilger for discussing the problem & solution with me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Tested-by: Junjiro Okajima <hooanon05@yahoo.co.jp> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* disable sys_timerfd() for 2.6.23Andrew Morton2007-09-191-0/+1
| | | | | | | | | | | | | | | There is still some confusion and disagreement over what this interface should actually do. So it is best that we disable it in 2.6.23 until we get that fully sorted out. (sys_timerfd() was present in 2.6.22 but it was apparently broken, so here we assume that nobody is using it yet). Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Davide Libenzi <davidel@xmailserver.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* nfs: fix oops re sysctls and V4 supportAlexey Dobriyan2007-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS unregisters sysctls only if V4 support is compiled in. However, sysctl table is not V4 specific, so unregister it always. Steps to reproduce: [build nfs.ko with CONFIG_NFS_V4=n] modrobe nfs rmmod nfs ls /proc/sys Unable to handle kernel paging request at ffffffff880661c0 RIP: [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 PGD 203067 PUD 207063 PMD 7e216067 PTE 0 Oops: 0000 [1] SMP CPU 1 Modules linked in: lockd nfs_acl sunrpc Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2 RIP: 0010:[<ffffffff802af8e3>] [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP: 0018:ffff81007fd93e78 EFLAGS: 00010286 RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0 RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0 R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280 FS: 00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000) Stack: ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40 ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a 2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640 Call Trace: [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0 [<ffffffff80284376>] sys_getdents+0x96/0xe0 [<ffffffff8020bb3e>] system_call+0x7e/0x83 Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b RIP [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP <ffff81007fd93e78> CR2: ffffffff880661c0 Kernel panic - not syncing: Fatal exception Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* dir_index: error out instead of BUG on corrupt dx dirsEric Sandeen2007-09-192-8/+60
| | | | | | | | | | | | | Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable errors with helpful warnings. With help catching other asserts from Duane Griffin <duaneg@dghda.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Duane Griffin <duaneg@dghda.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* intel-agp: Fix i830 mask variable that changed with G33 supportDave Airlie2007-09-192-2/+3
| | | | | | | | | | The mask on i830 should be 0x70 always, later chips 0xF0 should be okay. Signed-off-by: Dave Airlie <airlied@linux.ie> Acked-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Cc: Michael Haas <laga@laga.ath.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* intelfb: Fix bug in DPLL disableAntonino A. Daplas2007-09-191-1/+1
| | | | | | | | | | Reported in Kernel Bugzilla 9006 Fix an obvious bug in DPLL disable. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xen: don't bother trying to set cr4Jeremy Fitzhardinge2007-09-191-2/+2
| | | | | | | | | | Xen ignores all updates to cr4, and some versions will kill the domain if you try to change its value. Just ignore all changes. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pci: fix unterminated pci_device_id listsKees Cook2007-09-193-2/+5
| | | | | | | | | | | | | | | | | Fix a couple drivers that do not correctly terminate their pci_device_id lists. This results in garbage being spewed into modules.pcimap when the module happens to not have 28 NULL bytes following the table, and/or the last PCI ID is actually truncated from the table when calculating the modules.alias PCI aliases, cause those unfortunate device IDs to not auto-load. Signed-off-by: Kees Cook <kees@ubuntu.com> Acked-by: Corey Minyard <minyard@acm.org> Cc: David Woodhouse <dwmw2@infradead.org> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mspec: handle shrinking virtual memory areasCliff Wickman2007-09-191-21/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shrinking of a virtual memory area that is mmap(2)'d to a memory special file (device drivers/char/mspec.c) can cause a panic. If the mapped size of the vma (vm_area_struct) is very large, mspec allocates a large vma_data structure with vmalloc(). But such a vma can be shrunk by an munmap(2). The current driver uses the current size of each vma to deduce whether its vma_data structure was allocated by kmalloc() or vmalloc(). So if the vma was shrunk it appears to have been allocated by kmalloc(), and mspec attempts to free it with kfree(). This results in a panic. This patch avoids the panic (by preserving the type of the allocation) and also makes mspec work correctly as the vma is split into pieces by the munmap(2)'s. All vma's derived from such a split vma share the same vma_data structure that represents all the pages mapped into this set of vma's. The mpec driver must be made capable of using the right portion of the structure for each member vma. In other words, it must index into the array of page addresses using the portion of the array that represents the current vma. This is enabled by storing the vma group's vm_start in the vma_data structure. The shared vma_data's are not protected by mm->mmap_sem in the fork() case so the reference count is left as atomic_t. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* rtc: rtc-ds1553.c should use resource_size_t for base addressAtsushi Nemoto2007-09-191-1/+1
| | | | | | | | | | | | Currently the rtc driver, rtc-ds1552.c uses an unsigned long to store the base mmio address of the NVRAM/RTC. This breaks on 32-bit systems with larger physical addresses. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: David Brownell <david-b@pacbell.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* rtc-ds1742.c should use resource_size_t for base addressDavid Gibson2007-09-191-1/+1
| | | | | | | | | | | | | | | | | | | Currently the rtc driver, rtc-ds1742.c uses an unsigned long to store the base mmio address of the NVRAM/RTC. This breaks on systems like PowerPC 440, which is a 32-bit core with 36-bit physical addresses: IO on the system, including the RTC, is typically above the 4GB point, and cannot fit into an unsigned long. This patch fixes the problem by replacing the unsigned long with a resource_size_t. Tested on Ebony (PPC440) (with additional patches to instantiate the ds1742 platform device appropriately). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix UTS corruption during clone(CLONE_NEWUTS)Alexey Dobriyan2007-09-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct utsname is copied from master one without any exclusion. Here is sample output from one proggie doing sethostname("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"); sethostname("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"); and another clone(,, CLONE_NEWUTS, ...) uname() hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaabbbbb' hostname = 'bbbaaaaaaaaaaaaaaaaaaaaaaaaaaa' hostname = 'aaaaaaaabbbbbbbbbbbbbbbbbbbbbb' hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaabbbb' hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaabb' hostname = 'aaabbbbbbbbbbbbbbbbbbbbbbbbbbb' hostname = 'bbbbbbbbbbbbbbbbaaaaaaaaaaaaaa' Hostname is sometimes corrupted. Yes, even _the_ simplest namespace activity had bug in it. :-( Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix failure to resume from initrdsNigel Cunningham2007-09-191-1/+3
| | | | | | | | | | | | | | | | | Commit 831441862956fffa17b9801db37e6ea1650b0f69 (Freezer: make kernel threads nonfreezable by default) breaks freezing when attempting to resume from an initrd, because the init (which is freezeable) spins while waiting for another thread to run /linuxrc, but doesn't check whether it has been told to enter the refrigerator. The original patch replaced a call to try_to_freeze() with a call to yield(). I believe a simple reversion is wrong because if !CONFIG_PM_SLEEP, try_to_freeze() is a noop. It should still yield. Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* uml: use correct type in BLKGETSIZE ioctlNicolas George2007-09-191-1/+2
| | | | | | | | | | | | | | | | | | | | | I found a type mismatch in UML that makes host block devices unusable as ubd devices on x86_64 and other 64 bits systems (segfault of the mm subsystem): In block/ioctl.c, the following lines show that the BLKGETSIZE ioctl expects a pointer to a long: case BLKGETSIZE: if ((bdev->bd_inode->i_size >> 9) > ~0UL) return -EFBIG; return put_ulong(arg, bdev->bd_inode->i_size >> 9); In arch/um/os-Linux/file.c, os_file_size calls it with an int. The ioctl_list man page should be fixed as well. Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Fix "Fix DAC960 driver on machines which don't support 64-bit DMA"Andrew Morton2007-09-191-0/+1
| | | | | | | | | | | | | | | | sparc32: drivers/block/DAC960.c: In function 'DAC960_V1_EnableMemoryMailboxInterface': drivers/block/DAC960.c:1168: error: 'DMA_32BIT_MASK' undeclared (first use in this function) drivers/block/DAC960.c:1168: error: (Each undeclared identifier is reported only Cc: <dac@conglom-o.org> Cc: <stable@kernel.org> Cc: Alessandro Polverini <alex@nibbles.it> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'release' of ↵Linus Torvalds2007-09-167-103/+172
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: thinkpad-acpi: bump up version to 0.16 ACPI: thinkpad-acpi: revert new 2.6.23 CONFIG_THINKPAD_ACPI_INPUT_ENABLED option ACPI: fix CONFIG_NET=n acpi_bus_generate_netlink_event build failure msi-laptop: replace ',' with ';' ACPI: (more) delete CONFIG_ACPI_PROCFS_SLEEP (again)
| * Pull thinkpad into release branchLen Brown2007-09-174-96/+165
| |\
| | * ACPI: thinkpad-acpi: bump up version to 0.16Henrique de Moraes Holschuh2007-09-172-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Name it thinkpad-acpi version 0.16 to avoid any confusion with some 0.15 thinkpad-acpi development snapshots and backports that had input layer support, but no hotkey_report_mode support. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
| | * ACPI: thinkpad-acpi: revert new 2.6.23 CONFIG_THINKPAD_ACPI_INPUT_ENABLED optionHenrique de Moraes Holschuh2007-09-174-93/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert new 2.6.23 CONFIG_THINKPAD_ACPI_INPUT_ENABLED Kconfig option because it would create a legacy we don't want to support. CONFIG_THINKPAD_ACPI_INPUT_ENABLED was added to try to fix an issue that is now moot with the addition of the netlink ACPI event report interface to the ACPI core. Now that ACPI core can send events over netlink, we can use a different strategy to keep backwards compatibility with older userspace, without the need for the CONFIG_THINKPAD_ACPI_INPUT_ENABLED games. And it arrived before CONFIG_THINKPAD_ACPI_INPUT_ENABLED made it to a stable mainline kernel, even, which is Good. This patch is in sync with some changes to thinkpad-acpi backports, that will keep things sane for userspace across different combinations of kernel versions, thinkpad-acpi backports (or the lack thereof), and userspace capabilities: Unless a module parameter is used, thinkpad-acpi will now behave in such a way that it will work well (by default) with userspace that still uses only the old ACPI procfs event interface and doesn't care for thinkpad-acpi input devices. It will also always work well with userspace that has been updated to use both the thinkpad-acpi input devices, and ACPI core netlink event interface, regardless of any module parameter. The module parameter was added to allow thinkpad-acpi to work with userspace that has been partially updated to use thinkpad-acpi input devices, but not the new ACPI core netlink event interface. To use this mode of hot key reporting, one has to specify the hotkey_report_mode=2 module parameter. The thinkpad-acpi driver exports the value of hotkey_report_mode through sysfs, as well. thinkpad-acpi backports to older kernels, that do not support the new ACPI core netlink interface, have code to allow userspace to switch hotkey_report_mode at runtime through sysfs. This capability will not be provided in mainline thinkpad-acpi as it is not needed there. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Michael S. Tsirkin <mst@dev.mellanox.co.il> Cc: Hugh Dickins <hugh@veritas.com> Cc: Richard Hughes <hughsient@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>
| * | Pull misc into release branchLen Brown2007-09-173-7/+7
| |\ \
| | * | ACPI: fix CONFIG_NET=n acpi_bus_generate_netlink_event build failureHenrique de Moraes Holschuh2007-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/acpi/event.c:243: error: 'acpi_generate_netlink_event' undeclared here (not in a function) Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | msi-laptop: replace ',' with ';'Jonathan Woithe2007-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Jonathan Woithe <jwoithe@physics.adelaide.edu.au> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | ACPI: (more) delete CONFIG_ACPI_PROCFS_SLEEP (again)Christian Borntraeger2007-08-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2bcf9dddeb8e79a4ba55bf191533f70f39ce ('ACPI: delete CONFIG_ACPI_PROCFS_SLEEP (again)') was incomplete. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | | | Merge branch 'master' of ↵Linus Torvalds2007-09-167-43/+73
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Warn user if cpu is ignored. [SPARC64]: Fix lockdep, particularly on SMP. [SPARC64]: Update defconfig.
| * | | | [SPARC64]: Warn user if cpu is ignored.David S. Miller2007-09-162-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When NR_CPUS is smaller than the cpu probed, let the user know that the cpu won't be used. Suggested by Al Viro. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [SPARC64]: Fix lockdep, particularly on SMP.David S. Miller2007-09-164-30/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted by Al Viro, when we try to call prom_set_trap_table() in the SMP trampoline code we try to take the PROM call spinlock which doesn't work because the current thread pointer isn't valid yet and lockdep depends upon that being correct. Furthermore, we cannot set the current thread pointer register because it can't be properly dereferenced until we return from prom_set_trap_table(). Kernel TLB misses only work after that call. So do the PROM call to set the trap table directly instead of going through the OBP library C code, and thus avoid the lock altogether. These calls are guarenteed to be serialized fully. Since there are now no calls to the prom_set_trap_table{_sun4v}() library functions, they can be deleted. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [SPARC64]: Update defconfig.David S. Miller2007-09-161-11/+5
| | | | | | | | | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'master' of ↵Linus Torvalds2007-09-1627-404/+398
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [VLAN]: Fix net_device leak. [PPP] generic: Fix receive path data clobbering & non-linear handling [PPP] generic: Call skb_cow_head before scribbling over skb [NET] skbuff: Add skb_cow_head [BRIDGE]: Kill clone argument to br_flood_* [PPP] pppoe: Fill in header directly in __pppoe_xmit [PPP] pppoe: Fix data clobbering in __pppoe_xmit and return value [PPP] pppoe: Fix skb_unshare_check call position [SCTP]: Convert bind_addr_list locking to RCU [SCTP]: Add RCU synchronization around sctp_localaddr_list [PKT_SCHED]: sch_cbq.c: Shut up uninitialized variable warning [PKTGEN]: srcmac fix [IPV6]: Fix source address selection. [IPV4]: Just increment OutDatagrams once per a datagram. [IPV6]: Just increment OutDatagrams once per a datagram. [IPV6]: Fix unbalanced socket reference with MSG_CONFIRM. [NET_SCHED] protect action config/dump from irqs [NET]: Fix two issues wrt. SO_BINDTODEVICE.
| * | | | | [VLAN]: Fix net_device leak.Al Viro2007-09-161-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In "[VLAN]: Move device registation to seperate function" (commit e89fe42cd03c8fd3686df82d8390a235717a66de), a pile of code got moved to register_vlan_dev(), including grabbing a reference to underlying device. However, original dev_hold() had been left behind, so we leak a reference to net_device now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PPP] generic: Fix receive path data clobbering & non-linear handlingHerbert Xu2007-09-161-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds missing pskb_may_pull calls to deal with non-linear packets that may arrive from pppoe or pppol2tp. It also copies cloned packets before writing over them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PPP] generic: Call skb_cow_head before scribbling over skbHerbert Xu2007-09-161-11/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's rude to write over data that other people are still using. So call skb_cow_head before PPP proceeds to modify the skb data. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [NET] skbuff: Add skb_cow_headHerbert Xu2007-09-163-11/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an optimised version of skb_cow that avoids the copy if the header can be modified even if the rest of the payload is cloned. This can be used in encapsulating paths where we only need to modify the header. As it is, this can be used in PPPOE and bridging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [BRIDGE]: Kill clone argument to br_flood_*Herbert Xu2007-09-164-50/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The clone argument is only used by one caller and that caller can clone the packet itself. This patch moves the clone call into the caller and kills the clone argument. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PPP] pppoe: Fill in header directly in __pppoe_xmitHerbert Xu2007-09-161-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the hdr variable (which is copied into the skb) and instead sets the header directly in the skb. It also uses __skb_push instead of skb_push since we've just checked using skb_cow for enough head room. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PPP] pppoe: Fix data clobbering in __pppoe_xmit and return valueHerbert Xu2007-09-161-37/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function __pppoe_xmit modifies the skb data and therefore it needs to copy and skb data if it's cloned. In fact, it currently allocates a new skb so that it can return 0 in case of error without freeing the original skb. This is totally wrong because returning zero is meant to indicate congestion whereupon pppoe is supposed to wake up the upper layer once the congestion subsides. This makes sense for ppp_async and ppp_sync but is out-of-place for pppoe. This patch makes it always return 1 and free the skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PPP] pppoe: Fix skb_unshare_check call positionHerbert Xu2007-09-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The skb_unshare_check call needs to be made before pskb_may_pull, not after. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [SCTP]: Convert bind_addr_list locking to RCUVlad Yasevich2007-09-168-163/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the sctp_sockaddr_entry is now RCU enabled as part of the patch to synchronize sctp_localaddr_list, it makes sense to change all handling of these entries to RCU. This includes the sctp_bind_addrs structure and it's list of bound addresses. This list is currently protected by an external rw_lock and that looks like an overkill. There are only 2 writers to the list: bind()/bindx() calls, and BH processing of ASCONF-ACK chunks. These are already seriealized via the socket lock, so they will not step on each other. These are also relatively rare, so we should be good with RCU. The readers are varied and they are easily converted to RCU. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [SCTP]: Add RCU synchronization around sctp_localaddr_listVlad Yasevich2007-09-166-38/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sctp_localaddr_list is modified dynamically via NETDEV_UP and NETDEV_DOWN events, but there is not synchronization between writer (even handler) and readers. As a result, the readers can access an entry that has been freed and crash the sytem. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PKT_SCHED]: sch_cbq.c: Shut up uninitialized variable warningSatyam Sharma2007-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | net/sched/sch_cbq.c: In function 'cbq_enqueue': net/sched/sch_cbq.c:383: warning: 'ret' may be used uninitialized in this function has been verified to be a bogus case. So let's shut it up. Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [PKTGEN]: srcmac fixAdit Ranadive2007-09-161-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Adit Ranadive <adit.262@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [IPV6]: Fix source address selection.Jiri Kosina2007-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 95c385 broke proper source address selection for cases in which there is a address which is makred 'deprecated'. The commit mistakenly changed ifa->flags to ifa_result->flags (probably copy/paste error from a few lines above) in the 'Rule 3' address selection code. The patch restores the previous RFC-compliant behavior. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [IPV4]: Just increment OutDatagrams once per a datagram.YOSHIFUJI Hideaki2007-09-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [IPV6]: Just increment OutDatagrams once per a datagram.YOSHIFUJI Hideaki2007-09-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [IPV6]: Fix unbalanced socket reference with MSG_CONFIRM.YOSHIFUJI Hideaki2007-09-141-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | [NET_SCHED] protect action config/dump from irqsJamal Hadi Salim2007-09-142-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (with no apologies to C Heston) On Mon, 2007-10-09 at 21:00 +0800, Herbert Xu wrote: On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote: > > > > after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep > > was quite noisy when I tried to shape my external (wireless) interface: > > > > [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock: > > [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>] > > netif_receive_skb+0x2d5/0x3c0 > > [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the > > past: > > [ 6400.535145] (police_lock){-.--} > > This is a genuine dead-lock. The police lock can be taken > for reading with softirqs on. If a second CPU tries to take > the police lock for writing, while holding the ingress lock, > then a softirq on the first CPU can dead-lock when it tries > to get the ingress lock. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>