aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* x86: fix NULL pointer deref in __switch_toSuresh Siddha2008-06-201-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am able to reproduce the oops reported by Simon in __switch_to() with lguest. My debug showed that there is at least one lguest specific issue (which should be present in 2.6.25 and before aswell) and it got exposed with a kernel oops with the recent fpu dynamic allocation patches. In addition to the previous possible scenario (with fpu_counter), in the presence of lguest, it is possible that the cpu's TS bit it still set and the lguest launcher task's thread_info has TS_USEDFPU still set. This is because of the way the lguest launcher handling the guest's TS bit. (look at lguest_set_ts() in lguest_arch_run_guest()). This can result in a DNA fault while doing unlazy_fpu() in __switch_to(). This will end up causing a DNA fault in the context of new process thats getting context switched in (as opossed to handling DNA fault in the context of lguest launcher/helper process). This is wrong in both pre and post 2.6.25 kernels. In the recent 2.6.26-rc series, this is showing up as NULL pointer dereferences or sleeping function called from atomic context(__switch_to()), as we free and dynamically allocate the FPU context for the newly created threads. Older kernels might show some FPU corruption for processes running inside of lguest. With the appended patch, my test system is running for more than 50 mins now. So atleast some of your oops (hopefully all!) should get fixed. Please give it a try. I will spend more time with this fix tomorrow. Reported-by: Simon Holm Thøgersen <odie@cs.aau.dk> Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86, geode: add a VSA2 ID for General SoftwareJordan Crouse2008-06-192-3/+6
| | | | | | | | | | | | General Software writes their own VSA2 module for their version of the Geode BIOS, which returns a different ID then the standard VSA2. This was causing the framebuffer driver to break for most GSW boards. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Cc: tglx@linutronix.de Cc: linux-geode@lists.infradead.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: use BOOTMEM_EXCLUSIVE on 32-bitBernhard Walle2008-06-191-2/+8
| | | | | | | | | | | | This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for i386 and prints a error message on failure. The patch is still for 2.6.26 since it is only bug fixing. The unification of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
* x86, 32-bit: fix boot failure on TSC-less processorsMikael Pettersson2008-06-191-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6" (invalid opcode) and a kernel halt immediately after the kernel has been uncompressed. The BUG shows EIP pointing to an rdtsc instruction in native_read_tsc(), invoked from native_sched_clock(). (This error occurs so early that not even the serial console can capture it.) A bisection showed that this bug first occurs in 2.6.26-rc3-git7, via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910: >x86: distangle user disabled TSC from unstable > >tsc_enabled is set to 0 from the command line switch "notsc" and from >the mark_tsc_unstable code. Seperate those functionalities and replace >tsc_enable with tsc_disable. This makes also the native_sched_clock() >decision when to use TSC understandable. > >Preparatory patch to solve the sched_clock() issue on 32 bit. > >Signed-off-by: Thomas Gleixner <tglx@linutronix.de> The core reason for this bug is that native_sched_clock() gets called before tsc_init(). Before the commit above, tsc_32.c used a "tsc_enabled" variable which defaulted to 0 == disabled, and which only got enabled late in tsc_init(). Thus early calls to native_sched_clock() would skip the TSC and use jiffies instead. After the commit above, tsc_32.c uses a "tsc_disabled" variable which defaults to 0, meaning that the TSC is Ok to use. Early calls to native_sched_clock() now erroneously try to use the TSC on !cpu_has_tsc processors, leading to invalid opcode exceptions. My proposed fix is to initialise tsc_disabled to a "soft disabled" state distinct from the hard disabled state set up by the "notsc" kernel option. This fixes the native_sched_clock() problem. It also allows tsc_init() to be simplified: instead of setting tsc_disabled = 1 on every error return, we just set tsc_disabled = 0 once when all checks have succeeded. I've verified that this lets my 486 boot again. I've also verified that a Core2 machine still uses the TSC as clocksource after the patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: fix NULL pointer deref in __switch_toSuresh Siddha2008-06-192-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118 > > [62060.043009] *pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... /* * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.Jeremy Fitzhardinge2008-06-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can potentially have the same number of physical address bits as the 64-bit host ("Enhanced Legacy PAE Paging"). This means, in theory, we could have up to 52 bits of physical address in a pte. The 32-bit kernel uses a 32-bit unsigned long to represent a pfn. This means that it can only represent physical addresses up to 32+12=44 bits wide. Rather than widening pfns everywhere, just set 2^44 as the Linux x86_32-PAE architectural limit for physical address size. This is a bugfix for two cases: 1. running a 32-bit PAE kernel on a machine with more than 64GB RAM. 2. running a 32-bit PAE Xen guest on a host machine with more than 64GB RAM In both cases, a pte could need to have more than 36 bits of physical, and masking it to 36-bits will cause fairly severe havoc. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* agp: brown paper bag patch - put back two lines that got lostDave Airlie2008-06-181-0/+2
| | | | | | | | | | | Commit 62c96b9d0917894c164aa3e474a3ff3bca1554ae ("agp/intel: cleanup some serious whitespace badness") didn't just fix whitespace. It also lost two lines. Noticed by Linus. No more whitespace diffs for me. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'agp-patches' of ↵Linus Torvalds2008-06-1821-168/+235
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp/intel: cleanup some serious whitespace badness [AGP] intel_agp: Add support for Intel 4 series chipsets [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset agp: more boolean conversions. drivers/char/agp - use bool agp: two-stage page destruction issue agp/via: fixup pci ids
| * agp/intel: cleanup some serious whitespace badnessDave Airlie2008-06-191-68/+66
| | | | | | | | Signed-off-by: Dave Airlie <airlied@redhat.com>
| * [AGP] intel_agp: Add support for Intel 4 series chipsetsZhenyu Wang2008-06-191-10/+73
| | | | | | | | | | Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * [AGP] intel_agp: extra stolen mem size available for IGD_GM chipsetZhenyu Wang2008-06-191-2/+2
| | | | | | | | | | | | | | | | | | This adds missing stolen memory size detect for IGD_GM, be sure to detect right size as current X intel driver (2.3.2) which has already worked out. Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * agp: more boolean conversions.Dave Airlie2008-06-196-15/+15
| | | | | | | | Signed-off-by: Dave Airlie <airlied@redhat.com>
| * drivers/char/agp - use boolJoe Perches2008-06-1916-63/+55
| | | | | | | | | | | | | | | | Use boolean in AGP instead of having own TRUE/FALSE -- Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * agp: two-stage page destruction issueJan Beulich2008-06-193-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | besides it apparently being useful only in 2.6.24 (the changes in 2.6.25 really mean that it could be converted back to a single-stage mechanism), I'm seeing an issue in Xen Dom0 kernels, which is caused by the calling of gart_to_virt() in the second stage invocations of the destroy function. I think that besides this being a real issue with Xen (where unmap_page_from_agp() is not just a page table attribute change), this also is invalid from a theoretical perspective: One should not assume that gart_to_virt() is still valid after unmapping a page. So minimally (keeping the 2-stage mechanism) a patch like the one below would be needed. Jan Signed-off-by: Dave Airlie <airlied@redhat.com>
| * agp/via: fixup pci idsGreg KH2008-06-191-2/+11
| | | | | | | | | | | | | | | | | | add a new PCI ID and remove an old dodgy one, include the explaination in the commented code so nobody readds later. (davej also sent the pci id addition). Signed-off-by: Dave Airlie <airlied@redhat.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2008-06-182-4/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler() RDMA/nes: Fix off-by-one in nes_reg_user_mr() error path
| * | IB/uverbs: Fix check of is_closed flag check in ib_uverbs_async_handler()Jack Morgenstein2008-06-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1ae5c187 ("IB/uverbs: Don't store struct file * for event files") changed the way that closed files are handled in the uverbs code. However, after the conversion, is_closed flag is checked incorrectly in ib_uverbs_async_handler(). As a result, no async events are ever passed to applications. Found by: Ronni Zimmerman <ronniz@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | RDMA/nes: Fix off-by-one in nes_reg_user_mr() error pathRoland Dreier2008-06-101-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nes_reg_user_mr() should fail if page_count becomes >= 1024 * 512 rather than just testing for strict >, because page_count is essentially used as an index into an array with 1024 * 512 entries, so allowing the loop to continue with page_count == 1024 * 512 means that memory after the end of the array is corrupted. This leads to a crash triggerable by a userspace application that requests registration of a too-big region. Also get rid of the call to pci_free_consistent() here to avoid corrupting state with a double free, since the same memory will be freed in the code jumped to at reg_user_mr_err. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds2008-06-182-16/+20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: Revert "[WATCHDOG] hpwdt: Fix NMI handling." [WATCHDOG] hpwdt: Add CFLAGS to get driver working Revert "[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static"
| * | | Revert "[WATCHDOG] hpwdt: Fix NMI handling."Wim Van Sebroeck2008-06-181-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The old setup works better. Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * | | [WATCHDOG] hpwdt: Add CFLAGS to get driver workingThomas Mingarelli2008-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To get this driver working we need the CFLAGS_hpwdt.o += -O in the Makefile. Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
| * | | Revert "[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static"Thomas Mingarelli2008-06-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver needs the asmlinkage tag and the CFLAGS line in the Makefile. Without it the driver doesn't work. Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds2008-06-183-3/+12
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] dpt_i2o: Add PROC_IA64 define [SCSI] scsi_host regression: fix scsi host leak [SCSI] sr: fix corrupt CD data after media change and delay
| * | | | [SCSI] dpt_i2o: Add PROC_IA64 defineJeff Mahoney2008-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following compile failure: drivers/scsi/dpt_i2o.c:83: error: 'PROC_IA64' undeclared here (not in a function) Mark Salyzyn <Mark_Salyzyn@adaptec.com> indicated that IA64 must report itself as PROC_INTEL, so I've changed the comment for PROC_INTEL. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Mark Salyzyn <Mark_Salyzyn@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | | [SCSI] scsi_host regression: fix scsi host leakMike Christie2008-06-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9c7701088a61cc0cf8a6e1c68d1e74e3cc2ee0b7 Author: Dave Young <hidave.darkstar@gmail.com> Date: Tue Jan 22 14:01:34 2008 +0800 scsi: use class iteration api Isn't a correct replacement for the original hand rolled host lookup. The problem is that class_find_child would get a reference to the host's class device which is never released. Since the host class device holds a reference to the host gendev, the host can never be freed. In 2.6.26 we started using class_find_device, and this function also gets a reference to the device, so we end up with an extra ref and the host will not get released. This patch adds a put_device to balance the class_find_device() get. I kept the scsi_host_get in scsi_host_lookup, because the target layer is using scsi_host_lookup and it looks like it needs the SHOST_DEL check. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * | | | [SCSI] sr: fix corrupt CD data after media change and delayJames Bottomley2008-06-101-0/+3
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> If you delay 30s or more before mounting a CD after inserting it then the kernel has the wrong value for the CD size. http://marc.info/?t=121276133000001 The problem is in sr_test_unit_ready(): the function eats unit attentions without adjusting the sdev->changed status. This means that when the CD signals changed media via unit attention, we can ignore it. Fix by making sr_test_unit_ready() adjust the changed status. Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* | | | Merge branch 'merge' of ↵Linus Torvalds2008-06-182-1/+10
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Clear sub-page HPTE present bits when demoting page size [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector
| * | | | [POWERPC] Clear sub-page HPTE present bits when demoting page sizePaul Mackerras2008-06-181-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we demote a slice from 64k to 4k, and we are about to insert an HPTE for a 4k subpage and we notice that there is an existing 64k HPTE, we first invalidate that HPTE before inserting the new 4k subpage HPTE. Since the bits that encode which hash bucket the old HPTE was in overlap with the bits that encode which of the 16 subpages have HPTEs, we need to clear out the subpage HPTE-present bits before starting to insert HPTEs for the 4k subpages. If we don't do that, we can erroneously think that a subpage already has an HPTE when it doesn't. That in itself wouldn't be such a problem except that when we go to update the HPTE that we think is present on machines with a hypervisor, the hypervisor can tell us that the HPTE we think is there is actually there even though it isn't, which can lead to a process getting stuck in a loop, continually faulting. The reason for the confusion is that the AVPN (abbreviated virtual page number) we are looking for in the HPTE for a 4k subpage can actually match the AVPN in a stale HPTE for another 64k page. For example, the HPTE for the 4k subpage at 0x84000f000 will be in the same hash bucket and have the same AVPN as the HPTE for the 64k page at 0x8400f0000. This fixes the code to clear out the subpage HPTE-present bits. Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | [POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vectorJosh Boyer2008-06-181-1/+6
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit added support for the new 440x6 and 464 cores that have the added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the TLBs. The new bits were cleared in the finish_tlb_load function, however a similar bit of code was missed in the DataStorage interrupt vector. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | | Merge branch 'for_linus' of ↵Linus Torvalds2008-06-181-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: restore UDFFS_DEBUG to being undefined by default
| * | | | udf: restore UDFFS_DEBUG to being undefined by defaultPaul Collins2008-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 706047a79725b585cf272fdefc234b31b6545c72, "udf: Fix compilation warnings when UDF debug is on" inadvertently (I assume) enabled debugging messages by default for UDF. This patch disables them again. Signed-off-by: Paul Collins <paul@ondioline.org> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-06-1843-437/+544
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits) netlink: genl: fix circular locking Revert "mac80211: Use skb_header_cloned() on TX path." af_unix: fix 'poll for write'/ connected DGRAM sockets tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set atl1: relax eeprom mac address error check net/enc28j60: low power mode net/enc28j60: section fix sky2: 88E8040T pci device id netxen: download firmware in pci probe netxen: cleanup debug messages netxen: remove global physical_port array netxen: fix portnum for hp mezz cards ibm_newemac: select CRC32 in Kconfig xfrm: fix fragmentation for ipv4 xfrm tunnel netfilter: nf_conntrack_h323: fix module unload crash netfilter: nf_conntrack_h323: fix memory leak in module initialization error path netfilter: nf_nat: fix RCU races atm: [he] send idle cells instead of unassigned when in SDH mode atm: [he] limit queries to the device's register space atm: [br2864] fix routed vcmux support ...
| * | | | netlink: genl: fix circular lockingPatrick McHardy2008-06-181-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | genetlink has a circular locking dependency when dumping the registered families: - dump start: genl_rcv() : take genl_mutex genl_rcv_msg() : call netlink_dump_start() while holding genl_mutex netlink_dump_start(), netlink_dump() : take nlk->cb_mutex ctrl_dumpfamily() : try to detect this case and not take genl_mutex a second time - dump continuance: netlink_rcv() : call netlink_dump netlink_dump : take nlk->cb_mutex ctrl_dumpfamily() : take genl_mutex Register genl_lock as callback mutex with netlink to fix this. This slightly widens an already existing module unload race, the genl ops used during the dump might go away when the module is unloaded. Thomas Graf is working on a seperate fix for this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Revert "mac80211: Use skb_header_cloned() on TX path."David S. Miller2008-06-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 608961a5eca8d3c6bd07172febc27b5559408c5d. The problem is that the mac80211 stack not only needs to be able to muck with the link-level headers, it also might need to mangle all of the packet data if doing sw wireless encryption. This fixes kernel bugzilla #10903. Thanks to Didier Raboud (for the bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for bringing this bisection analysis to my attention), and Ilpo (for trying to analyze this purely from the TCP side). In 2.6.27 we can take another stab at this, by using something like skb_cow_data() when the TX path of mac80211 ends up with a non-NULL tx->key. The ESP protocol code in the IPSEC stack can be used as a model for implementation. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | af_unix: fix 'poll for write'/ connected DGRAM socketsRainer Weikusat2008-06-171-9/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The unix_dgram_sendmsg routine implements a (somewhat crude) form of receiver-imposed flow control by comparing the length of the receive queue of the 'peer socket' with the max_ack_backlog value stored in the corresponding sock structure, either blocking the thread which caused the send-routine to be called or returning EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET sockets. The poll-implementation for these socket types is datagram_poll from core/datagram.c. A socket is deemed to be writeable by this routine when the memory presently consumed by datagrams owned by it is less than the configured socket send buffer size. This is always wrong for connected PF_UNIX non-stream sockets when the abovementioned receive queue is currently considered to be full. 'poll' will then return, indicating that the socket is writeable, but a subsequent write result in EAGAIN, effectively causing an (usual) application to 'poll for writeability by repeated send request with O_NONBLOCK set' until it has consumed its time quantum. The change below uses a suitably modified variant of the datagram_poll routines for both type of PF_UNIX sockets, which tests if the recv-queue of the peer a socket is connected to is presently considered to be 'full' as part of the 'is this socket writeable'-checking code. The socket being polled is additionally put onto the peer_wait wait queue associated with its peer, because the unix_dgram_sendmsg routine does a wake up on this queue after a datagram was received and the 'other wakeup call' is done implicitly as part of skb destruction, meaning, a process blocked in poll because of a full peer receive queue could otherwise sleep forever if no datagram owned by its socket was already sitting on this queue. Among this change is a small (inline) helper routine named 'unix_recvq_full', which consolidates the actual testing code (in three different places) into a single location. Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'davem-fixes' of ↵David S. Miller2008-06-1711-242/+189
| |\ \ \ \ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| | * | | | atl1: relax eeprom mac address error checkRadu Cristescu2008-06-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The atl1 driver tries to determine the MAC address thusly: - If an EEPROM exists, read the MAC address from EEPROM and validate it. - If an EEPROM doesn't exist, try to read a MAC address from SPI flash. - If that fails, try to read a MAC address directly from the MAC Station Address register. - If that fails, assign a random MAC address provided by the kernel. We now have a report of a system fitted with an EEPROM containing all zeros where we expect the MAC address to be, and we currently handle this as an error condition. Turns out, on this system the BIOS writes a valid MAC address to the NIC's MAC Station Address register, but we never try to read it because we return an error when we find the all- zeros address in EEPROM. This patch relaxes the error check and continues looking for a MAC address even if it finds an illegal one in EEPROM. Signed-off-by: Radu Cristescu <advantis@gmx.net> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | net/enc28j60: low power modeDavid Brownell2008-06-171-24/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep enc28j60 chips in low-power mode when they're not in use. At typically 120 mA, these chips run hot even when idle; this low power mode cuts that power usage by a factor of around 100. This version provides a generic routine to poll a register until its masked value equals some value ... e.g. bit set or cleared. It's basically what the previous wait_phy_ready() did, but this version is generalized to support the handshaking needed to enter and exit low power mode. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Claudio Lanconelli <lanconelli.claudio@eptar.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | net/enc28j60: section fixDavid Brownell2008-06-171-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor bugfixes to the enc28j60 driver ... wrong section marking, indentation, and bogus use of spi_bus_type. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Claudio Lanconelli <lanconelli.claudio@eptar.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | sky2: 88E8040T pci device idStephen Hemminger2008-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Missed one pci id for 88E8040T. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | netxen: download firmware in pci probeDhananjay Phadke2008-06-172-59/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Downloading firmware in pci probe allows recovery in case of firmware failure by reloading the driver. Also reduced delays in firmware load. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | netxen: cleanup debug messagesDhananjay Phadke2008-06-174-132/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o Remove unnecessary debug prints and functions. o Explicitly specify pci class (0x020000) to avoid enabling management function. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | netxen: remove global physical_port arrayDhananjay Phadke2008-06-176-24/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Store physical port number in netxen_adapter structure. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | netxen: fix portnum for hp mezz cardsDhananjay Phadke2008-06-171-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a the issue where logical port number is set incorrectly for HP blade mezz cards. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| | * | | | ibm_newemac: select CRC32 in KconfigJosh Boyer2008-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ibm_newemac driver requires ether_crc to be defined. Apparently it is possible to generate a .config without CONFIG_CRC32 set which causes the following link errors if IBM_NEW_EMAC is selected: LD .tmp_vmlinux1 drivers/built-in.o: In function `emac_hash_mc': core.c:(.text+0x2f524): undefined reference to `crc32_le' core.c:(.text+0x2f528): undefined reference to `bitrev32' make: *** [.tmp_vmlinux1] Error 1 This patch has IBM_NEW_EMAC select CRC32 so we don't hit this error. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * | | | | tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is setAng Way Chuang2008-06-171-0/+15
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, tun.c running in TUN_TUN_DEV mode will set the protocol of packet to IPv4 if TUN_NO_PI is set. My program failed to work when I assumed that the driver will check the first nibble of packet, determine IP version and set the appropriate protocol. Signed-off-by: Ang Way Chuang <wcang@nav6.org> Acked-by: Max Krasnyansky <maxk@qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | xfrm: fix fragmentation for ipv4 xfrm tunnelSteffen Klassert2008-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When generating the ip header for the transformed packet we just copy the frag_off field of the ip header from the original packet to the ip header of the new generated packet. If we receive a packet as a chain of fragments, all but the last of the new generated packets have the IP_MF flag set. We have to mask the frag_off field to only keep the IP_DF flag from the original packet. This got lost with git commit 36cf9acf93e8561d9faec24849e57688a81eb9c5 ("[IPSEC]: Separate inner/outer mode processing on output") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netfilter: nf_conntrack_h323: fix module unload crashPatrick McHardy2008-06-171-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The H.245 helper is not registered/unregistered, but assigned to connections manually from the Q.931 helper. This means on unload existing expectations and connections using the helper are not cleaned up, leading to the following oops on module unload: CPU 0 Unable to handle kernel paging request at virtual address c00a6828, epc == 802224dc, ra == 801d4e7c Oops[#1]: Cpu 0 $ 0 : 00000000 00000000 00000004 c00a67f0 $ 4 : 802a5ad0 81657e00 00000000 00000000 $ 8 : 00000008 801461c8 00000000 80570050 $12 : 819b0280 819b04b0 00000006 00000000 $16 : 802a5a60 80000000 80b46000 80321010 $20 : 00000000 00000004 802a5ad0 00000001 $24 : 00000000 802257a8 $28 : 802a4000 802a59e8 00000004 801d4e7c Hi : 0000000b Lo : 00506320 epc : 802224dc ip_conntrack_help+0x38/0x74 Tainted: P ra : 801d4e7c nf_iterate+0xbc/0x130 Status: 1000f403 KERNEL EXL IE Cause : 00800008 BadVA : c00a6828 PrId : 00019374 Modules linked in: ip_nat_pptp ip_conntrack_pptp ath_pktlog wlan_acl wlan_wep wlan_tkip wlan_ccmp wlan_xauth ath_pci ath_dev ath_dfs ath_rate_atheros wlan ath_hal ip_nat_tftp ip_conntrack_tftp ip_nat_ftp ip_conntrack_ftp pppoe ppp_async ppp_deflate ppp_mppe pppox ppp_generic slhc Process swapper (pid: 0, threadinfo=802a4000, task=802a6000) Stack : 801e7d98 00000004 802a5a60 80000000 801d4e7c 801d4e7c 802a5ad0 00000004 00000000 00000000 801e7d98 00000000 00000004 802a5ad0 00000000 00000010 801e7d98 80b46000 802a5a60 80320000 80000000 801d4f8c 802a5b00 00000002 80063834 00000000 80b46000 802a5a60 801e7d98 80000000 802ba854 00000000 81a02180 80b7e260 81a021b0 819b0000 819b0000 80570056 00000000 00000001 ... Call Trace: [<801e7d98>] ip_finish_output+0x0/0x23c [<801d4e7c>] nf_iterate+0xbc/0x130 [<801d4e7c>] nf_iterate+0xbc/0x130 [<801e7d98>] ip_finish_output+0x0/0x23c [<801e7d98>] ip_finish_output+0x0/0x23c [<801d4f8c>] nf_hook_slow+0x9c/0x1a4 One way to fix this would be to split helper cleanup from the unregistration function and invoke it for the H.245 helper, but since ctnetlink needs to be able to find the helper for synchonization purposes, a better fix is to register it normally and make sure its not assigned to connections during helper lookup. The missing l3num initialization is enough for this, this patch changes it to use AF_UNSPEC to make it more explicit though. Reported-by: liannan <liannan@twsz.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netfilter: nf_conntrack_h323: fix memory leak in module initialization error ↵Patrick McHardy2008-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path Properly free h323_buffer when helper registration fails. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netfilter: nf_nat: fix RCU racesPatrick McHardy2008-06-173-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix three ct_extend/NAT extension related races: - When cleaning up the extension area and removing it from the bysource hash, the nat->ct pointer must not be set to NULL since it may still be used in a RCU read side - When replacing a NAT extension area in the bysource hash, the nat->ct pointer must be assigned before performing the replacement - When reallocating extension storage in ct_extend, the old memory must not be freed immediately since it may still be used by a RCU read side Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315 and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>