aboutsummaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfdLinus Torvalds2011-05-252-0/+81
|\ | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd: net: fix get_net_ns_by_fd for !CONFIG_NET_NS ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry. ns: Declare sys_setns in syscalls.h net: Allow setting the network namespace by fd ns proc: Add support for the ipc namespace ns proc: Add support for the uts namespace ns proc: Add support for the network namespace. ns: Introduce the setns syscall ns: proc files for namespace naming policy.
| * ns proc: Add support for the uts namespaceEric W. Biederman2011-05-101-0/+39
| | | | | | | | | | Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
| * ns: Introduce the setns syscallEric W. Biederman2011-05-101-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the networking stack today there is demand to handle multiple network stacks at a time. Not in the context of containers but in the context of people doing interesting things with routing. There is also demand in the context of containers to have an efficient way to execute some code in the container itself. If nothing else it is very useful ad a debugging technique. Both problems can be solved by starting some form of login daemon in the namespaces people want access to, or you can play games by ptracing a process and getting the traced process to do things you want it to do. However it turns out that a login daemon or a ptrace puppet controller are more code, they are more prone to failure, and generally they are less efficient than simply changing the namespace of a process to a specified one. Pieces of this puzzle can also be solved by instead of coming up with a general purpose system call coming up with targed system calls perhaps socketat that solve a subset of the larger problem. Overall that appears to be more work for less reward. int setns(int fd, int nstype); The fd argument is a file descriptor referring to a proc file of the namespace you want to switch the process to. In the setns system call the nstype is 0 or specifies an clone flag of the namespace you intend to change to prevent changing a namespace unintentionally. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: Cleaned up the system call arguments - Changed the order. - Modified nstype to take the standard clone flags. v6: Added missing error handling as pointed out by Matt Helsley <matthltc@us.ibm.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | Merge branch 'for-2.6.40' of ↵Linus Torvalds2011-05-252-3/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: signal: sys_pause() should check signal_pending() ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread
| * | signal: sys_pause() should check signal_pending()Oleg Nesterov2011-05-251-2/+4
| | | | | | | | | | | | | | | | | | | | | ERESTART* is always wrong without TIF_SIGPENDING. Teach sys_pause() to handle the spurious wakeup correctly. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
| * | ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED threadOleg Nesterov2011-05-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not clear why ptrace_resume() does wake_up_process(). Unless the caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do not need the extra and potentionally spurious wakeup. If the caller is PTRACE_KILL, wake_up_process() is even more wrong. The tracee can sleep in any state in any place, and if we have a buggy code which doesn't handle a spurious wakeup correctly PTRACE_KILL can be used to exploit it. For example: int main(void) { int child, status; child = fork(); if (!child) { int ret; assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); ret = pause(); printf("pause: %d %m\n", ret); return 0x23; } sleep(1); assert(ptrace(PTRACE_KILL, child, 0,0) == 0); assert(child == wait(&status)); printf("wait: %x\n", status); return 0; } prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the userland. In this case sys_pause() is buggy as well and should be fixed. I do not know what was the original rationality behind PTRACE_KILL. The man page is simply wrong and afaics it was always wrong. Imho it should be deprecated, or may be it should do send_sig(SIGKILL) as Denys suggests, but in any case I do not think that the current behaviour was intentional. Note: there is another problem, ptrace_resume() changes ->exit_code and this can race with SIGKILL too. Eventually we should change ptrace to not use ->exit_code. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds2011-05-252-1/+9
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (26 commits) arch/tile: prefer "tilepro" as the name of the 32-bit architecture compat: include aio_abi.h for aio_context_t arch/tile: cleanups for tilegx compat mode arch/tile: allocate PCI IRQs later in boot arch/tile: support signal "exception-trace" hook arch/tile: use better definitions of xchg() and cmpxchg() include/linux/compat.h: coding-style fixes tile: add an RTC driver for the Tilera hypervisor arch/tile: finish enabling support for TILE-Gx 64-bit chip compat: fixes to allow working with tile arch arch/tile: update defconfig file to something more useful tile: do_hardwall_trap: do not play with task->sighand tile: replace mm->cpu_vm_mask with mm_cpumask() tile,mn10300: add device parameter to dma_cache_sync() audit: support the "standard" <asm-generic/unistd.h> arch/tile: clarify flush_buffer()/finv_buffer() function names arch/tile: kernel-related cleanups from removing static page size arch/tile: various header improvements for building drivers arch/tile: disable GX prefetcher during cache flush arch/tile: tolerate disabling CONFIG_BLK_DEV_INITRD ...
| * | | arch/tile: support signal "exception-trace" hookChris Metcalf2011-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds support for /proc/sys/debug/exception-trace to tile. Like x86 and sparc, by default it is set to "1", generating a one-line printk whenever a user process crashes. By setting it to "2", we get a much more complete userspace diagnostic at crash time, including a user-space backtrace, register dump, and memory dump around the address of the crash. Some vestiges of the Tilera-internal version of this support are removed with this patch (the show_crashinfo variable and the arch_coredump_signal function). We retain a "crashinfo" boot parameter which allows you to set the boot-time value of exception-trace. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
| * | | compat: fixes to allow working with tile archChris Metcalf2011-05-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing <asm-generic/unistd.h> mechanism doesn't really provide enough to create the 64-bit "compat" ABI properly in a generic way, since the compat ABI is a mix of things were you can re-use the 64-bit versions of syscalls and things where you need a compat wrapper. To provide this in the most direct way possible, I added two new macros to go along with the existing __SYSCALL and __SC_3264 macros: __SC_COMP and SC_COMP_3264. These macros take an additional argument, typically a "compat_sys_xxx" function, which is passed to __SYSCALL if you define __SYSCALL_COMPAT when including the header, resulting in a pointer to the compat function being placed in the generated syscall table. The change also adds some missing definitions to <linux/compat.h> so that it actually has declarations for all the compat syscalls, since the "[nr] = ##call" approach requires proper C declarations for all the functions included in the syscall table. Finally, compat.c defines compat_sys_sigpending() and compat_sys_sigprocmask() even if the underlying architecture doesn't request it, which tries to pull in undefined compat_old_sigset_t defines. We need to guard those compat syscall definitions with appropriate __ARCH_WANT_SYS_xxx ifdefs. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
* | | | hrtimers: Fix typo causing erratic timersThomas Gleixner2011-05-251-1/+1
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9ec2690758a5 ("timerfd: Manage cancelable timers in timerfd") introduced a CONFIG_HIGHRES_TIMERS (should be CONFIG_HIGH_RES_TIMERS) typo, which caused applications depending on CLOCK_REALTIME timers to become sluggy due to the fact that the time base of the realtime timers was not updated when the wall clock time was set. This causes anything from 100% CPU use for some applications to odd delays and hickups. Reported-bisected-and-tested-by: Anca Emanuel <anca.emanuel@gmail.com> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Fatfingered-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds2011-05-251-11/+14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix-timers: RCU conversion
| * | | posix-timers: RCU conversionEric Dumazet2011-05-241-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ben Nagy reported a scalability problem with KVM/QEMU that hit very hard a single spinlock (idr_lock) in posix-timers code, on its 48 core machine. Even on a 16 cpu machine (2x4x2), a single test can show 98% of cpu time used in ticket_spin_lock, from lock_timer Ref: http://www.spinics.net/lists/kvm/msg51526.html Switching to RCU is quite easy, IDR being already RCU ready. idr_lock should be locked only for an insert/delete, not a lookup. Benchmark on a 2x4x2 machine, 16 processes calling timer_gettime(). Before : real 1m18.669s user 0m1.346s sys 1m17.180s After : real 0m3.296s user 0m1.366s sys 0m1.926s Reported-by: Ben Nagy <ben@iagu.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Ben Nagy <ben@iagu.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | printk: allocate kernel log buffer earlierMike Travis2011-05-251-29/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On larger systems, because of the numerous ACPI, Bootmem and EFI messages, the static log buffer overflows before the larger one specified by the log_buf_len param is allocated. Minimize the overflow by allocating the new log buffer as soon as possible. On kernels without memblock, a later call to setup_log_buf from kernel/init.c is the fallback. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | bitmap, irq: add smp_affinity_list interface to /proc/irqMike Travis2011-05-251-4/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the cpu count is large. Setting smp affinity to cpus 256 to 263 would be: echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity instead of: echo 256-263 > smp_affinity_list Think about what it looks like for cpus around say, 4088 to 4095. We already have many alternate "list" interfaces: /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list /sys/devices/system/cpu/cpuX/topology/thread_siblings_list /sys/devices/system/cpu/cpuX/topology/core_siblings_list /sys/devices/system/node/nodeX/cpulist /sys/devices/pci***/***/local_cpulist Add a companion interface, smp_affinity_list to use cpu lists instead of cpu maps. This conforms to other companion interfaces where both a map and a list interface exists. This required adding a bitmap_parselist_user() function in a manner similar to the bitmap_parse_user() function. [akpm@linux-foundation.org: make __bitmap_parselist() static] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: convert mm->cpu_vm_cpumask into cpumask_var_tKOSAKI Motohiro2011-05-251-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpumask_t is very big struct and cpu_vm_mask is placed wrong position. It might lead to reduce cache hit ratio. This patch has two change. 1) Move the place of cpumask into last of mm_struct. Because usually cpumask is accessed only front bits when the system has cpu-hotplug capability 2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory footprint if cpumask_size() will use nr_cpumask_bits properly in future. In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var. It may help to detect out of tree cpu_vm_mask users. This patch has no functional change. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: Convert i_mmap_lock to a mutexPeter Zijlstra2011-05-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Straightforward conversion of i_mmap_lock to a mutex. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Hugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: Remove i_mmap_lock lockbreakPeter Zijlstra2011-05-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hugh says: "The only significant loser, I think, would be page reclaim (when concurrent with truncation): could spin for a long time waiting for the i_mmap_mutex it expects would soon be dropped? " Counter points: - cpu contention makes the spin stop (need_resched()) - zap pages should be freeing pages at a higher rate than reclaim ever can I think the simplification of the truncate code is definitely worth it. Effectively reverts: 2aa15890f3c ("mm: prevent concurrent unmap_mapping_range() on the same inode") and takes out the code that caused its problem. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | lockdep, mutex: provide mutex_lock_nest_lockPeter Zijlstra2011-05-251-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to convert i_mmap_lock to a mutex we need a mutex equivalent to spin_lock_nest_lock(), thus provide the mutex_lock_nest_lock() annotation. As with spin_lock_nest_lock(), mutex_lock_nest_lock() allows annotation of the locking pattern where an outer lock serializes the acquisition order of nested locks. That is, if every time you lock multiple locks A, say A1 and A2 you first acquire N, the order of acquiring A1 and A2 is irrelevant. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-05-244-7/+109
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (43 commits) TOMOYO: Fix wrong domainname validation. SELINUX: add /sys/fs/selinux mount point to put selinuxfs CRED: Fix load_flat_shared_library() to initialise bprm correctly SELinux: introduce path_has_perm flex_array: allow 0 length elements flex_arrays: allow zero length flex arrays flex_array: flex_array_prealloc takes a number of elements, not an end SELinux: pass last path component in may_create SELinux: put name based create rules in a hashtable SELinux: generic hashtab entry counter SELinux: calculate and print hashtab stats with a generic function SELinux: skip filename trans rules if ttype does not match parent dir SELinux: rename filename_compute_type argument to *type instead of *con SELinux: fix comment to state filename_compute_type takes an objname not a qstr SMACK: smack_file_lock can use the struct path LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH LSM: split LSM_AUDIT_DATA_FS into _PATH and _INODE SELINUX: Make selinux cache VFS RCU walks safe SECURITY: Move exec_permission RCU checks into security modules SELinux: security_read_policy should take a size_t not ssize_t ...
| * \ \ \ Merge branch 'next' into for-linusJames Morris2011-05-244-7/+109
| |\ \ \ \ | | |/ / / | |/| | |
| | * | | Merge branch 'master' into nextJames Morris2011-05-1918-40/+103
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: include/linux/capability.h Manually resolve merge conflict w/ thanks to Stephen Rothwell. Signed-off-by: James Morris <jmorris@namei.org>
| | * \ \ \ Merge branch 'master'; commit 'v2.6.39-rc3' into nextJames Morris2011-04-1948-153/+255
| | |\ \ \ \
| | * | | | | capabilities: delete all CAP_INIT macrosEric Paris2011-04-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CAP_INIT macros of INH, BSET, and EFF made sense at one point in time, but now days they aren't helping. Just open code the logic in the init_cred. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
| | * | | | | capabilities: delete unused cap_set_fullEric Paris2011-04-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unused code. Clean it up. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
| | * | | | | capabilities: do not drop CAP_SETPCAP from the initial taskEric Paris2011-04-041-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In olden' days of yore CAP_SETPCAP had special meaning for the init task. We actually have code to make sure that CAP_SETPCAP wasn't in pE of things using the init_cred. But CAP_SETPCAP isn't so special any more and we don't have a reason to special case dropping it for init or kthreads.... Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
| | * | | | | capabilites: allow the application of capability limits to usermode helpersEric Paris2011-04-042-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no way to limit the capabilities of usermodehelpers. This problem reared its head recently when someone complained that any user with cap_net_admin was able to load arbitrary kernel modules, even though the user didn't have cap_sys_module. The reason is because the actual load is done by a usermode helper and those always have the full cap set. This patch addes new sysctls which allow us to bound the permissions of usermode helpers. /proc/sys/kernel/usermodehelper/bset /proc/sys/kernel/usermodehelper/inheritable You must have CAP_SYS_MODULE and CAP_SETPCAP to change these (changes are &= ONLY). When the kernel launches a usermodehelper it will do so with these as the bset and pI. -v2: make globals static create spinlock to protect globals -v3: require both CAP_SETPCAP and CAP_SYS_MODULE -v4: fix the typo s/CAP_SET_PCAP/CAP_SETPCAP/ because I didn't commit Signed-off-by: Eric Paris <eparis@redhat.com> No-objection-from: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
* | | | | | | Merge branch 'for-2.6.40' of ↵Linus Torvalds2011-05-241-3/+1
|\ \ \ \ \ \ \ | |/ / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Unify input section names percpu: Avoid extra NOP in percpu_cmpxchg16b_double percpu: Cast away printk format warning percpu: Always align percpu output section to PAGE_SIZE Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
| * | | | | | Merge branch 'fixes-2.6.39' into for-2.6.40Tejun Heo2011-05-2413-60/+88
| |\ \ \ \ \ \
| * | | | | | | percpu: Always align percpu output section to PAGE_SIZETejun Heo2011-03-241-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Percpu allocator honors alignment request upto PAGE_SIZE and both the percpu addresses in the percpu address space and the translated kernel addresses should be aligned accordingly. The calculation of the former depends on the alignment of percpu output section in the kernel image. The linker script macros PERCPU_VADDR() and PERCPU() are used to define this output section and the latter takes @align parameter. Several architectures are using @align smaller than PAGE_SIZE breaking percpu memory alignment. This patch removes @align parameter from PERCPU(), renames it to PERCPU_SECTION() and makes it always align to PAGE_SIZE. While at it, add PCPU_SETUP_BUG_ON() checks such that alignment problems are reliably detected and remove percpu alignment comment recently added in workqueue.c as the condition would trigger BUG way before reaching there. For um, this patch raises the alignment of percpu area. As the area is in .init, there shouldn't be any noticeable difference. This problem was discovered by David Howells while debugging boot failure on mn10300. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Cc: uclinux-dist-devel@blackfin.uclinux.org Cc: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: user-mode-linux-devel@lists.sourceforge.net
* | | | | | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2011-05-231-4/+2
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Fix sample type size calculation in 32 bits archs profile: Use vzalloc() rather than vmalloc() & memset()
| * \ \ \ \ \ \ \ Merge commit '559fa6e76b27' into perf/urgentIngo Molnar2011-05-231-4/+2
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: this commit was queued up quite some time ago but was forgotten about. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | profile: Use vzalloc() rather than vmalloc() & memset()Jesper Juhl2010-10-311-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no reason to memset() manually when we have vzalloc(). Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Arjan van de Ven <arjan@infradead.org> Cc: William Irwin <wli@holomorphy.com> LKML-Reference: <alpine.LNX.2.00.1010302150310.1572@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | | | kernel/watchdog.c: Use proper ANSI C prototypesLinus Torvalds2011-05-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We try to enforce it by using -Wstrict-prototypes, but apparently they sometimes get through. Introduced by 4eec42f39204 ("watchdog: Change the default timeout and configure nmi watchdog period based"). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2011-05-232-37/+54
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Increase SCHED_LOAD_SCALE resolution sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculations sched: Cleanup set_load_weight()
| * | | | | | | | | | sched: Increase SCHED_LOAD_SCALE resolutionNikhil Rao2011-05-201-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce SCHED_LOAD_RESOLUTION, which scales is added to SCHED_LOAD_SHIFT and increases the resolution of SCHED_LOAD_SCALE. This patch sets the value of SCHED_LOAD_RESOLUTION to 10, scaling up the weights for all sched entities by a factor of 1024. With this extra resolution, we can handle deeper cgroup hiearchies and the scheduler can do better shares distribution and load load balancing on larger systems (especially for low weight task groups). This does not change the existing user interface, the scaled weights are only used internally. We do not modify prio_to_weight values or inverses, but use the original weights when calculating the inverse which is used to scale execution time delta in calc_delta_mine(). This ensures we do not lose accuracy when accounting time to the sched entities. Thanks to Nikunj Dadhania for fixing an bug in c_d_m() that broken fairness. Below is some analysis of the performance costs/improvements of this patch. 1. Micro-arch performance costs: Experiment was to run Ingo's pipe_test_100k 200 times with the task pinned to one cpu. I measured instruction, cycles and stalled-cycles for the runs. See: http://thread.gmane.org/gmane.linux.kernel/1129232/focus=1129389 for more info. -tip (baseline): Performance counter stats for '/root/load-scale/pipe-test-100k' (200 runs): 964,991,769 instructions # 0.82 insns per cycle # 0.33 stalled cycles per insn # ( +- 0.05% ) 1,171,186,635 cycles # 0.000 GHz ( +- 0.08% ) 306,373,664 stalled-cycles-backend # 26.16% backend cycles idle ( +- 0.28% ) 314,933,621 stalled-cycles-frontend # 26.89% frontend cycles idle ( +- 0.34% ) 1.122405684 seconds time elapsed ( +- 0.05% ) -tip+patches: Performance counter stats for './load-scale/pipe-test-100k' (200 runs): 963,624,821 instructions # 0.82 insns per cycle # 0.33 stalled cycles per insn # ( +- 0.04% ) 1,175,215,649 cycles # 0.000 GHz ( +- 0.08% ) 315,321,126 stalled-cycles-backend # 26.83% backend cycles idle ( +- 0.28% ) 316,835,873 stalled-cycles-frontend # 26.96% frontend cycles idle ( +- 0.29% ) 1.122238659 seconds time elapsed ( +- 0.06% ) With this patch, instructions decrease by ~0.10% and cycles increase by 0.27%. This doesn't look statistically significant. The number of stalled cycles in the backend increased from 26.16% to 26.83%. This can be attributed to the shifts we do in c_d_m() and other places. The fraction of stalled cycles in the frontend remains about the same, at 26.96% compared to 26.89% in -tip. 2. Balancing low-weight task groups Test setup: run 50 tasks with random sleep/busy times (biased around 100ms) in a low weight container (with cpu.shares = 2). Measure %idle as reported by mpstat over a 10s window. -tip (baseline): 06:47:48 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle intr/s 06:47:49 PM all 94.32 0.00 0.06 0.00 0.00 0.00 0.00 0.00 5.62 15888.00 06:47:50 PM all 94.57 0.00 0.62 0.00 0.00 0.00 0.00 0.00 4.81 16180.00 06:47:51 PM all 94.69 0.00 0.06 0.00 0.00 0.00 0.00 0.00 5.25 15966.00 06:47:52 PM all 95.81 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.19 16053.00 06:47:53 PM all 94.88 0.06 0.00 0.00 0.00 0.00 0.00 0.00 5.06 15984.00 06:47:54 PM all 93.31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 6.69 15806.00 06:47:55 PM all 94.19 0.00 0.06 0.00 0.00 0.00 0.00 0.00 5.75 15896.00 06:47:56 PM all 92.87 0.00 0.00 0.00 0.00 0.00 0.00 0.00 7.13 15716.00 06:47:57 PM all 94.88 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.12 15982.00 06:47:58 PM all 95.44 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.56 16075.00 Average: all 94.49 0.01 0.08 0.00 0.00 0.00 0.00 0.00 5.42 15954.60 -tip+patches: 06:47:03 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle intr/s 06:47:04 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16630.00 06:47:05 PM all 99.69 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.31 16580.20 06:47:06 PM all 99.69 0.00 0.06 0.00 0.00 0.00 0.00 0.00 0.25 16596.00 06:47:07 PM all 99.20 0.00 0.74 0.00 0.00 0.06 0.00 0.00 0.00 17838.61 06:47:08 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16540.00 06:47:09 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16575.00 06:47:10 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16614.00 06:47:11 PM all 99.94 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.06 16588.00 06:47:12 PM all 99.94 0.00 0.06 0.00 0.00 0.00 0.00 0.00 0.00 16593.00 06:47:13 PM all 99.94 0.00 0.06 0.00 0.00 0.00 0.00 0.00 0.00 16551.00 Average: all 99.84 0.00 0.09 0.00 0.00 0.01 0.00 0.00 0.06 16711.58 We see an improvement in idle% on the system (drops from 5.42% on -tip to 0.06% with the patches). We see an improvement in idle% on the system (drops from 5.42% on -tip to 0.06% with the patches). Signed-off-by: Nikhil Rao <ncrao@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1305754668-18792-1-git-send-email-ncrao@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | sched: Introduce SCHED_POWER_SCALE to scale cpu_power calculationsNikhil Rao2011-05-202-27/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCHED_LOAD_SCALE is used to increase nice resolution and to scale cpu_power calculations in the scheduler. This patch introduces SCHED_POWER_SCALE and converts all uses of SCHED_LOAD_SCALE for scaling cpu_power to use SCHED_POWER_SCALE instead. This is a preparatory patch for increasing the resolution of SCHED_LOAD_SCALE, and there is no need to increase resolution for cpu_power calculations. Signed-off-by: Nikhil Rao <ncrao@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de> Cc: Mike Galbraith <efault@gmx.de> Link: http://lkml.kernel.org/r/1305738580-9924-3-git-send-email-ncrao@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | sched: Cleanup set_load_weight()Nikhil Rao2011-05-201-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid using long repetitious names; make this simpler and nicer to read. No functional change introduced in this patch. Signed-off-by: Nikhil Rao <ncrao@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Stephan Barwolf <stephan.baerwolf@tu-ilmenau.de> Cc: Mike Galbraith <efault@gmx.de> Link: http://lkml.kernel.org/r/1305738580-9924-2-git-send-email-ncrao@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | | | | Merge branch 'staging-next' of ↵Linus Torvalds2011-05-231-2/+2
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6 * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (970 commits) staging: usbip: replace usbip_u{dbg,err,info} and printk with dev_ and pr_ staging:iio: Trivial kconfig reorganization and uniformity improvements. staging:iio:documenation partial update. staging:iio: use pollfunc allocation helpers in remaining drivers. staging:iio:max1363 misc cleanups and use of for_each_bit_set to simplify event code spitting out. staging:iio: implement an iio_info structure to take some of the constant elements out of iio_dev. staging:iio:meter:ade7758: Use private data space from iio_allocate_device staging:iio:accel:lis3l02dq make write_reg_8 take value not a pointer to value. staging:iio: ring core cleanups + check if read_last available in lis3l02dq staging:iio:core cleanup: squash tiny wrappers and use dev_set_name to handle creation of event interface name. staging:iio: poll func allocation clean up. staging:iio:ad7780 trivial unused header cleanup. staging:iio:adc: AD7780: Use private data space from iio_allocate_device + trivial fixes staging:iio:adc:AD7780: Convert to new channel registration method staging:iio:adc: AD7606: Drop dev_data in favour of iio_priv() staging:iio:adc: AD7606: Consitently use indio_dev staging:iio: Rip out helper for software rings. staging:iio:adc:AD7298: Use private data space from iio_allocate_device staging:iio: rationalization of different buffer implementation hooks. staging:iio:imu:adis16400 avoid allocating rx, tx, and state separately from iio_dev. ... Fix up trivial conflicts in - drivers/staging/intel_sst/intelmid.c: patches applied in both branches - drivers/staging/rt2860/common/cmm_data_{pci,usb}.c: removed vs spelling - drivers/staging/usbip/vhci_sysfs.c: trivial header file inclusion
| * | | | | | | | | | | modules: Enabled dynamic debugging for staging modulesRoland Vossen2011-04-251-2/+2
| | |_|_|_|_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver modules from the staging directory are marked 'tainted' by module.c. Subsequently, tainted modules are denied dynamic debugging. This is unwanted behavior, since staging modules should be able to use the dynamic debugging mechanism. Please merge this also into the staging-linus branch. Signed-off-by: Roland Vossen <rvossen@broadcom.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | | | | | | | | | Merge branch 'timers-for-linus' of ↵Linus Torvalds2011-05-236-92/+125
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimers: Reorder clock bases hrtimers: Avoid touching inactive timer bases hrtimers: Make struct hrtimer_cpu_base layout less stupid timerfd: Manage cancelable timers in timerfd clockevents: Move C3 stop test outside lock alarmtimer: Drop device refcount after rtc_open() alarmtimer: Check return value of class_find_device() timerfd: Allow timers to be cancelled when clock was set hrtimers: Prepare for cancel on clock was set timers
| * | | | | | | | | | | hrtimers: Reorder clock basesThomas Gleixner2011-05-231-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ordering of the clock bases is historical due to the CLOCK_REALTIME and CLOCK_MONOTONIC constants. Now the hrtimer bases have their own enumeration due to the gap between CLOCK_MONOTONIC and CLOCK_BOOTTIME. So we can be more clever as most timers end up on the CLOCK_MONOTONIC base due to the virtue of POSIX declaring that relative CLOCK_REALTIME timers are not affected by time changes. In desktop environments this is slowly changing as applications switch to absolute timers, but I've observed empty CLOCK_REALTIME bases often enough. There is no performance penalty or overhead when CLOCK_REALTIME timers are active, but in case they are not we don't skip over a full cache line. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Peter Zijlstra <peterz@infradead.org>
| * | | | | | | | | | | hrtimers: Avoid touching inactive timer basesThomas Gleixner2011-05-234-16/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of iterating over all possible timer bases avoid it by marking the active bases in the cpu base. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Peter Zijlstra <peterz@infradead.org>
| * | | | | | | | | | | timerfd: Manage cancelable timers in timerfdThomas Gleixner2011-05-231-62/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Peter is concerned about the extra scan of CLOCK_REALTIME_COS in the timer interrupt. Yes, I did not think about it, because the solution was so elegant. I didn't like the extra list in timerfd when it was proposed some time ago, but with a rcu based list the list walk it's less horrible than the original global lock, which was held over the list iteration. Requested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Peter Zijlstra <peterz@infradead.org>
| * | | | | | | | | | | Merge branch 'timers/urgent' into timers/coreThomas Gleixner2011-05-2075-2326/+4643
| |\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reason: Get upstream fixes and kfree_rcu which is necessary for a follow up patch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | | | clockevents: Move C3 stop test outside lockAndi Kleen2011-05-051-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid taking broadcast_lock in the idle path for systems where the timer doesn't stop in C3. [ tglx: Removed the stale label and added comment ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Dave Kleikamp <dkleikamp@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: lenb@kernel.org Cc: paulmck@us.ibm.com Link: http://lkml.kernel.org/r/%3C20110504234806.GF2925%40one.firstfloor.org%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | | | alarmtimer: Drop device refcount after rtc_open()Thomas Gleixner2011-05-041-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | class_find_device() takes a refcount on the rtc device. rtc_open() takes another one, so we can drop it after the rtc_open() call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org>
| * | | | | | | | | | | | alarmtimer: Check return value of class_find_device()Thomas Gleixner2011-05-041-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alarmtimer_late_init() uses class_find_device() to find a alarm capable rtc device. The match callback stores a pointer to the name in the char pointer handed in from the call site. alarmtimer_late_init() checks the char pointer for NULL, but the pointer is on the stack and not initialized to NULL before the call. So it can have random content when the match function did not identify a device, which leads to random access in the following rtc_open() call where the pointer is dereferenced Instead of relying on the char pointer, check the return value of class_find_device. If a device is found then the name pointer is valid as well. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | | | timerfd: Allow timers to be cancelled when clock was setThomas Gleixner2011-05-022-1/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some applications must be aware of clock realtime being set backward. A simple example is a clock applet which arms a timer for the next minute display. If clock realtime is set backward then the applet displays a stale time for the amount of time which the clock was set backwards. Due to that applications poll the time because we don't have an interface. Extend the timerfd interface by adding a flag which puts the timer onto a different internal realtime clock. All timers on this clock are expired whenever the clock was set. The timerfd core records the monotonic offset when the timer is created. When the timer is armed, then the current offset is compared to the previous recorded offset. When it has changed, then timerfd_settime returns -ECANCELED. When a timer is read the offset is compared and if it changed -ECANCELED returned to user space. Periodic timers are not rearmed in the cancelation case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Chris Friesen <chris.friesen@genband.com> Tested-by: Kay Sievers <kay.sievers@vrfy.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davide Libenzi <davidel@xmailserver.org> Reviewed-by: Alexander Shishkin <virtuoso@slind.org> Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1104271359580.3323%40ionos%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | | | | | | | hrtimers: Prepare for cancel on clock was set timersThomas Gleixner2011-05-022-65/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make clock_was_set() unconditional and rename hres_timers_resume to hrtimers_resume. This is a preparatory patch for hrtimers which are cancelled when clock realtime was set. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | | | | | | | | | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2011-05-232-26/+38
|\ \ \ \ \ \ \ \ \ \ \ \ \ | | |_|_|_|/ / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Fix sample size bit operations perf tools: Fix ommitted mmap data update on remap watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh watchdog: Disable watchdog when thresh is zero watchdog: Only disable/enable watchdog if neccessary watchdog: Fix rounding bug in get_sample_period() perf tools: Propagate event parse error handling perf tools: Robustify dynamic sample content fetch perf tools: Pre-check sample size before parsing perf tools: Move evlist sample helpers to evlist area perf tools: Remove junk code in mmap size handling perf tools: Check we are able to read the event size on mmap