aboutsummaryrefslogtreecommitdiffstats
path: root/kernel
Commit message (Collapse)AuthorAgeFilesLines
* sched: small schedstat fixIngo Molnar2007-08-281-1/+3
| | | | | | | | | | small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes' statistics counters missed newly forked tasks and thus had a constant negative skew. Fix this. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
* sched: fix wait_start_fair condition in update_stats_wait_end()Ingo Molnar2007-08-281-0/+3
| | | | | | | | | | | | | | Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which is disabled by default at the moment): it relies on se.wait_start_fair being 0 while update_stats_wait_end() did not recognize a 0 value, so instead of 'skipping' the initial interval we gave the new child a maximum boost of +runtime-limit ... (No impact on the default kernel, but nice to fix for completeness.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
* sched: call update_curr() in task_tick_fair()Ting Yang2007-08-281-2/+3
| | | | | | | | | | | update the fair-clock before using it for the key value. [ mingo@elte.hu: small cleanups. ] Signed-off-by: Ting Yang <tingy@cs.umass.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* sched: make the scheduler converge to the ideal latencyIngo Molnar2007-08-282-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | de-HZ-ification of the granularity defaults unearthed a pre-existing property of CFS: while it correctly converges to the granularity goal, it does not prevent run-time fluctuations in the range of [-gran ... 0 ... +gran]. With the increase of the granularity due to the removal of HZ dependencies, this becomes visible in chew-max output (with 5 tasks running): out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40 out: 27 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 36 . 40 out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40 out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40 out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40 average slice is the ideal 13 msecs and the period is picture-perfect 40 msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no mechanism in CFS to keep that from happening: it's a perfectly valid solution that CFS finds. to fix this we add a granularity/preemption rule that knows about the "target latency", which makes tasks that run longer than the ideal latency run a bit less. The simplest approach is to simply decrease the preemption granularity when a task overruns its ideal latency. For this we have to track how much the task executed since its last preemption. ( this adds a new field to task_struct, but we can eliminate that overhead in 2.6.24 by putting all the scheduler timestamps into an anonymous union. ) with this change in place, chew-max output is fluctuation-less all around: out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40 this patch has no impact on any fastpath or on any globally observable scheduling property. (unless you have sharp enough eyes to see millisecond-level ruckles in glxgears smoothness :-) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
* sched: fix sleeper bonus limitMike Galbraith2007-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | There is an Amarok song switch time increase (regression) under hefty load. What is happening is that sleeper_bonus is never consumed, and only rarely goes below runtime_limit, so for the most part, Amarok isn't getting any bonus at all. We're keeping sleeper_bonus right at runtime_limit (sched_latency == sched_runtime_limit == 40ms) forever, ie we don't consume if we're lower that that, and don't add if we're above it. One Amarok thread waking (or anybody else) will push us past the threshold, so the next thread waking gets nada, but will reap pain from the previous thread waking until we drop back to runtime_limit. It looks to me like under load, some random task gets a bonus, and everybody else pays, whether deserving or not. This diff fixed the regression for me at any load rate. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* fix bogus hotplug cpu warningHugh Dickins2007-08-271-1/+1
| | | | | | | | | Fix bogus DEBUG_PREEMPT warning on x86_64, when cpu brought online after bootup: current_is_keventd is right to note its use of smp_processor_id is preempt-safe, but should use raw_smp_processor_id to avoid the warning. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sched: s/sched_latency/sched_min_granularityIngo Molnar2007-08-251-2/+2
| | | | | | | | | | runtime limit and wakeup granularity used to be a function of granularity and that was incorrect changed to sched_latency. Fix this to make wakeup granularity a function of min-granularity, and the runtime limit equal to latency. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: cleanup, sched_granularity -> sched_min_granularityIngo Molnar2007-08-253-7/+7
| | | | | | | | | due to adaptive granularity scheduling the role of sched_granularity has changed to "minimum granularity", so rename the variable (and the tunable) accordingly. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* sched: adaptive scheduler granularityPeter Zijlstra2007-08-253-17/+85
| | | | | | | | | | | | | | | | | | Instead of specifying the preemption granularity, specify the wanted latency. By fixing the granlarity to a constany the wakeup latency it a function of the number of running tasks on the rq. Invert this relation. sysctl_sched_granularity becomes a minimum for the dynamic granularity computed from the new sysctl_sched_latency. Then use this latency to do more intelligent granularity decisions: if there are fewer tasks running then we can schedule coarser. This helps performance while still always keeping the latency target. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: fix CONFIG_SCHED_DEBUG dependency of lockdep sysctlsPeter Zijlstra2007-08-251-9/+9
| | | | | | | Make the lockdep sysctls not depend on CONFIG_SCHED_DEBUG. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: fix startup penalty calculationIngo Molnar2007-08-241-1/+1
| | | | | | | | fix task startup penalty miscalculation: sysctl_sched_granularity is unsigned int and wait_runtime is long so we first have to convert it to long before turning it negative ... Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: simplify bonus calculation #2Peter Zijlstra2007-08-241-2/+1
| | | | | | | | | | | | | | | current code: delta = calc_delta_mine(delta_exec, curr->load.weight, lw); delta = min((u64)delta, cfs_rq->sleeper_bonus); Notice that this calc_delta_mine() line is exactly delta_mine, which gives: delta = min((u64)delta_mine, cfs_rq->sleeper_bonus); Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: simplify bonus calculation #1Peter Zijlstra2007-08-241-2/+1
| | | | | | | | | | | | | | | | | current code: delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec); delta = calc_delta_mine(delta, curr->load.weight, lw); delta = min((u64)delta, cfs_rq->sleeper_bonus); drop the first min(), because we clip against sleeper_bonus in the 3rd line again. That gives: delta = calc_delta_mine(delta_exec, curr->load.weight, lw); delta = min((u64)delta, cfs_rq->sleeper_bonus); Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: tidy up and simplify the bonus balanceIngo Molnar2007-08-241-4/+10
| | | | | | | | | | | | | | | | | | make the bonus balance more consistent: do not hand out a bonus if there's too much in flight already, and only deduct as much from a runner as it has the capacity. This makes the bonus engine a zero-sum game (as intended). this also simplifies the code: text data bss dec hex filename 34770 2998 24 37792 93a0 sched.o.before 34749 2998 24 37771 938b sched.o.after and it also avoids overscheduling in sleep-happy workloads like hackbench.c. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: optimize task_tick_rt() a bitDmitry Adamushko2007-08-241-3/+8
| | | | | | | | | | | | | | | | | Mitchell Erblich suggested a quality-of-implementation change to not requeue SCHED_RR tasks if there's only a single task on the runqueue, by checking for rq->nr_running == 1. provide a more efficient implementation of that, to check that particular RT priority-queue only. [ From: mingo@elte.hu ] Also first requeue the task then set need_resched - results in slightly better machine-instruction ordering. Also clean up the code a bit. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: simplify can_migrate_task()Sven-Thorsten Dietrich2007-08-241-6/+0
| | | | | | | | | | | | Remove trivial conditional branch in Linux scheduler's can_migrate_task() function. text data bss dec hex filename 34770 2998 24 37792 93a0 sched.o.before 34757 2998 24 37779 9393 sched.o.after Signed-off-by: Sven-Thorsten Dietrich <sven@thebigcorporation.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: remove HZ dependency from the granularity defaultIngo Molnar2007-08-242-8/+7
| | | | | | | | | | remove HZ dependency from the granularity default. Use 10 msec for the base granularity, 1 msec for wakeup granularity and 25 msec for batch wakeup granularity. (These defaults are close to the values that the default HZ=250 setting got previously, and thus it's the most common setting.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: CONFIG_SCHED_GROUP_FAIR=y fixletBruce Ashfield2007-08-241-1/+1
| | | | | | | | | | | | | when I built with CONFIG_FAIR_GROUP_SCHED=y, I need the following change to make things right. [ From: mingo@elte.hu ] this config option is not upstream-configurable right now but lets fix this for completeness. Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds2007-08-232-18/+53
|\ | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: tweak the sched_runtime_limit tunable sched: skip updating rq's next_balance under null SD sched: fix broken SMT/MC optimizations sched: accounting regression since rc1 sched: fix sysctl directory permissions sched: sched_clock_idle_[sleep|wakeup]_event()
| * sched: tweak the sched_runtime_limit tunableIngo Molnar2007-08-231-1/+1
| | | | | | | | | | | | | | | | Michael Gerdau reported reniced task CPU usage weirdnesses. Such symptoms can be caused by limit underruns so double the sched_runtime_limit. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: skip updating rq's next_balance under null SDSuresh Siddha2007-08-231-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Was playing with sched_smt_power_savings/sched_mc_power_savings and found out that while the scheduler domains are reconstructed when sysfs settings change, rebalance_domains() can get triggered with null domain on other cpus, which is setting next_balance to jiffies + 60*HZ. Resulting in no idle/busy balancing for 60 seconds. Fix this. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: fix broken SMT/MC optimizationsSuresh Siddha2007-08-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a four package system with HT - HT load balancing optimizations were broken. For example, if two tasks end up running on two logical threads of one of the packages, scheduler is not able to pull one of the tasks to a completely idle package. In this scenario, for nice-0 tasks, imbalance calculated by scheduler will be 512 and find_busiest_queue() will return 0 (as each cpu's load is 1024 > imbalance and has only one task running). Similarly MC scheduler optimizations also get fixed with this patch. [ mingo@elte.hu: restored fair balancing by increasing the fuzz and adding it back to the power decision, without the /2 factor. ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: fix sysctl directory permissionsEric W. Biederman2007-08-231-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | There are two remaining gotchas: - The directories have impossible permissions (writeable). - The ctl_name for the kernel directory is inconsistent with everything else. It should be CTL_KERN. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: sched_clock_idle_[sleep|wakeup]_event()Ingo Molnar2007-08-232-10/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | construct a more or less wall-clock time out of sched_clock(), by using ACPI-idle's existing knowledge about how much time we spent idling. This allows the rq clock to work around TSC-stops-in-C2, TSC-gets-corrupted-in-C3 type of problems. ( Besides the scheduler's statistics this also benefits blktrace and printk-timestamps as well. ) Furthermore, the precise before-C2/C3-sleep and after-C2/C3-wakeup callbacks allow the scheduler to get out the most of the period where the CPU has a reliable TSC. This results in slightly more precise task statistics. the ACPI bits were acked by Len. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Len Brown <len.brown@intel.com>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6Linus Torvalds2007-08-231-2/+1
|\ \ | |/ |/| | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: sysfs: don't warn on removal of a nonexistent binary file HOWTO: latest lxr url address changed HOWTO: korean translation of Documentation/HOWTO Fix Off-by-one in /sys/module/*/refcnt sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir()
| * Fix Off-by-one in /sys/module/*/refcntAlexey Dobriyan2007-08-221-2/+1
| | | | | | | | | | | | | | | | | | | | sysfs internals were changed to not pin module in question. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | signalfd: fix interaction with posix-timersOleg Nesterov2007-08-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dequeue_signal: if (__SI_TIMER) { spin_unlock(&tsk->sighand->siglock); do_schedule_next_timer(info); spin_lock(&tsk->sighand->siglock); } Unless tsk == curent, this is absolutely unsafe: nothing prevents tsk from exiting. If signalfd was passed to another process, do_schedule_next_timer() is just wrong. Add yet another "tsk == current" check into dequeue_signal(). This patch fixes an oopsable bug, but breaks the scheduling of posix timers if the shared __SI_TIMER signal was fetched via signalfd attached to another sub-thread. Mostly fixed by the next patch. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Roland McGrath <roland@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | posix-timers: fix creation raceOleg Nesterov2007-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sys_timer_create() sets ->it_process and unlocks ->siglock, then checks tmr->it_sigev_notify to define if get_task_struct() is needed. We already passed ->it_id to the caller, another thread can delete this timer and free its memory in between. As a minimal fix, move this code under ->siglock, sys_timer_delete() takes it too before calling release_posix_timer(). A proper serialization would be to take ->it_lock, we add a partly initialized timer on posix_timers_id, not good. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | posix-timers: fix deletion raceThomas Gleixner2007-08-221-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | timer_delete does: lock_timer(); timer->it_process = NULL; unlock_timer(); release_posix_timer(); timer->it_process is checked in lock_timer() to prevent access to a timer, which is on the way to be deleted, but the check happens after idr_lock is dropped. This allows release_posix_timer() to delete the timer before the lock code can check the timer: CPU 0 CPU 1 lock_timer(); timer->it_process = NULL; unlock_timer(); lock_timer() spin_lock(idr_lock); timer = idr_find(); spin_lock(timer->lock); spin_unlock(idr_lock); release_posix_timer(); spin_lock(idr_lock); idr_remove(timer); spin_unlock(idr_lock); free_timer(timer); if (timer->......) Change the locking to prevent this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | free_irq(): fix DEBUG_SHIRQ handlingAndrew Morton2007-08-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | If we're going to run the handler from free_irq() then we must do it with local irq's disabled. Otherwise lockdep complains that the handler is taking irq-safe spinlocks in a non-irq-safe fashion. Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | futex_unlock_pi() hurts my brain and may cause application deadlockjohn stultz2007-08-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Avoid futex_unlock_pi returning -EFAULT (which results in deadlock), by clearing uval before jumping to retry_locked. Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | kernel/auditsc.c: fix an off-by-oneAdrian Bunk2007-08-221-1/+1
|/ | | | | | | | | | | This patch fixes an off-by-one in a BUG_ON() spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Amy Griffis <amy.griffis@hp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fix - ensure we don't use bootconsoles after init has been releasedRobin Getz2007-08-211-4/+6
| | | | | | | | | Gerd Hoffmann pointed out that my patch from yesterday can lead to a null pointer dereference if the kernel is booted with no console, and no earlyprintk defined. This fixes that issue. Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ensure we don't use bootconsoles after init has been releasedRobin Getz2007-08-201-0/+11
| | | | | | | | | | | | | | | | | This is a followup to the cleanups for earlyprintk patch from Gerd Hoffmann http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=69331af79cf29e26d1231152a172a1a10c2df511 This ensures that a bootconsole is unregistered if it is not replaced. The current implementation spews garbage out the bootconsole in this case, since the bootconsole structure is normally in the init section, and is freed, but still used. Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Remove double inclusion of linux/capability.hChristian Heim2007-08-191-1/+0
| | | | | | | | | Remove the second inclusion of linux/capability.h, which has been introduced with "[PATCH] move capable() to capability.h" (commit c59ede7b78db329949d9cdcd7064e22d357560ef) Signed-off-by: Christian Heim <phreak@gentoo.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds2007-08-122-30/+30
|\ | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/ sched: fix sleeper bonus sched: make global code static
| * sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/Oleg Nesterov2007-08-121-1/+1
| | | | | | | | | | | | | | | | | | | | rebalance_domains(SCHED_IDLE) looks strange (typo), change it to CPU_IDLE. the effect of this bug was slightly more agressive idle-balancing on SMP than intended. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: fix sleeper bonusIngo Molnar2007-08-121-6/+6
| | | | | | | | | | | | | | | | | | Peter Ziljstra noticed that the sleeper bonus deduction code was not properly rate-limited: a task that scheduled more frequently would get a disproportionately large deduction. So limit the deduction to delta_exec. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: make global code staticAdrian Bunk2007-08-121-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | This patch makes the following needlessly global code static: - arch_reinit_sched_domains() - struct attr_sched_mc_power_savings - struct attr_sched_smt_power_savings Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | genirq: suppress resend of level interruptsThomas Gleixner2007-08-121-1/+6
| | | | | | | | | | | | | | | | | | | | Level type interrupts are resent by the interrupt hardware when they are still active at irq_enable(). Suppress the resend mechanism for interrupts marked as level. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | genirq: cleanup mismerge artifactThomas Gleixner2007-08-121-4/+1
|/ | | | | | | | | | | | Commit 5a43a066b11ac2fe84cf67307f20b83bea390f83: "genirq: Allow fasteoi handler to retrigger disabled interrupts" was erroneously applied to handle_level_irq(). This added the irq retrigger / resend functionality to the level irq handler. Revert the offending bits. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds2007-08-113-7/+17
|\ | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched debug: dont print kernel address in /proc/sched_debug sched: fix typo in the FAIR_GROUP_SCHED branch sched: improve rq-clock overflow logic
| * sched debug: dont print kernel address in /proc/sched_debugIngo Molnar2007-08-101-1/+1
| | | | | | | | | | | | | | | | Arjan van de Ven pointed out that we should not print kernel addresses in world-readable /proc files - fix that. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
| * sched: fix typo in the FAIR_GROUP_SCHED branchIngo Molnar2007-08-101-4/+3
| | | | | | | | | | | | | | while there's no in-tree way to turn group scheduling at the moment, fix a typo in it nevertheless. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: improve rq-clock overflow logicIngo Molnar2007-08-101-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | improve the rq-clock overflow logic: limit the absolute rq->clock delta since the last scheduler tick, instead of limiting the delta itself. tested by Arjan van de Ven - whole laptop was misbehaving due to an incorrectly calibrated cpu_khz confusing sched_clock(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
* | fix compilation with gcc 4.2Peter Chubb2007-08-112-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-4.2 is a lot more picky about its symbol handling. EXPORT_SYMBOL no longer works on symbols that are undefined or defined with static scope. For example, with CONFIG_PROFILE off, I see: kernel/profile.c:206: error: __ksymtab_profile_event_unregister causes a section type conflict kernel/profile.c:205: error: __ksymtab_profile_event_register causes a section type conflict This patch moves the EXPORTs inside the #ifdef CONFIG_PROFILE, so we only try to export symbols that are defined. Also, in kernel/kprobes.c there's an EXPORT_SYMBOL_GPL() for jprobes_return, which if CONFIG_JPROBES is undefined is a static inline and gives the same error. And in drivers/acpi/resources/rsxface.c, there's an ACPI_EXPORT_SYMBOPL() for a static symbol. If it's static, it's not accessible from outside the compilation unit, so should bot be exported. These three changes allow building a zx1_defconfig kernel with gcc 4.2 on IA64. [akpm@linux-foundation.org: export jpobe_return properly] Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | timer: remove clockevents_unregister_notifierMiao Xie2007-08-111-10/+0
| | | | | | | | | | | | | | | | | | | | I find a function(clockevents_unregister_notifier) which is not called by anything in tree. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Hibernation: do not try to mark invalid PFNs as nosaveRafael J. Wysocki2007-08-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some systems some PFNs reported by the early initialization code as 'nosave' may be invalid. If we try to set the corresponding bits in the hibernation bitmap, BUG_ON() in memory_bm_find_bit() will be triggered and the system won't be able to boot (cf. https://bugzilla.novell.com/show_bug.cgi?id=296242). Prevent this from happening by verifying if the 'nosave' PFNs are valid in mark_nosave_pages(). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Fix missing numa_zonelist_order sysctlLee Schermerhorn2007-08-111-1/+1
|/ | | | | | | | | Misplaced #endif is hiding the numa_zonelist_order sysctl when !SECURITY. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds2007-08-095-322/+303
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: (61 commits) sched: refine negative nice level granularity sched: fix update_stats_enqueue() reniced codepath sched: round a bit better sched: make the multiplication table more accurate sched: optimize update_rq_clock() calls in the load-balancer sched: optimize activate_task() sched: clean up set_curr_task_fair() sched: remove __update_rq_clock() call from entity_tick() sched: move the __update_rq_clock() call to scheduler_tick() sched debug: remove the 'u64 now' parameter from print_task()/_rq() sched: remove the 'u64 now' local variables sched: remove the 'u64 now' parameter from deactivate_task() sched: remove the 'u64 now' parameter from dequeue_task() sched: remove the 'u64 now' parameter from enqueue_task() sched: remove the 'u64 now' parameter from dec_nr_running() sched: remove the 'u64 now' parameter from inc_nr_running() sched: remove the 'u64 now' parameter from dec_load() sched: remove the 'u64 now' parameter from inc_load() sched: remove the 'u64 now' parameter from update_curr_load() sched: remove the 'u64 now' parameter from ->task_new() ...