From 3d43321b7015387cfebbe26436d0e9d299162ea1 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Thu, 2 Apr 2009 15:49:29 -0700 Subject: modules: sysctl to block module loading Implement a sysctl file that disables module-loading system-wide since there is no longer a viable way to remove CAP_SYS_MODULE after the system bounding capability set was removed in 2.6.25. Value can only be set to "1", and is tested only if standard capability checks allow CAP_SYS_MODULE. Given existing /dev/mem protections, this should allow administrators a one-way method to block module loading after initial boot-time module loading has finished. Signed-off-by: Kees Cook Acked-by: Serge Hallyn Signed-off-by: James Morris --- Documentation/sysctl/kernel.txt | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'Documentation') diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index a4ccdd1..02b1349 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -30,6 +30,7 @@ show up in /proc/sys/kernel: - kstack_depth_to_print [ X86 only ] - l2cr [ PPC only ] - modprobe ==> Documentation/debugging-modules.txt +- modules_disabled - msgmax - msgmnb - msgmni @@ -179,6 +180,16 @@ kernel stack. ============================================================== +modules_disabled: + +A toggle value indicating if modules are allowed to be loaded +in an otherwise modular kernel. This toggle defaults to off +(0), but can be set true (1). Once true, modules can be +neither loaded nor unloaded, and the toggle cannot be set back +to false. + +============================================================== + osrelease, ostype & version: # cat osrelease -- cgit v1.1 From a0c31fdfb7f158780faa43ab83cc08b414bebd7e Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Wed, 8 Apr 2009 16:01:16 -0700 Subject: ISDN: update Documentation/isdn/00-INDEX After the merging of mISDN, state which files refer only to the old isdn4linux subsystem. Also add a few missing files and sort alphabetically. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/00-INDEX | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) (limited to 'Documentation') diff --git a/Documentation/isdn/00-INDEX b/Documentation/isdn/00-INDEX index 9fee5f2..33543ac 100644 --- a/Documentation/isdn/00-INDEX +++ b/Documentation/isdn/00-INDEX @@ -2,28 +2,20 @@ - this file (info on ISDN implementation for Linux) CREDITS - list of the kind folks that brought you this stuff. +HiSax.cert + - information about the ITU approval certification of the HiSax driver. INTERFACE - - description of Linklevel and Hardwarelevel ISDN interface. + - description of isdn4linux Link Level and Hardware Level interfaces. +INTERFACE.fax + - description of the fax subinterface of isdn4linux. README - general info on what you need and what to do for Linux ISDN. README.FAQ - general info for FAQ. -README.audio - - info for running audio over ISDN. -README.fax - - info for using Fax over ISDN. -README.icn - - info on the ICN-ISDN-card and its driver. README.HiSax - info on the HiSax driver which replaces the old teles. -README.hfc-pci - - info on hfc-pci based cards. -README.pcbit - - info on the PCBIT-D ISDN adapter and driver. -README.syncppp - - info on running Sync PPP over ISDN. -syncPPP.FAQ - - frequently asked questions about running PPP over ISDN. +README.audio + - info for running audio over ISDN. README.avmb1 - info on driver for AVM-B1 ISDN card. README.act2000 @@ -34,10 +26,25 @@ README.concap - info on "CONCAP" encapsulation protocol interface used for X.25. README.diversion - info on module for isdn diversion services. +README.fax + - info for using Fax over ISDN. +README.gigaset + - info on the drivers for Siemens Gigaset ISDN adapters +README.hfc-pci + - info on hfc-pci based cards. +README.hysdn + - info on driver for Hypercope active HYSDN cards +README.icn + - info on the ICN-ISDN-card and its driver. +README.mISDN + - info on the Modular ISDN subsystem (mISDN) +README.pcbit + - info on the PCBIT-D ISDN adapter and driver. README.sc - info on driver for Spellcaster cards. +README.syncppp + - info on running Sync PPP over ISDN. README.x25 - _ info for running X.25 over ISDN. -README.hysdn - - info on driver for Hypercope active HYSDN cards - + - info on running X.25 over ISDN. +syncPPP.FAQ + - frequently asked questions about running PPP over ISDN. -- cgit v1.1 From 2fad2d9bb8310889f3261035b594b4e068b6eb8b Mon Sep 17 00:00:00 2001 From: Mark Langsdorf Date: Thu, 9 Apr 2009 15:31:53 +0200 Subject: x86/docs: add description for cache_disable sysfs interface Signed-off-by: Mark Langsdorf Signed-off-by: Andreas Herrmann Cc: Andrew Morton LKML-Reference: <20090409133153.GL31527@alberich.amd.com> Signed-off-by: Ingo Molnar --- Documentation/ABI/testing/sysfs-devices-cache_disable | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-devices-cache_disable (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-devices-cache_disable b/Documentation/ABI/testing/sysfs-devices-cache_disable new file mode 100644 index 0000000..175bb4f --- /dev/null +++ b/Documentation/ABI/testing/sysfs-devices-cache_disable @@ -0,0 +1,18 @@ +What: /sys/devices/system/cpu/cpu*/cache/index*/cache_disable_X +Date: August 2008 +KernelVersion: 2.6.27 +Contact: mark.langsdorf@amd.com +Description: These files exist in every cpu's cache index directories. + There are currently 2 cache_disable_# files in each + directory. Reading from these files on a supported + processor will return that cache disable index value + for that processor and node. Writing to one of these + files will cause the specificed cache index to be disabled. + + Currently, only AMD Family 10h Processors support cache index + disable, and only for their L3 caches. See the BIOS and + Kernel Developer's Guide at + http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/31116-Public-GH-BKDG_3.20_2-4-09.pdf + for formatting information and other details on the + cache index disable. +Users: joachim.deguara@amd.com -- cgit v1.1 From 56c49951747f250d8398582509e02ae5ce1d36d1 Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Sat, 11 Apr 2009 15:51:19 -0400 Subject: tracing: Add documentation for the power tracer Signed-off-by: "Theodore Ts'o" Acked-by: Arjan van de Ven Cc: Frederic Weisbecker Cc: Steven Rostedt LKML-Reference: <1239479479-2603-4-git-send-email-tytso@mit.edu> Signed-off-by: Ingo Molnar --- Documentation/trace/power.txt | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 Documentation/trace/power.txt (limited to 'Documentation') diff --git a/Documentation/trace/power.txt b/Documentation/trace/power.txt new file mode 100644 index 0000000..cd805e1 --- /dev/null +++ b/Documentation/trace/power.txt @@ -0,0 +1,17 @@ +The power tracer collects detailed information about C-state and P-state +transitions, instead of just looking at the high-level "average" +information. + +There is a helper script found in scrips/tracing/power.pl in the kernel +sources which can be used to parse this information and create a +Scalable Vector Graphics (SVG) picture from the trace data. + +To use this tracer: + + echo 0 > /sys/kernel/debug/tracing/tracing_enabled + echo power > /sys/kernel/debug/tracing/current_tracer + echo 1 > /sys/kernel/debug/tracing/tracing_enabled + sleep 1 + echo 0 > /sys/kernel/debug/tracing/tracing_enabled + cat /sys/kernel/debug/tracing/trace | \ + perl scripts/tracing/power.pl > out.sv -- cgit v1.1 From abd41443ac76d3e9c29a8c1d9e9a3312306cc55e Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Sat, 11 Apr 2009 15:51:18 -0400 Subject: tracing: Document the event tracing system Signed-off-by: "Theodore Ts'o" Cc: Theodore Ts'o Cc: Steven Rostedt LKML-Reference: <1239479479-2603-3-git-send-email-tytso@mit.edu> Signed-off-by: Ingo Molnar --- Documentation/trace/events.txt | 135 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 135 insertions(+) create mode 100644 Documentation/trace/events.txt (limited to 'Documentation') diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt new file mode 100644 index 0000000..abdee66 --- /dev/null +++ b/Documentation/trace/events.txt @@ -0,0 +1,135 @@ + Event Tracing + + Documentation written by Theodore Ts'o + +Introduction +============ + +Tracepoints (see Documentation/trace/tracepoints.txt) can be used +without creating custom kernel modules to register probe functions +using the event tracing infrastructure. + +Not all tracepoints can be traced using the event tracing system; +the kernel developer must provide code snippets which define how the +tracing information is saved into the tracing buffer, and how the +the tracing information should be printed. + +Using Event Tracing +=================== + +The events which are available for tracing can be found in the file +/sys/kernel/debug/tracing/available_events. + +To enable a particular event, such as 'sched_wakeup', simply echo it +to /sys/debug/tracing/set_event. For example: + + # echo sched_wakeup > /sys/kernel/debug/tracing/set_event + +[ Note: events can also be enabled/disabled via the 'enabled' toggle + found in the /sys/kernel/tracing/events/ hierarchy of directories. ] + +To disable an event, echo the event name to the set_event file prefixed +with an exclamation point: + + # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event + +To disable events, echo an empty line to the set_event file: + + # echo > /sys/kernel/debug/tracing/set_event + +The events are organized into subsystems, such as ext4, irq, sched, +etc., and a full event name looks like this: :. The +subsystem name is optional, but it is displayed in the available_events +file. All of the events in a subsystem can be specified via the syntax +":*"; for example, to enable all irq events, you can use the +command: + + # echo 'irq:*' > /sys/kernel/debug/tracing/set_event + +Defining an event-enabled tracepoint +------------------------------------ + +A kernel developer which wishes to define an event-enabled tracepoint +must declare the tracepoint using TRACE_EVENT instead of DECLARE_TRACE. +This is done via two header files in include/trace. For example, to +event-enable the jbd2 subsystem, we must create two files, +include/trace/jbd2.h and include/trace/jbd2_event_types.h. The +include/trace/jbd2.h file should be included by kernel source files that +will have a tracepoint inserted, and might look like this: + +#ifndef _TRACE_JBD2_H +#define _TRACE_JBD2_H + +#include +#include + +#include + +#endif + +In a file that utilizes a jbd2 tracepoint, this header file would be +included. Note that you still have to use DEFINE_TRACE(). So for +example, if fs/jbd2/commit.c planned to use the jbd2_start_commit +tracepoint, it would have the following near the beginning of the file: + +#include + +DEFINE_TRACE(jbd2_start_commit); + +Then in the function that would call the tracepoint, it would call the +tracepoint function. (For more information, please see the tracepoint +documentation in Documentation/trace/tracepoints.txt): + + trace_jbd2_start_commit(journal, commit_transaction); + +The code snippets which allow jbd2_start_commit to be an event-enabled +tracepoint are placed in the file include/trace/jbd2_event_types.h: + +/* use instead */ +#ifndef TRACE_EVENT +# error Do not include this file directly. +# error Unless you know what you are doing. +#endif + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM jbd2 + +#include + +TRACE_EVENT(jbd2_start_commit, + TP_PROTO(journal_t *journal, transaction_t *commit_transaction), + TP_ARGS(journal, commit_transaction), + TP_STRUCT__entry( + __array( char, devname, BDEVNAME_SIZE+24 ) + __field( int, transaction ) + ), + TP_fast_assign( + memcpy(__entry->devname, journal->j_devname, BDEVNAME_SIZE+24); + __entry->transaction = commit_transaction->t_tid; + ), + TP_printk("dev %s transaction %d", + __entry->devname, __entry->transaction) +); + +The TP_PROTO and TP_ARGS are unchanged from DECLARE_TRACE. The new +arguments to TRACE_EVENT are TP_STRUCT__entry, TP_fast_assign, and +TP_printk. + +TP_STRUCT__entry defines the data structure which will be stored in the +trace buffer. Normally, fields in __entry will be arrays or simple +types. It is possible to place data structures in __entry --- however, +pointers in the data structure can not be trusted, since they will be +accessed sometime later by TP_printk, and if the data structure contains +fields that will not or cannot be used by TP_printk, this will waste +space in the trace buffer. In general, data structures should be +avoided, unless they do only contain non-pointer types and all of the +fields will be used by TP_printk. + +TP_fast_assign defines the code snippet which saves information into the +__entry data structure, using the passed-in arguments defined in +TP_PROTO and TP_ARGS. + +Finally, TP_printk will print the __entry data structure. At the time +when the code snippet defined by TP_printk is executed, it will not have +access to the TP_ARGS arguments; it can only use the information saved +in the __entry data structure. -- cgit v1.1 From ecfcc53fef3c357574bb6143dce6631e6d56295c Mon Sep 17 00:00:00 2001 From: Etienne Basset Date: Wed, 8 Apr 2009 20:40:06 +0200 Subject: smack: implement logging V3 the following patch, add logging of Smack security decisions. This is of course very useful to understand what your current smack policy does. As suggested by Casey, it also now forbids labels with ', " or \ It introduces a '/smack/logging' switch : 0: no logging 1: log denied (default) 2: log accepted 3: log denied&accepted Signed-off-by: Etienne Basset Acked-by: Casey Schaufler Acked-by: Eric Paris Signed-off-by: James Morris --- Documentation/Smack.txt | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/Smack.txt b/Documentation/Smack.txt index 629c92e..34614b4 100644 --- a/Documentation/Smack.txt +++ b/Documentation/Smack.txt @@ -184,8 +184,9 @@ length. Single character labels using special characters, that being anything other than a letter or digit, are reserved for use by the Smack development team. Smack labels are unstructured, case sensitive, and the only operation ever performed on them is comparison for equality. Smack labels cannot -contain unprintable characters or the "/" (slash) character. Smack labels -cannot begin with a '-', which is reserved for special options. +contain unprintable characters, the "/" (slash), the "\" (backslash), the "'" +(quote) and '"' (double-quote) characters. +Smack labels cannot begin with a '-', which is reserved for special options. There are some predefined labels: @@ -523,3 +524,18 @@ Smack supports some mount options: These mount options apply to all file system types. +Smack auditing + +If you want Smack auditing of security events, you need to set CONFIG_AUDIT +in your kernel configuration. +By default, all denied events will be audited. You can change this behavior by +writing a single character to the /smack/logging file : +0 : no logging +1 : log denied (default) +2 : log accepted +3 : log denied & accepted + +Events are logged as 'key=value' pairs, for each event you at least will get +the subjet, the object, the rights requested, the action, the kernel function +that triggered the event, plus other pairs depending on the type of event +audited. -- cgit v1.1 From 6fd9b3a40b82081d9e6490b0d7cd656e9a78a134 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Mon, 13 Apr 2009 21:31:18 -0700 Subject: rcu: Update RCU tracing documentation for __rcu_pending This patch updates the RCU documentation to reflect the changes in tracing made in the previous patch in the set. Located-by: Anton Blanchard Tested-by: Anton Blanchard Signed-off-by: Paul E. McKenney Cc: anton@samba.org Cc: akpm@linux-foundation.org Cc: dipankar@in.ibm.com Cc: manfred@colorfullife.com Cc: cl@linux-foundation.org Cc: josht@linux.vnet.ibm.com Cc: schamp@sgi.com Cc: niv@us.ibm.com Cc: dvhltc@us.ibm.com Cc: ego@in.ibm.com Cc: laijs@cn.fujitsu.com Cc: rostedt@goodmis.org Cc: peterz@infradead.org Cc: penberg@cs.helsinki.fi Cc: andi@firstfloor.org Cc: "Paul E. McKenney" LKML-Reference: <12396834792865-git-send-email-> Signed-off-by: Ingo Molnar --- Documentation/RCU/trace.txt | 102 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 80 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index 0688482..02cced1 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -192,23 +192,24 @@ rcu/rcuhier (which displays the struct rcu_node hierarchy). The output of "cat rcu/rcudata" looks as follows: rcu: - 0 c=4011 g=4012 pq=1 pqc=4011 qp=0 rpfq=1 rp=3c2a dt=23301/73 dn=2 df=1882 of=0 ri=2126 ql=2 b=10 - 1 c=4011 g=4012 pq=1 pqc=4011 qp=0 rpfq=3 rp=39a6 dt=78073/1 dn=2 df=1402 of=0 ri=1875 ql=46 b=10 - 2 c=4010 g=4010 pq=1 pqc=4010 qp=0 rpfq=-5 rp=1d12 dt=16646/0 dn=2 df=3140 of=0 ri=2080 ql=0 b=10 - 3 c=4012 g=4013 pq=1 pqc=4012 qp=1 rpfq=3 rp=2b50 dt=21159/1 dn=2 df=2230 of=0 ri=1923 ql=72 b=10 - 4 c=4012 g=4013 pq=1 pqc=4012 qp=1 rpfq=3 rp=1644 dt=5783/1 dn=2 df=3348 of=0 ri=2805 ql=7 b=10 - 5 c=4012 g=4013 pq=0 pqc=4011 qp=1 rpfq=3 rp=1aac dt=5879/1 dn=2 df=3140 of=0 ri=2066 ql=10 b=10 - 6 c=4012 g=4013 pq=1 pqc=4012 qp=1 rpfq=3 rp=ed8 dt=5847/1 dn=2 df=3797 of=0 ri=1266 ql=10 b=10 - 7 c=4012 g=4013 pq=1 pqc=4012 qp=1 rpfq=3 rp=1fa2 dt=6199/1 dn=2 df=2795 of=0 ri=2162 ql=28 b=10 +rcu: + 0 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=10951/1 dn=0 df=1101 of=0 ri=36 ql=0 b=10 + 1 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=16117/1 dn=0 df=1015 of=0 ri=0 ql=0 b=10 + 2 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=1445/1 dn=0 df=1839 of=0 ri=0 ql=0 b=10 + 3 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=6681/1 dn=0 df=1545 of=0 ri=0 ql=0 b=10 + 4 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=1003/1 dn=0 df=1992 of=0 ri=0 ql=0 b=10 + 5 c=17829 g=17830 pq=1 pqc=17829 qp=1 dt=3887/1 dn=0 df=3331 of=0 ri=4 ql=2 b=10 + 6 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=859/1 dn=0 df=3224 of=0 ri=0 ql=0 b=10 + 7 c=17829 g=17830 pq=0 pqc=17829 qp=1 dt=3761/1 dn=0 df=1818 of=0 ri=0 ql=2 b=10 rcu_bh: - 0 c=-268 g=-268 pq=1 pqc=-268 qp=0 rpfq=-145 rp=21d6 dt=23301/73 dn=2 df=0 of=0 ri=0 ql=0 b=10 - 1 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-170 rp=20ce dt=78073/1 dn=2 df=26 of=0 ri=5 ql=0 b=10 - 2 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-83 rp=fbd dt=16646/0 dn=2 df=28 of=0 ri=4 ql=0 b=10 - 3 c=-268 g=-268 pq=1 pqc=-268 qp=0 rpfq=-105 rp=178c dt=21159/1 dn=2 df=28 of=0 ri=2 ql=0 b=10 - 4 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-30 rp=b54 dt=5783/1 dn=2 df=32 of=0 ri=0 ql=0 b=10 - 5 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-29 rp=df5 dt=5879/1 dn=2 df=30 of=0 ri=3 ql=0 b=10 - 6 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-28 rp=788 dt=5847/1 dn=2 df=32 of=0 ri=0 ql=0 b=10 - 7 c=-268 g=-268 pq=1 pqc=-268 qp=1 rpfq=-53 rp=1098 dt=6199/1 dn=2 df=30 of=0 ri=3 ql=0 b=10 + 0 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=10951/1 dn=0 df=0 of=0 ri=0 ql=0 b=10 + 1 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=16117/1 dn=0 df=13 of=0 ri=0 ql=0 b=10 + 2 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=1445/1 dn=0 df=15 of=0 ri=0 ql=0 b=10 + 3 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=6681/1 dn=0 df=9 of=0 ri=0 ql=0 b=10 + 4 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=1003/1 dn=0 df=15 of=0 ri=0 ql=0 b=10 + 5 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=3887/1 dn=0 df=15 of=0 ri=0 ql=0 b=10 + 6 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=859/1 dn=0 df=15 of=0 ri=0 ql=0 b=10 + 7 c=-275 g=-275 pq=1 pqc=-275 qp=0 dt=3761/1 dn=0 df=15 of=0 ri=0 ql=0 b=10 The first section lists the rcu_data structures for rcu, the second for rcu_bh. Each section has one line per CPU, or eight for this 8-CPU system. @@ -253,12 +254,6 @@ o "pqc" indicates which grace period the last-observed quiescent o "qp" indicates that RCU still expects a quiescent state from this CPU. -o "rpfq" is the number of rcu_pending() calls on this CPU required - to induce this CPU to invoke force_quiescent_state(). - -o "rp" is low-order four hex digits of the count of how many times - rcu_pending() has been invoked on this CPU. - o "dt" is the current value of the dyntick counter that is incremented when entering or leaving dynticks idle state, either by the scheduler or by irq. The number after the "/" is the interrupt @@ -305,6 +300,9 @@ o "b" is the batch limit for this CPU. If more than this number of RCU callbacks is ready to invoke, then the remainder will be deferred. +There is also an rcu/rcudata.csv file with the same information in +comma-separated-variable spreadsheet format. + The output of "cat rcu/rcugp" looks as follows: @@ -411,3 +409,63 @@ o Each element of the form "1/1 0:127 ^0" represents one struct For example, the first entry at the lowest level shows "^0", indicating that it corresponds to bit zero in the first entry at the middle level. + + +The output of "cat rcu/rcu_pending" looks as follows: + +rcu: + 0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 + 1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 + 2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 + 3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 + 4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 + 5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 + 6 np=219995 qsp=46718 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 + 7 np=249893 qsp=49390 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 +rcu_bh: + 0 np=146741 qsp=1419 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 + 1 np=155792 qsp=12597 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 + 2 np=136629 qsp=18680 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 + 3 np=137723 qsp=2843 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 + 4 np=123110 qsp=12433 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 + 5 np=137456 qsp=4210 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 + 6 np=120834 qsp=9902 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 + 7 np=144888 qsp=26336 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 + +As always, this is once again split into "rcu" and "rcu_bh" portions. +The fields are as follows: + +o "np" is the number of times that __rcu_pending() has been invoked + for the corresponding flavor of RCU. + +o "qsp" is the number of times that the RCU was waiting for a + quiescent state from this CPU. + +o "cbr" is the number of times that this CPU had RCU callbacks + that had passed through a grace period, and were thus ready + to be invoked. + +o "cng" is the number of times that this CPU needed another + grace period while RCU was idle. + +o "gpc" is the number of times that an old grace period had + completed, but this CPU was not yet aware of it. + +o "gps" is the number of times that a new grace period had started, + but this CPU was not yet aware of it. + +o "nf" is the number of times that this CPU suspected that the + current grace period had run for too long, and thus needed to + be forced. + + Please note that "forcing" consists of sending resched IPIs + to holdout CPUs. If that CPU really still is in an old RCU + read-side critical section, then we really do have to wait for it. + The assumption behing "forcing" is that the CPU is not still in + an old RCU read-side critical section, but has not yet responded + for some other reason. + +o "nn" is the number of times that this CPU needed nothing. Alert + readers will note that the rcu "nn" number for a given CPU very + closely matches the rcu_bh "np" number for that same CPU. This + is due to short-circuit evaluation in rcu_pending(). -- cgit v1.1 From 8338c300642f67af5a8cc279ca5e088c7055b7eb Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 14 Apr 2009 12:30:36 +0200 Subject: ALSA: Add missing description of lx6464es to ALSA-Configuration.txt Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/ALSA-Configuration.txt | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..dfd266e 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -1093,6 +1093,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. This module supports multiple cards. The driver requires the firmware loader support on kernel. + Module snd-lx6464es + ------------------- + + Module for Digigram LX6464ES boards + + This module supports multiple cards. + Module snd-maestro3 ------------------- -- cgit v1.1 From bd3ce6556072bdc8ea66dfd5448e184f189bdc7f Mon Sep 17 00:00:00 2001 From: H Hartley Sweeten Date: Fri, 17 Apr 2009 20:12:35 -0700 Subject: Input: rotary_encoder - add support for REL_* axes The rotary encoder driver only supports returning input events for ABS_* axes, this adds support for REL_* axes. The relative axis input event is reported as -1 for each counter-clockwise step and +1 for each clockwise step. The ability to clamp the position of ABS_* axes between 0 and a maximum of "steps" has also been added. Signed-off-by: H Hartley Sweeten Signed-off-by: Daniel Mack Signed-off-by: Dmitry Torokhov --- Documentation/input/rotary-encoder.txt | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/input/rotary-encoder.txt b/Documentation/input/rotary-encoder.txt index 435102a..3a6aec4 100644 --- a/Documentation/input/rotary-encoder.txt +++ b/Documentation/input/rotary-encoder.txt @@ -67,7 +67,12 @@ data with it. struct rotary_encoder_platform_data is declared in include/linux/rotary-encoder.h and needs to be filled with the number of steps the encoder has and can carry information about externally inverted -signals (because of used invertig buffer or other reasons). +signals (because of an inverting buffer or other reasons). The encoder +can be set up to deliver input information as either an absolute or relative +axes. For relative axes the input event returns +/-1 for each step. For +absolute axes the position of the encoder can either roll over between zero +and the number of steps or will clamp at the maximum and zero depending on +the configuration. Because GPIO to IRQ mapping is platform specific, this information must be given in seperately to the driver. See the example below. @@ -85,6 +90,8 @@ be given in seperately to the driver. See the example below. static struct rotary_encoder_platform_data my_rotary_encoder_info = { .steps = 24, .axis = ABS_X, + .relative_axis = false, + .rollover = false, .gpio_a = GPIO_ROTARY_A, .gpio_b = GPIO_ROTARY_B, .inverted_a = 0, -- cgit v1.1 From 4a74491ef514c01f45d820b10501a5cfcccd3c73 Mon Sep 17 00:00:00 2001 From: Thadeu Lima de Souza Cascardo Date: Fri, 17 Apr 2009 20:35:58 -0700 Subject: Input: fix typo in documentation Event type for key presses is EV_KEY, not REL_KEY. Signed-off-by: Dmitry Torokhov --- Documentation/input/input.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt index 686ee99..b93c084 100644 --- a/Documentation/input/input.txt +++ b/Documentation/input/input.txt @@ -278,7 +278,7 @@ struct input_event { }; 'time' is the timestamp, it returns the time at which the event happened. -Type is for example EV_REL for relative moment, REL_KEY for a keypress or +Type is for example EV_REL for relative moment, EV_KEY for a keypress or release. More types are defined in include/linux/input.h. 'code' is event code, for example REL_X or KEY_BACKSPACE, again a complete -- cgit v1.1 From 2b2fd87a6ef56ba7647a578e81bb8c8efda166b8 Mon Sep 17 00:00:00 2001 From: Weidong Han Date: Fri, 17 Apr 2009 16:42:12 +0800 Subject: docs, x86: add nox2apic back to kernel-parameters.txt "nox2apic" was removed from kernel-parameters.txt by mistake, when entries were sorted in alpha order (commit 0cb55ad2). But this early parameter is still there, add it back to kernel-parameters.txt. [ Impact: add boot parameter description ] Signed-off-by: Suresh Siddha Signed-off-by: Weidong Han Acked-by: David Woodhouse Cc: Randy Dunlap Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-2-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar --- Documentation/kernel-parameters.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 6172e43..33989d2 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1588,6 +1588,8 @@ and is between 256 and 4096 characters. It is defined in the file nowb [ARM] + nox2apic [X86-64,APIC] Do not enable x2APIC mode. + nptcg= [IA64] Override max number of concurrent global TLB purges which is reported from either PAL_VM_SUMMARY or SAL PALO. -- cgit v1.1 From 03ea81550676296d94596e4337c771c6ba29f542 Mon Sep 17 00:00:00 2001 From: Weidong Han Date: Fri, 17 Apr 2009 16:42:15 +0800 Subject: x86, intr-remap: add option to disable interrupt remapping Add option "nointremap" to disable interrupt remapping. [ Impact: add new boot option ] Signed-off-by: Suresh Siddha Signed-off-by: Weidong Han Acked-by: David Woodhouse Cc: iommu@lists.linux-foundation.org Cc: allen.m.kay@intel.com Cc: fenghua.yu@intel.com LKML-Reference: <1239957736-6161-5-git-send-email-weidong.han@intel.com> Signed-off-by: Ingo Molnar --- Documentation/kernel-parameters.txt | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 33989d2..843cb66 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1533,6 +1533,9 @@ and is between 256 and 4096 characters. It is defined in the file noinitrd [RAM] Tells the kernel not to load any configured initial RAM disk. + nointremap [X86-64, Intel-IOMMU] Do not enable interrupt + remapping. + nointroute [IA-64] nojitter [IA64] Disables jitter checking for ITC timers. -- cgit v1.1 From 246d0a17f5e09af0794960164269fc8988a8811c Mon Sep 17 00:00:00 2001 From: Mark Brown Date: Wed, 22 Apr 2009 18:24:55 +0100 Subject: ASoC: Add power supply widget to DAPM Many modern CODECs have shared resources on chip which must be enabled for portions of the chip to work but which can be disabled at other times in order to achieve power savings. Examples of such resources include power supplies and some internal clocks. Since these widgets are dependencies for the audio path but do not carry audio signals they require slightly different handling to most widgets - they do not contribute to the audio path and so should not be counted as either inputs or outputs during path walks. Cases where one supply provides a supply for another will require additional work. There is also room for more optimisation of the graph walking to avoid repeated checks for the same thing. Signed-off-by: Mark Brown --- Documentation/sound/alsa/soc/dapm.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/soc/dapm.txt b/Documentation/sound/alsa/soc/dapm.txt index 9e67632..9ac842b 100644 --- a/Documentation/sound/alsa/soc/dapm.txt +++ b/Documentation/sound/alsa/soc/dapm.txt @@ -62,6 +62,7 @@ Audio DAPM widgets fall into a number of types:- o Mic - Mic (and optional Jack) o Line - Line Input/Output (and optional Jack) o Speaker - Speaker + o Supply - Power or clock supply widget used by other widgets. o Pre - Special PRE widget (exec before all others) o Post - Special POST widget (exec after all others) -- cgit v1.1 From 621cac85297de5ba655e3430b007dd2e0da91da6 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Fri, 27 Mar 2009 14:14:31 +0100 Subject: rfkill: remove user_claim stuff Almost all drivers do not support user_claim, so remove it completely and always report -EOPNOTSUPP to userspace. Since userspace cannot really drive rfkill _anyway_ (due to the odd restrictions imposed by the documentation) having this code is just pointless. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/rfkill.txt | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index 4d3ee31..40c3a3f 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -521,16 +521,12 @@ status of the system. Input devices may issue events that are related to rfkill. These are the various KEY_* events and SW_* events supported by rfkill-input.c. -******IMPORTANT****** -When rfkill-input is ACTIVE, userspace is NOT TO CHANGE THE STATE OF AN RFKILL -SWITCH IN RESPONSE TO AN INPUT EVENT also handled by rfkill-input, unless it -has set to true the user_claim attribute for that particular switch. This rule -is *absolute*; do NOT violate it. -******IMPORTANT****** - -Userspace must not assume it is the only source of control for rfkill switches. -Their state CAN and WILL change due to firmware actions, direct user actions, -and the rfkill-input EPO override for *_RFKILL_ALL. +Userspace may not change the state of an rfkill switch in response to an +input event, it should refrain from changing states entirely. + +Userspace cannot assume it is the only source of control for rfkill switches. +Their state can change due to firmware actions, direct user actions, and the +rfkill-input EPO override for *_RFKILL_ALL. When rfkill-input is not active, userspace must initiate a rfkill status change by writing to the "state" attribute in order for anything to happen. -- cgit v1.1 From de2b3e864aa908e613dd9912def88af7877d85f3 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Wed, 8 Apr 2009 12:54:41 +0200 Subject: mac80211: update injection documentation We don't currently support antenna or rate setting, so remove that. Also update the link -- the current one is dead and the wiki can be kept updated easier. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/networking/mac80211-injection.txt | 28 ++++++------------------- 1 file changed, 6 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt index 84906ef..b30e81a 100644 --- a/Documentation/networking/mac80211-injection.txt +++ b/Documentation/networking/mac80211-injection.txt @@ -12,38 +12,22 @@ following format: The radiotap format is discussed in ./Documentation/networking/radiotap-headers.txt. -Despite 13 radiotap argument types are currently defined, most only make sense +Despite many radiotap parameters being currently defined, most only make sense to appear on received packets. The following information is parsed from the radiotap headers and used to control injection: - * IEEE80211_RADIOTAP_RATE - - rate in 500kbps units, automatic if invalid or not present - - - * IEEE80211_RADIOTAP_ANTENNA - - antenna to use, automatic if not present - - - * IEEE80211_RADIOTAP_DBM_TX_POWER - - transmit power in dBm, automatic if not present - - * IEEE80211_RADIOTAP_FLAGS IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the - current fragmentation threshold. Note that - this flag is only reliable when software - fragmentation is enabled) + current fragmentation threshold. + The injection code can also skip all other currently defined radiotap fields facilitating replay of captured radiotap headers directly. -Here is an example valid radiotap header defining these three parameters +Here is an example valid radiotap header defining some parameters 0x00, 0x00, // <-- radiotap version 0x0b, 0x00, // <- radiotap header length @@ -72,8 +56,8 @@ interface), along the following lines: ... r = pcap_inject(ppcap, u8aSendBuffer, nLength); -You can also find sources for a complete inject test applet here: +You can also find a link to a complete inject application here: -http://penumbra.warmcat.com/_twk/tiki-index.php?page=packetspammer +http://wireless.kernel.org/en/users/Documentation/packetspammer Andy Green -- cgit v1.1 From 4ba170c2bb77713e999ac6fb9ffe6ddd99d2f25a Mon Sep 17 00:00:00 2001 From: Benny Halevy Date: Tue, 7 Apr 2009 21:45:37 +0300 Subject: update Documentation/filesystems/00-INDEX with new nfsd related docs. Signed-off-by: Benny Halevy Cc: James Lentini Signed-off-by: J. Bruce Fields --- Documentation/filesystems/00-INDEX | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX index 8dd6db7..f15621e 100644 --- a/Documentation/filesystems/00-INDEX +++ b/Documentation/filesystems/00-INDEX @@ -66,6 +66,10 @@ mandatory-locking.txt - info on the Linux implementation of Sys V mandatory file locking. ncpfs.txt - info on Novell Netware(tm) filesystem using NCP protocol. +nfs41-server.txt + - info on the Linux server implementation of NFSv4 minor version 1. +nfs-rdma.txt + - how to install and setup the Linux NFS/RDMA client and server software. nfsroot.txt - short guide on setting up a diskless box with NFS root filesystem. nilfs2.txt -- cgit v1.1 From 4ed0d3e6c64cfd9ba4ceb2099b10d1cf8ece4320 Mon Sep 17 00:00:00 2001 From: Fenghua Yu Date: Fri, 24 Apr 2009 17:30:20 -0700 Subject: Intel IOMMU Pass Through Support The patch adds kernel parameter intel_iommu=pt to set up pass through mode in context mapping entry. This disables DMAR in linux kernel; but KVM still runs on VT-d and interrupt remapping still works. In this mode, kernel uses swiotlb for DMA API functions but other VT-d functionalities are enabled for KVM. KVM always uses multi level translation page table in VT-d. By default, pass though mode is disabled in kernel. This is useful when people don't want to enable VT-d DMAR in kernel but still want to use KVM and interrupt remapping for reasons like DMAR performance concern or debug purpose. Signed-off-by: Fenghua Yu Acked-by: Weidong Han Signed-off-by: David Woodhouse --- Documentation/kernel-parameters.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 600cdd7..fa4faeb 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -965,6 +965,7 @@ and is between 256 and 4096 characters. It is defined in the file nomerge forcesac soft + pt [x86, IA64] io7= [HW] IO7 for Marvel based alpha systems See comment before marvel_specify_io7 in -- cgit v1.1 From 50fa610a3b6ba7cf91d7a92229177dfaff2b81a1 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 28 Apr 2009 15:01:38 +0100 Subject: sched: Document memory barriers implied by sleep/wake-up primitives Add a section to the memory barriers document to note the implied memory barriers of sleep primitives (set_current_state() and wrappers) and wake-up primitives (wake_up() and co.). Also extend the in-code comments on the wake_up() functions to note these implied barriers. [ Impact: add documentation ] Signed-off-by: David Howells Cc: Oleg Nesterov Cc: Linus Torvalds Cc: Andrew Morton LKML-Reference: <20090428140138.1192.94723.stgit@warthog.procyon.org.uk> Signed-off-by: Ingo Molnar --- Documentation/memory-barriers.txt | 129 +++++++++++++++++++++++++++++++++++++- 1 file changed, 128 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index f5b7127..7f5809e 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -31,6 +31,7 @@ Contents: - Locking functions. - Interrupt disabling functions. + - Sleep and wake-up functions. - Miscellaneous functions. (*) Inter-CPU locking barrier effects. @@ -1217,6 +1218,132 @@ barriers are required in such a situation, they must be provided from some other means. +SLEEP AND WAKE-UP FUNCTIONS +--------------------------- + +Sleeping and waking on an event flagged in global data can be viewed as an +interaction between two pieces of data: the task state of the task waiting for +the event and the global data used to indicate the event. To make sure that +these appear to happen in the right order, the primitives to begin the process +of going to sleep, and the primitives to initiate a wake up imply certain +barriers. + +Firstly, the sleeper normally follows something like this sequence of events: + + for (;;) { + set_current_state(TASK_UNINTERRUPTIBLE); + if (event_indicated) + break; + schedule(); + } + +A general memory barrier is interpolated automatically by set_current_state() +after it has altered the task state: + + CPU 1 + =============================== + set_current_state(); + set_mb(); + STORE current->state + + LOAD event_indicated + +set_current_state() may be wrapped by: + + prepare_to_wait(); + prepare_to_wait_exclusive(); + +which therefore also imply a general memory barrier after setting the state. +The whole sequence above is available in various canned forms, all of which +interpolate the memory barrier in the right place: + + wait_event(); + wait_event_interruptible(); + wait_event_interruptible_exclusive(); + wait_event_interruptible_timeout(); + wait_event_killable(); + wait_event_timeout(); + wait_on_bit(); + wait_on_bit_lock(); + + +Secondly, code that performs a wake up normally follows something like this: + + event_indicated = 1; + wake_up(&event_wait_queue); + +or: + + event_indicated = 1; + wake_up_process(event_daemon); + +A write memory barrier is implied by wake_up() and co. if and only if they wake +something up. The barrier occurs before the task state is cleared, and so sits +between the STORE to indicate the event and the STORE to set TASK_RUNNING: + + CPU 1 CPU 2 + =============================== =============================== + set_current_state(); STORE event_indicated + set_mb(); wake_up(); + STORE current->state + STORE current->state + LOAD event_indicated + +The available waker functions include: + + complete(); + wake_up(); + wake_up_all(); + wake_up_bit(); + wake_up_interruptible(); + wake_up_interruptible_all(); + wake_up_interruptible_nr(); + wake_up_interruptible_poll(); + wake_up_interruptible_sync(); + wake_up_interruptible_sync_poll(); + wake_up_locked(); + wake_up_locked_poll(); + wake_up_nr(); + wake_up_poll(); + wake_up_process(); + + +[!] Note that the memory barriers implied by the sleeper and the waker do _not_ +order multiple stores before the wake-up with respect to loads of those stored +values after the sleeper has called set_current_state(). For instance, if the +sleeper does: + + set_current_state(TASK_INTERRUPTIBLE); + if (event_indicated) + break; + __set_current_state(TASK_RUNNING); + do_something(my_data); + +and the waker does: + + my_data = value; + event_indicated = 1; + wake_up(&event_wait_queue); + +there's no guarantee that the change to event_indicated will be perceived by +the sleeper as coming after the change to my_data. In such a circumstance, the +code on both sides must interpolate its own memory barriers between the +separate data accesses. Thus the above sleeper ought to do: + + set_current_state(TASK_INTERRUPTIBLE); + if (event_indicated) { + smp_rmb(); + do_something(my_data); + } + +and the waker should do: + + my_data = value; + smp_wmb(); + event_indicated = 1; + wake_up(&event_wait_queue); + + MISCELLANEOUS FUNCTIONS ----------------------- @@ -1366,7 +1493,7 @@ WHERE ARE MEMORY BARRIERS NEEDED? Under normal operation, memory operation reordering is generally not going to be a problem as a single-threaded linear piece of code will still appear to -work correctly, even if it's in an SMP kernel. There are, however, three +work correctly, even if it's in an SMP kernel. There are, however, four circumstances in which reordering definitely _could_ be a problem: (*) Interprocessor interaction. -- cgit v1.1 From a76f8c6da1e48fd4ef025f42c736389532ff30ba Mon Sep 17 00:00:00 2001 From: Jason Baron Date: Thu, 30 Apr 2009 13:29:42 -0400 Subject: tracing: add new tracepoints docbook Add tracepoint docbook. This will help us document and understand what tracepoints are in the kernel. Since there are multiple macros, and files that contain tracepoints. [ Impact: add documentation ] Signed-off-by: Jason Baron Acked-by: Randy Dunlap Cc: akpm@linux-foundation.org Cc: rostedt@goodmis.org Cc: fweisbec@gmail.com Cc: mathieu.desnoyers@polymtl.ca Cc: wcohen@redhat.com LKML-Reference: <84160b6bd94aff02455da7e12bad054d34c579a0.1241107197.git.jbaron@redhat.com> Signed-off-by: Ingo Molnar --- Documentation/DocBook/Makefile | 3 +- Documentation/DocBook/tracepoint.tmpl | 84 +++++++++++++++++++++++++++++++++++ 2 files changed, 86 insertions(+), 1 deletion(-) create mode 100644 Documentation/DocBook/tracepoint.tmpl (limited to 'Documentation') diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 8918a32..4c8f4d6 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -13,7 +13,8 @@ DOCBOOKS := z8530book.xml mcabook.xml device-drivers.xml \ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \ genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \ mac80211.xml debugobjects.xml sh.xml regulator.xml \ - alsa-driver-api.xml writing-an-alsa-driver.xml + alsa-driver-api.xml writing-an-alsa-driver.xml \ + tracepoint.xml ### # The build process is as follows (targets): diff --git a/Documentation/DocBook/tracepoint.tmpl b/Documentation/DocBook/tracepoint.tmpl new file mode 100644 index 0000000..70891bc --- /dev/null +++ b/Documentation/DocBook/tracepoint.tmpl @@ -0,0 +1,84 @@ + + + + + + The Linux Kernel Tracepoint API + + + + Jason + Baron + +
+ jbaron@redhat.com +
+
+
+
+ + + + This documentation is free software; you can redistribute + it and/or modify it under the terms of the GNU General Public + License as published by the Free Software Foundation; either + version 2 of the License, or (at your option) any later + version. + + + + This program is distributed in the hope that it will be + useful, but WITHOUT ANY WARRANTY; without even the implied + warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + See the GNU General Public License for more details. + + + + You should have received a copy of the GNU General Public + License along with this program; if not, write to the Free + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, + MA 02111-1307 USA + + + + For more details see the file COPYING in the source + distribution of Linux. + + +
+ + + + Introduction + + Tracepoints are static probe points that are located in strategic points + throughout the kernel. 'Probes' register/unregister with tracepoints + via a callback mechanism. The 'probes' are strictly typed functions that + are passed a unique set of parameters defined by each tracepoint. + + + + From this simple callback mechanism, 'probes' can be used to profile, debug, + and understand kernel behavior. There are a number of tools that provide a + framework for using 'probes'. These tools include Systemtap, ftrace, and + LTTng. + + + + Tracepoints are defined in a number of header files via various macros. Thus, + the purpose of this document is to provide a clear accounting of the available + tracepoints. The intention is to understand not only what tracepoints are + available but also to understand where future tracepoints might be added. + + + + The API presented has functions of the form: + trace_tracepointname(function parameters). These are the + tracepoints callbacks that are found throughout the code. Registering and + unregistering probes with these callback sites is covered in the + Documentation/trace/* directory. + + + +
-- cgit v1.1 From 9ee1983c9aa18f12388ef660d0c76a23dc112959 Mon Sep 17 00:00:00 2001 From: Jason Baron Date: Thu, 30 Apr 2009 13:29:47 -0400 Subject: tracing: add irq tracepoint documentation Document irqs for the newly created docbook. [ Impact: add documentation ] Signed-off-by: Jason Baron Acked-by: Randy Dunlap Cc: akpm@linux-foundation.org Cc: rostedt@goodmis.org Cc: fweisbec@gmail.com Cc: mathieu.desnoyers@polymtl.ca Cc: wcohen@redhat.com LKML-Reference: <73ff42be3420157667ec548e9b0e409c3cfad05f.1241107197.git.jbaron@redhat.com> Signed-off-by: Ingo Molnar --- Documentation/DocBook/tracepoint.tmpl | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DocBook/tracepoint.tmpl b/Documentation/DocBook/tracepoint.tmpl index 70891bc..b0756d0 100644 --- a/Documentation/DocBook/tracepoint.tmpl +++ b/Documentation/DocBook/tracepoint.tmpl @@ -81,4 +81,9 @@ + + IRQ +!Iinclude/trace/events/irq.h + + -- cgit v1.1 From 514bf54cd8c7f172816d3c003a6d022e9165a29b Mon Sep 17 00:00:00 2001 From: James Gardiner Date: Sun, 3 May 2009 04:00:44 -0400 Subject: ALSA: hda - Addition for HP dv4-1222nr laptop support Signed-off-by: James Gardiner Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 8eec05b..36c9712 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -347,6 +347,7 @@ STAC92HD71B* hp-m4 HP mini 1000 hp-dv5 HP dv series hp-hdx HP HDX series + hp-dv4-1222nr HP dv4-1222nr (with LED support) auto BIOS setup (default) STAC92HD73* -- cgit v1.1 From b0ec3a30bc01c15cc6277b223fae136f7b71e90c Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Sun, 3 May 2009 10:39:19 +0200 Subject: ALSA: sc6000: enable joystick port Add module parameter to enable or disable joystick port (gameport) on the SC6600 and later cards. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/ALSA-Configuration.txt | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..11a80a9c71 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -1543,13 +1543,15 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module snd-sc6000 ----------------- - Module for Gallant SC-6000 soundcard. + Module for Gallant SC-6000 soundcard and later models: SC-6600 + and SC-7000. port - Port # (0x220 or 0x240) mss_port - MSS Port # (0x530 or 0xe80) irq - IRQ # (5,7,9,10,11) mpu_irq - MPU-401 IRQ # (5,7,9,10) ,0 - no MPU-401 irq dma - DMA # (1,3,0) + joystick - Enable gameport - 0 = disable (default), 1 = enable This module supports multiple cards. -- cgit v1.1 From 255cac91c3c9ce7dca7713b93ab03c75b7902e0e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= Date: Mon, 4 May 2009 11:07:36 -0700 Subject: tcp: extend ECN sysctl to allow server-side only ECN MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This should be very safe compared with full enabled, so I see no reason why it shouldn't be done right away. As ECN can only be negotiated if the SYN sending party is also supporting it, somebody in the loop probably knows what he/she is doing. If SYN does not ask for ECN, the server side SYN-ACK is identical to what it is without ECN. Thus it's quite safe. The chosen value is safe w.r.t to existing configs which choose to currently set manually either 0 or 1 but silently upgrades those who have not explicitly requested ECN off. Whether to just enable both sides comes up time to time but unless that gets done now we can at least make the servers aware of ECN already. As there are some known problems to occur if ECN is enabled, it's currently questionable whether there's any real gain from enabling clients as servers mostly won't support it anyway (so we'd hit just the negative sides). After enabling the servers and getting that deployed, the client end enable really has some potential gain too. Signed-off-by: Ilpo Järvinen Signed-off-by: David S. Miller --- Documentation/networking/ip-sysctl.txt | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index ec5de02..7f98aa3 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -168,7 +168,16 @@ tcp_dsack - BOOLEAN Allows TCP to send "duplicate" SACKs. tcp_ecn - BOOLEAN - Enable Explicit Congestion Notification in TCP. + Enable Explicit Congestion Notification (ECN) in TCP. ECN is only + used when both ends of the TCP flow support it. It is useful to + avoid losses due to congestion (when the bottleneck router supports + ECN). + Possible values are: + 0 disable ECN + 1 ECN enabled + 2 Only server-side ECN enabled. If the other end does + not support ECN, behavior is like with ECN disabled. + Default: 2 tcp_fack - BOOLEAN Enable FACK congestion avoidance and fast retransmission. -- cgit v1.1 From 60aa605dfce2976e54fa76e805ab0f221372d4d9 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Tue, 5 May 2009 17:50:21 +0200 Subject: sched: rt: document the risk of small values in the bandwidth settings Thomas noted that we should disallow sysctl_sched_rt_runtime == 0 for (!RT_GROUP) since the root group always has some RT tasks in it. Further, update the documentation to inspire clue. [ Impact: exclude corner-case sysctl_sched_rt_runtime value ] Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra LKML-Reference: <20090505155436.863098054@chello.nl> Signed-off-by: Ingo Molnar --- Documentation/scheduler/sched-rt-group.txt | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'Documentation') diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt index 5ba4d3f..eb74b01 100644 --- a/Documentation/scheduler/sched-rt-group.txt +++ b/Documentation/scheduler/sched-rt-group.txt @@ -4,6 +4,7 @@ CONTENTS ======== +0. WARNING 1. Overview 1.1 The problem 1.2 The solution @@ -14,6 +15,23 @@ CONTENTS 3. Future plans +0. WARNING +========== + + Fiddling with these settings can result in an unstable system, the knobs are + root only and assumes root knows what he is doing. + +Most notable: + + * very small values in sched_rt_period_us can result in an unstable + system when the period is smaller than either the available hrtimer + resolution, or the time it takes to handle the budget refresh itself. + + * very small values in sched_rt_runtime_us can result in an unstable + system when the runtime is so small the system has difficulty making + forward progress (NOTE: the migration thread and kstopmachine both + are real-time processes). + 1. Overview =========== -- cgit v1.1 From c898faf91b3ec6b0f6efa35831b3984fa3331db0 Mon Sep 17 00:00:00 2001 From: Rik van Riel Date: Tue, 5 May 2009 17:28:56 -0400 Subject: x86: 46 bit physical address support on 64 bits Extend the maximum addressable memory on x86-64 from 2^44 to 2^46 bytes. This requires some shuffling around of the vmalloc and virtual memmap memory areas, to keep them away from the direct mapping of up to 64TB of physical memory. This patch also introduces a guard hole between the vmalloc area and the virtual memory map space. There's really no good reason why we wouldn't have a guard hole there. [ Impact: future hardware enablement ] Signed-off-by: Rik van Riel LKML-Reference: <20090505172856.6820db22@cuia.bos.redhat.com> Signed-off-by: H. Peter Anvin --- Documentation/x86/x86_64/mm.txt | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 29b52b1..5394132 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -6,10 +6,11 @@ Virtual memory map with 4 level page tables: 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm hole caused by [48:63] sign extension ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole -ffff880000000000 - ffffc0ffffffffff (=57 TB) direct mapping of all phys. memory -ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole -ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space -ffffe20000000000 - ffffe2ffffffffff (=40 bits) virtual memory map (1TB) +ffff880000000000 - ffffc8ffffffffff (=64 TB) direct mapping of all phys. memory +ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole +ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space +ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole +ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB) ... unused hole ... ffffffff80000000 - ffffffffa0000000 (=512 MB) kernel text mapping, from phys 0 ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space -- cgit v1.1 From 2feceeff1e771850e49f9074307f071964fd9e3e Mon Sep 17 00:00:00 2001 From: "H. Peter Anvin" Date: Tue, 5 May 2009 19:07:07 -0700 Subject: x86: fix typo in address space documentation Fix a trivial typo in Documentation/x86/x86_64/mm.txt. [ Impact: documentation only ] Signed-off-by: H. Peter Anvin Cc: Rik van Riel --- Documentation/x86/x86_64/mm.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 5394132..d6498e3 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -6,7 +6,7 @@ Virtual memory map with 4 level page tables: 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm hole caused by [48:63] sign extension ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole -ffff880000000000 - ffffc8ffffffffff (=64 TB) direct mapping of all phys. memory +ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole -- cgit v1.1 From 72cbfd45fac627de4bd38c203baaac40cd26477c Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 May 2009 17:33:19 +0200 Subject: ALSA: ice1724 - Add ESI Maya44 support Added the support for ESI Maya44 board to ice1724 driver. Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/ALSA-Configuration.txt | 3 +- Documentation/sound/alsa/README.maya44 | 163 ++++++++++++++++++++++++ 2 files changed, 165 insertions(+), 1 deletion(-) create mode 100644 Documentation/sound/alsa/README.maya44 (limited to 'Documentation') diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..821a994 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -925,6 +925,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * Onkyo SE-90PCI * Onkyo SE-200PCI * ESI Juli@ + * ESI Maya44 * Hercules Fortissimo IV * EGO-SYS WaveTerminal 192M @@ -933,7 +934,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. prodigy71xt, prodigy71hifi, prodigyhd2, prodigy192, juli, aureon51, aureon71, universe, ap192, k8x800, phase22, phase28, ms300, av710, se200pci, se90pci, - fortissimo4, sn25p, WT192M + fortissimo4, sn25p, WT192M, maya44 This module supports multiple cards and autoprobe. diff --git a/Documentation/sound/alsa/README.maya44 b/Documentation/sound/alsa/README.maya44 new file mode 100644 index 0000000..0e41576 --- /dev/null +++ b/Documentation/sound/alsa/README.maya44 @@ -0,0 +1,163 @@ +NOTE: The following is the original document of Rainer's patch that the +current maya44 code based on. Some contents might be obsoleted, but I +keep here as reference -- tiwai + +---------------------------------------------------------------- + +STATE OF DEVELOPMENT: + +This driver is being developed on the initiative of Piotr Makowski (oponek@gmail.com) and financed by Lars Bergmann. +Development is carried out by Rainer Zimmermann (mail@lightshed.de). + +ESI provided a sample Maya44 card for the development work. + +However, unfortunately it has turned out difficult to get detailed programming information, so I (Rainer Zimmermann) had to find out some card-specific information by experiment and conjecture. Some information (in particular, several GPIO bits) is still missing. + +This is the first testing version of the Maya44 driver released to the alsa-devel mailing list (Feb 5, 2008). + + +The following functions work, as tested by Rainer Zimmermann and Piotr Makowski: + +- playback and capture at all sampling rates +- input/output level +- crossmixing +- line/mic switch +- phantom power switch +- analogue monitor a.k.a bypass + + +The following functions *should* work, but are not fully tested: + +- Channel 3+4 analogue - S/PDIF input switching +- S/PDIF output +- all inputs/outputs on the M/IO/DIO extension card +- internal/external clock selection + + +*In particular, we would appreciate testing of these functions by anyone who has access to an M/IO/DIO extension card.* + + +Things that do not seem to work: + +- The level meters ("multi track") in 'alsamixer' do not seem to react to signals in (if this is a bug, it would probably be in the existing ICE1724 code). + +- Ardour 2.1 seems to work only via JACK, not using ALSA directly or via OSS. This still needs to be tracked down. + + +DRIVER DETAILS: + +the following files were added: + +pci/ice1724/maya44.c - Maya44 specific code +pci/ice1724/maya44.h +pci/ice1724/ice1724.patch +pci/ice1724/ice1724.h.patch - PROPOSED patch to ice1724.h (see SAMPLING RATES) +i2c/other/wm8776.c - low-level access routines for Wolfson WM8776 codecs +include/wm8776.h + + +Note that the wm8776.c code is meant to be card-independent and does not actually register the codec with the ALSA infrastructure. +This is done in maya44.c, mainly because some of the WM8776 controls are used in Maya44-specific ways, and should be named appropriately. + + +the following files were created in pci/ice1724, simply #including the corresponding file from the alsa-kernel tree: + +wtm.h +vt1720_mobo.h +revo.h +prodigy192.h +pontis.h +phase.h +maya44.h +juli.h +aureon.h +amp.h +envy24ht.h +se.h +prodigy_hifi.h + + +*I hope this is the correct way to do things.* + + +SAMPLING RATES: + +The Maya44 card (or more exactly, the Wolfson WM8776 codecs) allow a maximum sampling rate of 192 kHz for playback and 92 kHz for capture. + +As the ICE1724 chip only allows one global sampling rate, this is handled as follows: + +* setting the sampling rate on any open PCM device on the maya44 card will always set the *global* sampling rate for all playback and capture channels. + +* In the current state of the driver, setting rates of up to 192 kHz is permitted even for capture devices. + +*AVOID CAPTURING AT RATES ABOVE 96kHz*, even though it may appear to work. The codec cannot actually capture at such rates, meaning poor quality. + + +I propose some additional code for limiting the sampling rate when setting on a capture pcm device. However because of the global sampling rate, this logic would be somewhat problematic. + +The proposed code (currently deactivated) is in ice1712.h.patch, ice1724.c and maya44.c (in pci/ice1712). + + +SOUND DEVICES: + +PCM devices correspond to inputs/outputs as follows (assuming Maya44 is card #0): + +hw:0,0 input - stereo, analog input 1+2 +hw:0,0 output - stereo, analog output 1+2 +hw:0,1 input - stereo, analog input 3+4 OR S/PDIF input +hw:0,1 output - stereo, analog output 3+4 (and SPDIF out) + + +NAMING OF MIXER CONTROLS: + +(for more information about the signal flow, please refer to the block diagram on p.24 of the ESI Maya44 manual, or in the ESI windows software). + + +PCM: (digital) output level for channel 1+2 +PCM 1: same for channel 3+4 + +Mic Phantom+48V: switch for +48V phantom power for electrostatic microphones on input 1/2. + Make sure this is not turned on while any other source is connected to input 1/2. + It might damage the source and/or the maya44 card. + +Mic/Line input: if switch is is on, input jack 1/2 is microphone input (mono), otherwise line input (stereo). + +Bypass: analogue bypass from ADC input to output for channel 1+2. Same as "Monitor" in the windows driver. +Bypass 1: same for channel 3+4. + +Crossmix: cross-mixer from channels 1+2 to channels 3+4 +Crossmix 1: cross-mixer from channels 3+4 to channels 1+2 + +IEC958 Output: switch for S/PDIF output. + This is not supported by the ESI windows driver. + S/PDIF should output the same signal as channel 3+4. [untested!] + + +Digitial output selectors: + + These switches allow a direct digital routing from the ADCs to the DACs. + Each switch determines where the digital input data to one of the DACs comes from. + They are not supported by the ESI windows driver. + For normal operation, they should all be set to "PCM out". + +H/W: Output source channel 1 +H/W 1: Output source channel 2 +H/W 2: Output source channel 3 +H/W 3: Output source channel 4 + +H/W 4 ... H/W 9: unknown function, left in to enable testing. + Possibly some of these control S/PDIF output(s). + If these turn out to be unused, they will go away in later driver versions. + +Selectable values for each of the digital output selectors are: + "PCM out" -> DAC output of the corresponding channel (default setting) + "Input 1"... + "Input 4" -> direct routing from ADC output of the selected input channel + + +-------- + +Feb 14, 2008 +Rainer Zimmermann +mail@lightshed.de + -- cgit v1.1 From 1dcdb5a9e7c235e6e80f1f4d5b8247b3e5347e48 Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Mon, 27 Apr 2009 17:44:11 +0200 Subject: oprofile: re-add force_arch_perfmon option This re-adds the force_arch_perfmon option that was in the original arch perfmon patchkit. Originally this was rejected in favour of a generalized perfmon=name option, but it turned out implementing the later in a reliable way is hard (and it would have been easy to crash the kernel if a user gets it wrong) But now Atom and Core i7 support being readded a user would need to update their oprofile userland to beyond 0.9.4 to use oprofile again on Atom or Core i7. To avoid this problem readd the force_arch_perfmon option. Signed-off-by: Andi Kleen Signed-off-by: Robert Richter --- Documentation/kernel-parameters.txt | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 90b3924..9b9566b 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1650,6 +1650,12 @@ and is between 256 and 4096 characters. It is defined in the file oprofile.timer= [HW] Use timer interrupt instead of performance counters + oprofile.force_arch_perfmon=1 [X86] + Force use of architectural perfmon instead of + the CPU specific event set. + This might be useful if you have older oprofile + userland or if you want common events over Intel CPUs. + osst= [HW,SCSI] SCSI Tape Driver Format: , See also Documentation/scsi/st.txt. -- cgit v1.1 From 7e4e0bd50e80df2fe5501f48f872448376cdd997 Mon Sep 17 00:00:00 2001 From: Robert Richter Date: Wed, 6 May 2009 12:10:23 +0200 Subject: oprofile: introduce module_param oprofile.cpu_type This patch removes module_param oprofile.force_arch_perfmon and introduces oprofile.cpu_type=archperfmon instead. This new parameter can be reused for other models and architectures. Currently only archperfmon is supported. Cc: Andi Kleen Signed-off-by: Robert Richter --- Documentation/kernel-parameters.txt | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 9b9566b..6ce5f48 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1650,11 +1650,13 @@ and is between 256 and 4096 characters. It is defined in the file oprofile.timer= [HW] Use timer interrupt instead of performance counters - oprofile.force_arch_perfmon=1 [X86] - Force use of architectural perfmon instead of - the CPU specific event set. - This might be useful if you have older oprofile - userland or if you want common events over Intel CPUs. + oprofile.cpu_type= Force an oprofile cpu type + This might be useful if you have an older oprofile + userland or if you want common events. + Format: { archperfmon } + archperfmon: [X86] Force use of architectural + perfmon on Intel CPUs instead of the + CPU specific event set. osst= [HW,SCSI] SCSI Tape Driver Format: , -- cgit v1.1 From b30505c81a9d4adea8b70ecff512b0216929b797 Mon Sep 17 00:00:00 2001 From: Darren Hart Date: Thu, 7 May 2009 15:40:14 -0700 Subject: futex: add requeue-pi documentation Add Documentation/futex-requeue-pi.txt describing the motivation for the newly added FUTEX_*REQUEUE_PI op codes and their implementation. [ Impact: add documentation ] Signed-off-by: Darren Hart Cc: Sripathi Kodi Cc: Peter Zijlstra Cc: John Stultz Cc: Steven Rostedt Cc: Dinakar Guniguntala Cc: Ulrich Drepper Cc: Eric Dumazet Cc: Jakub Jelinek LKML-Reference: <4A03634E.3080609@us.ibm.com> [ reformatted the file ] Signed-off-by: Ingo Molnar --- Documentation/futex-requeue-pi.txt | 131 +++++++++++++++++++++++++++++++++++++ 1 file changed, 131 insertions(+) create mode 100644 Documentation/futex-requeue-pi.txt (limited to 'Documentation') diff --git a/Documentation/futex-requeue-pi.txt b/Documentation/futex-requeue-pi.txt new file mode 100644 index 0000000..9dc1ff4 --- /dev/null +++ b/Documentation/futex-requeue-pi.txt @@ -0,0 +1,131 @@ +Futex Requeue PI +---------------- + +Requeueing of tasks from a non-PI futex to a PI futex requires +special handling in order to ensure the underlying rt_mutex is never +left without an owner if it has waiters; doing so would break the PI +boosting logic [see rt-mutex-desgin.txt] For the purposes of +brevity, this action will be referred to as "requeue_pi" throughout +this document. Priority inheritance is abbreviated throughout as +"PI". + +Motivation +---------- + +Without requeue_pi, the glibc implementation of +pthread_cond_broadcast() must resort to waking all the tasks waiting +on a pthread_condvar and letting them try to sort out which task +gets to run first in classic thundering-herd formation. An ideal +implementation would wake the highest-priority waiter, and leave the +rest to the natural wakeup inherent in unlocking the mutex +associated with the condvar. + +Consider the simplified glibc calls: + +/* caller must lock mutex */ +pthread_cond_wait(cond, mutex) +{ + lock(cond->__data.__lock); + unlock(mutex); + do { + unlock(cond->__data.__lock); + futex_wait(cond->__data.__futex); + lock(cond->__data.__lock); + } while(...) + unlock(cond->__data.__lock); + lock(mutex); +} + +pthread_cond_broadcast(cond) +{ + lock(cond->__data.__lock); + unlock(cond->__data.__lock); + futex_requeue(cond->data.__futex, cond->mutex); +} + +Once pthread_cond_broadcast() requeues the tasks, the cond->mutex +has waiters. Note that pthread_cond_wait() attempts to lock the +mutex only after it has returned to user space. This will leave the +underlying rt_mutex with waiters, and no owner, breaking the +previously mentioned PI-boosting algorithms. + +In order to support PI-aware pthread_condvar's, the kernel needs to +be able to requeue tasks to PI futexes. This support implies that +upon a successful futex_wait system call, the caller would return to +user space already holding the PI futex. The glibc implementation +would be modified as follows: + + +/* caller must lock mutex */ +pthread_cond_wait_pi(cond, mutex) +{ + lock(cond->__data.__lock); + unlock(mutex); + do { + unlock(cond->__data.__lock); + futex_wait_requeue_pi(cond->__data.__futex); + lock(cond->__data.__lock); + } while(...) + unlock(cond->__data.__lock); + /* the kernel acquired the the mutex for us */ +} + +pthread_cond_broadcast_pi(cond) +{ + lock(cond->__data.__lock); + unlock(cond->__data.__lock); + futex_requeue_pi(cond->data.__futex, cond->mutex); +} + +The actual glibc implementation will likely test for PI and make the +necessary changes inside the existing calls rather than creating new +calls for the PI cases. Similar changes are needed for +pthread_cond_timedwait() and pthread_cond_signal(). + +Implementation +-------------- + +In order to ensure the rt_mutex has an owner if it has waiters, it +is necessary for both the requeue code, as well as the waiting code, +to be able to acquire the rt_mutex before returning to user space. +The requeue code cannot simply wake the waiter and leave it to +acquire the rt_mutex as it would open a race window between the +requeue call returning to user space and the waiter waking and +starting to run. This is especially true in the uncontended case. + +The solution involves two new rt_mutex helper routines, +rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which +allow the requeue code to acquire an uncontended rt_mutex on behalf +of the waiter and to enqueue the waiter on a contended rt_mutex. +Two new system calls provide the kernel<->user interface to +requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_REQUEUE_CMP_PI. + +FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait() +and pthread_cond_timedwait()) to block on the initial futex and wait +to be requeued to a PI-aware futex. The implementation is the +result of a high-speed collision between futex_wait() and +futex_lock_pi(), with some extra logic to check for the additional +wake-up scenarios. + +FUTEX_REQUEUE_CMP_PI is called by the waker +(pthread_cond_broadcast() and pthread_cond_signal()) to requeue and +possibly wake the waiting tasks. Internally, this system call is +still handled by futex_requeue (by passing requeue_pi=1). Before +requeueing, futex_requeue() attempts to acquire the requeue target +PI futex on behalf of the top waiter. If it can, this waiter is +woken. futex_requeue() then proceeds to requeue the remaining +nr_wake+nr_requeue tasks to the PI futex, calling +rt_mutex_start_proxy_lock() prior to each requeue to prepare the +task as a waiter on the underlying rt_mutex. It is possible that +the lock can be acquired at this stage as well, if so, the next +waiter is woken to finish the acquisition of the lock. + +FUTEX_REQUEUE_PI accepts nr_wake and nr_requeue as arguments, but +their sum is all that really matters. futex_requeue() will wake or +requeue up to nr_wake + nr_requeue tasks. It will wake only as many +tasks as it can acquire the lock for, which in the majority of cases +should be 0 as good programming practice dictates that the caller of +either pthread_cond_broadcast() or pthread_cond_signal() acquire the +mutex prior to making the call. FUTEX_REQUEUE_PI requires that +nr_wake=1. nr_requeue should be INT_MAX for broadcast and 0 for +signal. -- cgit v1.1 From 01f2bd48d08c6bbde12d86b66c760612e33e49a9 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 11 May 2009 08:12:43 +0200 Subject: ALSA: hda - Add missing models for Realtek codecs Added the missing descriptions and the model names for Realtek codecs to the documentation and the config table. Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 36c9712..92c9d3a 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -36,6 +36,7 @@ ALC260 acer Acer TravelMate will Will laptops (PB V7900) replacer Replacer 672V + favorit100 Maxdata Favorit 100XS basic fixed pin assignment (old default model) test for testing/debugging purpose, almost all controls can adjusted. Appearing only when compiled with @@ -85,10 +86,11 @@ ALC269 eeepc-p703 ASUS Eeepc P703 P900A eeepc-p901 ASUS Eeepc P901 S101 fujitsu FSC Amilo + lifebook Fujitsu Lifebook S6420 auto auto-config reading BIOS (default) -ALC662/663 -========== +ALC662/663/272 +============== 3stack-dig 3-stack (2-channel) with SPDIF 3stack-6ch 3-stack (6-channel) 3stack-6ch-dig 3-stack (6-channel) with SPDIF @@ -107,6 +109,8 @@ ALC662/663 asus-mode4 ASUS asus-mode5 ASUS asus-mode6 ASUS + dell Dell with ALC272 + dell-zm1 Dell ZM1 with ALC272 auto auto-config reading BIOS (default) ALC882/885 @@ -118,6 +122,7 @@ ALC882/885 asus-a7j ASUS A7J asus-a7m ASUS A7M macpro MacPro support + mb5 Macbook 5,1 mbp3 Macbook Pro rev3 imac24 iMac 24'' with jack detection w2jc ASUS W2JC @@ -150,6 +155,7 @@ ALC883/888 fujitsu-pi2515 Fujitsu AMILO Pi2515 fujitsu-xa3530 Fujitsu AMILO XA3530 3stack-6ch-intel Intel DG33* boards + asus-p5q ASUS P5Q-EM boards auto auto-config reading BIOS (default) ALC861/660 -- cgit v1.1 From d297366ba692faf1f0384811a6ff0b20c3470b1b Mon Sep 17 00:00:00 2001 From: "H. Peter Anvin" Date: Mon, 11 May 2009 16:06:23 -0700 Subject: x86: document new bzImage fields Document the new bzImage fields for kernel memory placement. [ Impact: adds documentation ] Signed-off-by: H. Peter Anvin --- Documentation/x86/boot.txt | 69 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 65 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt index e020366..cf8dfc7 100644 --- a/Documentation/x86/boot.txt +++ b/Documentation/x86/boot.txt @@ -50,6 +50,11 @@ Protocol 2.08: (Kernel 2.6.26) Added crc32 checksum and ELF format Protocol 2.09: (Kernel 2.6.26) Added a field of 64-bit physical pointer to single linked list of struct setup_data. +Protocol 2.10: (Kernel 2.6.31) A protocol for relaxed alignment + beyond the kernel_alignment added, new init_size and + pref_address fields. + + **** MEMORY LAYOUT The traditional memory map for the kernel loader, used for Image or @@ -173,7 +178,7 @@ Offset Proto Name Meaning 022C/4 2.03+ ramdisk_max Highest legal initrd address 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not -0235/1 N/A pad2 Unused +0235/1 2.10+ min_alignment Minimum alignment, as a power of two 0236/2 N/A pad3 Unused 0238/4 2.06+ cmdline_size Maximum size of the kernel command line 023C/4 2.07+ hardware_subarch Hardware subarchitecture @@ -182,6 +187,8 @@ Offset Proto Name Meaning 024C/4 2.08+ payload_length Length of kernel payload 0250/8 2.09+ setup_data 64-bit physical pointer to linked list of struct setup_data +0258/8 2.10+ pref_address Preferred loading address +0260/4 2.10+ init_size Linear memory required during initialization (1) For backwards compatibility, if the setup_sects field contains 0, the real value is 4. @@ -482,11 +489,19 @@ Protocol: 2.03+ 0x37FFFFFF, you can start your ramdisk at 0x37FE0000.) Field name: kernel_alignment -Type: read (reloc) +Type: read/modify (reloc) Offset/size: 0x230/4 -Protocol: 2.05+ +Protocol: 2.05+ (read), 2.10+ (modify) + + Alignment unit required by the kernel (if relocatable_kernel is + true.) A relocatable kernel that is loaded at an alignment + incompatible with the value in this field will be realigned during + kernel initialization. - Alignment unit required by the kernel (if relocatable_kernel is true.) + Starting with protocol version 2.10, this reflects the kernel + alignment preferred for optimal performance; it is possible for the + loader to modify this field to permit a lesser alignment. See the + min_alignment and pref_address field below. Field name: relocatable_kernel Type: read (reloc) @@ -498,6 +513,22 @@ Protocol: 2.05+ After loading, the boot loader must set the code32_start field to point to the loaded code, or to a boot loader hook. +Field name: min_alignment +Type: read (reloc) +Offset/size: 0x235/1 +Protocol: 2.10+ + + This field, if nonzero, indicates as a power of two the minimum + alignment required, as opposed to preferred, by the kernel to boot. + If a boot loader makes use of this field, it should update the + kernel_alignment field with the alignment unit desired; typically: + + kernel_alignment = 1 << min_alignment + + There may be a considerable performance cost with an excessively + misaligned kernel. Therefore, a loader should typically try each + power-of-two alignment from kernel_alignment down to this alignment. + Field name: cmdline_size Type: read Offset/size: 0x238/4 @@ -582,6 +613,36 @@ Protocol: 2.09+ sure to consider the case where the linked list already contains entries. +Field name: pref_address +Type: read (reloc) +Offset/size: 0x258/8 +Protocol: 2.10+ + + This field, if nonzero, represents a preferred load address for the + kernel. A relocating bootloader should attempt to load at this + address if possible. + + A non-relocatable kernel will unconditionally move itself and to run + at this address. + +Field name: init_size +Type: read +Offset/size: 0x25c/4 + + This field indicates the amount of linear contiguous memory starting + at the kernel runtime start address that the kernel needs before it + is capable of examining its memory map. This is not the same thing + as the total amount of memory the kernel needs to boot, but it can + be used by a relocating boot loader to help select a safe load + address for the kernel. + + The kernel runtime start address is determined by the following algorithm: + + if (relocatable_kernel) + runtime_start = align_up(load_address, kernel_alignment) + else + runtime_start = pref_address + **** THE IMAGE CHECKSUM -- cgit v1.1 From 5031296c57024a78ddad4edfc993367dbf4abb98 Mon Sep 17 00:00:00 2001 From: "H. Peter Anvin" Date: Thu, 7 May 2009 16:54:11 -0700 Subject: x86: add extension fields for bootloader type and version A long ago, in days of yore, it all began with a god named Thor. There were vikings and boats and some plans for a Linux kernel header. Unfortunately, a single 8-bit field was used for bootloader type and version. This has generally worked without *too* much pain, but we're getting close to flat running out of ID fields. Add extension fields for both type and version. The type will be extended if it the old field is 0xE; the version is a simple MSB extension. Keep /proc/sys/kernel/bootloader_type containing (type << 4) + (ver & 0xf) for backwards compatiblity, but also add /proc/sys/kernel/bootloader_version which contains the full version number. [ Impact: new feature to support more bootloaders ] Signed-off-by: H. Peter Anvin --- Documentation/x86/boot.txt | 59 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 52 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt index cf8dfc7..8da3a79 100644 --- a/Documentation/x86/boot.txt +++ b/Documentation/x86/boot.txt @@ -50,10 +50,9 @@ Protocol 2.08: (Kernel 2.6.26) Added crc32 checksum and ELF format Protocol 2.09: (Kernel 2.6.26) Added a field of 64-bit physical pointer to single linked list of struct setup_data. -Protocol 2.10: (Kernel 2.6.31) A protocol for relaxed alignment +Protocol 2.10: (Kernel 2.6.31) Added a protocol for relaxed alignment beyond the kernel_alignment added, new init_size and - pref_address fields. - + pref_address fields. Added extended boot loader IDs. **** MEMORY LAYOUT @@ -173,7 +172,8 @@ Offset Proto Name Meaning 021C/4 2.00+ ramdisk_size initrd size (set by boot loader) 0220/4 2.00+ bootsect_kludge DO NOT USE - for bootsect.S use only 0224/2 2.01+ heap_end_ptr Free memory after setup end -0226/2 N/A pad1 Unused +0226/1 2.02+(3 ext_loader_ver Extended boot loader version +0227/1 2.02+(3 ext_loader_type Extended boot loader ID 0228/4 2.02+ cmd_line_ptr 32-bit pointer to the kernel command line 022C/4 2.03+ ramdisk_max Highest legal initrd address 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel @@ -197,6 +197,8 @@ Offset Proto Name Meaning field are unusable, which means the size of a bzImage kernel cannot be determined. +(3) Ignored, but safe to set, for boot protocols 2.02-2.09. + If the "HdrS" (0x53726448) magic number is not found at offset 0x202, the boot protocol version is "old". Loading an old kernel, the following parameters should be assumed: @@ -350,18 +352,32 @@ Protocol: 2.00+ 0xTV here, where T is an identifier for the boot loader and V is a version number. Otherwise, enter 0xFF here. + For boot loader IDs above T = 0xD, write T = 0xE to this field and + write the extended ID minus 0x10 to the ext_loader_type field. + Similarly, the ext_loader_ver field can be used to provide more than + four bits for the bootloader version. + + For example, for T = 0x15, V = 0x234, write: + + type_of_loader <- 0xE4 + ext_loader_type <- 0x05 + ext_loader_ver <- 0x23 + Assigned boot loader ids: 0 LILO (0x00 reserved for pre-2.00 bootloader) 1 Loadlin 2 bootsect-loader (0x20, all other values reserved) - 3 SYSLINUX - 4 EtherBoot + 3 Syslinux + 4 Etherboot/gPXE 5 ELILO 7 GRUB - 8 U-BOOT + 8 U-Boot 9 Xen A Gujin B Qemu + C Arcturus Networks uCbootloader + E Extended (see ext_loader_type) + F Special (0xFF = undefined) Please contact if you need a bootloader ID value assigned. @@ -460,6 +476,35 @@ Protocol: 2.01+ Set this field to the offset (from the beginning of the real-mode code) of the end of the setup stack/heap, minus 0x0200. +Field name: ext_loader_ver +Type: write (optional) +Offset/size: 0x226/1 +Protocol: 2.02+ + + This field is used as an extension of the version number in the + type_of_loader field. The total version number is considered to be + (type_of_loader & 0x0f) + (ext_loader_ver << 4). + + The use of this field is boot loader specific. If not written, it + is zero. + + Kernels prior to 2.6.31 did not recognize this field, but it is safe + to write for protocol version 2.02 or higher. + +Field name: ext_loader_type +Type: write (obligatory if (type_of_loader & 0xf0) == 0xe0) +Offset/size: 0x227/1 +Protocol: 2.02+ + + This field is used as an extension of the type number in + type_of_loader field. If the type in type_of_loader is 0xE, then + the actual type is (ext_loader_type + 0x10). + + This field is ignored if the type in type_of_loader is not 0xE. + + Kernels prior to 2.6.31 did not recognize this field, but it is safe + to write for protocol version 2.02 or higher. + Field name: cmd_line_ptr Type: write (obligatory) Offset/size: 0x228/4 -- cgit v1.1 From 9541ba1d665542c96f7c0b5b836bbc1fd9d961b6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Chris=20Pockel=C3=A9?= Date: Tue, 12 May 2009 08:08:53 +0200 Subject: ALSA: hda - Add support of Samsung NC10 mini notebook MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add specific configuration for Samsung NC10 mini notebook. Internal mic/speakers will be correctly muted when plugging in external ones. Mixer controls are added for speakers, headphones and PC beep. "Boost" is added for the microphones. Signed-off-by: Chris Pockelé Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 92c9d3a..14ed1d1 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -111,6 +111,7 @@ ALC662/663/272 asus-mode6 ASUS dell Dell with ALC272 dell-zm1 Dell ZM1 with ALC272 + samsung-nc10 Samsung NC10 mini notebook auto auto-config reading BIOS (default) ALC882/885 -- cgit v1.1 From 8cc72361481f00253f1e468ade5795427386d593 Mon Sep 17 00:00:00 2001 From: Wai Yew CHAY Date: Thu, 14 May 2009 08:05:58 +0200 Subject: ALSA: SB X-Fi driver merge MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Sound Blaster X-Fi driver supports Creative solutions based on 20K1 and 20K2 chipsets. Supported hardware : Creative Sound Blaster X-Fi Titanium Fatal1ty® Champion Series Creative Sound Blaster X-Fi Titanium Fatal1ty Professional Series Creative Sound Blaster X-Fi Titanium Professional Audio Creative Sound Blaster X-Fi Titanium Creative Sound Blaster X-Fi Elite Pro Creative Sound Blaster X-Fi Platinum Creative Sound Blaster X-Fi Fatal1ty Creative Sound Blaster X-Fi XtremeGamer Creative Sound Blaster X-Fi XtremeMusic Current release features: * ALSA PCM Playback * ALSA Record * ALSA Mixer Note: * External I/O modules detection not included. Signed-off-by: Wai Yew CHAY Singed-off-by: Ryan RICHARDS Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/ALSA-Configuration.txt | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..c903db8 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -460,6 +460,25 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. The power-management is supported. + Module snd-ctxfi + ---------------- + + Module for Creative Sound Blaster X-Fi boards (20k1 / 20k2 chips) + * Creative Sound Blaster X-Fi Titanium Fatal1ty Champion Series + * Creative Sound Blaster X-Fi Titanium Fatal1ty Professional Series + * Creative Sound Blaster X-Fi Titanium Professional Audio + * Creative Sound Blaster X-Fi Titanium + * Creative Sound Blaster X-Fi Elite Pro + * Creative Sound Blaster X-Fi Platinum + * Creative Sound Blaster X-Fi Fatal1ty + * Creative Sound Blaster X-Fi XtremeGamer + * Creative Sound Blaster X-Fi XtremeMusic + + reference_rate - reference sample rate, 44100 or 48000 (default) + multiple - multiple to ref. sample rate, 1 or 2 (default) + + This module supports multiple cards. + Module snd-darla20 ------------------ -- cgit v1.1 From b3fcb13f1c866ae0330c445c3cb481014c36a02f Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Wed, 13 May 2009 12:44:17 +0000 Subject: gigaset: documentation update Mention handling of unregisteted DECT wireless datasets in README.gigaset. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/README.gigaset | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset index 02c0e93..f6e9eaa 100644 --- a/Documentation/isdn/README.gigaset +++ b/Documentation/isdn/README.gigaset @@ -192,7 +192,6 @@ GigaSet 307x Device Driver 2.6. M105 Undocumented USB Requests ------------------------------ - The Gigaset M105 USB data box understands a couple of useful, but undocumented USB commands. These requests are not used in normal operation (for wireless access to the base), but are needed for access @@ -204,6 +203,20 @@ GigaSet 307x Device Driver M105, try setting that option to "y" via 'make {x,menu}config' and recompiling the driver. +2.7. Unregistered Wireless Devices (M101/M105) + ----------------------------------------- + The main purpose of the ser_gigaset and usb_gigaset drivers is to allow + the M101 and M105 wireless devices to be used as ISDN devices for ISDN + connections through a Gigaset base. Therefore they assume that the device + is registered to a DECT base. + + If the M101/M105 device is not registered to a base, initialization of + the device fails, and a corresponding error message is logged by the + driver. In that situation, a restricted set of functions is available + which includes, in particular, those necessary for registering the device + to a base or for switching it between Fixed Part and Portable Part + modes. For the M105, these commands require the "Support for undocumented + USB requests" configuration option (see section 2.6.) to be enabled. 3. Troubleshooting --------------- @@ -240,6 +253,14 @@ GigaSet 307x Device Driver Recompile the usb_gigaset driver with the kernel configuration option CONFIG_GIGASET_UNDOCREQ set to 'y'. (see section 2.6.) + Problem: + Messages like this: + usb_gigaset 3-2:1.0: Could not initialize the device. + appear in your syslog. + Solution: + Check whether your M10x wireless device is correctly registered to the + Gigaset base. (see section 2.7.) + 3.2. Telling the driver to provide more information ---------------------------------------------- Building the driver with the "Gigaset debugging" kernel configuration -- cgit v1.1 From b88bd95655c7bc059606529e01467594978d7b72 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Wed, 13 May 2009 12:44:18 +0000 Subject: gigaset: remove UNDOCREQ config option Drop the kernel config option GIGASET_UNDOCREQ, permanently activating the code it controlled, as there have been no reports of problems caused by its activation but many problems caused by it being disabled. Also fix a few bad comments while we're at it. Impact: cleanup Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/README.gigaset | 33 +++++---------------------------- 1 file changed, 5 insertions(+), 28 deletions(-) (limited to 'Documentation') diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset index f6e9eaa..f996310 100644 --- a/Documentation/isdn/README.gigaset +++ b/Documentation/isdn/README.gigaset @@ -149,10 +149,8 @@ GigaSet 307x Device Driver configuration files and chat scripts in the gigaset-VERSION/ppp directory in the driver packages from http://sourceforge.net/projects/gigaset307x/. Please note that the USB drivers are not able to change the state of the - control lines (the M105 driver can be configured to use some undocumented - control requests, if you really need the control lines, though). This means - you must use "Stupid Mode" if you are using wvdial or you should use the - nocrtscts option of pppd. + control lines. This means you must use "Stupid Mode" if you are using + wvdial or you should use the nocrtscts option of pppd. You must also assure that the ppp_async module is loaded with the parameter flag_time=0. You can do this e.g. by adding a line like @@ -190,20 +188,7 @@ GigaSet 307x Device Driver You can also use /sys/class/tty/ttyGxy/cidmode for changing the CID mode setting (ttyGxy is ttyGU0 or ttyGB0). -2.6. M105 Undocumented USB Requests - ------------------------------ - The Gigaset M105 USB data box understands a couple of useful, but - undocumented USB commands. These requests are not used in normal - operation (for wireless access to the base), but are needed for access - to the M105's own configuration mode (registration to the base, baudrate - and line format settings, device status queries) via the gigacontr - utility. Their use is controlled by the kernel configuration option - "Support for undocumented USB requests" (CONFIG_GIGASET_UNDOCREQ). If you - encounter error code -ENOTTY when trying to use some features of the - M105, try setting that option to "y" via 'make {x,menu}config' and - recompiling the driver. - -2.7. Unregistered Wireless Devices (M101/M105) +2.6. Unregistered Wireless Devices (M101/M105) ----------------------------------------- The main purpose of the ser_gigaset and usb_gigaset drivers is to allow the M101 and M105 wireless devices to be used as ISDN devices for ISDN @@ -215,8 +200,7 @@ GigaSet 307x Device Driver driver. In that situation, a restricted set of functions is available which includes, in particular, those necessary for registering the device to a base or for switching it between Fixed Part and Portable Part - modes. For the M105, these commands require the "Support for undocumented - USB requests" configuration option (see section 2.6.) to be enabled. + modes. 3. Troubleshooting --------------- @@ -247,19 +231,12 @@ GigaSet 307x Device Driver Select Unimodem mode for all DECT data adapters. (see section 2.4.) Problem: - You want to configure your USB DECT data adapter (M105) but gigacontr - reports an error: "/dev/ttyGU0: Inappropriate ioctl for device". - Solution: - Recompile the usb_gigaset driver with the kernel configuration option - CONFIG_GIGASET_UNDOCREQ set to 'y'. (see section 2.6.) - - Problem: Messages like this: usb_gigaset 3-2:1.0: Could not initialize the device. appear in your syslog. Solution: Check whether your M10x wireless device is correctly registered to the - Gigaset base. (see section 2.7.) + Gigaset base. (see section 2.6.) 3.2. Telling the driver to provide more information ---------------------------------------------- -- cgit v1.1 From 888a589f6be07d624e21e2174d98375e9f95911b Mon Sep 17 00:00:00 2001 From: Yinghai Lu Date: Fri, 15 May 2009 13:59:37 -0700 Subject: mm, x86: remove MEMORY_HOTPLUG_RESERVE related code after: | commit b263295dbffd33b0fbff670720fa178c30e3392a | Author: Christoph Lameter | Date: Wed Jan 30 13:30:47 2008 +0100 | | x86: 64-bit, make sparsemem vmemmap the only memory model we don't have MEMORY_HOTPLUG_RESERVE anymore. Historically, x86-64 had an architecture-specific method for memory hotplug whereby it scanned the SRAT for physical memory ranges that could be potentially used for memory hot-add later. By reserving those ranges without physical memory, the memmap would be allocated and left dormant until needed. This depended on the DISCONTIG memory model which has been removed so the code implementing HOTPLUG_RESERVE is now dead. This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE. (Changelog authored by Mel.) v2: updated changelog, and remove hotadd= in doc [ Impact: remove dead code ] Signed-off-by: Yinghai Lu Reviewed-by: Christoph Lameter Reviewed-by: Mel Gorman Workflow-found-OK-by: Andrew Morton LKML-Reference: <4A0C4910.7090508@kernel.org> Signed-off-by: Ingo Molnar --- Documentation/x86/x86_64/boot-options.txt | 5 ----- 1 file changed, 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 34c1304..2db5893 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -150,11 +150,6 @@ NUMA Otherwise, the remaining system RAM is allocated to an additional node. - numa=hotadd=percent - Only allow hotadd memory to preallocate page structures upto - percent of already available memory. - numa=hotadd=0 will disable hotadd memory. - ACPI acpi=off Don't enable ACPI -- cgit v1.1 From eb4c41d30ba1f973c8b10be3644571ab997ae0d6 Mon Sep 17 00:00:00 2001 From: Torben Schulz Date: Mon, 18 May 2009 15:02:35 +0200 Subject: ALSA: hda - Improved MacBook 3,1 support This patch adds support for MacBook 3,1 sound by adding a model new "mb31" with the appropriate init verbs, mixers and channel modes to the ALC883 configuration. patch_alc882() and patch_alc883() are modified to handle the MacBook 3,1 sound-chip (Realtek ALC889A) correctly. Signed-off-by: Torben Schulz Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 14ed1d1..ebb444d 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -157,6 +157,7 @@ ALC883/888 fujitsu-xa3530 Fujitsu AMILO XA3530 3stack-6ch-intel Intel DG33* boards asus-p5q ASUS P5Q-EM boards + mb31 MacBook 3,1 auto auto-config reading BIOS (default) ALC861/660 -- cgit v1.1 From 070276d5d049f385763dee19112bea08f56c9a0d Mon Sep 17 00:00:00 2001 From: Ben Dooks Date: Sun, 17 May 2009 22:32:23 +0100 Subject: [ARM] S3C24XX: GPIO: Change to macros for GPIO numbering Prepare to remove the large number of S3C2410_GPxn defines by moving to S3C2410_GPx(n) in arch/arm. The following perl was used to change the files: perl -pi~ -e 's/S3C2410_GP([A-Z])([0-9]+)([^_^0-9])/S3C2410_GP\1\(\2\)\3/g' Signed-off-by: Ben Dooks --- Documentation/arm/Samsung-S3C24XX/GPIO.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/arm/Samsung-S3C24XX/GPIO.txt b/Documentation/arm/Samsung-S3C24XX/GPIO.txt index ea7ccfc..948c871 100644 --- a/Documentation/arm/Samsung-S3C24XX/GPIO.txt +++ b/Documentation/arm/Samsung-S3C24XX/GPIO.txt @@ -51,7 +51,7 @@ PIN Numbers ----------- Each pin has an unique number associated with it in regs-gpio.h, - eg S3C2410_GPA0 or S3C2410_GPF1. These defines are used to tell + eg S3C2410_GPA(0) or S3C2410_GPF(1). These defines are used to tell the GPIO functions which pin is to be used. @@ -65,11 +65,11 @@ Configuring a pin Eg: - s3c2410_gpio_cfgpin(S3C2410_GPA0, S3C2410_GPA0_ADDR0); - s3c2410_gpio_cfgpin(S3C2410_GPE8, S3C2410_GPE8_SDDAT1); + s3c2410_gpio_cfgpin(S3C2410_GPA(0), S3C2410_GPA0_ADDR0); + s3c2410_gpio_cfgpin(S3C2410_GPE(8), S3C2410_GPE8_SDDAT1); - which would turn GPA0 into the lowest Address line A0, and set - GPE8 to be connected to the SDIO/MMC controller's SDDAT1 line. + which would turn GPA(0) into the lowest Address line A0, and set + GPE(8) to be connected to the SDIO/MMC controller's SDDAT1 line. Reading the current configuration -- cgit v1.1 From e20dad964aeac229a204e298c563b6ea7ff1e987 Mon Sep 17 00:00:00 2001 From: Wolfgang Grandegger Date: Fri, 15 May 2009 23:39:27 +0000 Subject: can: Documentation for the CAN device driver interface This patch documents the CAN netowrk device drivers interface, removes obsolete documentation and adds some useful links to CAN resources. Signed-off-by: Wolfgang Grandegger Signed-off-by: Oliver Hartkopp Signed-off-by: David S. Miller --- Documentation/networking/can.txt | 235 ++++++++++++++++++++++++++++++++------- 1 file changed, 197 insertions(+), 38 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt index 2035bc4..6cd6627 100644 --- a/Documentation/networking/can.txt +++ b/Documentation/networking/can.txt @@ -36,10 +36,15 @@ This file contains 6.2 local loopback of sent frames 6.3 CAN controller hardware filters 6.4 The virtual CAN driver (vcan) - 6.5 currently supported CAN hardware - 6.6 todo + 6.5 The CAN network device driver interface + 6.5.1 Netlink interface to set/get devices properties + 6.5.2 Setting the CAN bit-timing + 6.5.3 Starting and stopping the CAN network device + 6.6 supported CAN hardware - 7 Credits + 7 Socket CAN resources + + 8 Credits ============================================================================ @@ -234,6 +239,8 @@ solution for a couple of reasons: the user application using the common CAN filter mechanisms. Inside this filter definition the (interested) type of errors may be selected. The reception of error frames is disabled by default. + The format of the CAN error frame is briefly decribed in the Linux + header file "include/linux/can/error.h". 4. How to use Socket CAN ------------------------ @@ -605,61 +612,213 @@ solution for a couple of reasons: removal of vcan network devices can be managed with the ip(8) tool: - Create a virtual CAN network interface: - ip link add type vcan + $ ip link add type vcan - Create a virtual CAN network interface with a specific name 'vcan42': - ip link add dev vcan42 type vcan + $ ip link add dev vcan42 type vcan - Remove a (virtual CAN) network interface 'vcan42': - ip link del vcan42 - - The tool 'vcan' from the SocketCAN SVN repository on BerliOS is obsolete. - - Virtual CAN network device creation in older Kernels: - In Linux Kernel versions < 2.6.24 the vcan driver creates 4 vcan - netdevices at module load time by default. This value can be changed - with the module parameter 'numdev'. E.g. 'modprobe vcan numdev=8' - - 6.5 currently supported CAN hardware + $ ip link del vcan42 + + 6.5 The CAN network device driver interface + + The CAN network device driver interface provides a generic interface + to setup, configure and monitor CAN network devices. The user can then + configure the CAN device, like setting the bit-timing parameters, via + the netlink interface using the program "ip" from the "IPROUTE2" + utility suite. The following chapter describes briefly how to use it. + Furthermore, the interface uses a common data structure and exports a + set of common functions, which all real CAN network device drivers + should use. Please have a look to the SJA1000 or MSCAN driver to + understand how to use them. The name of the module is can-dev.ko. + + 6.5.1 Netlink interface to set/get devices properties + + The CAN device must be configured via netlink interface. The supported + netlink message types are defined and briefly described in + "include/linux/can/netlink.h". CAN link support for the program "ip" + of the IPROUTE2 utility suite is avaiable and it can be used as shown + below: + + - Setting CAN device properties: + + $ ip link set can0 type can help + Usage: ip link set DEVICE type can + [ bitrate BITRATE [ sample-point SAMPLE-POINT] ] | + [ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1 + phase-seg2 PHASE-SEG2 [ sjw SJW ] ] + + [ loopback { on | off } ] + [ listen-only { on | off } ] + [ triple-sampling { on | off } ] + + [ restart-ms TIME-MS ] + [ restart ] + + Where: BITRATE := { 1..1000000 } + SAMPLE-POINT := { 0.000..0.999 } + TQ := { NUMBER } + PROP-SEG := { 1..8 } + PHASE-SEG1 := { 1..8 } + PHASE-SEG2 := { 1..8 } + SJW := { 1..4 } + RESTART-MS := { 0 | NUMBER } + + - Display CAN device details and statistics: + + $ ip -details -statistics link show can0 + 2: can0: mtu 16 qdisc pfifo_fast state UP qlen 10 + link/can + can state ERROR-ACTIVE restart-ms 100 + bitrate 125000 sample_point 0.875 + tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1 + sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1 + clock 8000000 + re-started bus-errors arbit-lost error-warn error-pass bus-off + 41 17457 0 41 42 41 + RX: bytes packets errors dropped overrun mcast + 140859 17608 17457 0 0 0 + TX: bytes packets errors dropped carrier collsns + 861 112 0 41 0 0 + + More info to the above output: + + "" + Shows the list of selected CAN controller modes: LOOPBACK, + LISTEN-ONLY, or TRIPLE-SAMPLING. + + "state ERROR-ACTIVE" + The current state of the CAN controller: "ERROR-ACTIVE", + "ERROR-WARNING", "ERROR-PASSIVE", "BUS-OFF" or "STOPPED" + + "restart-ms 100" + Automatic restart delay time. If set to a non-zero value, a + restart of the CAN controller will be triggered automatically + in case of a bus-off condition after the specified delay time + in milliseconds. By default it's off. + + "bitrate 125000 sample_point 0.875" + Shows the real bit-rate in bits/sec and the sample-point in the + range 0.000..0.999. If the calculation of bit-timing parameters + is enabled in the kernel (CONFIG_CAN_CALC_BITTIMING=y), the + bit-timing can be defined by setting the "bitrate" argument. + Optionally the "sample-point" can be specified. By default it's + 0.000 assuming CIA-recommended sample-points. + + "tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1" + Shows the time quanta in ns, propagation segment, phase buffer + segment 1 and 2 and the synchronisation jump width in units of + tq. They allow to define the CAN bit-timing in a hardware + independent format as proposed by the Bosch CAN 2.0 spec (see + chapter 8 of http://www.semiconductors.bosch.de/pdf/can2spec.pdf). + + "sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1 + clock 8000000" + Shows the bit-timing constants of the CAN controller, here the + "sja1000". The minimum and maximum values of the time segment 1 + and 2, the synchronisation jump width in units of tq, the + bitrate pre-scaler and the CAN system clock frequency in Hz. + These constants could be used for user-defined (non-standard) + bit-timing calculation algorithms in user-space. + + "re-started bus-errors arbit-lost error-warn error-pass bus-off" + Shows the number of restarts, bus and arbitration lost errors, + and the state changes to the error-warning, error-passive and + bus-off state. RX overrun errors are listed in the "overrun" + field of the standard network statistics. + + 6.5.2 Setting the CAN bit-timing + + The CAN bit-timing parameters can always be defined in a hardware + independent format as proposed in the Bosch CAN 2.0 specification + specifying the arguments "tq", "prop_seg", "phase_seg1", "phase_seg2" + and "sjw": + + $ ip link set canX type can tq 125 prop-seg 6 \ + phase-seg1 7 phase-seg2 2 sjw 1 + + If the kernel option CONFIG_CAN_CALC_BITTIMING is enabled, CIA + recommended CAN bit-timing parameters will be calculated if the bit- + rate is specified with the argument "bitrate": + + $ ip link set canX type can bitrate 125000 + + Note that this works fine for the most common CAN controllers with + standard bit-rates but may *fail* for exotic bit-rates or CAN system + clock frequencies. Disabling CONFIG_CAN_CALC_BITTIMING saves some + space and allows user-space tools to solely determine and set the + bit-timing parameters. The CAN controller specific bit-timing + constants can be used for that purpose. They are listed by the + following command: + + $ ip -details link show can0 + ... + sja1000: clock 8000000 tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1 + + 6.5.3 Starting and stopping the CAN network device + + A CAN network device is started or stopped as usual with the command + "ifconfig canX up/down" or "ip link set canX up/down". Be aware that + you *must* define proper bit-timing parameters for real CAN devices + before you can start it to avoid error-prone default settings: + + $ ip link set canX up type can bitrate 125000 + + A device may enter the "bus-off" state if too much errors occurred on + the CAN bus. Then no more messages are received or sent. An automatic + bus-off recovery can be enabled by setting the "restart-ms" to a + non-zero value, e.g.: + + $ ip link set canX type can restart-ms 100 + + Alternatively, the application may realize the "bus-off" condition + by monitoring CAN error frames and do a restart when appropriate with + the command: + + $ ip link set canX type can restart + + Note that a restart will also create a CAN error frame (see also + chapter 3.4). - On the project website http://developer.berlios.de/projects/socketcan - there are different drivers available: + 6.6 Supported CAN hardware - vcan: Virtual CAN interface driver (if no real hardware is available) - sja1000: Philips SJA1000 CAN controller (recommended) - i82527: Intel i82527 CAN controller - mscan: Motorola/Freescale CAN controller (e.g. inside SOC MPC5200) - ccan: CCAN controller core (e.g. inside SOC h7202) - slcan: For a bunch of CAN adaptors that are attached via a - serial line ASCII protocol (for serial / USB adaptors) + Please check the "Kconfig" file in "drivers/net/can" to get an actual + list of the support CAN hardware. On the Socket CAN project website + (see chapter 7) there might be further drivers available, also for + older kernel versions. - Additionally the different CAN adaptors (ISA/PCI/PCMCIA/USB/Parport) - from PEAK Systemtechnik support the CAN netdevice driver model - since Linux driver v6.0: http://www.peak-system.com/linux/index.htm +7. Socket CAN resources +----------------------- - Please check the Mailing Lists on the berlios OSS project website. + You can find further resources for Socket CAN like user space tools, + support for old kernel versions, more drivers, mailing lists, etc. + at the BerliOS OSS project website for Socket CAN: - 6.6 todo + http://developer.berlios.de/projects/socketcan - The configuration interface for CAN network drivers is still an open - issue that has not been finalized in the socketcan project. Also the - idea of having a library module (candev.ko) that holds functions - that are needed by all CAN netdevices is not ready to ship. - Your contribution is welcome. + If you have questions, bug fixes, etc., don't hesitate to post them to + the Socketcan-Users mailing list. But please search the archives first. -7. Credits +8. Credits ---------- - Oliver Hartkopp (PF_CAN core, filters, drivers, bcm) + Oliver Hartkopp (PF_CAN core, filters, drivers, bcm, SJA1000 driver) Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan) Jan Kizka (RT-SocketCAN core, Socket-API reconciliation) - Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews) + Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews, + CAN device driver interface, MSCAN driver) Robert Schwebel (design reviews, PTXdist integration) Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers) Benedikt Spranger (reviews) Thomas Gleixner (LKML reviews, coding style, posting hints) - Andrey Volkov (kernel subtree structure, ioctls, mscan driver) + Andrey Volkov (kernel subtree structure, ioctls, MSCAN driver) Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003) Klaus Hitschler (PEAK driver integration) Uwe Koppe (CAN netdevices with PF_PACKET approach) Michael Schulze (driver layer loopback requirement, RT CAN drivers review) + Pavel Pisa (Bit-timing calculation) + Sascha Hauer (SJA1000 platform driver) + Sebastian Haas (SJA1000 EMS PCI driver) + Markus Plessing (SJA1000 EMS PCI driver) + Per Dalen (SJA1000 Kvaser PCI driver) + Sam Ravnborg (reviews, coding style, kbuild help) -- cgit v1.1 From 69e3c75f4d541a6eb151b3ef91f34033cb3ad6e1 Mon Sep 17 00:00:00 2001 From: Johann Baudy Date: Mon, 18 May 2009 22:11:22 -0700 Subject: net: TX_RING and packet mmap New packet socket feature that makes packet socket more efficient for transmission. - It reduces number of system call through a PACKET_TX_RING mechanism, based on PACKET_RX_RING (Circular buffer allocated in kernel space which is mmapped from user space). - It minimizes CPU copy using fragmented SKB (almost zero copy). Signed-off-by: Johann Baudy Signed-off-by: David S. Miller --- Documentation/networking/packet_mmap.txt | 140 ++++++++++++++++++++++++++----- 1 file changed, 121 insertions(+), 19 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index 07c53d5..a22fd85 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -4,16 +4,18 @@ This file documents the CONFIG_PACKET_MMAP option available with the PACKET socket interface on 2.4 and 2.6 kernels. This type of sockets is used for -capture network traffic with utilities like tcpdump or any other that uses -the libpcap library. - -You can find the latest version of this document at +capture network traffic with utilities like tcpdump or any other that needs +raw access to network interface. +You can find the latest version of this document at: http://pusa.uv.es/~ulisses/packet_mmap/ -Please send me your comments to +Howto can be found at: + http://wiki.gnu-log.net (packet_mmap) +Please send your comments to Ulisses Alonso Camaró + Johann Baudy ------------------------------------------------------------------------------- + Why use PACKET_MMAP @@ -25,19 +27,24 @@ to capture each packet, it requires two if you want to get packet's timestamp (like libpcap always does). In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size -configurable circular buffer mapped in user space. This way reading packets just -needs to wait for them, most of the time there is no need to issue a single -system call. By using a shared buffer between the kernel and the user -also has the benefit of minimizing packet copies. - -It's fine to use PACKET_MMAP to improve the performance of the capture process, -but it isn't everything. At least, if you are capturing at high speeds (this -is relative to the cpu speed), you should check if the device driver of your -network interface card supports some sort of interrupt load mitigation or -(even better) if it supports NAPI, also make sure it is enabled. +configurable circular buffer mapped in user space that can be used to either +send or receive packets. This way reading packets just needs to wait for them, +most of the time there is no need to issue a single system call. Concerning +transmission, multiple packets can be sent through one system call to get the +highest bandwidth. +By using a shared buffer between the kernel and the user also has the benefit +of minimizing packet copies. + +It's fine to use PACKET_MMAP to improve the performance of the capture and +transmission process, but it isn't everything. At least, if you are capturing +at high speeds (this is relative to the cpu speed), you should check if the +device driver of your network interface card supports some sort of interrupt +load mitigation or (even better) if it supports NAPI, also make sure it is +enabled. For transmission, check the MTU (Maximum Transmission Unit) used and +supported by devices of your network. -------------------------------------------------------------------------------- -+ How to use CONFIG_PACKET_MMAP ++ How to use CONFIG_PACKET_MMAP to improve capture process -------------------------------------------------------------------------------- From the user standpoint, you should use the higher level libpcap library, which @@ -57,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP support. -------------------------------------------------------------------------------- -+ How to use CONFIG_PACKET_MMAP directly ++ How to use CONFIG_PACKET_MMAP directly to improve capture process -------------------------------------------------------------------------------- From the system calls stand point, the use of PACKET_MMAP involves @@ -66,6 +73,7 @@ the following process: [setup] socket() -------> creation of the capture socket setsockopt() ---> allocation of the circular buffer (ring) + option: PACKET_RX_RING mmap() ---------> mapping of the allocated buffer to the user process @@ -97,13 +105,75 @@ also the mapping of the circular buffer in the user process and the use of this buffer. -------------------------------------------------------------------------------- ++ How to use CONFIG_PACKET_MMAP directly to improve transmission process +-------------------------------------------------------------------------------- +Transmission process is similar to capture as shown below. + +[setup] socket() -------> creation of the transmission socket + setsockopt() ---> allocation of the circular buffer (ring) + option: PACKET_TX_RING + bind() ---------> bind transmission socket with a network interface + mmap() ---------> mapping of the allocated buffer to the + user process + +[transmission] poll() ---------> wait for free packets (optional) + send() ---------> send all packets that are set as ready in + the ring + The flag MSG_DONTWAIT can be used to return + before end of transfer. + +[shutdown] close() --------> destruction of the transmission socket and + deallocation of all associated resources. + +Binding the socket to your network interface is mandatory (with zero copy) to +know the header size of frames used in the circular buffer. + +As capture, each frame contains two parts: + + -------------------- +| struct tpacket_hdr | Header. It contains the status of +| | of this frame +|--------------------| +| data buffer | +. . Data that will be sent over the network interface. +. . + -------------------- + + bind() associates the socket to your network interface thanks to + sll_ifindex parameter of struct sockaddr_ll. + + Initialization example: + + struct sockaddr_ll my_addr; + struct ifreq s_ifr; + ... + + strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name)); + + /* get interface index of eth0 */ + ioctl(this->socket, SIOCGIFINDEX, &s_ifr); + + /* fill sockaddr_ll struct to prepare binding */ + my_addr.sll_family = AF_PACKET; + my_addr.sll_protocol = ETH_P_ALL; + my_addr.sll_ifindex = s_ifr.ifr_ifindex; + + /* bind socket to eth0 */ + bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll)); + + A complete tutorial is available at: http://wiki.gnu-log.net/ + +-------------------------------------------------------------------------------- + PACKET_MMAP settings -------------------------------------------------------------------------------- To setup PACKET_MMAP from user level code is done with a call like + - Capture process setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req)) + - Transmission process + setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req)) The most significant argument in the previous call is the req parameter, this parameter must to have the following structure: @@ -117,11 +187,11 @@ this parameter must to have the following structure: }; This structure is defined in /usr/include/linux/if_packet.h and establishes a -circular buffer (ring) of unswappable memory mapped in the capture process. +circular buffer (ring) of unswappable memory. Being mapped in the capture process allows reading the captured frames and related meta-information like timestamps without requiring a system call. -Captured frames are grouped in blocks. Each block is a physically contiguous +Frames are grouped in blocks. Each block is a physically contiguous region of memory and holds tp_block_size/tp_frame_size frames. The total number of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because @@ -336,6 +406,7 @@ struct tpacket_hdr). If this field is 0 means that the frame is ready to be used for the kernel, If not, there is a frame the user can read and the following flags apply: ++++ Capture process: from include/linux/if_packet.h #define TP_STATUS_COPY 2 @@ -391,6 +462,37 @@ packets are in the ring: It doesn't incur in a race condition to first check the status value and then poll for frames. + +++ Transmission process +Those defines are also used for transmission: + + #define TP_STATUS_AVAILABLE 0 // Frame is available + #define TP_STATUS_SEND_REQUEST 1 // Frame will be sent on next send() + #define TP_STATUS_SENDING 2 // Frame is currently in transmission + #define TP_STATUS_WRONG_FORMAT 4 // Frame format is not correct + +First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a +packet, the user fills a data buffer of an available frame, sets tp_len to +current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST. +This can be done on multiple frames. Once the user is ready to transmit, it +calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are +forwarded to the network device. The kernel updates each status of sent +frames with TP_STATUS_SENDING until the end of transfer. +At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE. + + header->tp_len = in_i_size; + header->tp_status = TP_STATUS_SEND_REQUEST; + retval = send(this->socket, NULL, 0, 0); + +The user can also use poll() to check if a buffer is available: +(status == TP_STATUS_SENDING) + + struct pollfd pfd; + pfd.fd = fd; + pfd.revents = 0; + pfd.events = POLLOUT; + retval = poll(&pfd, 1, timeout); + -------------------------------------------------------------------------------- + THANKS -------------------------------------------------------------------------------- -- cgit v1.1 From 6ca05ee11f3da9484c5323aeaec7cee540af70ed Mon Sep 17 00:00:00 2001 From: Kumar Gala Date: Mon, 27 Apr 2009 13:17:47 -0500 Subject: powerpc/85xx: Add binding for LAWs and ECM The first 4k region of CCSR space is well defined for local access windows, CCSRBAR, etc. The second 4k region is well defined as register for configuring and getting errors for the ECM coherency module. Signed-off-by: Kumar Gala --- Documentation/powerpc/dts-bindings/ecm.txt | 64 ++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 Documentation/powerpc/dts-bindings/ecm.txt (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/ecm.txt b/Documentation/powerpc/dts-bindings/ecm.txt new file mode 100644 index 0000000..f514f29 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/ecm.txt @@ -0,0 +1,64 @@ +===================================================================== +E500 LAW & Coherency Module Device Tree Binding +Copyright (C) 2009 Freescale Semiconductor Inc. +===================================================================== + +Local Access Window (LAW) Node + +The LAW node represents the region of CCSR space where local access +windows are configured. For ECM based devices this is the first 4k +of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some +number of local access windows as specified by fsl,num-laws. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,ecm-law" + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - fsl,num-laws + Usage: required + Value type: + Definition: The value specifies the number of local access + windows for this device. + +===================================================================== + +E500 Coherency Module Node + +The E500 LAW node represents the region of CCSR space where ECM config +and error reporting registers exist, this is the second 4k (0x1000) +of CCSR space. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,CHIP-ecm", "fsl,ecm" where + CHIP is the processor (mpc8572, mpc8544, etc.) + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - interrupts + Usage: required + Value type: + + - interrupt-parent + Usage: required + Value type: + +===================================================================== -- cgit v1.1 From 5a928079c9b95e7c34526b07a3bb80b8d062347f Mon Sep 17 00:00:00 2001 From: Kumar Gala Date: Wed, 22 Apr 2009 13:12:14 -0500 Subject: powerpc/86xx: Add binding for LAWs and MCM The first 4k region of CCSR space is well defined for local access windows, CCSRBAR, etc. The second 4k region is well defined as register for configuring and getting errors for the MPX coherency module. Signed-off-by: Kumar Gala --- Documentation/powerpc/dts-bindings/fsl/mcm.txt | 64 ++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 Documentation/powerpc/dts-bindings/fsl/mcm.txt (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/mcm.txt b/Documentation/powerpc/dts-bindings/fsl/mcm.txt new file mode 100644 index 0000000..4ceda9b --- /dev/null +++ b/Documentation/powerpc/dts-bindings/fsl/mcm.txt @@ -0,0 +1,64 @@ +===================================================================== +MPX LAW & Coherency Module Device Tree Binding +Copyright (C) 2009 Freescale Semiconductor Inc. +===================================================================== + +Local Access Window (LAW) Node + +The LAW node represents the region of CCSR space where local access +windows are configured. For MCM based devices this is the first 4k +of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some +number of local access windows as specified by fsl,num-laws. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,mcm-law" + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - fsl,num-laws + Usage: required + Value type: + Definition: The value specifies the number of local access + windows for this device. + +===================================================================== + +MPX Coherency Module Node + +The MPX LAW node represents the region of CCSR space where MCM config +and error reporting registers exist, this is the second 4k (0x1000) +of CCSR space. + +PROPERTIES + + - compatible + Usage: required + Value type: + Definition: Must include "fsl,CHIP-mcm", "fsl,mcm" where + CHIP is the processor (mpc8641, mpc8610, etc.) + + - reg + Usage: required + Value type: + Definition: A standard property. The value specifies the + physical address offset and length of the CCSR space + registers. + + - interrupts + Usage: required + Value type: + + - interrupt-parent + Usage: required + Value type: + +===================================================================== -- cgit v1.1 From 06c4435021f4856261edd01e2691071edeb8fa51 Mon Sep 17 00:00:00 2001 From: Haiying Wang Date: Fri, 1 May 2009 15:40:47 -0400 Subject: powerpc/qe: update risc allocation for QE Change the RISC allocation to macros instead of enum, add function to read the number of risc engines from the new property "fsl,qe-num-riscs" under the qe node in dts. Add new property "fsl,qe-num-riscs" description in qe.txt Signed-off-by: Haiying Wang Acked-by: Timur Tabi Signed-off-by: Kumar Gala --- Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt index 78790d5..39b5d1f 100644 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt +++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt @@ -17,6 +17,7 @@ Required properties: - model : precise model of the QE, Can be "QE", "CPM", or "CPM2" - reg : offset and length of the device registers. - bus-frequency : the clock frequency for QUICC Engine. +- fsl,qe-num-riscs: define how many RISC engines the QE has. Recommended properties - brg-frequency : the internal clock source frequency for baud-rate -- cgit v1.1 From 98ca77af23da6682bb3e34961a3f32e2c064a4ce Mon Sep 17 00:00:00 2001 From: Haiying Wang Date: Fri, 1 May 2009 15:40:48 -0400 Subject: powerpc/qe: update QE Serial Number The latest QE chip may have more Serial Number(SNUM)s of thread to use. We will get the number of SNUMs from device tree by reading the new property "fsl,qe-num-snums", and set 28 as the default number of SNUMs so that it is compatible with the old QE chips' device trees which don't have this new property. The macro QE_NUM_OF_SNUM is defined as the maximum number in QE snum table which is 256. Also we update the snum_init[] array with 18 more new SNUMs which are confirmed to be useful on new chip. Signed-off-by: Haiying Wang Acked-by: Timur Tabi Signed-off-by: Kumar Gala --- Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt index 39b5d1f..6e37be1 100644 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt +++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/qe.txt @@ -18,6 +18,8 @@ Required properties: - reg : offset and length of the device registers. - bus-frequency : the clock frequency for QUICC Engine. - fsl,qe-num-riscs: define how many RISC engines the QE has. +- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the + threads. Recommended properties - brg-frequency : the internal clock source frequency for baud-rate -- cgit v1.1 From 404614728f857d0ac63d29c3a29d0cf392a15598 Mon Sep 17 00:00:00 2001 From: Kumar Gala Date: Fri, 8 May 2009 08:47:45 -0500 Subject: powerpc/fsl: Update FSL esdhc binding Updated the binding spec to use "fsl,eshdc" as the base compatible rather than the first chip in the family. Signed-off-by: Kumar Gala --- Documentation/powerpc/dts-bindings/fsl/esdhc.txt | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt index 6008465..5093ddf 100644 --- a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt +++ b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt @@ -5,8 +5,7 @@ for MMC, SD, and SDIO types of memory cards. Required properties: - compatible : should be - "fsl,-esdhc", "fsl,mpc8379-esdhc" for MPC83xx processors. - "fsl,-esdhc", "fsl,mpc8536-esdhc" for MPC85xx processors. + "fsl,-esdhc", "fsl,esdhc" - reg : should contain eSDHC registers location and length. - interrupts : should contain eSDHC interrupt. - interrupt-parent : interrupt source phandle. @@ -15,7 +14,7 @@ Required properties: Example: sdhci@2e000 { - compatible = "fsl,mpc8378-esdhc", "fsl,mpc8379-esdhc"; + compatible = "fsl,mpc8378-esdhc", "fsl,esdhc"; reg = <0x2e000 0x1000>; interrupts = <42 0x8>; interrupt-parent = <&ipic>; -- cgit v1.1 From 143c145e3a475065a4be661468d0df1bd0b25f74 Mon Sep 17 00:00:00 2001 From: Li Zefan Date: Tue, 19 May 2009 14:43:15 +0800 Subject: tracing/events: Documentation updates - fix some typos - document the difference between '>' and '>>' - document the 'enable' toggle - remove section "Defining an event-enabled tracepoint", since it's out-dated and sample/trace_events/ already serves this purpose. v2: add "Updated by Li Zefan" [ Impact: make documentation up-to-date ] Signed-off-by: Li Zefan Cc: Steven Rostedt Cc: Frederic Weisbecker Cc: "Theodore Ts'o" LKML-Reference: <4A125503.5060406@cn.fujitsu.com> Signed-off-by: Ingo Molnar --- Documentation/trace/events.txt | 159 +++++++++++++++-------------------------- 1 file changed, 57 insertions(+), 102 deletions(-) (limited to 'Documentation') diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt index abdee66..f157d75 100644 --- a/Documentation/trace/events.txt +++ b/Documentation/trace/events.txt @@ -1,9 +1,10 @@ Event Tracing Documentation written by Theodore Ts'o + Updated by Li Zefan -Introduction -============ +1. Introduction +=============== Tracepoints (see Documentation/trace/tracepoints.txt) can be used without creating custom kernel modules to register probe functions @@ -12,30 +13,37 @@ using the event tracing infrastructure. Not all tracepoints can be traced using the event tracing system; the kernel developer must provide code snippets which define how the tracing information is saved into the tracing buffer, and how the -the tracing information should be printed. +tracing information should be printed. -Using Event Tracing -=================== +2. Using Event Tracing +====================== + +2.1 Via the 'set_event' interface +--------------------------------- The events which are available for tracing can be found in the file -/sys/kernel/debug/tracing/available_events. +/debug/tracing/available_events. To enable a particular event, such as 'sched_wakeup', simply echo it -to /sys/debug/tracing/set_event. For example: +to /debug/tracing/set_event. For example: - # echo sched_wakeup > /sys/kernel/debug/tracing/set_event + # echo sched_wakeup >> /debug/tracing/set_event -[ Note: events can also be enabled/disabled via the 'enabled' toggle - found in the /sys/kernel/tracing/events/ hierarchy of directories. ] +[ Note: '>>' is necessary, otherwise it will firstly disable + all the events. ] To disable an event, echo the event name to the set_event file prefixed with an exclamation point: - # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event + # echo '!sched_wakeup' >> /debug/tracing/set_event + +To disable all events, echo an empty line to the set_event file: + + # echo > /debug/tracing/set_event -To disable events, echo an empty line to the set_event file: +To enable all events, echo '*:*' or '*:' to the set_event file: - # echo > /sys/kernel/debug/tracing/set_event + # echo *:* > /debug/tracing/set_event The events are organized into subsystems, such as ext4, irq, sched, etc., and a full event name looks like this: :. The @@ -44,92 +52,39 @@ file. All of the events in a subsystem can be specified via the syntax ":*"; for example, to enable all irq events, you can use the command: - # echo 'irq:*' > /sys/kernel/debug/tracing/set_event - -Defining an event-enabled tracepoint ------------------------------------- - -A kernel developer which wishes to define an event-enabled tracepoint -must declare the tracepoint using TRACE_EVENT instead of DECLARE_TRACE. -This is done via two header files in include/trace. For example, to -event-enable the jbd2 subsystem, we must create two files, -include/trace/jbd2.h and include/trace/jbd2_event_types.h. The -include/trace/jbd2.h file should be included by kernel source files that -will have a tracepoint inserted, and might look like this: - -#ifndef _TRACE_JBD2_H -#define _TRACE_JBD2_H - -#include -#include - -#include - -#endif - -In a file that utilizes a jbd2 tracepoint, this header file would be -included. Note that you still have to use DEFINE_TRACE(). So for -example, if fs/jbd2/commit.c planned to use the jbd2_start_commit -tracepoint, it would have the following near the beginning of the file: - -#include - -DEFINE_TRACE(jbd2_start_commit); - -Then in the function that would call the tracepoint, it would call the -tracepoint function. (For more information, please see the tracepoint -documentation in Documentation/trace/tracepoints.txt): - - trace_jbd2_start_commit(journal, commit_transaction); - -The code snippets which allow jbd2_start_commit to be an event-enabled -tracepoint are placed in the file include/trace/jbd2_event_types.h: - -/* use instead */ -#ifndef TRACE_EVENT -# error Do not include this file directly. -# error Unless you know what you are doing. -#endif - -#undef TRACE_SYSTEM -#define TRACE_SYSTEM jbd2 - -#include - -TRACE_EVENT(jbd2_start_commit, - TP_PROTO(journal_t *journal, transaction_t *commit_transaction), - TP_ARGS(journal, commit_transaction), - TP_STRUCT__entry( - __array( char, devname, BDEVNAME_SIZE+24 ) - __field( int, transaction ) - ), - TP_fast_assign( - memcpy(__entry->devname, journal->j_devname, BDEVNAME_SIZE+24); - __entry->transaction = commit_transaction->t_tid; - ), - TP_printk("dev %s transaction %d", - __entry->devname, __entry->transaction) -); - -The TP_PROTO and TP_ARGS are unchanged from DECLARE_TRACE. The new -arguments to TRACE_EVENT are TP_STRUCT__entry, TP_fast_assign, and -TP_printk. - -TP_STRUCT__entry defines the data structure which will be stored in the -trace buffer. Normally, fields in __entry will be arrays or simple -types. It is possible to place data structures in __entry --- however, -pointers in the data structure can not be trusted, since they will be -accessed sometime later by TP_printk, and if the data structure contains -fields that will not or cannot be used by TP_printk, this will waste -space in the trace buffer. In general, data structures should be -avoided, unless they do only contain non-pointer types and all of the -fields will be used by TP_printk. - -TP_fast_assign defines the code snippet which saves information into the -__entry data structure, using the passed-in arguments defined in -TP_PROTO and TP_ARGS. - -Finally, TP_printk will print the __entry data structure. At the time -when the code snippet defined by TP_printk is executed, it will not have -access to the TP_ARGS arguments; it can only use the information saved -in the __entry data structure. + # echo 'irq:*' > /debug/tracing/set_event + +2.2 Via the 'enable' toggle +--------------------------- + +The events available are also listed in /debug/tracing/events/ hierarchy +of directories. + +To enable event 'sched_wakeup': + + # echo 1 > /debug/tracing/events/sched/sched_wakeup/enable + +To disable it: + + # echo 0 > /debug/tracing/events/sched/sched_wakeup/enable + +To enable all events in sched subsystem: + + # echo 1 > /debug/tracing/events/sched/enable + +To eanble all events: + + # echo 1 > /debug/tracing/events/enable + +When reading one of these enable files, there are four results: + + 0 - all events this file affects are disabled + 1 - all events this file affects are enabled + X - there is a mixture of events enabled and disabled + ? - this file does not affect any event + +3. Defining an event-enabled tracepoint +======================================= + +See The example provided in samples/trace_events + -- cgit v1.1 From e9ccb73ab57ada469602506496c42e5b4468ac3e Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Tue, 19 May 2009 10:23:23 +0100 Subject: GFS2: Update docs Update a few things which were out of date, and fix a typo. Signed-off-by: Steven Whitehouse --- Documentation/filesystems/gfs2-glocks.txt | 2 +- Documentation/filesystems/gfs2.txt | 19 +++++++++++-------- 2 files changed, 12 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/gfs2-glocks.txt b/Documentation/filesystems/gfs2-glocks.txt index 4dae9a3..0494f78 100644 --- a/Documentation/filesystems/gfs2-glocks.txt +++ b/Documentation/filesystems/gfs2-glocks.txt @@ -60,7 +60,7 @@ go_lock | Called for the first local holder of a lock go_unlock | Called on the final local unlock of a lock go_dump | Called to print content of object for debugfs file, or on | error to dump glock to the log. -go_type; | The type of the glock, LM_TYPE_..... +go_type | The type of the glock, LM_TYPE_..... go_min_hold_time | The minimum hold time The minimum hold time for each lock is the time after a remote lock diff --git a/Documentation/filesystems/gfs2.txt b/Documentation/filesystems/gfs2.txt index 593004b..5e3ab8f 100644 --- a/Documentation/filesystems/gfs2.txt +++ b/Documentation/filesystems/gfs2.txt @@ -11,18 +11,15 @@ their I/O so file system consistency is maintained. One of the nifty features of GFS is perfect consistency -- changes made to the file system on one machine show up immediately on all other machines in the cluster. -GFS uses interchangable inter-node locking mechanisms. Different lock -modules can plug into GFS and each file system selects the appropriate -lock module at mount time. Lock modules include: +GFS uses interchangable inter-node locking mechanisms, the currently +supported mechanisms are: lock_nolock -- allows gfs to be used as a local file system lock_dlm -- uses a distributed lock manager (dlm) for inter-node locking The dlm is found at linux/fs/dlm/ -In addition to interfacing with an external locking manager, a gfs lock -module is responsible for interacting with external cluster management -systems. Lock_dlm depends on user space cluster management systems found +Lock_dlm depends on user space cluster management systems found at the URL above. To use gfs as a local file system, no external clustering systems are @@ -31,13 +28,19 @@ needed, simply: $ mkfs -t gfs2 -p lock_nolock -j 1 /dev/block_device $ mount -t gfs2 /dev/block_device /dir -GFS2 is not on-disk compatible with previous versions of GFS. +If you are using Fedora, you need to install the gfs2-utils package +and, for lock_dlm, you will also need to install the cman package +and write a cluster.conf as per the documentation. + +GFS2 is not on-disk compatible with previous versions of GFS, but it +is pretty close. The following man pages can be found at the URL above: - gfs2_fsck to repair a filesystem + fsck.gfs2 to repair a filesystem gfs2_grow to expand a filesystem online gfs2_jadd to add journals to a filesystem online gfs2_tool to manipulate, examine and tune a filesystem gfs2_quota to examine and change quota values in a filesystem + gfs2_convert to convert a gfs filesystem to gfs2 in-place mount.gfs2 to help mount(8) mount a filesystem mkfs.gfs2 to make a filesystem -- cgit v1.1 From cce4c77b87ce7e71a0f244a3dfb6ac1c3a1bc67e Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 19 May 2009 10:39:34 +0200 Subject: mac80211: fix kernel-doc Moving information from config_interface to bss_info_changed removed struct ieee80211_if_conf which the documentation still refers to, additionally there's one kernel-doc description too much and one other missing, fix all this. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/DocBook/mac80211.tmpl | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/mac80211.tmpl b/Documentation/DocBook/mac80211.tmpl index fbeaffc..e369866 100644 --- a/Documentation/DocBook/mac80211.tmpl +++ b/Documentation/DocBook/mac80211.tmpl @@ -145,7 +145,6 @@ usage should require reading the full document. interface in STA mode at first! !Finclude/net/mac80211.h ieee80211_if_init_conf -!Finclude/net/mac80211.h ieee80211_if_conf -- cgit v1.1 From 14f966e79445015cd89d0fa0ceb6b33702e951b6 Mon Sep 17 00:00:00 2001 From: Robert Jennings Date: Wed, 15 Apr 2009 05:55:32 +0000 Subject: powerpc/pseries: CMO unused page hinting Adds support for the "unused" page hint which can be used in shared memory partitions to flag pages not in use, which will then be stolen before active pages by the hypervisor when memory needs to be moved to LPARs in need of additional memory. Failure to mark pages as 'unused' makes the LPAR slower to give up unused memory to other partitions. This adds the kernel parameter 'cmo_free_hint' to disable this functionality. Signed-off-by: Brian King Signed-off-by: Robert Jennings Signed-off-by: Benjamin Herrenschmidt --- Documentation/kernel-parameters.txt | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..844e9a6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -497,6 +497,13 @@ and is between 256 and 4096 characters. It is defined in the file Also note the kernel might malfunction if you disable some critical bits. + cmo_free_hint= [PPC] Format: { yes | no } + Specify whether pages are marked as being inactive + when they are freed. This is used in CMO environments + to determine OS memory pressure for page stealing by + a hypervisor. + Default: yes + code_bytes [X86] How many bytes of object code to print in an oops report. Range: 0 - 8192 -- cgit v1.1 From 5789ba3bd0a3cd20df5980ebf03358f2eb44fd67 Mon Sep 17 00:00:00 2001 From: Eric Paris Date: Thu, 21 May 2009 15:47:06 -0400 Subject: IMA: Minimal IMA policy and boot param for TCB IMA policy The IMA TCB policy is dangerous. A normal use can use all of a system's memory (which cannot be freed) simply by building and running lots of executables. The TCB policy is also nearly useless because logging in as root often causes a policy violation when dealing with utmp, thus rendering the measurements meaningless. There is no good fix for this in the kernel. A full TCB policy would need to be loaded in userspace using LSM rule matching to get both a protected and useful system. But, if too little is measured before userspace can load a real policy one again ends up with a meaningless set of measurements. One option would be to put the policy load inside the initrd in order to get it early enough in the boot sequence to be useful, but this runs into trouble with the LSM. For IMA to measure the LSM policy and the LSM policy loading mechanism it needs rules to do so, but we already talked about problems with defaulting to such broad rules.... IMA also depends on the files being measured to be on an FS which implements and supports i_version. Since the only FS with this support (ext4) doesn't even use it by default it seems silly to have any IMA rules by default. This should reduce the performance overhead of IMA to near 0 while still letting users who choose to configure their machine as such to inclue the ima_tcb kernel paramenter and get measurements during boot before they can load a customized, reasonable policy in userspace. Signed-off-by: Eric Paris Acked-by: Mimi Zohar Signed-off-by: James Morris --- Documentation/kernel-parameters.txt | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..d9a24a0 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -914,6 +914,12 @@ and is between 256 and 4096 characters. It is defined in the file Formt: { "sha1" | "md5" } default: "sha1" + ima_tcb [IMA] + Load a policy which meets the needs of the Trusted + Computing Base. This means IMA will measure all + programs exec'd, files mmap'd for exec, and all files + opened for read by uid=0. + in2000= [HW,SCSI] See header of drivers/scsi/in2000.c. -- cgit v1.1 From c72758f33784e5e2a1a4bb9421ef3e6de8f9fcf3 Mon Sep 17 00:00:00 2001 From: "Martin K. Petersen" Date: Fri, 22 May 2009 17:17:53 -0400 Subject: block: Export I/O topology for block devices and partitions To support devices with physical block sizes bigger than 512 bytes we need to ensure proper alignment. This patch adds support for exposing I/O topology characteristics as devices are stacked. logical_block_size is the smallest unit the device can address. physical_block_size indicates the smallest I/O the device can write without incurring a read-modify-write penalty. The io_min parameter is the smallest preferred I/O size reported by the device. In many cases this is the same as the physical block size. However, the io_min parameter can be scaled up when stacking (RAID5 chunk size > physical block size). The io_opt characteristic indicates the optimal I/O size reported by the device. This is usually the stripe width for arrays. The alignment_offset parameter indicates the number of bytes the start of the device/partition is offset from the device's natural alignment. Partition tools and MD/DM utilities can use this to pad their offsets so filesystems start on proper boundaries. Signed-off-by: Martin K. Petersen Signed-off-by: Jens Axboe --- Documentation/ABI/testing/sysfs-block | 59 +++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index 44f52a4..cbbd3e0 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -60,3 +60,62 @@ Description: Indicates whether the block layer should automatically generate checksums for write requests bound for devices that support receiving integrity metadata. + +What: /sys/block//alignment_offset +Date: April 2009 +Contact: Martin K. Petersen +Description: + Storage devices may report a physical block size that is + bigger than the logical block size (for instance a drive + with 4KB physical sectors exposing 512-byte logical + blocks to the operating system). This parameter + indicates how many bytes the beginning of the device is + offset from the disk's natural alignment. + +What: /sys/block///alignment_offset +Date: April 2009 +Contact: Martin K. Petersen +Description: + Storage devices may report a physical block size that is + bigger than the logical block size (for instance a drive + with 4KB physical sectors exposing 512-byte logical + blocks to the operating system). This parameter + indicates how many bytes the beginning of the partition + is offset from the disk's natural alignment. + +What: /sys/block//queue/logical_block_size +Date: May 2009 +Contact: Martin K. Petersen +Description: + This is the smallest unit the storage device can + address. It is typically 512 bytes. + +What: /sys/block//queue/physical_block_size +Date: May 2009 +Contact: Martin K. Petersen +Description: + This is the smallest unit the storage device can write + without resorting to read-modify-write operation. It is + usually the same as the logical block size but may be + bigger. One example is SATA drives with 4KB sectors + that expose a 512-byte logical block size to the + operating system. + +What: /sys/block//queue/minimum_io_size +Date: April 2009 +Contact: Martin K. Petersen +Description: + Storage devices may report a preferred minimum I/O size, + which is the smallest request the device can perform + without incurring a read-modify-write penalty. For disk + drives this is often the physical block size. For RAID + arrays it is often the stripe chunk size. + +What: /sys/block//queue/optimal_io_size +Date: April 2009 +Contact: Martin K. Petersen +Description: + Storage devices may report an optimal I/O size, which is + the device's preferred unit of receiving I/O. This is + rarely reported for disk drives. For RAID devices it is + usually the stripe width or the internal block size. -- cgit v1.1 From 04f9890df1bad2115665b7027e664aaffa44088d Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Mon, 25 May 2009 10:11:29 +0200 Subject: sound: virtuoso: add Xonar Essence ST support Add support for the Asus Xonar Essence ST and its daughterboard. Signed-off-by: Clemens Ladisch Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/ALSA-Configuration.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..68ef84f 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -1859,7 +1859,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ------------------- Module for sound cards based on the Asus AV100/AV200 chips, - i.e., Xonar D1, DX, D2, D2X, HDAV1.3 (Deluxe), and Essence STX. + i.e., Xonar D1, DX, D2, D2X, HDAV1.3 (Deluxe), Essence ST + (Deluxe) and Essence STX. This module supports autoprobe and multiple cards. -- cgit v1.1 From 29fcefba8a2f0fea11e2b721fe174a1832801284 Mon Sep 17 00:00:00 2001 From: Pekka Enberg Date: Sun, 24 May 2009 11:13:17 +0300 Subject: kmemtrace: fix kernel parameter documentation The kmemtrace.enable kernel parameter no longer works. To enable kmemtrace at boot-time, you must pass "ftrace=kmemtrace" instead. [ Impact: remove obsolete kernel parameter documentation ] Cc: Eduard - Gabriel Munteanu Signed-off-by: Pekka Enberg LKML-Reference: Signed-off-by: Frederic Weisbecker --- Documentation/kernel-parameters.txt | 10 ---------- 1 file changed, 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..9243dd8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -56,7 +56,6 @@ parameter is applicable: ISAPNP ISA PnP code is enabled. ISDN Appropriate ISDN support is enabled. JOY Appropriate joystick support is enabled. - KMEMTRACE kmemtrace is enabled. LIBATA Libata driver is enabled LP Printer support is enabled. LOOP Loopback device support is enabled. @@ -1054,15 +1053,6 @@ and is between 256 and 4096 characters. It is defined in the file use the HighMem zone if it exists, and the Normal zone if it does not. - kmemtrace.enable= [KNL,KMEMTRACE] Format: { yes | no } - Controls whether kmemtrace is enabled - at boot-time. - - kmemtrace.subbufs=n [KNL,KMEMTRACE] Overrides the number of - subbufs kmemtrace's relay channel has. Set this - higher than default (KMEMTRACE_N_SUBBUFS in code) if - you experience buffer overruns. - kgdboc= [HW] kgdb over consoles. Requires a tty driver that supports console polling. (only serial suported for now) -- cgit v1.1 From 113b3a2b901573961509e81a28e9546cf9defef0 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Wed, 27 May 2009 21:46:11 -0400 Subject: ACPI: delete acpi.power_nocheck from kernel-parameters.txt Signed-off-by: Len Brown --- Documentation/kernel-parameters.txt | 8 -------- 1 file changed, 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index fd5cac0..ed3d8b9 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -229,14 +229,6 @@ and is between 256 and 4096 characters. It is defined in the file to assume that this machine's pmtimer latches its value and always returns good values. - acpi.power_nocheck= [HW,ACPI] - Format: 1/0 enable/disable the check of power state. - On some bogus BIOS the _PSC object/_STA object of - power resource can't return the correct device power - state. In such case it is unneccessary to check its - power state again in power transition. - 1 : disable the power state check - acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode Format: { level | edge | high | low } -- cgit v1.1 From d9cfed925448f097ec7faab80d903eb7e5f99712 Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Tue, 19 May 2009 12:16:29 +0200 Subject: amd-iommu: remove amd_iommu_size kernel parameter This parameter is not longer necessary when aperture increases dynamically. Signed-off-by: Joerg Roedel --- Documentation/kernel-parameters.txt | 5 ----- 1 file changed, 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..5b776c6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -329,11 +329,6 @@ and is between 256 and 4096 characters. It is defined in the file flushed before they will be reused, which is a lot of faster - amd_iommu_size= [HW,X86-64] - Define the size of the aperture for the AMD IOMMU - driver. Possible values are: - '32M', '64M' (default), '128M', '256M', '512M', '1G' - amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT Format: , -- cgit v1.1 From 45f458e9a8a216b02b76fe61d9e8bc40d659fbe8 Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Tue, 28 Apr 2009 23:18:26 +0200 Subject: x86, mce: deprecate old 32bit machine check code Schedule for removal in 2.6.32 Signed-off-by: Andi Kleen Signed-off-by: H. Peter Anvin Signed-off-by: Hidetoshi Seto Signed-off-by: H. Peter Anvin --- Documentation/feature-removal-schedule.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index de491a3..ec9ef5d 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -437,3 +437,13 @@ Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate driver but this caused driver conflicts. Who: Jean Delvare Krzysztof Helt + +---------------------------- + +What: CONFIG_X86_OLD_MCE +When: 2.6.32 +Why: Remove the old legacy 32bit machine check code. This has been + superseded by the newer machine check code from the 64bit port, + but the old version has been kept around for easier testing. Note this + doesn't impact the old P5 and WinChip machine check handlers. +Who: Andi Kleen -- cgit v1.1 From 172d899db4bf0beb7766d583379e5ed552130e4a Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Tue, 28 Apr 2009 23:37:02 +0200 Subject: x86, mce: document new 32bit mcelog requirement in Documentation/Changes Signed-off-by: Andi Kleen Signed-off-by: H. Peter Anvin Signed-off-by: Hidetoshi Seto Signed-off-by: H. Peter Anvin --- Documentation/Changes | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/Changes b/Documentation/Changes index b95082b..d21b3b5 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -48,6 +48,7 @@ o procps 3.2.0 # ps --version o oprofile 0.9 # oprofiled --version o udev 081 # udevinfo -V o grub 0.93 # grub --version +o mcelog 0.6 Kernel compilation ================== @@ -276,6 +277,16 @@ before running exportfs or mountd. It is recommended that all NFS services be protected from the internet-at-large by a firewall where that is possible. +mcelog +------ + +In Linux 2.6.31+ the i386 kernel needs to run the mcelog utility +as a regular cronjob similar to the x86-64 kernel to process and log +machine check events when CONFIG_X86_NEW_MCE is enabled. Machine check +events are errors reported by the CPU. Processing them is strongly encouraged. +All x86-64 kernels since 2.6.4 require the mcelog utility to +process machine checks. + Getting updated software ======================== @@ -365,6 +376,10 @@ FUSE ---- o +mcelog +------ +o + Networking ********** -- cgit v1.1 From 8780e8e0f6b34862cdf2c62d4d2674d6bc3207db Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Wed, 27 May 2009 21:56:56 +0200 Subject: x86, mce: improve documentation Document that check_interval set to 0 means no polling. Noticed by Hidetoshi Seto Also add a reference from boot options to the sysfs tunables Acked-by: Hidetoshi Seto Signed-off-by: Andi Kleen Signed-off-by: Hidetoshi Seto Signed-off-by: H. Peter Anvin --- Documentation/x86/x86_64/boot-options.txt | 2 ++ Documentation/x86/x86_64/machinecheck | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 34c1304..63fca71 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -5,6 +5,8 @@ only the AMD64 specific ones are listed here. Machine check + Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables. + mce=off disable machine check mce=bootlog Enable logging of machine checks left over from booting. Disabled by default on AMD because some BIOS leave bogus ones. diff --git a/Documentation/x86/x86_64/machinecheck b/Documentation/x86/x86_64/machinecheck index a05e58e..a4fdb25 100644 --- a/Documentation/x86/x86_64/machinecheck +++ b/Documentation/x86/x86_64/machinecheck @@ -41,7 +41,9 @@ check_interval the polling interval. When the poller stops finding MCEs, it triggers an exponential backoff (poll less often) on the polling interval. The check_interval variable is both the initial and - maximum polling interval. + maximum polling interval. 0 means no polling for corrected machine + check errors (but some corrected errors might be still reported + in other ways) tolerant Tolerance level. When a machine check exception occurs for a non -- cgit v1.1 From 19fe7f1a00023d2aa97617655b7ea56eb72f4db8 Mon Sep 17 00:00:00 2001 From: Kevin Cernekee Date: Wed, 8 Apr 2009 22:51:43 -0700 Subject: Documentation: add MTD sysfs docs Signed-off-by: Kevin Cernekee Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse --- Documentation/ABI/testing/sysfs-class-mtd | 125 ++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-mtd (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-class-mtd b/Documentation/ABI/testing/sysfs-class-mtd new file mode 100644 index 0000000..4d55a18 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-mtd @@ -0,0 +1,125 @@ +What: /sys/class/mtd/ +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + The mtd/ class subdirectory belongs to the MTD subsystem + (MTD core). + +What: /sys/class/mtd/mtdX/ +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + The /sys/class/mtd/mtd{0,1,2,3,...} directories correspond + to each /dev/mtdX character device. These may represent + physical/simulated flash devices, partitions on a flash + device, or concatenated flash devices. They exist regardless + of whether CONFIG_MTD_CHAR is actually enabled. + +What: /sys/class/mtd/mtdXro/ +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + These directories provide the corresponding read-only device + nodes for /sys/class/mtd/mtdX/ . They are only created + (for the benefit of udev) if CONFIG_MTD_CHAR is enabled. + +What: /sys/class/mtd/mtdX/dev +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + Major and minor numbers of the character device corresponding + to this MTD device (in : format). This is the + read-write device so will be even. + +What: /sys/class/mtd/mtdXro/dev +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + Major and minor numbers of the character device corresponding + to the read-only variant of thie MTD device (in + : format). In this case will be odd. + +What: /sys/class/mtd/mtdX/erasesize +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + "Major" erase size for the device. If numeraseregions is + zero, this is the eraseblock size for the entire device. + Otherwise, the MEMGETREGIONCOUNT/MEMGETREGIONINFO ioctls + can be used to determine the actual eraseblock layout. + +What: /sys/class/mtd/mtdX/flags +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + A hexadecimal value representing the device flags, ORed + together: + + 0x0400: MTD_WRITEABLE - device is writable + 0x0800: MTD_BIT_WRITEABLE - single bits can be flipped + 0x1000: MTD_NO_ERASE - no erase necessary + 0x2000: MTD_POWERUP_LOCK - always locked after reset + +What: /sys/class/mtd/mtdX/name +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + A human-readable ASCII name for the device or partition. + This will match the name in /proc/mtd . + +What: /sys/class/mtd/mtdX/numeraseregions +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + For devices that have variable eraseblock sizes, this + provides the total number of erase regions. Otherwise, + it will read back as zero. + +What: /sys/class/mtd/mtdX/oobsize +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + Number of OOB bytes per page. + +What: /sys/class/mtd/mtdX/size +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + Total size of the device/partition, in bytes. + +What: /sys/class/mtd/mtdX/type +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + One of the following ASCII strings, representing the device + type: + + absent, ram, rom, nor, nand, dataflash, ubi, unknown + +What: /sys/class/mtd/mtdX/writesize +Date: April 2009 +KernelVersion: 2.6.29 +Contact: linux-mtd@lists.infradead.org +Description: + Minimal writable flash unit size. This will always be + a positive integer. + + In the case of NOR flash it is 1 (even though individual + bits can be cleared). + + In the case of NAND flash it is one NAND page (or a + half page, or a quarter page). + + In the case of ECC NOR, it is the ECC block size. -- cgit v1.1 From 294ae4011530d008c59c4fb9847738e39228821e Mon Sep 17 00:00:00 2001 From: GeunSik Lim Date: Thu, 28 May 2009 10:36:11 +0900 Subject: ftrace: fix typo about map of kernel priority in ftrace.txt file. Fix typo about chart to map the kernel priority to user land priorities. * About sched_setscheduler(2) Processes scheduled under SCHED_FIFO or SCHED_RR can have a (user-space) static priority in the range 1 to 99. (reference: http://www.kernel.org/doc/man-pages/online/pages/ man2/sched_setscheduler.2.html) * From: Steven Rostedt 0 to 98 - maps to RT tasks 99 to 1 (SCHED_RR or SCHED_FIFO) 99 - maps to internal kernel threads that want to be lower than RT tasks but higher than SCHED_OTHER tasks. Although I'm not sure if any kernel thread actually uses this. I'm not even sure how this can be set, because the internal sched_setscheduler function does not allow for it. 100 to 139 - maps nice levels -20 to 19. These are not set via sched_setscheduler, but are set via the nice system call. 140 - reserved for idle tasks. Signed-off-by: GeunSik Lim Acked-by: Steven Rostedt Signed-off-by: Peter Zijlstra LKML-Reference: Signed-off-by: Ingo Molnar --- Documentation/trace/ftrace.txt | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index fd9a3e6..e362f50 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -518,9 +518,18 @@ priority with zero (0) being the highest priority and the nice values starting at 100 (nice -20). Below is a quick chart to map the kernel priority to user land priorities. - Kernel priority: 0 to 99 ==> user RT priority 99 to 0 - Kernel priority: 100 to 139 ==> user nice -20 to 19 - Kernel priority: 140 ==> idle task priority + Kernel Space User Space + =============================================================== + 0(high) to 98(low) user RT priority 99(high) to 1(low) + with SCHED_RR or SCHED_FIFO + --------------------------------------------------------------- + 99 sched_priority is not used in scheduling + decisions(it must be specified as 0) + --------------------------------------------------------------- + 100(high) to 139(low) user nice -20(high) to 19(low) + --------------------------------------------------------------- + 140 idle task priority + --------------------------------------------------------------- The task states are: -- cgit v1.1 From f04d82b7e0c63d0251f9952a537a4bc4d73aa1a9 Mon Sep 17 00:00:00 2001 From: GeunSik Lim Date: Thu, 28 May 2009 10:36:14 +0900 Subject: sched: fix typo in sched-rt-group.txt file Fix typo about static priority's range. Kernel Space User Space =============================================================== 0(high) to 98(low) user RT priority 99(high) to 1(low) with SCHED_RR or SCHED_FIFO --------------------------------------------------------------- 99 sched_priority is not used in scheduling decisions(it must be specified as 0) --------------------------------------------------------------- 100(high) to 139(low) user nice -20(high) to 19(low) --------------------------------------------------------------- 140 idle task priority --------------------------------------------------------------- * ref) http://www.kernel.org/doc/man-pages/online/pages/man2/sched_setscheduler.2.html Signed-off-by: GeunSik Lim CC: Steven Rostedt Signed-off-by: Peter Zijlstra LKML-Reference: Signed-off-by: Ingo Molnar --- Documentation/scheduler/sched-rt-group.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt index eb74b01..1df7f9c 100644 --- a/Documentation/scheduler/sched-rt-group.txt +++ b/Documentation/scheduler/sched-rt-group.txt @@ -187,7 +187,7 @@ get their allocated time. Implementing SCHED_EDF might take a while to complete. Priority Inheritance is the biggest challenge as the current linux PI infrastructure is geared towards -the limited static priority levels 0-139. With deadline scheduling you need to +the limited static priority levels 0-99. With deadline scheduling you need to do deadline inheritance (since priority is inversely proportional to the deadline delta (deadline - now). -- cgit v1.1 From d1a277c584d0862dbf51991baea947ea5f2ce6bf Mon Sep 17 00:00:00 2001 From: Wolfgang Grandegger Date: Sat, 30 May 2009 07:55:50 +0000 Subject: can: sja1000: generic OF platform bus driver This patch adds a generic driver for SJA1000 chips on the OpenFirmware platform bus found on embedded PowerPC systems. You need a SJA1000 node definition in your flattened device tree source (DTS) file similar to: can@3,100 { compatible = "nxp,sja1000"; reg = <3 0x100 0x80>; interrupts = <2 0>; interrupt-parent = <&mpic>; nxp,external-clock-frequency = <16000000>; }; See also Documentation/powerpc/dts-bindings/can/sja1000.txt. CC: devicetree-discuss@ozlabs.org Signed-off-by: Wolfgang Grandegger Signed-off-by: David S. Miller --- Documentation/powerpc/dts-bindings/can/sja1000.txt | 53 ++++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 Documentation/powerpc/dts-bindings/can/sja1000.txt (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/can/sja1000.txt b/Documentation/powerpc/dts-bindings/can/sja1000.txt new file mode 100644 index 0000000..d6d209d --- /dev/null +++ b/Documentation/powerpc/dts-bindings/can/sja1000.txt @@ -0,0 +1,53 @@ +Memory mapped SJA1000 CAN controller from NXP (formerly Philips) + +Required properties: + +- compatible : should be "nxp,sja1000". + +- reg : should specify the chip select, address offset and size required + to map the registers of the SJA1000. The size is usually 0x80. + +- interrupts: property with a value describing the interrupt source + (number and sensitivity) required for the SJA1000. + +Optional properties: + +- nxp,external-clock-frequency : Frequency of the external oscillator + clock in Hz. Note that the internal clock frequency used by the + SJA1000 is half of that value. If not specified, a default value + of 16000000 (16 MHz) is used. + +- nxp,tx-output-mode : operation mode of the TX output control logic: + <0x0> : bi-phase output mode + <0x1> : normal output mode (default) + <0x2> : test output mode + <0x3> : clock output mode + +- nxp,tx-output-config : TX output pin configuration: + <0x01> : TX0 invert + <0x02> : TX0 pull-down (default) + <0x04> : TX0 pull-up + <0x06> : TX0 push-pull + <0x08> : TX1 invert + <0x10> : TX1 pull-down + <0x20> : TX1 pull-up + <0x30> : TX1 push-pull + +- nxp,clock-out-frequency : clock frequency in Hz on the CLKOUT pin. + If not specified or if the specified value is 0, the CLKOUT pin + will be disabled. + +- nxp,no-comparator-bypass : Allows to disable the CAN input comperator. + +For futher information, please have a look to the SJA1000 data sheet. + +Examples: + +can@3,100 { + compatible = "nxp,sja1000"; + reg = <3 0x100 0x80>; + interrupts = <2 0>; + interrupt-parent = <&mpic>; + nxp,external-clock-frequency = <16000000>; +}; + -- cgit v1.1 From 56d417b12e57dfe11c9b7ba4bea3882c62a55815 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Mon, 1 Jun 2009 03:07:33 -0700 Subject: IPv6: Add 'autoconf' and 'disable_ipv6' module parameters Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module. The first controls if IPv6 addresses are autoconfigured from prefixes received in Router Advertisements. The IPv6 loopback (::1) and link-local addresses are still configured. The second controls if IPv6 addresses are desired at all. No IPv6 addresses will be added to any interfaces. Signed-off-by: Brian Haley Signed-off-by: David S. Miller --- Documentation/networking/ip-sysctl.txt | 7 +++++++ Documentation/networking/ipv6.txt | 37 ++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 3ffd233..8be7623 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -1057,6 +1057,13 @@ disable_ipv6 - BOOLEAN address. Default: FALSE (enable IPv6 operation) + When this value is changed from 1 to 0 (IPv6 is being enabled), + it will dynamically create a link-local address on the given + interface and start Duplicate Address Detection, if necessary. + + When this value is changed from 0 to 1 (IPv6 is being disabled), + it will dynamically delete all address on the given interface. + accept_dad - INTEGER Whether to accept DAD (Duplicate Address Detection). 0: Disable DAD diff --git a/Documentation/networking/ipv6.txt b/Documentation/networking/ipv6.txt index 268e5c1..9fd7e21 100644 --- a/Documentation/networking/ipv6.txt +++ b/Documentation/networking/ipv6.txt @@ -33,3 +33,40 @@ disable A reboot is required to enable IPv6. +autoconf + + Specifies whether to enable IPv6 address autoconfiguration + on all interfaces. This might be used when one does not wish + for addresses to be automatically generated from prefixes + received in Router Advertisements. + + The possible values and their effects are: + + 0 + IPv6 address autoconfiguration is disabled on all interfaces. + + Only the IPv6 loopback address (::1) and link-local addresses + will be added to interfaces. + + 1 + IPv6 address autoconfiguration is enabled on all interfaces. + + This is the default value. + +disable_ipv6 + + Specifies whether to disable IPv6 on all interfaces. + This might be used when no IPv6 addresses are desired. + + The possible values and their effects are: + + 0 + IPv6 is enabled on all interfaces. + + This is the default value. + + 1 + IPv6 is disabled on all interfaces. + + No IPv6 addresses will be added to interfaces. + -- cgit v1.1 From 2af15d6a44b871ad4c2a651302374cde8f335480 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 28 May 2009 13:37:24 -0400 Subject: ftrace: add kernel command line function filtering When using ftrace=function on the command line to trace functions on boot up, one can not filter out functions that are commonly called. This patch adds two new ftrace command line commands. ftrace_notrace=function-list ftrace_filter=function-list Where function-list is a comma separated list of functions to filter. The ftrace_notrace will make the functions listed not be included in the function tracing, and ftrace_filter will only trace the functions listed. These two act the same as the debugfs/tracing/set_ftrace_notrace and debugfs/tracing/set_ftrace_filter respectively. The simple glob expressions that are allowed by the filter files can also be used by the command line interface. ftrace_notrace=rcu*,*lock,*spin* Will not trace any function that starts with rcu, ends with lock, or has the word spin in it. Note, if the self tests are enabled, they may interfere with the filtering set by the command lines. Signed-off-by: Steven Rostedt --- Documentation/kernel-parameters.txt | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 9243dd8..fcd3bfb 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -751,12 +751,25 @@ and is between 256 and 4096 characters. It is defined in the file ia64_pal_cache_flush instead of SAL_CACHE_FLUSH. ftrace=[tracer] - [ftrace] will set and start the specified tracer + [FTRACE] will set and start the specified tracer as early as possible in order to facilitate early boot debugging. ftrace_dump_on_oops - [ftrace] will dump the trace buffers on oops. + [FTRACE] will dump the trace buffers on oops. + + ftrace_filter=[function-list] + [FTRACE] Limit the functions traced by the function + tracer at boot up. function-list is a comma separated + list of functions. This list can be changed at run + time by the set_ftrace_filter file in the debugfs + tracing directory. + + ftrace_notrace=[function-list] + [FTRACE] Do not trace the functions specified in + function-list. This list can be changed at run time + by the set_ftrace_notrace file in the debugfs + tracing directory. gamecon.map[2|3]= [HW,JOY] Multisystem joystick and NES/SNES/PSX pad -- cgit v1.1 From 3b315d70b094e8b439358756a9084438fd7a71c2 Mon Sep 17 00:00:00 2001 From: Hector Martin Date: Tue, 2 Jun 2009 10:54:19 +0200 Subject: ALSA: hda - Acer Aspire 8930G support Short story: this laptop has 5.1 built-in speakers which you *really* want to use (the not-so-"sub" woofer is what makes the audio above average for a laptop), so 6-channel support is important (plus a decent asound.conf to upmix stereo). It also has the 3 typical jacks that ought to have a selectable mode. And it's based on ALC889, which sucks. Rationale/explanations: The const_channel_count stuff was added because, for a laptop like this, you always have 6 channels available (internal speakers) but still need to set the mode for the 3 external jacks. Therefore, the device always needs to be in 6-channel mode but there still needs to be a mixer control for the jack mode. You could use line/mic-in at the same time as the 6 internal speakers, for example. You might be tempted to make it even smarter by dynamically switching the max channel count when headphones are plugged in (therefore muting the internal speakers and reducing the physical channel count to the jack channel mode), but as a user I consider this to be harmful because I want the audio to blow up to 6 channels / upmixed as soon as I unplug the headphones, and having opened the device while in 2-channel mode would prevent this from working (and always making 6-channel mode available doesn't do any harm). The hardware needs EAPD turned on and the DACs routed to the internal speaker pins, so the patch adds those verbs. The ALC889 CLFE and subsequent (side/aux, here unused) DACs do NOT work by default, at least here. I wasted much time trying to talk to Realtek/pshou about this, but they just kept sending me useless updates to patch_realtek.c that did nothing relevant. In the end I gave up and brute forced the issue by trying to flip every bit in the proprietary coefficient registers, and eventually found the two magic registers that need to be cleared to enable all DACs. I have only heard Acer users complain, but that might be because ALC889 is pretty new and using 5.1 (and noticing the missing center/lfe channels) might not be that common. If this is a generalized issue with all ALC889 systems then those verbs should probably be moved to a common verb array. The internal mic is untested and probably doesn't work. These settings will probably work for other Acer Gemstone laptops with the same 5.1 speaker config. When identified, those should be added to the PCI subsystem ID list. Signed-off-by: Hector Martin Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 818d06f..29c6125 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -139,6 +139,7 @@ ALC883/888 acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc) acer-aspire Acer Aspire 9810 acer-aspire-4930g Acer Aspire 4930G + acer-aspire-8930g Acer Aspire 8930G medion Medion Laptops medion-md2 Medion MD2 targa-dig Targa/MSI -- cgit v1.1 From 7fe063268e73681cdca1a6496a25f93d3332f517 Mon Sep 17 00:00:00 2001 From: Andrew Patterson Date: Tue, 2 Jun 2009 14:48:39 +0200 Subject: cciss: add cciss driver sysfs entries Add sysfs entries to the cciss driver needed for the dm/multipath tools. A file for vendor, model, rev, and unique_id is added for each logical drive under directory /sys/bus/pci/devices//ccissX/cXdY. Where X = the controller (or host) number and Y is the logical drive number. A link from /sys/bus/pci/devices//ccissX/cXdY/block:cciss!cXdY to /sys/block/cciss!cXdY/device is also created. A bus is created in /sys/bus/cciss. A link is created from the pci ccissX entry to /sys/bus/cciss/devices/ccissX. Please consider this for inclusion. Signed-off-by: Mike Miller Cc: Stephen M. Cameron Signed-off-by: Jens Axboe --- .../ABI/testing/sysfs-bus-pci-devices-cciss | 33 ++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-bus-pci-devices-cciss (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss new file mode 100644 index 0000000..0a92a7c --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss @@ -0,0 +1,33 @@ +Where: /sys/bus/pci/devices//ccissX/cXdY/model +Date: March 2009 +Kernel Version: 2.6.30 +Contact: iss_storagedev@hp.com +Description: Displays the SCSI INQUIRY page 0 model for logical drive + Y of controller X. + +Where: /sys/bus/pci/devices//ccissX/cXdY/rev +Date: March 2009 +Kernel Version: 2.6.30 +Contact: iss_storagedev@hp.com +Description: Displays the SCSI INQUIRY page 0 revision for logical + drive Y of controller X. + +Where: /sys/bus/pci/devices//ccissX/cXdY/unique_id +Date: March 2009 +Kernel Version: 2.6.30 +Contact: iss_storagedev@hp.com +Description: Displays the SCSI INQUIRY page 83 serial number for logical + drive Y of controller X. + +Where: /sys/bus/pci/devices//ccissX/cXdY/vendor +Date: March 2009 +Kernel Version: 2.6.30 +Contact: iss_storagedev@hp.com +Description: Displays the SCSI INQUIRY page 0 vendor for logical drive + Y of controller X. + +Where: /sys/bus/pci/devices//ccissX/cXdY/block:cciss!cXdY +Date: March 2009 +Kernel Version: 2.6.30 +Contact: iss_storagedev@hp.com +Description: A symbolic link to /sys/block/cciss!cXdY -- cgit v1.1 From dbdc9dd342f0a7e32f40f0d4ade662bdfe057484 Mon Sep 17 00:00:00 2001 From: vibi sreenivasan Date: Tue, 2 Jun 2009 14:52:32 +0200 Subject: Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt File Documentation/PCI/PCI-DMA-mapping.txt does not exist. Documentation/DMA-mapping.txt contains DMA Mapping details Signed-off-by: vibi sreenivasan Signed-off-by: Jens Axboe --- Documentation/block/biodoc.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index 6fab97e..8d2158a 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -186,7 +186,7 @@ a virtual address mapping (unlike the earlier scheme of virtual address do not have a corresponding kernel virtual address space mapping) and low-memory pages. -Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion +Note: Please refer to Documentation/DMA-mapping.txt for a discussion on PCI high mem DMA aspects and mapping of scatter gather lists, and support for 64 bit PCI. -- cgit v1.1 From 1745de5e5639457513fe43440f2800e23c3cbc7d Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Fri, 22 May 2009 21:49:51 +0200 Subject: dma-debug: add dma_debug_driver kernel command line This patch add the dma_debug_driver= boot parameter to enable the driver filter for early boot. Signed-off-by: Joerg Roedel --- Documentation/kernel-parameters.txt | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..b3f1314 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -646,6 +646,13 @@ and is between 256 and 4096 characters. It is defined in the file DMA-API debugging code disables itself because the architectural default is too low. + dma_debug_driver= + With this option the DMA-API debugging driver + filter feature can be enabled at boot time. Just + pass the driver to filter for as the parameter. + The filter can be disabled or changed to another + driver later using sysfs. + dscc4.setup= [NET] dtc3181e= [HW,SCSI] -- cgit v1.1 From 016ea6874a6d58df85b54f56997d26df13c307b2 Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Fri, 22 May 2009 21:57:23 +0200 Subject: dma-debug: add documentation for the driver filter This patch adds the driver filter feature to the dma-debug documentation. Signed-off-by: Joerg Roedel --- Documentation/DMA-API.txt | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index d9aa43d..25fb8bc 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -704,12 +704,24 @@ this directory the following files can currently be found: The current number of free dma_debug_entries in the allocator. + dma-api/driver-filter + You can write a name of a driver into this file + to limit the debug output to requests from that + particular driver. Write an empty string to + that file to disable the filter and see + all errors again. + If you have this code compiled into your kernel it will be enabled by default. If you want to boot without the bookkeeping anyway you can provide 'dma_debug=off' as a boot parameter. This will disable DMA-API debugging. Notice that you can not enable it again at runtime. You have to reboot to do so. +If you want to see debug messages only for a special device driver you can +specify the dma_debug_driver= parameter. This will enable the +driver filter at boot time. The debug code will only print errors for that +driver afterwards. This filter can be disabled or changed later using debugfs. + When the code disables itself at runtime this is most likely because it ran out of dma_debug_entries. These entries are preallocated at boot. The number of preallocated entries is defined per architecture. If it is too low for you -- cgit v1.1 From 85c7859190c4197a7c34066db14c25903c401187 Mon Sep 17 00:00:00 2001 From: Denis Karpov Date: Thu, 4 Jun 2009 02:34:22 +0900 Subject: FAT: add 'errors' mount option On severe errors FAT remounts itself in read-only mode. Allow to specify FAT fs desired behavior through 'errors' mount option: panic, continue or remount read-only. `mount -t [fat|vfat] -o errors=[panic,remount-ro,continue] \ ` This is analog to ext2 fs 'errors' mount option. Signed-off-by: Denis Karpov Signed-off-by: OGAWA Hirofumi --- Documentation/filesystems/vfat.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt index 3a5ddc9..7d41c0d 100644 --- a/Documentation/filesystems/vfat.txt +++ b/Documentation/filesystems/vfat.txt @@ -132,6 +132,11 @@ rodir -- FAT has the ATTR_RO (read-only) attribute. But on Windows, If you want to use ATTR_RO as read-only flag even for the directory, set this option. +errors=panic|continue|remount-ro + -- specify FAT behavior on critical errors: panic, continue + without doing anything or remount the partition in + read-only mode (default behavior). + : 0,1,yes,no,true,false TODO -- cgit v1.1 From 19d337dff95cbf76edd3ad95c0cee2732c3e1ec5 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 2 Jun 2009 13:01:37 +0200 Subject: rfkill: rewrite This patch completely rewrites the rfkill core to address the following deficiencies: * all rfkill drivers need to implement polling where necessary rather than having one central implementation * updating the rfkill state cannot be done from arbitrary contexts, forcing drivers to use schedule_work and requiring lots of code * rfkill drivers need to keep track of soft/hard blocked internally -- the core should do this * the rfkill API has many unexpected quirks, for example being asymmetric wrt. alloc/free and register/unregister * rfkill can call back into a driver from within a function the driver called -- this is prone to deadlocks and generally should be avoided * rfkill-input pointlessly is a separate module * drivers need to #ifdef rfkill functions (unless they want to depend on or select RFKILL) -- rfkill should provide inlines that do nothing if it isn't compiled in * the rfkill structure is not opaque -- drivers need to initialise it correctly (lots of sanity checking code required) -- instead force drivers to pass the right variables to rfkill_alloc() * the documentation is hard to read because it always assumes the reader is completely clueless and contains way TOO MANY CAPS * the rfkill code needlessly uses a lot of locks and atomic operations in locked sections * fix LED trigger to actually change the LED when the radio state changes -- this wasn't done before Tested-by: Alan Jenkins Signed-off-by: Henrique de Moraes Holschuh [thinkpad] Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/rfkill.txt | 597 +++++++---------------------------------------- 1 file changed, 78 insertions(+), 519 deletions(-) (limited to 'Documentation') diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index 40c3a3f..de941e3 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -1,571 +1,130 @@ -rfkill - RF switch subsystem support -==================================== +rfkill - RF kill switch support +=============================== -1 Introduction -2 Implementation details -3 Kernel driver guidelines -3.1 wireless device drivers -3.2 platform/switch drivers -3.3 input device drivers -4 Kernel API -5 Userspace support +1. Introduction +2. Implementation details +3. Kernel driver guidelines +4. Kernel API +5. Userspace support -1. Introduction: +1. Introduction -The rfkill switch subsystem exists to add a generic interface to circuitry that -can enable or disable the signal output of a wireless *transmitter* of any -type. By far, the most common use is to disable radio-frequency transmitters. +The rfkill subsystem provides a generic interface to disabling any radio +transmitter in the system. When a transmitter is blocked, it shall not +radiate any power. -Note that disabling the signal output means that the the transmitter is to be -made to not emit any energy when "blocked". rfkill is not about blocking data -transmissions, it is about blocking energy emission. +The subsystem also provides the ability to react on button presses and +disable all transmitters of a certain type (or all). This is intended for +situations where transmitters need to be turned off, for example on +aircraft. -The rfkill subsystem offers support for keys and switches often found on -laptops to enable wireless devices like WiFi and Bluetooth, so that these keys -and switches actually perform an action in all wireless devices of a given type -attached to the system. -The buttons to enable and disable the wireless transmitters are important in -situations where the user is for example using his laptop on a location where -radio-frequency transmitters _must_ be disabled (e.g. airplanes). -Because of this requirement, userspace support for the keys should not be made -mandatory. Because userspace might want to perform some additional smarter -tasks when the key is pressed, rfkill provides userspace the possibility to -take over the task to handle the key events. - -=============================================================================== -2: Implementation details +2. Implementation details The rfkill subsystem is composed of various components: the rfkill class, the rfkill-input module (an input layer handler), and some specific input layer events. -The rfkill class provides kernel drivers with an interface that allows them to -know when they should enable or disable a wireless network device transmitter. -This is enabled by the CONFIG_RFKILL Kconfig option. - -The rfkill class support makes sure userspace will be notified of all state -changes on rfkill devices through uevents. It provides a notification chain -for interested parties in the kernel to also get notified of rfkill state -changes in other drivers. It creates several sysfs entries which can be used -by userspace. See section "Userspace support". - -The rfkill-input module provides the kernel with the ability to implement a -basic response when the user presses a key or button (or toggles a switch) -related to rfkill functionality. It is an in-kernel implementation of default -policy of reacting to rfkill-related input events and neither mandatory nor -required for wireless drivers to operate. It is enabled by the -CONFIG_RFKILL_INPUT Kconfig option. - -rfkill-input is a rfkill-related events input layer handler. This handler will -listen to all rfkill key events and will change the rfkill state of the -wireless devices accordingly. With this option enabled userspace could either -do nothing or simply perform monitoring tasks. - -The rfkill-input module also provides EPO (emergency power-off) functionality -for all wireless transmitters. This function cannot be overridden, and it is -always active. rfkill EPO is related to *_RFKILL_ALL input layer events. - - -Important terms for the rfkill subsystem: - -In order to avoid confusion, we avoid the term "switch" in rfkill when it is -referring to an electronic control circuit that enables or disables a -transmitter. We reserve it for the physical device a human manipulates -(which is an input device, by the way): - -rfkill switch: - - A physical device a human manipulates. Its state can be perceived by - the kernel either directly (through a GPIO pin, ACPI GPE) or by its - effect on a rfkill line of a wireless device. - -rfkill controller: - - A hardware circuit that controls the state of a rfkill line, which a - kernel driver can interact with *to modify* that state (i.e. it has - either write-only or read/write access). - -rfkill line: - - An input channel (hardware or software) of a wireless device, which - causes a wireless transmitter to stop emitting energy (BLOCK) when it - is active. Point of view is extremely important here: rfkill lines are - always seen from the PoV of a wireless device (and its driver). - -soft rfkill line/software rfkill line: - - A rfkill line the wireless device driver can directly change the state - of. Related to rfkill_state RFKILL_STATE_SOFT_BLOCKED. - -hard rfkill line/hardware rfkill line: - - A rfkill line that works fully in hardware or firmware, and that cannot - be overridden by the kernel driver. The hardware device or the - firmware just exports its status to the driver, but it is read-only. - Related to rfkill_state RFKILL_STATE_HARD_BLOCKED. - -The enum rfkill_state describes the rfkill state of a transmitter: - -When a rfkill line or rfkill controller is in the RFKILL_STATE_UNBLOCKED state, -the wireless transmitter (radio TX circuit for example) is *enabled*. When the -it is in the RFKILL_STATE_SOFT_BLOCKED or RFKILL_STATE_HARD_BLOCKED, the -wireless transmitter is to be *blocked* from operating. - -RFKILL_STATE_SOFT_BLOCKED indicates that a call to toggle_radio() can change -that state. RFKILL_STATE_HARD_BLOCKED indicates that a call to toggle_radio() -will not be able to change the state and will return with a suitable error if -attempts are made to set the state to RFKILL_STATE_UNBLOCKED. - -RFKILL_STATE_HARD_BLOCKED is used by drivers to signal that the device is -locked in the BLOCKED state by a hardwire rfkill line (typically an input pin -that, when active, forces the transmitter to be disabled) which the driver -CANNOT override. - -Full rfkill functionality requires two different subsystems to cooperate: the -input layer and the rfkill class. The input layer issues *commands* to the -entire system requesting that devices registered to the rfkill class change -state. The way this interaction happens is not complex, but it is not obvious -either: - -Kernel Input layer: - - * Generates KEY_WWAN, KEY_WLAN, KEY_BLUETOOTH, SW_RFKILL_ALL, and - other such events when the user presses certain keys, buttons, or - toggles certain physical switches. - - THE INPUT LAYER IS NEVER USED TO PROPAGATE STATUS, NOTIFICATIONS OR THE - KIND OF STUFF AN ON-SCREEN-DISPLAY APPLICATION WOULD REPORT. It is - used to issue *commands* for the system to change behaviour, and these - commands may or may not be carried out by some kernel driver or - userspace application. It follows that doing user feedback based only - on input events is broken, as there is no guarantee that an input event - will be acted upon. - - Most wireless communication device drivers implementing rfkill - functionality MUST NOT generate these events, and have no reason to - register themselves with the input layer. Doing otherwise is a common - misconception. There is an API to propagate rfkill status change - information, and it is NOT the input layer. - -rfkill class: - - * Calls a hook in a driver to effectively change the wireless - transmitter state; - * Keeps track of the wireless transmitter state (with help from - the driver); - * Generates userspace notifications (uevents) and a call to a - notification chain (kernel) when there is a wireless transmitter - state change; - * Connects a wireless communications driver with the common rfkill - control system, which, for example, allows actions such as - "switch all bluetooth devices offline" to be carried out by - userspace or by rfkill-input. - - THE RFKILL CLASS NEVER ISSUES INPUT EVENTS. THE RFKILL CLASS DOES - NOT LISTEN TO INPUT EVENTS. NO DRIVER USING THE RFKILL CLASS SHALL - EVER LISTEN TO, OR ACT ON RFKILL INPUT EVENTS. Doing otherwise is - a layering violation. - - Most wireless data communication drivers in the kernel have just to - implement the rfkill class API to work properly. Interfacing to the - input layer is not often required (and is very often a *bug*) on - wireless drivers. - - Platform drivers often have to attach to the input layer to *issue* - (but never to listen to) rfkill events for rfkill switches, and also to - the rfkill class to export a control interface for the platform rfkill - controllers to the rfkill subsystem. This does NOT mean the rfkill - switch is attached to a rfkill class (doing so is almost always wrong). - It just means the same kernel module is the driver for different - devices (rfkill switches and rfkill controllers). - - -Userspace input handlers (uevents) or kernel input handlers (rfkill-input): - - * Implements the policy of what should happen when one of the input - layer events related to rfkill operation is received. - * Uses the sysfs interface (userspace) or private rfkill API calls - to tell the devices registered with the rfkill class to change - their state (i.e. translates the input layer event into real - action). - - * rfkill-input implements EPO by handling EV_SW SW_RFKILL_ALL 0 - (power off all transmitters) in a special way: it ignores any - overrides and local state cache and forces all transmitters to the - RFKILL_STATE_SOFT_BLOCKED state (including those which are already - supposed to be BLOCKED). - * rfkill EPO will remain active until rfkill-input receives an - EV_SW SW_RFKILL_ALL 1 event. While the EPO is active, transmitters - are locked in the blocked state (rfkill will refuse to unblock them). - * rfkill-input implements different policies that the user can - select for handling EV_SW SW_RFKILL_ALL 1. It will unlock rfkill, - and either do nothing (leave transmitters blocked, but now unlocked), - restore the transmitters to their state before the EPO, or unblock - them all. - -Userspace uevent handler or kernel platform-specific drivers hooked to the -rfkill notifier chain: - - * Taps into the rfkill notifier chain or to KOBJ_CHANGE uevents, - in order to know when a device that is registered with the rfkill - class changes state; - * Issues feedback notifications to the user; - * In the rare platforms where this is required, synthesizes an input - event to command all *OTHER* rfkill devices to also change their - statues when a specific rfkill device changes state. - - -=============================================================================== -3: Kernel driver guidelines - -Remember: point-of-view is everything for a driver that connects to the rfkill -subsystem. All the details below must be measured/perceived from the point of -view of the specific driver being modified. - -The first thing one needs to know is whether his driver should be talking to -the rfkill class or to the input layer. In rare cases (platform drivers), it -could happen that you need to do both, as platform drivers often handle a -variety of devices in the same driver. - -Do not mistake input devices for rfkill controllers. The only type of "rfkill -switch" device that is to be registered with the rfkill class are those -directly controlling the circuits that cause a wireless transmitter to stop -working (or the software equivalent of them), i.e. what we call a rfkill -controller. Every other kind of "rfkill switch" is just an input device and -MUST NOT be registered with the rfkill class. - -A driver should register a device with the rfkill class when ALL of the -following conditions are met (they define a rfkill controller): - -1. The device is/controls a data communications wireless transmitter; - -2. The kernel can interact with the hardware/firmware to CHANGE the wireless - transmitter state (block/unblock TX operation); - -3. The transmitter can be made to not emit any energy when "blocked": - rfkill is not about blocking data transmissions, it is about blocking - energy emission; - -A driver should register a device with the input subsystem to issue -rfkill-related events (KEY_WLAN, KEY_BLUETOOTH, KEY_WWAN, KEY_WIMAX, -SW_RFKILL_ALL, etc) when ALL of the folowing conditions are met: - -1. It is directly related to some physical device the user interacts with, to - command the O.S./firmware/hardware to enable/disable a data communications - wireless transmitter. - - Examples of the physical device are: buttons, keys and switches the user - will press/touch/slide/switch to enable or disable the wireless - communication device. - -2. It is NOT slaved to another device, i.e. there is no other device that - issues rfkill-related input events in preference to this one. - - Please refer to the corner cases and examples section for more details. - -When in doubt, do not issue input events. For drivers that should generate -input events in some platforms, but not in others (e.g. b43), the best solution -is to NEVER generate input events in the first place. That work should be -deferred to a platform-specific kernel module (which will know when to generate -events through the rfkill notifier chain) or to userspace. This avoids the -usual maintenance problems with DMI whitelisting. - - -Corner cases and examples: -==================================== - -1. If the device is an input device that, because of hardware or firmware, -causes wireless transmitters to be blocked regardless of the kernel's will, it -is still just an input device, and NOT to be registered with the rfkill class. - -2. If the wireless transmitter switch control is read-only, it is an input -device and not to be registered with the rfkill class (and maybe not to be made -an input layer event source either, see below). - -3. If there is some other device driver *closer* to the actual hardware the -user interacted with (the button/switch/key) to issue an input event, THAT is -the device driver that should be issuing input events. - -E.g: - [RFKILL slider switch] -- [GPIO hardware] -- [WLAN card rf-kill input] - (platform driver) (wireless card driver) - -The user is closer to the RFKILL slide switch plaform driver, so the driver -which must issue input events is the platform driver looking at the GPIO -hardware, and NEVER the wireless card driver (which is just a slave). It is -very likely that there are other leaves than just the WLAN card rf-kill input -(e.g. a bluetooth card, etc)... - -On the other hand, some embedded devices do this: - - [RFKILL slider switch] -- [WLAN card rf-kill input] - (wireless card driver) - -In this situation, the wireless card driver *could* register itself as an input -device and issue rf-kill related input events... but in order to AVOID the need -for DMI whitelisting, the wireless card driver does NOT do it. Userspace (HAL) -or a platform driver (that exists only on these embedded devices) will do the -dirty job of issuing the input events. - - -COMMON MISTAKES in kernel drivers, related to rfkill: -==================================== - -1. NEVER confuse input device keys and buttons with input device switches. - - 1a. Switches are always set or reset. They report the current state - (on position or off position). - - 1b. Keys and buttons are either in the pressed or not-pressed state, and - that's it. A "button" that latches down when you press it, and - unlatches when you press it again is in fact a switch as far as input - devices go. - -Add the SW_* events you need for switches, do NOT try to emulate a button using -KEY_* events just because there is no such SW_* event yet. Do NOT try to use, -for example, KEY_BLUETOOTH when you should be using SW_BLUETOOTH instead. - -2. Input device switches (sources of EV_SW events) DO store their current state -(so you *must* initialize it by issuing a gratuitous input layer event on -driver start-up and also when resuming from sleep), and that state CAN be -queried from userspace through IOCTLs. There is no sysfs interface for this, -but that doesn't mean you should break things trying to hook it to the rfkill -class to get a sysfs interface :-) - -3. Do not issue *_RFKILL_ALL events by default, unless you are sure it is the -correct event for your switch/button. These events are emergency power-off -events when they are trying to turn the transmitters off. An example of an -input device which SHOULD generate *_RFKILL_ALL events is the wireless-kill -switch in a laptop which is NOT a hotkey, but a real sliding/rocker switch. -An example of an input device which SHOULD NOT generate *_RFKILL_ALL events by -default, is any sort of hot key that is type-specific (e.g. the one for WLAN). - - -3.1 Guidelines for wireless device drivers ------------------------------------------- - -(in this text, rfkill->foo means the foo field of struct rfkill). - -1. Each independent transmitter in a wireless device (usually there is only one -transmitter per device) should have a SINGLE rfkill class attached to it. - -2. If the device does not have any sort of hardware assistance to allow the -driver to rfkill the device, the driver should emulate it by taking all actions -required to silence the transmitter. - -3. If it is impossible to silence the transmitter (i.e. it still emits energy, -even if it is just in brief pulses, when there is no data to transmit and there -is no hardware support to turn it off) do NOT lie to the users. Do not attach -it to a rfkill class. The rfkill subsystem does not deal with data -transmission, it deals with energy emission. If the transmitter is emitting -energy, it is not blocked in rfkill terms. - -4. It doesn't matter if the device has multiple rfkill input lines affecting -the same transmitter, their combined state is to be exported as a single state -per transmitter (see rule 1). - -This rule exists because users of the rfkill subsystem expect to get (and set, -when possible) the overall transmitter rfkill state, not of a particular rfkill -line. +The rfkill class is provided for kernel drivers to register their radio +transmitter with the kernel, provide methods for turning it on and off and, +optionally, letting the system know about hardware-disabled states that may +be implemented on the device. This code is enabled with the CONFIG_RFKILL +Kconfig option, which drivers can "select". -5. The wireless device driver MUST NOT leave the transmitter enabled during -suspend and hibernation unless: +The rfkill class code also notifies userspace of state changes, this is +achieved via uevents. It also provides some sysfs files for userspace to +check the status of radio transmitters. See the "Userspace support" section +below. - 5.1. The transmitter has to be enabled for some sort of functionality - like wake-on-wireless-packet or autonomous packed forwarding in a mesh - network, and that functionality is enabled for this suspend/hibernation - cycle. -AND +The rfkill-input code implements a basic response to rfkill buttons -- it +implements turning on/off all devices of a certain class (or all). - 5.2. The device was not on a user-requested BLOCKED state before - the suspend (i.e. the driver must NOT unblock a device, not even - to support wake-on-wireless-packet or remain in the mesh). +When the device is hard-blocked (either by a call to rfkill_set_hw_state() +or from query_hw_block) set_block() will be invoked but drivers can well +ignore the method call since they can use the return value of the function +rfkill_set_hw_state() to sync the software state instead of keeping track +of calls to set_block(). -In other words, there is absolutely no allowed scenario where a driver can -automatically take action to unblock a rfkill controller (obviously, this deals -with scenarios where soft-blocking or both soft and hard blocking is happening. -Scenarios where hardware rfkill lines are the only ones blocking the -transmitter are outside of this rule, since the wireless device driver does not -control its input hardware rfkill lines in the first place). -6. During resume, rfkill will try to restore its previous state. +The entire functionality is spread over more than one subsystem: -7. After a rfkill class is suspended, it will *not* call rfkill->toggle_radio -until it is resumed. + * The kernel input layer generates KEY_WWAN, KEY_WLAN etc. and + SW_RFKILL_ALL -- when the user presses a button. Drivers for radio + transmitters generally do not register to the input layer, unless the + device really provides an input device (i.e. a button that has no + effect other than generating a button press event) + * The rfkill-input code hooks up to these events and switches the soft-block + of the various radio transmitters, depending on the button type. -Example of a WLAN wireless driver connected to the rfkill subsystem: --------------------------------------------------------------------- + * The rfkill drivers turn off/on their transmitters as requested. -A certain WLAN card has one input pin that causes it to block the transmitter -and makes the status of that input pin available (only for reading!) to the -kernel driver. This is a hard rfkill input line (it cannot be overridden by -the kernel driver). + * The rfkill class will generate userspace notifications (uevents) to tell + userspace what the current state is. -The card also has one PCI register that, if manipulated by the driver, causes -it to block the transmitter. This is a soft rfkill input line. -It has also a thermal protection circuitry that shuts down its transmitter if -the card overheats, and makes the status of that protection available (only for -reading!) to the kernel driver. This is also a hard rfkill input line. -If either one of these rfkill lines are active, the transmitter is blocked by -the hardware and forced offline. +3. Kernel driver guidelines -The driver should allocate and attach to its struct device *ONE* instance of -the rfkill class (there is only one transmitter). -It can implement the get_state() hook, and return RFKILL_STATE_HARD_BLOCKED if -either one of its two hard rfkill input lines are active. If the two hard -rfkill lines are inactive, it must return RFKILL_STATE_SOFT_BLOCKED if its soft -rfkill input line is active. Only if none of the rfkill input lines are -active, will it return RFKILL_STATE_UNBLOCKED. +Drivers for radio transmitters normally implement only the rfkill class. +These drivers may not unblock the transmitter based on own decisions, they +should act on information provided by the rfkill class only. -Since the device has a hardware rfkill line, it IS subject to state changes -external to rfkill. Therefore, the driver must make sure that it calls -rfkill_force_state() to keep the status always up-to-date, and it must do a -rfkill_force_state() on resume from sleep. +Platform drivers might implement input devices if the rfkill button is just +that, a button. If that button influences the hardware then you need to +implement an rfkill class instead. This also applies if the platform provides +a way to turn on/off the transmitter(s). -Every time the driver gets a notification from the card that one of its rfkill -lines changed state (polling might be needed on badly designed cards that don't -generate interrupts for such events), it recomputes the rfkill state as per -above, and calls rfkill_force_state() to update it. +During suspend/hibernation, transmitters should only be left enabled when +wake-on wlan or similar functionality requires it and the device wasn't +blocked before suspend/hibernate. Note that it may be necessary to update +the rfkill subsystem's idea of what the current state is at resume time if +the state may have changed over suspend. -The driver should implement the toggle_radio() hook, that: -1. Returns an error if one of the hardware rfkill lines are active, and the -caller asked for RFKILL_STATE_UNBLOCKED. -2. Activates the soft rfkill line if the caller asked for state -RFKILL_STATE_SOFT_BLOCKED. It should do this even if one of the hard rfkill -lines are active, effectively double-blocking the transmitter. - -3. Deactivates the soft rfkill line if none of the hardware rfkill lines are -active and the caller asked for RFKILL_STATE_UNBLOCKED. - -=============================================================================== -4: Kernel API +4. Kernel API To build a driver with rfkill subsystem support, the driver should depend on -(or select) the Kconfig symbol RFKILL; it should _not_ depend on RKFILL_INPUT. +(or select) the Kconfig symbol RFKILL. The hardware the driver talks to may be write-only (where the current state of the hardware is unknown), or read-write (where the hardware can be queried about its current state). -The rfkill class will call the get_state hook of a device every time it needs -to know the *real* current state of the hardware. This can happen often, but -it does not do any polling, so it is not enough on hardware that is subject -to state changes outside of the rfkill subsystem. - -Therefore, calling rfkill_force_state() when a state change happens is -mandatory when the device has a hardware rfkill line, or when something else -like the firmware could cause its state to be changed without going through the -rfkill class. - -Some hardware provides events when its status changes. In these cases, it is -best for the driver to not provide a get_state hook, and instead register the -rfkill class *already* with the correct status, and keep it updated using -rfkill_force_state() when it gets an event from the hardware. - -rfkill_force_state() must be used on the device resume handlers to update the -rfkill status, should there be any chance of the device status changing during -the sleep. - -There is no provision for a statically-allocated rfkill struct. You must -use rfkill_allocate() to allocate one. - -You should: - - rfkill_allocate() - - modify rfkill fields (flags, name) - - modify state to the current hardware state (THIS IS THE ONLY TIME - YOU CAN ACCESS state DIRECTLY) - - rfkill_register() +Calling rfkill_set_hw_state() when a state change happens is required from +rfkill drivers that control devices that can be hard-blocked unless they also +assign the poll_hw_block() callback (then the rfkill core will poll the +device). Don't do this unless you cannot get the event in any other way. -The only way to set a device to the RFKILL_STATE_HARD_BLOCKED state is through -a suitable return of get_state() or through rfkill_force_state(). -When a device is in the RFKILL_STATE_HARD_BLOCKED state, the only way to switch -it to a different state is through a suitable return of get_state() or through -rfkill_force_state(). -If toggle_radio() is called to set a device to state RFKILL_STATE_SOFT_BLOCKED -when that device is already at the RFKILL_STATE_HARD_BLOCKED state, it should -not return an error. Instead, it should try to double-block the transmitter, -so that its state will change from RFKILL_STATE_HARD_BLOCKED to -RFKILL_STATE_SOFT_BLOCKED should the hardware blocking cease. - -Please refer to the source for more documentation. - -=============================================================================== -5: Userspace support - -rfkill devices issue uevents (with an action of "change"), with the following -environment variables set: - -RFKILL_NAME -RFKILL_STATE -RFKILL_TYPE +5. Userspace support -The ABI for these variables is defined by the sysfs attributes. It is best -to take a quick look at the source to make sure of the possible values. - -It is expected that HAL will trap those, and bridge them to DBUS, etc. These -events CAN and SHOULD be used to give feedback to the user about the rfkill -status of the system. - -Input devices may issue events that are related to rfkill. These are the -various KEY_* events and SW_* events supported by rfkill-input.c. - -Userspace may not change the state of an rfkill switch in response to an -input event, it should refrain from changing states entirely. - -Userspace cannot assume it is the only source of control for rfkill switches. -Their state can change due to firmware actions, direct user actions, and the -rfkill-input EPO override for *_RFKILL_ALL. - -When rfkill-input is not active, userspace must initiate a rfkill status -change by writing to the "state" attribute in order for anything to happen. - -Take particular care to implement EV_SW SW_RFKILL_ALL properly. When that -switch is set to OFF, *every* rfkill device *MUST* be immediately put into the -RFKILL_STATE_SOFT_BLOCKED state, no questions asked. - -The following sysfs entries will be created: +The following sysfs entries exist for every rfkill device: name: Name assigned by driver to this key (interface or driver name). type: Name of the key type ("wlan", "bluetooth", etc). state: Current state of the transmitter 0: RFKILL_STATE_SOFT_BLOCKED - transmitter is forced off, but one can override it - by a write to the state attribute; + transmitter is turned off by software 1: RFKILL_STATE_UNBLOCKED - transmiter is NOT forced off, and may operate if - all other conditions for such operation are met - (such as interface is up and configured, etc); + transmiter is (potentially) active 2: RFKILL_STATE_HARD_BLOCKED transmitter is forced off by something outside of - the driver's control. One cannot set a device to - this state through writes to the state attribute; - claim: 1: Userspace handles events, 0: Kernel handles events - -Both the "state" and "claim" entries are also writable. For the "state" entry -this means that when 1 or 0 is written, the device rfkill state (if not yet in -the requested state), will be will be toggled accordingly. - -For the "claim" entry writing 1 to it means that the kernel no longer handles -key events even though RFKILL_INPUT input was enabled. When "claim" has been -set to 0, userspace should make sure that it listens for the input events or -check the sysfs "state" entry regularly to correctly perform the required tasks -when the rkfill key is pressed. - -A note about input devices and EV_SW events: - -In order to know the current state of an input device switch (like -SW_RFKILL_ALL), you will need to use an IOCTL. That information is not -available through sysfs in a generic way at this time, and it is not available -through the rfkill class AT ALL. + the driver's control. + claim: 0: Kernel handles events (currently always reads that value) + +rfkill devices also issue uevents (with an action of "change"), with the +following environment variables set: + +RFKILL_NAME +RFKILL_STATE +RFKILL_TYPE + +The contents of these variables corresponds to the "name", "state" and +"type" sysfs files explained above. -- cgit v1.1 From c64fb01627e24725d1f9d535e4426475a4415753 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 2 Jun 2009 13:01:38 +0200 Subject: rfkill: create useful userspace interface The new code added by this patch will make rfkill create a misc character device /dev/rfkill that userspace can use to control rfkill soft blocks and get status of devices as well as events when the status changes. Using it is very simple -- when you open it you can read a number of times to get the initial state, and every further read blocks (you can poll) on getting the next event from the kernel. The same structure you read is also used when writing to it to change the soft block of a given device, all devices of a given type, or all devices. This also makes CONFIG_RFKILL_INPUT selectable again in order to be able to test without it present since its functionality can now be replaced by userspace entirely and distros and users may not want the input part of rfkill interfering with their userspace code. We will also write a userspace daemon to handle all that and consequently add the input code to the feature removal schedule. In order to have rfkilld support both kernels with and without CONFIG_RFKILL_INPUT (or new kernels after its eventual removal) we also add an ioctl (that only exists if rfkill-input is present) to disable rfkill-input. It is not very efficient, but at least gives the correct behaviour in all cases. Signed-off-by: Johannes Berg Acked-by: Marcel Holtmann Signed-off-by: John W. Linville --- Documentation/feature-removal-schedule.txt | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index de491a3..edb2f0b 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -437,3 +437,10 @@ Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate driver but this caused driver conflicts. Who: Jean Delvare Krzysztof Helt + +--------------------------- + +What: CONFIG_RFKILL_INPUT +When: 2.6.33 +Why: Should be implemented in userspace, policy daemon. +Who: Johannes Berg -- cgit v1.1 From f71fea23a27ba8ec53375832aab6a80fc14622e0 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Wed, 3 Jun 2009 10:17:59 +0200 Subject: rfkill: document /dev/rfkill Add some blurb about /dev/rfkill to the documentation and fix the "transmiter" spelling error. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/rfkill.txt | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index de941e3..1b74b5f 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -113,7 +113,7 @@ The following sysfs entries exist for every rfkill device: 0: RFKILL_STATE_SOFT_BLOCKED transmitter is turned off by software 1: RFKILL_STATE_UNBLOCKED - transmiter is (potentially) active + transmitter is (potentially) active 2: RFKILL_STATE_HARD_BLOCKED transmitter is forced off by something outside of the driver's control. @@ -128,3 +128,9 @@ RFKILL_TYPE The contents of these variables corresponds to the "name", "state" and "type" sysfs files explained above. + +An alternative userspace interface exists as a misc device /dev/rfkill, +which allows userspace to obtain and set the state of rfkill devices and +sets of devices. It also notifies userspace about device addition and +removal. The API is a simple read/write API that is defined in +linux/rfkill.h. -- cgit v1.1 From 3c0797925f4ef9d55a32059d2af61a9c262e639d Mon Sep 17 00:00:00 2001 From: Andi Kleen Date: Wed, 27 May 2009 21:56:55 +0200 Subject: x86, mce: switch x86 machine check handler to Monarch election. On Intel platforms machine check exceptions are always broadcast to all CPUs. This patch makes the machine check handler synchronize all these machine checks, elect a Monarch to handle the event and collect the worst event from all CPUs and then process it first. This has some advantages: - When there is a truly data corrupting error the system panics as quickly as possible. This improves containment of corrupted data and makes sure the corrupted data never hits stable storage. - The panics are synchronized and do not reenter the panic code on multiple CPUs (which currently does not handle this well). - All the errors are reported. Currently it often happens that another CPU happens to do the panic first, but reports useless information (empty machine check) because the real error happened on another CPU which came in later. This is a big advantage on Nehalem where the 8 threads per CPU lead to often the wrong CPU winning the race and dumping useless information on a machine check. The problem also occurs in a less severe form on older CPUs. - The system can detect when no CPUs detected a machine check and shut down the system. This can happen when one CPU is so badly hung that that it cannot process a machine check anymore or when some external agent wants to stop the system by asserting the machine check pin. This follows Intel hardware recommendations. - This matches the recommended error model by the CPU designers. - The events can be output in true severity order - When a panic happens on another CPU it makes sure to be actually be able to process the stop IPI by enabling interrupts. The code is extremly careful to handle timeouts while waiting for other CPUs. It can't rely on the normal timing mechanisms (jiffies, ktime_get) because of its asynchronous/lockless nature, so it uses own timeouts using ndelay() and a "SPINUNIT" The timeout is configurable. By default it waits for upto one second for the other CPUs. This can be also disabled. From some informal testing AMD systems do not see to broadcast machine checks, so right now it's always disabled by default on non Intel CPUs or also on very old Intel systems. Includes fixes from Ying Huang Fixed a "ecception" in a comment (H.Seto) Moved global_nwo reset later based on suggestion from H.Seto v2: Avoid duplicate messages [ Impact: feature, fixes long standing problems. ] Signed-off-by: Andi Kleen Signed-off-by: Hidetoshi Seto Signed-off-by: H. Peter Anvin --- Documentation/x86/x86_64/boot-options.txt | 6 +++++- Documentation/x86/x86_64/machinecheck | 4 ++++ 2 files changed, 9 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 63fca71..0ee5e3b 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -15,13 +15,17 @@ Machine check in a reboot. On Intel systems it is enabled by default. mce=nobootlog Disable boot machine check logging. - mce=tolerancelevel (number) + mce=tolerancelevel[,monarchtimeout] (number,number) + tolerance levels: 0: always panic on uncorrected errors, log corrected errors 1: panic or SIGBUS on uncorrected errors, log corrected errors 2: SIGBUS or log uncorrected errors, log corrected errors 3: never panic or SIGBUS, log all errors (for testing only) Default is 1 Can be also set using sysfs which is preferable. + monarchtimeout: + Sets the time in us to wait for other CPUs on machine checks. 0 + to disable. nomce (for compatibility with i386): same as mce=off diff --git a/Documentation/x86/x86_64/machinecheck b/Documentation/x86/x86_64/machinecheck index a4fdb25..b1fb302 100644 --- a/Documentation/x86/x86_64/machinecheck +++ b/Documentation/x86/x86_64/machinecheck @@ -69,6 +69,10 @@ trigger Program to run when a machine check event is detected. This is an alternative to running mcelog regularly from cron and allows to detect events faster. +monarch_timeout + How long to wait for the other CPUs to machine check too on a + exception. 0 to disable waiting for other CPUs. + Unit: us TBD document entries for AMD threshold interrupt configuration -- cgit v1.1 From bbb0a4247aaf1eabbd6d87750eafe99c577920f7 Mon Sep 17 00:00:00 2001 From: Jonathan Corbet Date: Fri, 16 Jan 2009 09:49:50 -0700 Subject: Document Reported-by in SubmittingPatches Randy pointed out that the Reported-By tag should be documented with the others in SubmittingPatches. Reported-by: Randy Dunlap Signed-off-by: Jonathan Corbet --- Documentation/SubmittingPatches | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index f309d3c..6f29e29 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -405,7 +405,14 @@ person it names. This tag documents that potentially interested parties have been included in the discussion -14) Using Tested-by: and Reviewed-by: +14) Using Reported-by:, Tested-by: and Reviewed-by: + +If this patch fixes a problem reported by somebody else, consider adding a +Reported-by: tag to credit the reporter for their contribution. Please +note that this tag should not be added without the reporter's permission, +especially if the problem was not reported in a public forum. That said, +if we diligently credit our bug reporters, they will, hopefully, be +inspired to help us again in the future. A Tested-by: tag indicates that the patch has been successfully tested (in some environment) by the person named. This tag informs maintainers that -- cgit v1.1 From 5d98932ab0acb699dc56d9e252f056b9b2cdab25 Mon Sep 17 00:00:00 2001 From: Jonathan Corbet Date: Tue, 21 Apr 2009 13:33:06 -0600 Subject: docs: Encourage better changelogs in the development process document Add a couple of paragraphs to the "patch formatting" section on how patches should be described. This text is shamelessly cribbed from suggestions posted by Rusty Russell. Signed-off-by: Jonathan Corbet --- Documentation/development-process/5.Posting | 31 ++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/development-process/5.Posting b/Documentation/development-process/5.Posting index dd48132..f622c1e 100644 --- a/Documentation/development-process/5.Posting +++ b/Documentation/development-process/5.Posting @@ -119,7 +119,7 @@ which takes quite a bit of time and thought after the "real work" has been done. When done properly, though, it is time well spent. -5.4: PATCH FORMATTING +5.4: PATCH FORMATTING AND CHANGELOGS So now you have a perfect series of patches for posting, but the work is not done quite yet. Each patch needs to be formatted into a message which @@ -146,8 +146,33 @@ that end, each patch will be composed of the following: - One or more tag lines, with, at a minimum, one Signed-off-by: line from the author of the patch. Tags will be described in more detail below. -The above three items should, normally, be the text used when committing -the change to a revision control system. They are followed by: +The items above, together, form the changelog for the patch. Writing good +changelogs is a crucial but often-neglected art; it's worth spending +another moment discussing this issue. When writing a changelog, you should +bear in mind that a number of different people will be reading your words. +These include subsystem maintainers and reviewers who need to decide +whether the patch should be included, distributors and other maintainers +trying to decide whether a patch should be backported to other kernels, bug +hunters wondering whether the patch is responsible for a problem they are +chasing, users who want to know how the kernel has changed, and more. A +good changelog conveys the needed information to all of these people in the +most direct and concise way possible. + +To that end, the summary line should describe the effects of and motivation +for the change as well as possible given the one-line constraint. The +detailed description can then amplify on those topics and provide any +needed additional information. If the patch fixes a bug, cite the commit +which introduced the bug if possible. If a problem is associated with +specific log or compiler output, include that output to help others +searching for a solution to the same problem. If the change is meant to +support other changes coming in later patch, say so. If internal APIs are +changed, detail those changes and how other developers should respond. In +general, the more you can put yourself into the shoes of everybody who will +be reading your changelog, the better that changelog (and the kernel as a +whole) will be. + +Needless to say, the changelog should be the text used when committing the +change to a revision control system. It will be followed by: - The patch itself, in the unified ("-u") patch format. Using the "-p" option to diff will associate function names with changes, making the -- cgit v1.1 From 5801da1b2f1207da21271ffd6768cd40a6c7f1c4 Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Thu, 4 Jun 2009 16:26:50 +0200 Subject: SubmittingPatches: fix typo Fix typo. Signed-off-by: Pavel Machek Signed-off-by: Jonathan Corbet --- Documentation/SubmittingPatches | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 6f29e29..71b6da2 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -451,7 +451,7 @@ offer a Reviewed-by tag for a patch. This tag serves to give credit to reviewers and to inform maintainers of the degree of review which has been done on the patch. Reviewed-by: tags, when supplied by reviewers known to understand the subject area and to perform thorough reviews, will normally -increase the liklihood of your patch getting into the kernel. +increase the likelihood of your patch getting into the kernel. 15) The canonical patch format -- cgit v1.1 From 2ae19acaa50a09c1099956efb895c0aca74ab050 Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Thu, 16 Apr 2009 07:44:45 -0400 Subject: Documentation: Add "how to write a good patch summary" to SubmittingPatches Unfortunately many patch submissions are arriving with painfully poor patch descriptions. As a result of the discussion on LKML: http://lkml.org/lkml/2009/4/15/296 explain how to submit a better patch description, in the (perhaps vain) hope that maintainers won't end up having to rewrite the git commit logs as often as they do today. Signed-off-by: "Theodore Ts'o" Cc: Ingo Molnar Signed-off-by: Jonathan Corbet --- Documentation/SubmittingPatches | 65 ++++++++++++++++++++++++++++++++--------- 1 file changed, 51 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 71b6da2..6c45683 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -91,6 +91,10 @@ Be as specific as possible. The WORST descriptions possible include things like "update driver X", "bug fix for driver X", or "this patch includes updates for subsystem X. Please apply." +The maintainer will thank you if you write your patch description in a +form which can be easily pulled into Linux's source code management +system, git, as a "commit log". See #15, below. + If your description starts to get long, that's a sign that you probably need to split up your patch. See #3, next. @@ -492,12 +496,33 @@ phrase" should not be a filename. Do not use the same "summary phrase" for every patch in a whole patch series (where a "patch series" is an ordered sequence of multiple, related patches). -Bear in mind that the "summary phrase" of your email becomes -a globally-unique identifier for that patch. It propagates -all the way into the git changelog. The "summary phrase" may -later be used in developer discussions which refer to the patch. -People will want to google for the "summary phrase" to read -discussion regarding that patch. +Bear in mind that the "summary phrase" of your email becomes a +globally-unique identifier for that patch. It propagates all the way +into the git changelog. The "summary phrase" may later be used in +developer discussions which refer to the patch. People will want to +google for the "summary phrase" to read discussion regarding that +patch. It will also be the only thing that people may quickly see +when, two or three months later, they are going through perhaps +thousands of patches using tools such as "gitk" or "git log +--oneline". + +For these reasons, the "summary" must be no more than 70-75 +characters, and it must describe both what the patch changes, as well +as why the patch might be necessary. It is challenging to be both +succinct and descriptive, but that is what a well-written summary +should do. + +The "summary phrase" may be prefixed by tags enclosed in square +brackets: "Subject: [PATCH tag] ". The tags are not +considered part of the summary phrase, but describe how the patch +should be treated. Common tags might include a version descriptor if +the multiple versions of the patch have been sent out in response to +comments (i.e., "v1, v2, v3"), or "RFC" to indicate a request for +comments. If there are four patches in a patch series the individual +patches may be numbered like this: 1/4, 2/4, 3/4, 4/4. This assures +that developers understand the order in which the patches should be +applied and that they have reviewed or applied all of the patches in +the patch series. A couple of example Subjects: @@ -517,19 +542,31 @@ the patch author in the changelog. The explanation body will be committed to the permanent source changelog, so should make sense to a competent reader who has long since forgotten the immediate details of the discussion that might -have led to this patch. +have led to this patch. Including symptoms of the failure which the +patch addresses (kernel log messages, oops messages, etc.) is +especially useful for people who might be searching the commit logs +looking for the applicable patch. If a patch fixes a compile failure, +it may not be necessary to include _all_ of the compile failures; just +enough that it is likely that someone searching for the patch can find +it. As in the "summary phrase", it is important to be both succinct as +well as descriptive. The "---" marker line serves the essential purpose of marking for patch handling tools where the changelog message ends. One good use for the additional comments after the "---" marker is for -a diffstat, to show what files have changed, and the number of inserted -and deleted lines per file. A diffstat is especially useful on bigger -patches. Other comments relevant only to the moment or the maintainer, -not suitable for the permanent changelog, should also go here. -Use diffstat options "-p 1 -w 70" so that filenames are listed from the -top of the kernel source tree and don't use too much horizontal space -(easily fit in 80 columns, maybe with some indentation). +a diffstat, to show what files have changed, and the number of +inserted and deleted lines per file. A diffstat is especially useful +on bigger patches. Other comments relevant only to the moment or the +maintainer, not suitable for the permanent changelog, should also go +here. A good example of such comments might be "patch changelogs" +which describe what has changed between the v1 and v2 version of the +patch. + +If you are going to include a diffstat after the "---" marker, please +use diffstat options "-p 1 -w 70" so that filenames are listed from +the top of the kernel source tree and don't use too much horizontal +space (easily fit in 80 columns, maybe with some indentation). See more details on the proper patch format in the following references. -- cgit v1.1 From 3e1647c5b54a91a7182e121cfe569e6f0bf167ec Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Guido=20G=C3=BCnther?= Date: Fri, 5 Jun 2009 00:47:26 +0200 Subject: ALSA: support Sony Vaio TT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit with BIOS probing only we offer a non functional headphone swith and volume slider. Signed-off-by: Guido Günther Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 29c6125..54b0805 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -159,6 +159,7 @@ ALC883/888 3stack-6ch-intel Intel DG33* boards asus-p5q ASUS P5Q-EM boards mb31 MacBook 3,1 + sony-vaio-tt Sony VAIO TT auto auto-config reading BIOS (default) ALC861/660 -- cgit v1.1 From 5988af2319781bc8e0ce418affec4e09cfa77907 Mon Sep 17 00:00:00 2001 From: Rohit Hagargundgi Date: Tue, 12 May 2009 13:46:57 -0700 Subject: mtd: Flex-OneNAND support Add support for Samsung Flex-OneNAND devices. Flex-OneNAND combines SLC and MLC technologies into a single device. SLC area provides increased reliability and speed, suitable for storing code such as bootloader, kernel and root file system. MLC area provides high density and is suitable for storing user data. SLC and MLC regions can be configured through kernel parameter. [akpm@linux-foundation.org: export flexoand_region and onenand_addr] Signed-off-by: Rohit Hagargundgi Signed-off-by: Kyungmin Park Cc: Vishak G Signed-off-by: Andrew Morton Signed-off-by: David Woodhouse --- Documentation/kernel-parameters.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e87bdbf..12df135 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1380,6 +1380,16 @@ and is between 256 and 4096 characters. It is defined in the file mtdparts= [MTD] See drivers/mtd/cmdlinepart.c. + onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration + + Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock] + + boundary - index of last SLC block on Flex-OneNAND. + The remaining blocks are configured as MLC blocks. + lock - Configure if Flex-OneNAND boundary should be locked. + Once locked, the boundary cannot be changed. + 1 indicates lock status, 0 indicates unlock status. + mtdset= [ARM] ARM/S3C2412 JIVE boot control -- cgit v1.1 From f89d7eaf6c34828070f407d0e04b73127f176ec5 Mon Sep 17 00:00:00 2001 From: Jonathan Corbet Date: Thu, 4 Jun 2009 16:35:25 -0600 Subject: Document the debugfs API This is an updated document covering the internal API for the debugfs filesystem. Thanks to Shen Feng for suggesting that I put this text here and noting that the old LWN version was rather out of date. Acked-by: Greg Kroah-Hartman Reported-by: Shen Feng Signed-off-by: Jonathan Corbet --- Documentation/filesystems/debugfs.txt | 158 ++++++++++++++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100644 Documentation/filesystems/debugfs.txt (limited to 'Documentation') diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt new file mode 100644 index 0000000..ed52af6 --- /dev/null +++ b/Documentation/filesystems/debugfs.txt @@ -0,0 +1,158 @@ +Copyright 2009 Jonathan Corbet + +Debugfs exists as a simple way for kernel developers to make information +available to user space. Unlike /proc, which is only meant for information +about a process, or sysfs, which has strict one-value-per-file rules, +debugfs has no rules at all. Developers can put any information they want +there. The debugfs filesystem is also intended to not serve as a stable +ABI to user space; in theory, there are no stability constraints placed on +files exported there. The real world is not always so simple, though [1]; +even debugfs interfaces are best designed with the idea that they will need +to be maintained forever. + +Debugfs is typically mounted with a command like: + + mount -t debugfs none /sys/kernel/debug + +(Or an equivalent /etc/fstab line). + +Note that the debugfs API is exported GPL-only to modules. + +Code using debugfs should include . Then, the first order +of business will be to create at least one directory to hold a set of +debugfs files: + + struct dentry *debugfs_create_dir(const char *name, struct dentry *parent); + +This call, if successful, will make a directory called name underneath the +indicated parent directory. If parent is NULL, the directory will be +created in the debugfs root. On success, the return value is a struct +dentry pointer which can be used to create files in the directory (and to +clean it up at the end). A NULL return value indicates that something went +wrong. If ERR_PTR(-ENODEV) is returned, that is an indication that the +kernel has been built without debugfs support and none of the functions +described below will work. + +The most general way to create a file within a debugfs directory is with: + + struct dentry *debugfs_create_file(const char *name, mode_t mode, + struct dentry *parent, void *data, + const struct file_operations *fops); + +Here, name is the name of the file to create, mode describes the access +permissions the file should have, parent indicates the directory which +should hold the file, data will be stored in the i_private field of the +resulting inode structure, and fops is a set of file operations which +implement the file's behavior. At a minimum, the read() and/or write() +operations should be provided; others can be included as needed. Again, +the return value will be a dentry pointer to the created file, NULL for +error, or ERR_PTR(-ENODEV) if debugfs support is missing. + +In a number of cases, the creation of a set of file operations is not +actually necessary; the debugfs code provides a number of helper functions +for simple situations. Files containing a single integer value can be +created with any of: + + struct dentry *debugfs_create_u8(const char *name, mode_t mode, + struct dentry *parent, u8 *value); + struct dentry *debugfs_create_u16(const char *name, mode_t mode, + struct dentry *parent, u16 *value); + struct dentry *debugfs_create_u32(const char *name, mode_t mode, + struct dentry *parent, u32 *value); + struct dentry *debugfs_create_u64(const char *name, mode_t mode, + struct dentry *parent, u64 *value); + +These files support both reading and writing the given value; if a specific +file should not be written to, simply set the mode bits accordingly. The +values in these files are in decimal; if hexadecimal is more appropriate, +the following functions can be used instead: + + struct dentry *debugfs_create_x8(const char *name, mode_t mode, + struct dentry *parent, u8 *value); + struct dentry *debugfs_create_x16(const char *name, mode_t mode, + struct dentry *parent, u16 *value); + struct dentry *debugfs_create_x32(const char *name, mode_t mode, + struct dentry *parent, u32 *value); + +Note that there is no debugfs_create_x64(). + +These functions are useful as long as the developer knows the size of the +value to be exported. Some types can have different widths on different +architectures, though, complicating the situation somewhat. There is a +function meant to help out in one special case: + + struct dentry *debugfs_create_size_t(const char *name, mode_t mode, + struct dentry *parent, + size_t *value); + +As might be expected, this function will create a debugfs file to represent +a variable of type size_t. + +Boolean values can be placed in debugfs with: + + struct dentry *debugfs_create_bool(const char *name, mode_t mode, + struct dentry *parent, u32 *value); + +A read on the resulting file will yield either Y (for non-zero values) or +N, followed by a newline. If written to, it will accept either upper- or +lower-case values, or 1 or 0. Any other input will be silently ignored. + +Finally, a block of arbitrary binary data can be exported with: + + struct debugfs_blob_wrapper { + void *data; + unsigned long size; + }; + + struct dentry *debugfs_create_blob(const char *name, mode_t mode, + struct dentry *parent, + struct debugfs_blob_wrapper *blob); + +A read of this file will return the data pointed to by the +debugfs_blob_wrapper structure. Some drivers use "blobs" as a simple way +to return several lines of (static) formatted text output. This function +can be used to export binary information, but there does not appear to be +any code which does so in the mainline. Note that all files created with +debugfs_create_blob() are read-only. + +There are a couple of other directory-oriented helper functions: + + struct dentry *debugfs_rename(struct dentry *old_dir, + struct dentry *old_dentry, + struct dentry *new_dir, + const char *new_name); + + struct dentry *debugfs_create_symlink(const char *name, + struct dentry *parent, + const char *target); + +A call to debugfs_rename() will give a new name to an existing debugfs +file, possibly in a different directory. The new_name must not exist prior +to the call; the return value is old_dentry with updated information. +Symbolic links can be created with debugfs_create_symlink(). + +There is one important thing that all debugfs users must take into account: +there is no automatic cleanup of any directories created in debugfs. If a +module is unloaded without explicitly removing debugfs entries, the result +will be a lot of stale pointers and no end of highly antisocial behavior. +So all debugfs users - at least those which can be built as modules - must +be prepared to remove all files and directories they create there. A file +can be removed with: + + void debugfs_remove(struct dentry *dentry); + +The dentry value can be NULL, in which case nothing will be removed. + +Once upon a time, debugfs users were required to remember the dentry +pointer for every debugfs file they created so that all files could be +cleaned up. We live in more civilized times now, though, and debugfs users +can call: + + void debugfs_remove_recursive(struct dentry *dentry); + +If this function is passed a pointer for the dentry corresponding to the +top-level directory, the entire hierarchy below that directory will be +removed. + +Notes: + [1] http://lwn.net/Articles/309298/ -- cgit v1.1 From 075affcbe01d4d7cefcd0e30a98df1253bcf8d92 Mon Sep 17 00:00:00 2001 From: Bartlomiej Zolnierkiewicz Date: Sun, 7 Jun 2009 13:52:52 +0200 Subject: ide: preserve Host Protected Area by default (v2) From the perspective of most users of recent systems, disabling Host Protected Area (HPA) can break vendor RAID formats, GPT partitions and risks corrupting firmware or overwriting vendor system recovery tools. Unfortunately the original (kernels < 2.6.30) behavior (unconditionally disabling HPA and using full disk capacity) was introduced at the time when the main use of HPA was to make the drive look small enough for the BIOS to allow the system to boot with large capacity drives. Thus to allow the maximum compatibility with the existing setups (using HPA and partitioned with HPA disabled) we automically disable HPA if any partitions overlapping HPA are detected. Additionally HPA can also be disabled using the "nohpa" module parameter (i.e. "ide_core.nohpa=0.0" to disable HPA on /dev/hda). v2: Fix ->resume HPA support. While at it: - remove stale "idebus=" entry from Documentation/kernel-parameters.txt Cc: Robert Hancock Cc: Frans Pop Cc: "Andries E. Brouwer" Cc: Al Viro Acked-by: Sergei Shtylyov [patch description was based on input from Alan Cox and Frans Pop] Emphatically-Acked-by: Alan Cox Signed-off-by: Bartlomiej Zolnierkiewicz --- Documentation/ide/ide.txt | 2 ++ Documentation/kernel-parameters.txt | 7 ++----- 2 files changed, 4 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/ide/ide.txt b/Documentation/ide/ide.txt index 0c78f4b..e77bebf 100644 --- a/Documentation/ide/ide.txt +++ b/Documentation/ide/ide.txt @@ -216,6 +216,8 @@ Other kernel parameters for ide_core are: * "noflush=[interface_number.device_number]" to disable flush requests +* "nohpa=[interface_number.device_number]" to disable Host Protected Area + * "noprobe=[interface_number.device_number]" to skip probing * "nowerr=[interface_number.device_number]" to ignore the WRERR_STAT bit diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index a19f021..e58c91c 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -835,11 +835,8 @@ and is between 256 and 4096 characters. It is defined in the file ide-core.nodma= [HW] (E)IDE subsystem Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc - .vlb_clock .pci_clock .noflush .noprobe .nowerr .cdrom - .chs .ignore_cable are additional options - See Documentation/ide/ide.txt. - - idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed + .vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr + .cdrom .chs .ignore_cable are additional options See Documentation/ide/ide.txt. ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem -- cgit v1.1 From 4e329972052c3649367b91de783f6293b8653cb2 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sun, 7 Jun 2009 09:09:23 +0000 Subject: isdn: rename capi_ctr_reseted() to capi_ctr_down() Change the name of the Kernel CAPI exported function capi_ctr_reseted() to something representing its purpose better. Impact: renaming, no functional change Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/INTERFACE.CAPI | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/isdn/INTERFACE.CAPI b/Documentation/isdn/INTERFACE.CAPI index 786d619..4631538 100644 --- a/Documentation/isdn/INTERFACE.CAPI +++ b/Documentation/isdn/INTERFACE.CAPI @@ -45,7 +45,7 @@ From then on, Kernel CAPI may call the registered callback functions for the device. If the device becomes unusable for any reason (shutdown, disconnect ...), the -driver has to call capi_ctr_reseted(). This will prevent further calls to the +driver has to call capi_ctr_down(). This will prevent further calls to the callback functions by Kernel CAPI. @@ -166,7 +166,7 @@ int detach_capi_ctr(struct capi_ctr *ctrlr) register/unregister a device (controller) with Kernel CAPI void capi_ctr_ready(struct capi_ctr *ctrlr) -void capi_ctr_reseted(struct capi_ctr *ctrlr) +void capi_ctr_down(struct capi_ctr *ctrlr) signal controller ready/not ready void capi_ctr_suspend_output(struct capi_ctr *ctrlr) -- cgit v1.1 From fe93299a008a7056fe1790744b3a425ddf79a16b Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sun, 7 Jun 2009 09:09:24 +0000 Subject: isdn: extend INTERFACE.CAPI document Clarify calling context and return codes of callback methods, and add a description of the _cmsg structure and helper functions. Impact: documentation Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/INTERFACE.CAPI | 90 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 88 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/isdn/INTERFACE.CAPI b/Documentation/isdn/INTERFACE.CAPI index 4631538..686e107 100644 --- a/Documentation/isdn/INTERFACE.CAPI +++ b/Documentation/isdn/INTERFACE.CAPI @@ -114,20 +114,36 @@ char *driver_name int (*load_firmware)(struct capi_ctr *ctrlr, capiloaddata *ldata) (optional) pointer to a callback function for sending firmware and configuration data to the device + Return value: 0 on success, error code on error + Called in process context. void (*reset_ctr)(struct capi_ctr *ctrlr) - pointer to a callback function for performing a reset on the device, - releasing all registered applications + (optional) pointer to a callback function for performing a reset on + the device, releasing all registered applications + Called in process context. void (*register_appl)(struct capi_ctr *ctrlr, u16 applid, capi_register_params *rparam) void (*release_appl)(struct capi_ctr *ctrlr, u16 applid) pointers to callback functions for registration and deregistration of applications with the device + Calls to these functions are serialized by Kernel CAPI so that only + one call to any of them is active at any time. u16 (*send_message)(struct capi_ctr *ctrlr, struct sk_buff *skb) pointer to a callback function for sending a CAPI message to the device + Return value: CAPI error code + If the method returns 0 (CAPI_NOERROR) the driver has taken ownership + of the skb and the caller may no longer access it. If it returns a + non-zero (error) value then ownership of the skb returns to the caller + who may reuse or free it. + The return value should only be used to signal problems with respect + to accepting or queueing the message. Errors occurring during the + actual processing of the message should be signaled with an + appropriate reply message. + Calls to this function are not serialized by Kernel CAPI, ie. it must + be prepared to be re-entered. char *(*procinfo)(struct capi_ctr *ctrlr) pointer to a callback function returning the entry for the device in @@ -138,6 +154,8 @@ read_proc_t *ctr_read_proc system entry, /proc/capi/controllers/; will be called with a pointer to the device's capi_ctr structure as the last (data) argument +Note: Callback functions are never called in interrupt context. + - to be filled in before calling capi_ctr_ready(): u8 manu[CAPI_MANUFACTURER_LEN] @@ -153,6 +171,45 @@ u8 serial[CAPI_SERIAL_LEN] value to return for CAPI_GET_SERIAL +4.3 The _cmsg Structure + +(declared in ) + +The _cmsg structure stores the contents of a CAPI 2.0 message in an easily +accessible form. It contains members for all possible CAPI 2.0 parameters, of +which only those appearing in the message type currently being processed are +actually used. Unused members should be set to zero. + +Members are named after the CAPI 2.0 standard names of the parameters they +represent. See for the exact spelling. Member data +types are: + +u8 for CAPI parameters of type 'byte' + +u16 for CAPI parameters of type 'word' + +u32 for CAPI parameters of type 'dword' + +_cstruct for CAPI parameters of type 'struct' not containing any + variably-sized (struct) subparameters (eg. 'Called Party Number') + The member is a pointer to a buffer containing the parameter in + CAPI encoding (length + content). It may also be NULL, which will + be taken to represent an empty (zero length) parameter. + +_cmstruct for CAPI parameters of type 'struct' containing 'struct' + subparameters ('Additional Info' and 'B Protocol') + The representation is a single byte containing one of the values: + CAPI_DEFAULT: the parameter is empty + CAPI_COMPOSE: the values of the subparameters are stored + individually in the corresponding _cmsg structure members + +Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert +messages between their transport encoding described in the CAPI 2.0 standard +and their _cmsg structure representation. Note that capi_cmsg2message() does +not know or check the size of its destination buffer. The caller must make +sure it is big enough to accomodate the resulting CAPI message. + + 5. Lower Layer Interface Functions (declared in ) @@ -211,3 +268,32 @@ CAPIMSG_CONTROL(m) CAPIMSG_SETCONTROL(m, contr) Controller/PLCI/NCCI (u32) CAPIMSG_DATALEN(m) CAPIMSG_SETDATALEN(m, len) Data Length (u16) + +Library functions for working with _cmsg structures +(from ): + +unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg) + Assembles a CAPI 2.0 message from the parameters in *cmsg, storing the + result in *msg. + +unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg) + Disassembles the CAPI 2.0 message in *msg, storing the parameters in + *cmsg. + +unsigned capi_cmsg_header(_cmsg *cmsg, u16 ApplId, u8 Command, u8 Subcommand, + u16 Messagenumber, u32 Controller) + Fills the header part and address field of the _cmsg structure *cmsg + with the given values, zeroing the remainder of the structure so only + parameters with non-default values need to be changed before sending + the message. + +void capi_cmsg_answer(_cmsg *cmsg) + Sets the low bit of the Subcommand field in *cmsg, thereby converting + _REQ to _CONF and _IND to _RESP. + +char *capi_cmd2str(u8 Command, u8 Subcommand) + Returns the CAPI 2.0 message name corresponding to the given command + and subcommand values, as a static ASCII string. The return value may + be NULL if the command/subcommand is not one of those defined in the + CAPI 2.0 standard. + -- cgit v1.1 From 64a8be74357477558183b43156c5536b642de134 Mon Sep 17 00:00:00 2001 From: David Heidelberger Date: Mon, 8 Jun 2009 16:15:18 +0200 Subject: ALSA: hda - Add 7.1 support for MSI GX620 Added 7.1 support for MSI GX620 and jack quirk. Reference: kernel bug#13430 http://bugzilla.kernel.org/show_bug.cgi?id=13430 Signed-off-by: David Heidelberger Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 54b0805..de8e10a 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -143,7 +143,8 @@ ALC883/888 medion Medion Laptops medion-md2 Medion MD2 targa-dig Targa/MSI - targa-2ch-dig Targs/MSI with 2-channel + targa-2ch-dig Targa/MSI with 2-channel + targa-8ch-dig Targa/MSI with 8-channel (MSI GX620) laptop-eapd 3-jack with SPDIF I/O and EAPD (Clevo M540JE, M550JE) lenovo-101e Lenovo 101E lenovo-nb0763 Lenovo NB0763 -- cgit v1.1 From 02cf228639233aa227a152955a98564c7a18f9ee Mon Sep 17 00:00:00 2001 From: Sergey Lapin Date: Mon, 8 Jun 2009 12:18:50 +0000 Subject: ieee802154: add documentation about our stack Add MAINTAINERS entry and a small text describing our stack interfaces, how to hook the drivers, etc. Signed-off-by: Dmitry Eremin-Solenikov Signed-off-by: Sergey Lapin Signed-off-by: David S. Miller --- Documentation/networking/ieee802154.txt | 76 +++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 Documentation/networking/ieee802154.txt (limited to 'Documentation') diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt new file mode 100644 index 0000000..a0280ad --- /dev/null +++ b/Documentation/networking/ieee802154.txt @@ -0,0 +1,76 @@ + + Linux IEEE 802.15.4 implementation + + +Introduction +============ + +The Linux-ZigBee project goal is to provide complete implementation +of IEEE 802.15.4 / ZigBee / 6LoWPAN protocols. IEEE 802.15.4 is a stack +of protocols for organizing Low-Rate Wireless Personal Area Networks. + +Currently only IEEE 802.15.4 layer is implemented. We have choosen +to use plain Berkeley socket API, the generic Linux networking stack +to transfer IEEE 802.15.4 messages and a special protocol over genetlink +for configuration/management + + +Socket API +========== + +int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0); +..... + +The address family, socket addresses etc. are defined in the +include/net/ieee802154/af_ieee802154.h header or in the special header +in our userspace package (see either linux-zigbee sourceforge download page +or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee). + +One can use SOCK_RAW for passing raw data towards device xmit function. YMMV. + + +MLME - MAC Level Management +============================ + +Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands. +See the include/net/ieee802154/nl802154.h header. Our userspace tools package +(see above) provides CLI configuration utility for radio interfaces and simple +coordinator for IEEE 802.15.4 networks as an example users of MLME protocol. + + +Kernel side +============= + +Like with WiFi, there are several types of devices implementing IEEE 802.15.4. +1) 'HardMAC'. The MAC layer is implemented in the device itself, the device + exports MLME and data API. +2) 'SoftMAC' or just radio. These types of devices are just radio transceivers + possibly with some kinds of acceleration like automatic CRC computation and + comparation, automagic ACK handling, address matching, etc. + +Those types of devices require different approach to be hooked into Linux kernel. + + +HardMAC +======= + +See the header include/net/ieee802154/netdevice.h. You have to implement Linux +net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family +code via plain sk_buffs. The control block of sk_buffs will contain additional +info as described in the struct ieee802154_mac_cb. + +To hook the MLME interface you have to populate the ml_priv field of your +net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are +required. + +We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c + + +SoftMAC +======= + +We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC +in software. This is currently WIP. + +See header include/net/ieee802154/mac802154.h and several drivers in +drivers/ieee802154/ -- cgit v1.1 From 3e56f08bffe9e3e2b936eb73bd51d8800d1b42c2 Mon Sep 17 00:00:00 2001 From: David VomLehn Date: Sat, 30 May 2009 18:13:32 -0700 Subject: kbuild/Documentation: Incorrect makefile syntax in example There is an error in the make syntax for one of the kbuild examples Signed-off-by: David VomLehn Signed-off-by: Sam Ravnborg --- Documentation/kbuild/modules.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/kbuild/modules.txt b/Documentation/kbuild/modules.txt index b1096da..0767cf6 100644 --- a/Documentation/kbuild/modules.txt +++ b/Documentation/kbuild/modules.txt @@ -275,7 +275,7 @@ following files: KERNELDIR := /lib/modules/`uname -r`/build all:: - $(MAKE) -C $KERNELDIR M=`pwd` $@ + $(MAKE) -C $(KERNELDIR) M=`pwd` $@ # Module specific targets genbin: -- cgit v1.1 From 98f540d31ba0d3598b52177e194dde0bc498352d Mon Sep 17 00:00:00 2001 From: Markus Heidelberg Date: Mon, 18 May 2009 01:36:47 +0200 Subject: kconfig: resort the documentation of the environment variables All the KCONFIG_ environment variables were previously located in a section "Environment variables in 'menuconfig'", but neither are they restricted to 'menuconfig' nor are they all used by 'menuconfig'. Introduce the following three sections for these variables: * Environment variables for '*config' * Environment variables for '{allyes/allmod/allno/rand}config' * Environment variables for 'silentoldconfig' Furthermore this puts MENUCONFIG_MODE next to MENUCONFIG_COLOR into a common section "User interface options for 'menuconfig'". Signed-off-by: Markus Heidelberg Signed-off-by: Sam Ravnborg --- Documentation/kbuild/kconfig.txt | 116 ++++++++++++++++++++------------------- 1 file changed, 60 insertions(+), 56 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kbuild/kconfig.txt b/Documentation/kbuild/kconfig.txt index 26a7c0a..849b5e5 100644 --- a/Documentation/kbuild/kconfig.txt +++ b/Documentation/kbuild/kconfig.txt @@ -35,48 +35,26 @@ new .config files to see the differences: (Yes, we need something better here.) - -====================================================================== -menuconfig --------------------------------------------------- - -SEARCHING for CONFIG symbols - -Searching in menuconfig: - - The Search function searches for kernel configuration symbol - names, so you have to know something close to what you are - looking for. - - Example: - /hotplug - This lists all config symbols that contain "hotplug", - e.g., HOTPLUG, HOTPLUG_CPU, MEMORY_HOTPLUG. - - For search help, enter / followed TAB-TAB-TAB (to highlight - ) and Enter. This will tell you that you can also use - regular expressions (regexes) in the search string, so if you - are not interested in MEMORY_HOTPLUG, you could try - - /^hotplug - - ______________________________________________________________________ -Color Themes for 'menuconfig' +Environment variables for '*config' -It is possible to select different color themes using the variable -MENUCONFIG_COLOR. To select a theme use: +KCONFIG_CONFIG +-------------------------------------------------- +This environment variable can be used to specify a default kernel config +file name to override the default name of ".config". - make MENUCONFIG_COLOR= menuconfig +KCONFIG_OVERWRITECONFIG +-------------------------------------------------- +If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not +break symlinks when .config is a symlink to somewhere else. -Available themes are: - mono => selects colors suitable for monochrome displays - blackbg => selects a color scheme with black background - classic => theme with blue background. The classic look - bluetitle => a LCD friendly version of classic. (default) +KCONFIG_NOTIMESTAMP +-------------------------------------------------- +If this environment variable exists and is non-null, the timestamp line +in generated .config files is omitted. ______________________________________________________________________ -Environment variables in 'menuconfig' +Environment variables for '{allyes/allmod/allno/rand}config' KCONFIG_ALLCONFIG -------------------------------------------------- @@ -95,8 +73,7 @@ values. This enables you to create "miniature" config (miniconfig) or custom config files containing just the config symbols that you are interested in. Then the kernel config system generates the full .config file, -including dependencies of your miniconfig file, based on the miniconfig -file. +including symbols of your miniconfig file. This 'KCONFIG_ALLCONFIG' file is a config file which contains (usually a subset of all) preset config symbols. These variable @@ -113,26 +90,14 @@ These examples will disable most options (allnoconfig) but enable or disable the options that are explicitly listed in the specified mini-config files. +______________________________________________________________________ +Environment variables for 'silentoldconfig' + KCONFIG_NOSILENTUPDATE -------------------------------------------------- If this variable has a non-blank value, it prevents silent kernel config udpates (requires explicit updates). -KCONFIG_CONFIG --------------------------------------------------- -This environment variable can be used to specify a default kernel config -file name to override the default name of ".config". - -KCONFIG_OVERWRITECONFIG --------------------------------------------------- -If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not -break symlinks when .config is a symlink to somewhere else. - -KCONFIG_NOTIMESTAMP --------------------------------------------------- -If this environment variable exists and is non-null, the timestamp line -in generated .config files is omitted. - KCONFIG_AUTOCONFIG -------------------------------------------------- This environment variable can be set to specify the path & name of the @@ -143,15 +108,54 @@ KCONFIG_AUTOHEADER This environment variable can be set to specify the path & name of the "autoconf.h" (header) file. Its default value is "include/linux/autoconf.h". + +====================================================================== +menuconfig +-------------------------------------------------- + +SEARCHING for CONFIG symbols + +Searching in menuconfig: + + The Search function searches for kernel configuration symbol + names, so you have to know something close to what you are + looking for. + + Example: + /hotplug + This lists all config symbols that contain "hotplug", + e.g., HOTPLUG, HOTPLUG_CPU, MEMORY_HOTPLUG. + + For search help, enter / followed TAB-TAB-TAB (to highlight + ) and Enter. This will tell you that you can also use + regular expressions (regexes) in the search string, so if you + are not interested in MEMORY_HOTPLUG, you could try + + /^hotplug + ______________________________________________________________________ -menuconfig User Interface Options ----------------------------------------------------------------------- +User interface options for 'menuconfig' + +MENUCONFIG_COLOR +-------------------------------------------------- +It is possible to select different color themes using the variable +MENUCONFIG_COLOR. To select a theme use: + + make MENUCONFIG_COLOR= menuconfig + +Available themes are: + mono => selects colors suitable for monochrome displays + blackbg => selects a color scheme with black background + classic => theme with blue background. The classic look + bluetitle => a LCD friendly version of classic. (default) + MENUCONFIG_MODE -------------------------------------------------- This mode shows all sub-menus in one large tree. Example: - MENUCONFIG_MODE=single_menu make menuconfig + make MENUCONFIG_MODE=single_menu menuconfig + ====================================================================== xconfig -- cgit v1.1 From fb6e7113ae3ba6c7d0de77c6ccbcfa659899ff0f Mon Sep 17 00:00:00 2001 From: Ryusuke Konishi Date: Sat, 30 May 2009 11:27:17 +0900 Subject: nilfs2: modify list of unsupported features in caveats This clarifies missing features of nilfs as a regular filesystem. Signed-off-by: Ryusuke Konishi --- Documentation/filesystems/nilfs2.txt | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt index 55c4300..01539f4 100644 --- a/Documentation/filesystems/nilfs2.txt +++ b/Documentation/filesystems/nilfs2.txt @@ -39,9 +39,8 @@ Features which NILFS2 does not support yet: - extended attributes - POSIX ACLs - quotas - - writable snapshots - - remote backup (CDP) - - data integrity + - fsck + - resize - defragmentation Mount options -- cgit v1.1 From bc5c6c043d8381676339fb3da59cc4cc5921d368 Mon Sep 17 00:00:00 2001 From: Mike Frysinger Date: Wed, 10 Jun 2009 04:48:41 -0400 Subject: ftrace/documentation: fix typo in function grapher name The function graph tracer is called just "function_graph" (no trailing "_tracer" needed). Signed-off-by: Mike Frysinger LKML-Reference: <1244623722-6325-1-git-send-email-vapier@gentoo.org> Signed-off-by: Steven Rostedt --- Documentation/trace/ftrace.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index fd9a3e6..5ad2ded 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -179,7 +179,7 @@ Here is the list of current tracers that may be configured. Function call tracer to trace all kernel functions. - "function_graph_tracer" + "function_graph" Similar to the function tracer except that the function tracer probes the functions on their entry -- cgit v1.1 From 62fdac5913f71f8f200bd2c9bd59a02e9a1498e9 Mon Sep 17 00:00:00 2001 From: Hidetoshi Seto Date: Thu, 11 Jun 2009 16:06:07 +0900 Subject: x86, mce: Add boot options for corrected errors This patch introduces three boot options (no_cmci, dont_log_ce and ignore_ce) to control handling for corrected errors. The "mce=no_cmci" boot option disables the CMCI feature. Since CMCI is a new feature so having boot controls to disable it will be a help if the hardware is misbehaving. The "mce=dont_log_ce" boot option disables logging for corrected errors. All reported corrected errors will be cleared silently. This option will be useful if you never care about corrected errors. The "mce=ignore_ce" boot option disables features for corrected errors, i.e. polling timer and cmci. All corrected events are not cleared and kept in bank MSRs. Usually this disablement is not recommended, however it will be a help if there are some conflict with the BIOS or hardware monitoring applications etc., that clears corrected events in banks instead of OS. [ And trivial cleanup (space -> tab) for doc is included. ] Signed-off-by: Hidetoshi Seto Reviewed-by: Andi Kleen LKML-Reference: <4A30ACDF.5030408@jp.fujitsu.com> Signed-off-by: Ingo Molnar --- Documentation/x86/x86_64/boot-options.txt | 36 +++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 0ee5e3b..fa2bed0 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -7,12 +7,36 @@ Machine check Please see Documentation/x86/x86_64/machinecheck for sysfs runtime tunables. - mce=off disable machine check - mce=bootlog Enable logging of machine checks left over from booting. - Disabled by default on AMD because some BIOS leave bogus ones. - If your BIOS doesn't do that it's a good idea to enable though - to make sure you log even machine check events that result - in a reboot. On Intel systems it is enabled by default. + mce=off + Disable machine check + mce=no_cmci + Disable CMCI(Corrected Machine Check Interrupt) that + Intel processor supports. Usually this disablement is + not recommended, but it might be handy if your hardware + is misbehaving. + Note that you'll get more problems without CMCI than with + due to the shared banks, i.e. you might get duplicated + error logs. + mce=dont_log_ce + Don't make logs for corrected errors. All events reported + as corrected are silently cleared by OS. + This option will be useful if you have no interest in any + of corrected errors. + mce=ignore_ce + Disable features for corrected errors, e.g. polling timer + and CMCI. All events reported as corrected are not cleared + by OS and remained in its error banks. + Usually this disablement is not recommended, however if + there is an agent checking/clearing corrected errors + (e.g. BIOS or hardware monitoring applications), conflicting + with OS's error handling, and you cannot deactivate the agent, + then this option will be a help. + mce=bootlog + Enable logging of machine checks left over from booting. + Disabled by default on AMD because some BIOS leave bogus ones. + If your BIOS doesn't do that it's a good idea to enable though + to make sure you log even machine check events that result + in a reboot. On Intel systems it is enabled by default. mce=nobootlog Disable boot machine check logging. mce=tolerancelevel[,monarchtimeout] (number,number) -- cgit v1.1 From 4f64e150191bfddc7f5c0768f325f747dbca1913 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 11 Jun 2009 16:14:11 +0200 Subject: ALSA: pcm - Update document about xrun_debug proc file Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/Procfile.txt | 36 +++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/Procfile.txt b/Documentation/sound/alsa/Procfile.txt index cfac20c..381908d 100644 --- a/Documentation/sound/alsa/Procfile.txt +++ b/Documentation/sound/alsa/Procfile.txt @@ -88,26 +88,34 @@ card*/pcm*/info substreams, etc. card*/pcm*/xrun_debug - This file appears when CONFIG_SND_DEBUG=y. - This shows the status of xrun (= buffer overrun/xrun) debug of - ALSA PCM middle layer, as an integer from 0 to 2. The value - can be changed by writing to this file, such as - - # cat 2 > /proc/asound/card0/pcm0p/xrun_debug - - When this value is greater than 0, the driver will show the - messages to kernel log when an xrun is detected. The debug - message is shown also when the invalid H/W pointer is detected - at the update of periods (usually called from the interrupt + This file appears when CONFIG_SND_DEBUG=y and + CONFIG_PCM_XRUN_DEBUG=y. + This shows the status of xrun (= buffer overrun/xrun) and + invalid PCM position debug/check of ALSA PCM middle layer. + It takes an integer value, can be changed by writing to this + file, such as + + # cat 5 > /proc/asound/card0/pcm0p/xrun_debug + + The value consists of the following bit flags: + bit 0 = Enable XRUN/jiffies debug messages + bit 1 = Show stack trace at XRUN / jiffies check + bit 2 = Enable additional jiffies check + + When the bit 0 is set, the driver will show the messages to + kernel log when an xrun is detected. The debug message is + shown also when the invalid H/W pointer is detected at the + update of periods (usually called from the interrupt handler). - When this value is greater than 1, the driver will show the - stack trace additionally. This may help the debugging. + When the bit 1 is set, the driver will show the stack trace + additionally. This may help the debugging. - Since 2.6.30, this option also enables the hwptr check using + Since 2.6.30, this option can enable the hwptr check using jiffies. This detects spontaneous invalid pointer callback values, but can be lead to too much corrections for a (mostly buggy) hardware that doesn't give smooth pointer updates. + This feature is enabled via the bit 2. card*/pcm*/sub*/info The general information of this PCM sub-stream. -- cgit v1.1 From 04f70336c80c43a15e617b36c2043dfa0ad6ed0f Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Thu, 11 Jun 2009 13:22:39 +0100 Subject: kmemleak: Add documentation on the memory leak detector This patch adds the Documentation/kmemleak.txt file with some information about how kmemleak works. Signed-off-by: Catalin Marinas --- Documentation/kernel-parameters.txt | 4 + Documentation/kmemleak.txt | 142 ++++++++++++++++++++++++++++++++++++ 2 files changed, 146 insertions(+) create mode 100644 Documentation/kmemleak.txt (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4a3c220..04a44cc 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1077,6 +1077,10 @@ and is between 256 and 4096 characters. It is defined in the file Configure the RouterBoard 532 series on-chip Ethernet adapter MAC address. + kmemleak= [KNL] Boot-time kmemleak enable/disable + Valid arguments: on, off + Default: on + kstack=N [X86] Print N words from the kernel stack in oops dumps. diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt new file mode 100644 index 0000000..0112da3 --- /dev/null +++ b/Documentation/kmemleak.txt @@ -0,0 +1,142 @@ +Kernel Memory Leak Detector +=========================== + +Introduction +------------ + +Kmemleak provides a way of detecting possible kernel memory leaks in a +way similar to a tracing garbage collector +(http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Tracing_garbage_collectors), +with the difference that the orphan objects are not freed but only +reported via /sys/kernel/debug/kmemleak. A similar method is used by the +Valgrind tool (memcheck --leak-check) to detect the memory leaks in +user-space applications. + +Usage +----- + +CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel +thread scans the memory every 10 minutes (by default) and prints any new +unreferenced objects found. To trigger an intermediate scan and display +all the possible memory leaks: + + # mount -t debugfs nodev /sys/kernel/debug/ + # cat /sys/kernel/debug/kmemleak + +Note that the orphan objects are listed in the order they were allocated +and one object at the beginning of the list may cause other subsequent +objects to be reported as orphan. + +Memory scanning parameters can be modified at run-time by writing to the +/sys/kernel/debug/kmemleak file. The following parameters are supported: + + off - disable kmemleak (irreversible) + stack=on - enable the task stacks scanning + stack=off - disable the tasks stacks scanning + scan=on - start the automatic memory scanning thread + scan=off - stop the automatic memory scanning thread + scan= - set the automatic memory scanning period in seconds (0 + to disable it) + +Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on +the kernel command line. + +Basic Algorithm +--------------- + +The memory allocations via kmalloc, vmalloc, kmem_cache_alloc and +friends are traced and the pointers, together with additional +information like size and stack trace, are stored in a prio search tree. +The corresponding freeing function calls are tracked and the pointers +removed from the kmemleak data structures. + +An allocated block of memory is considered orphan if no pointer to its +start address or to any location inside the block can be found by +scanning the memory (including saved registers). This means that there +might be no way for the kernel to pass the address of the allocated +block to a freeing function and therefore the block is considered a +memory leak. + +The scanning algorithm steps: + + 1. mark all objects as white (remaining white objects will later be + considered orphan) + 2. scan the memory starting with the data section and stacks, checking + the values against the addresses stored in the prio search tree. If + a pointer to a white object is found, the object is added to the + gray list + 3. scan the gray objects for matching addresses (some white objects + can become gray and added at the end of the gray list) until the + gray set is finished + 4. the remaining white objects are considered orphan and reported via + /sys/kernel/debug/kmemleak + +Some allocated memory blocks have pointers stored in the kernel's +internal data structures and they cannot be detected as orphans. To +avoid this, kmemleak can also store the number of values pointing to an +address inside the block address range that need to be found so that the +block is not considered a leak. One example is __vmalloc(). + +Kmemleak API +------------ + +See the include/linux/kmemleak.h header for the functions prototype. + +kmemleak_init - initialize kmemleak +kmemleak_alloc - notify of a memory block allocation +kmemleak_free - notify of a memory block freeing +kmemleak_not_leak - mark an object as not a leak +kmemleak_ignore - do not scan or report an object as leak +kmemleak_scan_area - add scan areas inside a memory block +kmemleak_no_scan - do not scan a memory block +kmemleak_erase - erase an old value in a pointer variable +kmemleak_alloc_recursive - as kmemleak_alloc but checks the recursiveness +kmemleak_free_recursive - as kmemleak_free but checks the recursiveness + +Dealing with false positives/negatives +-------------------------------------- + +The false negatives are real memory leaks (orphan objects) but not +reported by kmemleak because values found during the memory scanning +point to such objects. To reduce the number of false negatives, kmemleak +provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and +kmemleak_erase functions (see above). The task stacks also increase the +amount of false negatives and their scanning is not enabled by default. + +The false positives are objects wrongly reported as being memory leaks +(orphan). For objects known not to be leaks, kmemleak provides the +kmemleak_not_leak function. The kmemleak_ignore could also be used if +the memory block is known not to contain other pointers and it will no +longer be scanned. + +Some of the reported leaks are only transient, especially on SMP +systems, because of pointers temporarily stored in CPU registers or +stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing +the minimum age of an object to be reported as a memory leak. + +Limitations and Drawbacks +------------------------- + +The main drawback is the reduced performance of memory allocation and +freeing. To avoid other penalties, the memory scanning is only performed +when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is +intended for debugging purposes where the performance might not be the +most important requirement. + +To keep the algorithm simple, kmemleak scans for values pointing to any +address inside a block's address range. This may lead to an increased +number of false negatives. However, it is likely that a real memory leak +will eventually become visible. + +Another source of false negatives is the data stored in non-pointer +values. In a future version, kmemleak could only scan the pointer +members in the allocated structures. This feature would solve many of +the false negative cases described above. + +The tool can report false positives. These are cases where an allocated +block doesn't need to be freed (some cases in the init_call functions), +the pointer is calculated by other methods than the usual container_of +macro or the pointer is stored in a location not scanned by kmemleak. + +Page allocations and ioremap are not tracked. Only the ARM and x86 +architectures are currently supported. -- cgit v1.1 From 9e9f46c44e487af0a82eb61b624553e2f7118f5b Mon Sep 17 00:00:00 2001 From: Jesse Barnes Date: Thu, 11 Jun 2009 10:58:28 -0700 Subject: PCI: use ACPI _CRS data by default At this point, it seems to solve more problems than it causes, so let's try using it by default. It's an easy revert if it ends up causing trouble. Reviewed-by: Yinghai Lu Acked-by: Bjorn Helgaas Signed-off-by: Jesse Barnes --- Documentation/kernel-parameters.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index fd5cac0..7bdaf50 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1787,7 +1787,7 @@ and is between 256 and 4096 characters. It is defined in the file IRQ routing is enabled. noacpi [X86] Do not use ACPI for IRQ routing or for PCI scanning. - use_crs [X86] Use _CRS for PCI resource + nocrs [X86] Don't use _CRS for PCI resource allocation. routeirq Do IRQ routing for all PCI devices. This is normally done in pci_enable_device(), -- cgit v1.1 From 43c16408842b0eeb367c23a6fa540ce69f99e347 Mon Sep 17 00:00:00 2001 From: Andrew Patterson Date: Wed, 22 Apr 2009 16:52:09 -0600 Subject: PCI: Add support for turning PCIe ECRC on or off Adds support for PCI Express transaction layer end-to-end CRC checking (ECRC). This patch will enable/disable ECRC checking by setting/clearing the ECRC Check Enable and/or ECRC Generation Enable bits for devices that support ECRC. The ECRC setting is controlled by the "pci=ecrc=" command-line option. If this option is not set or is set to 'bios", the enable and generation bits are left in whatever state that firmware/BIOS set them to. The "off" setting turns them off, and the "on" option turns them on (if the device supports it). Turning ECRC on or off can be a data integrity versus performance tradeoff. In theory, turning it on will catch more data errors, turning it off means possibly better performance since CRC does not need to be calculated by the PCIe hardware and packet sizes are reduced. Signed-off-by: Andrew Patterson Signed-off-by: Jesse Barnes --- Documentation/kernel-parameters.txt | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7bdaf50..395d1a0 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1824,6 +1824,12 @@ and is between 256 and 4096 characters. It is defined in the file PAGE_SIZE is used as alignment. PCI-PCI bridge can be specified, if resource windows need to be expanded. + ecrc= Enable/disable PCIe ECRC (transaction layer + end-to-end CRC checking). + bios: Use BIOS/firmware settings. This is the + the default. + off: Turn ECRC off + on: Turn ECRC on. pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power Management. -- cgit v1.1 From 746e6ad23cd6fec2edce056e014a0eabeffa838c Mon Sep 17 00:00:00 2001 From: John Dykstra Date: Thu, 11 Jun 2009 20:57:21 -0700 Subject: [PATCH] net core: Some interface flags not returned by SIOCGIFFLAGS Commit b00055aacdb172c05067612278ba27265fcd05ce " [NET] core: add RFC2863 operstate" defined new interface flag values. Its documentation specified that these flags could be accessed from user space via SIOCGIFFLAGS. However, this does not work because the new flags do not fit in that ioctl's argument width. Change the documentation to match the code's behavior. Also change the source to explicitly show the truncation. This _should_ have no effect on executable code, and did not with gcc 4.2.4 generating x86 code. A new ioctl could be defined to return all interface flags to user space. However, since this has been broken for three years with no one complaining, there doesn't seem much need. They are still accessible via netlink. Reported-by: "Fredrik Arnerup" Signed-off-by: John Dykstra Signed-off-by: David S. Miller --- Documentation/networking/operstates.txt | 3 --- 1 file changed, 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/operstates.txt b/Documentation/networking/operstates.txt index c9074f9..1a77a3c 100644 --- a/Documentation/networking/operstates.txt +++ b/Documentation/networking/operstates.txt @@ -38,9 +38,6 @@ ifinfomsg::if_flags & IFF_LOWER_UP: ifinfomsg::if_flags & IFF_DORMANT: Driver has signaled netif_dormant_on() -These interface flags can also be queried without netlink using the -SIOCGIFFLAGS ioctl. - TLV IFLA_OPERSTATE contains RFC2863 state of the interface in numeric representation: -- cgit v1.1 From 713b15b3781240653d2b38414da3f4567dcbcf91 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:26:58 -0600 Subject: lguest: be paranoid about guest playing with device descriptors. We can't trust the values in the device descriptor table once the guest has booted, so keep local copies. They could set them to strange values then cause us to segv (they're 8 bit values, so they can't make our pointers go too wild). This becomes more important with the following patches which read them. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index d36fcc0..e65d6cb 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -126,9 +126,13 @@ struct device /* The linked-list pointer. */ struct device *next; - /* The this device's descriptor, as mapped into the Guest. */ + /* The device's descriptor, as mapped into the Guest. */ struct lguest_device_desc *desc; + /* We can't trust desc values once Guest has booted: we use these. */ + unsigned int feature_len; + unsigned int num_vq; + /* The name of this device, for --verbose. */ const char *name; @@ -245,7 +249,7 @@ static void iov_consume(struct iovec iov[], unsigned num_iov, unsigned len) static u8 *get_feature_bits(struct device *dev) { return (u8 *)(dev->desc + 1) - + dev->desc->num_vq * sizeof(struct lguest_vqconfig); + + dev->num_vq * sizeof(struct lguest_vqconfig); } /*L:100 The Launcher code itself takes us out into userspace, that scary place @@ -979,8 +983,8 @@ static void update_device_status(struct device *dev) verbose("Resetting device %s\n", dev->name); /* Clear any features they've acked. */ - memset(get_feature_bits(dev) + dev->desc->feature_len, 0, - dev->desc->feature_len); + memset(get_feature_bits(dev) + dev->feature_len, 0, + dev->feature_len); /* Zero out the virtqueues. */ for (vq = dev->vq; vq; vq = vq->next) { @@ -994,12 +998,12 @@ static void update_device_status(struct device *dev) unsigned int i; verbose("Device %s OK: offered", dev->name); - for (i = 0; i < dev->desc->feature_len; i++) + for (i = 0; i < dev->feature_len; i++) verbose(" %02x", get_feature_bits(dev)[i]); verbose(", accepted"); - for (i = 0; i < dev->desc->feature_len; i++) + for (i = 0; i < dev->feature_len; i++) verbose(" %02x", get_feature_bits(dev) - [dev->desc->feature_len+i]); + [dev->feature_len+i]); if (dev->ready) dev->ready(dev); @@ -1129,8 +1133,8 @@ static void handle_input(int fd) static u8 *device_config(const struct device *dev) { return (void *)(dev->desc + 1) - + dev->desc->num_vq * sizeof(struct lguest_vqconfig) - + dev->desc->feature_len * 2; + + dev->num_vq * sizeof(struct lguest_vqconfig) + + dev->feature_len * 2; } /* This routine allocates a new "struct lguest_device_desc" from descriptor @@ -1191,6 +1195,7 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, * yet, otherwise we'd be overwriting them. */ assert(dev->desc->config_len == 0 && dev->desc->feature_len == 0); memcpy(device_config(dev), &vq->config, sizeof(vq->config)); + dev->num_vq++; dev->desc->num_vq++; verbose("Virtqueue page %#lx\n", to_guest_phys(p)); @@ -1219,7 +1224,7 @@ static void add_feature(struct device *dev, unsigned bit) /* We can't extend the feature bits once we've added config bytes */ if (dev->desc->feature_len <= bit / CHAR_BIT) { assert(dev->desc->config_len == 0); - dev->desc->feature_len = (bit / CHAR_BIT) + 1; + dev->feature_len = dev->desc->feature_len = (bit/CHAR_BIT) + 1; } features[bit / CHAR_BIT] |= (1 << (bit % CHAR_BIT)); @@ -1259,6 +1264,8 @@ static struct device *new_device(const char *name, u16 type, int fd, dev->name = name; dev->vq = NULL; dev->ready = NULL; + dev->feature_len = 0; + dev->num_vq = 0; /* Append to device list. Prepending to a single-linked list is * easier, but the user expects the devices to be arranged on the bus -- cgit v1.1 From 56739c802ca845435f60e909104637880e14c769 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:26:59 -0600 Subject: lguest: cleanup passing of /dev/lguest fd around example launcher. We hand the /dev/lguest fd everywhere; it's far neater to just make it a global (it already is, in fact, hidden in the waker_fds struct). Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 102 +++++++++++++++++++----------------------- 1 file changed, 47 insertions(+), 55 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index e65d6cb..1a2b906 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -79,7 +79,6 @@ static bool verbose; /* File descriptors for the Waker. */ struct { int pipe[2]; - int lguest_fd; } waker_fds; /* The pointer to the start of guest memory. */ @@ -89,6 +88,8 @@ static unsigned long guest_limit, guest_max; /* The pipe for signal hander to write to. */ static int timeoutpipe[2]; static unsigned int timeout_usec = 500; +/* The /dev/lguest file descriptor. */ +static int lguest_fd; /* a per-cpu variable indicating whose vcpu is currently running */ static unsigned int __thread cpu_id; @@ -139,7 +140,7 @@ struct device /* If handle_input is set, it wants to be called when this file * descriptor is ready. */ int fd; - bool (*handle_input)(int fd, struct device *me); + bool (*handle_input)(struct device *me); /* Any queues attached to this device */ struct virtqueue *vq; @@ -169,7 +170,7 @@ struct virtqueue u16 last_avail_idx; /* The routine to call when the Guest pings us, or timeout. */ - void (*handle_output)(int fd, struct virtqueue *me, bool timeout); + void (*handle_output)(struct virtqueue *me, bool timeout); /* Outstanding buffers */ unsigned int inflight; @@ -509,21 +510,16 @@ static void concat(char *dst, char *args[]) * saw the arguments it expects when we looked at initialize() in lguest_user.c: * the base of Guest "physical" memory, the top physical page to allow and the * entry point for the Guest. */ -static int tell_kernel(unsigned long start) +static void tell_kernel(unsigned long start) { unsigned long args[] = { LHREQ_INITIALIZE, (unsigned long)guest_base, guest_limit / getpagesize(), start }; - int fd; - verbose("Guest: %p - %p (%#lx)\n", guest_base, guest_base + guest_limit, guest_limit); - fd = open_or_die("/dev/lguest", O_RDWR); - if (write(fd, args, sizeof(args)) < 0) + lguest_fd = open_or_die("/dev/lguest", O_RDWR); + if (write(lguest_fd, args, sizeof(args)) < 0) err(1, "Writing to /dev/lguest"); - - /* We return the /dev/lguest file descriptor to control this Guest */ - return fd; } /*:*/ @@ -583,21 +579,18 @@ static int waker(void *unused) } /* Send LHREQ_BREAK command to snap the Launcher out of it. */ - pwrite(waker_fds.lguest_fd, args, sizeof(args), cpu_id); + pwrite(lguest_fd, args, sizeof(args), cpu_id); } return 0; } /* This routine just sets up a pipe to the Waker process. */ -static void setup_waker(int lguest_fd) +static void setup_waker(void) { /* This pipe is closed when Launcher dies, telling Waker. */ if (pipe(waker_fds.pipe) != 0) err(1, "Creating pipe for Waker"); - /* Waker also needs to know the lguest fd */ - waker_fds.lguest_fd = lguest_fd; - if (clone(waker, malloc(4096) + 4096, CLONE_VM | SIGCHLD, NULL) == -1) err(1, "Creating Waker"); } @@ -727,7 +720,7 @@ static void add_used(struct virtqueue *vq, unsigned int head, int len) } /* This actually sends the interrupt for this virtqueue */ -static void trigger_irq(int fd, struct virtqueue *vq) +static void trigger_irq(struct virtqueue *vq) { unsigned long buf[] = { LHREQ_IRQ, vq->config.irq }; @@ -737,16 +730,15 @@ static void trigger_irq(int fd, struct virtqueue *vq) return; /* Send the Guest an interrupt tell them we used something up. */ - if (write(fd, buf, sizeof(buf)) != 0) + if (write(lguest_fd, buf, sizeof(buf)) != 0) err(1, "Triggering irq %i", vq->config.irq); } /* And here's the combo meal deal. Supersize me! */ -static void add_used_and_trigger(int fd, struct virtqueue *vq, - unsigned int head, int len) +static void add_used_and_trigger(struct virtqueue *vq, unsigned head, int len) { add_used(vq, head, len); - trigger_irq(fd, vq); + trigger_irq(vq); } /* @@ -770,7 +762,7 @@ struct console_abort }; /* This is the routine which handles console input (ie. stdin). */ -static bool handle_console_input(int fd, struct device *dev) +static bool handle_console_input(struct device *dev) { int len; unsigned int head, in_num, out_num; @@ -804,7 +796,7 @@ static bool handle_console_input(int fd, struct device *dev) } /* Tell the Guest about the new input. */ - add_used_and_trigger(fd, dev->vq, head, len); + add_used_and_trigger(dev->vq, head, len); /* Three ^C within one second? Exit. * @@ -824,7 +816,7 @@ static bool handle_console_input(int fd, struct device *dev) close(waker_fds.pipe[1]); /* Just in case Waker is blocked in BREAK, send * unbreak now. */ - write(fd, args, sizeof(args)); + write(lguest_fd, args, sizeof(args)); exit(2); } abort->count = 0; @@ -839,7 +831,7 @@ static bool handle_console_input(int fd, struct device *dev) /* Handling output for console is simple: we just get all the output buffers * and write them to stdout. */ -static void handle_console_output(int fd, struct virtqueue *vq, bool timeout) +static void handle_console_output(struct virtqueue *vq, bool timeout) { unsigned int head, out, in; int len; @@ -850,7 +842,7 @@ static void handle_console_output(int fd, struct virtqueue *vq, bool timeout) if (in) errx(1, "Input buffers in output queue?"); len = writev(STDOUT_FILENO, iov, out); - add_used_and_trigger(fd, vq, head, len); + add_used_and_trigger(vq, head, len); } } @@ -879,7 +871,7 @@ static void block_vq(struct virtqueue *vq) * and write them (ignoring the first element) to this device's file descriptor * (/dev/net/tun). */ -static void handle_net_output(int fd, struct virtqueue *vq, bool timeout) +static void handle_net_output(struct virtqueue *vq, bool timeout) { unsigned int head, out, in, num = 0; int len; @@ -893,7 +885,7 @@ static void handle_net_output(int fd, struct virtqueue *vq, bool timeout) len = writev(vq->dev->fd, iov, out); if (len < 0) err(1, "Writing network packet to tun"); - add_used_and_trigger(fd, vq, head, len); + add_used_and_trigger(vq, head, len); num++; } @@ -917,7 +909,7 @@ static void handle_net_output(int fd, struct virtqueue *vq, bool timeout) /* This is where we handle a packet coming in from the tun device to our * Guest. */ -static bool handle_tun_input(int fd, struct device *dev) +static bool handle_tun_input(struct device *dev) { unsigned int head, in_num, out_num; int len; @@ -946,7 +938,7 @@ static bool handle_tun_input(int fd, struct device *dev) err(1, "reading network"); /* Tell the Guest about the new packet. */ - add_used_and_trigger(fd, dev->vq, head, len); + add_used_and_trigger(dev->vq, head, len); verbose("tun input packet len %i [%02x %02x] (%s)\n", len, ((u8 *)iov[1].iov_base)[0], ((u8 *)iov[1].iov_base)[1], @@ -959,18 +951,18 @@ static bool handle_tun_input(int fd, struct device *dev) /*L:215 This is the callback attached to the network and console input * virtqueues: it ensures we try again, in case we stopped console or net * delivery because Guest didn't have any buffers. */ -static void enable_fd(int fd, struct virtqueue *vq, bool timeout) +static void enable_fd(struct virtqueue *vq, bool timeout) { add_device_fd(vq->dev->fd); /* Snap the Waker out of its select loop. */ write(waker_fds.pipe[1], "", 1); } -static void net_enable_fd(int fd, struct virtqueue *vq, bool timeout) +static void net_enable_fd(struct virtqueue *vq, bool timeout) { /* We don't need to know again when Guest refills receive buffer. */ vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY; - enable_fd(fd, vq, timeout); + enable_fd(vq, timeout); } /* When the Guest tells us they updated the status field, we handle it. */ @@ -1011,7 +1003,7 @@ static void update_device_status(struct device *dev) } /* This is the generic routine we call when the Guest uses LHCALL_NOTIFY. */ -static void handle_output(int fd, unsigned long addr) +static void handle_output(unsigned long addr) { struct device *i; struct virtqueue *vq; @@ -1039,7 +1031,7 @@ static void handle_output(int fd, unsigned long addr) if (strcmp(vq->dev->name, "console") != 0) verbose("Output to %s\n", vq->dev->name); if (vq->handle_output) - vq->handle_output(fd, vq, false); + vq->handle_output(vq, false); return; } } @@ -1053,7 +1045,7 @@ static void handle_output(int fd, unsigned long addr) strnlen(from_guest_phys(addr), guest_limit - addr)); } -static void handle_timeout(int fd) +static void handle_timeout(void) { char buf[32]; struct device *i; @@ -1071,14 +1063,14 @@ static void handle_timeout(int fd) vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY; vq->blocked = false; if (vq->handle_output) - vq->handle_output(fd, vq, true); + vq->handle_output(vq, true); } } } /* This is called when the Waker wakes us up: check for incoming file * descriptors. */ -static void handle_input(int fd) +static void handle_input(void) { /* select() wants a zeroed timeval to mean "don't wait". */ struct timeval poll = { .tv_sec = 0, .tv_usec = 0 }; @@ -1100,7 +1092,7 @@ static void handle_input(int fd) * descriptors and a method of handling them. */ for (i = devices.dev; i; i = i->next) { if (i->handle_input && FD_ISSET(i->fd, &fds)) { - if (i->handle_input(fd, i)) + if (i->handle_input(i)) continue; /* If handle_input() returns false, it means we @@ -1114,7 +1106,7 @@ static void handle_input(int fd) /* Is this the timeout fd? */ if (FD_ISSET(timeoutpipe[0], &fds)) - handle_timeout(fd); + handle_timeout(); } } @@ -1163,7 +1155,7 @@ static struct lguest_device_desc *new_dev_desc(u16 type) /* Each device descriptor is followed by the description of its virtqueues. We * specify how many descriptors the virtqueue is to have. */ static void add_virtqueue(struct device *dev, unsigned int num_descs, - void (*handle_output)(int, struct virtqueue *, bool)) + void (*handle_output)(struct virtqueue *, bool)) { unsigned int pages; struct virtqueue **i, *vq = malloc(sizeof(*vq)); @@ -1249,7 +1241,7 @@ static void set_config(struct device *dev, unsigned len, const void *conf) * * See what I mean about userspace being boring? */ static struct device *new_device(const char *name, u16 type, int fd, - bool (*handle_input)(int, struct device *)) + bool (*handle_input)(struct device *)) { struct device *dev = malloc(sizeof(*dev)); @@ -1678,7 +1670,7 @@ static int io_thread(void *_dev) /* Now we've seen the I/O thread, we return to the Launcher to see what happens * when that thread tells us it's completed some I/O. */ -static bool handle_io_finish(int fd, struct device *dev) +static bool handle_io_finish(struct device *dev) { char c; @@ -1688,12 +1680,12 @@ static bool handle_io_finish(int fd, struct device *dev) exit(1); /* It did some work, so trigger the irq. */ - trigger_irq(fd, dev->vq); + trigger_irq(dev->vq); return true; } /* When the Guest submits some I/O, we just need to wake the I/O thread. */ -static void handle_virtblk_output(int fd, struct virtqueue *vq, bool timeout) +static void handle_virtblk_output(struct virtqueue *vq, bool timeout) { struct vblk_info *vblk = vq->dev->priv; char c = 0; @@ -1771,7 +1763,7 @@ static void setup_block_file(const char *filename) * console is the reverse. * * The same logic applies, however. */ -static bool handle_rng_input(int fd, struct device *dev) +static bool handle_rng_input(struct device *dev) { int len; unsigned int head, in_num, out_num, totlen = 0; @@ -1800,7 +1792,7 @@ static bool handle_rng_input(int fd, struct device *dev) } /* Tell the Guest about the new input. */ - add_used_and_trigger(fd, dev->vq, head, totlen); + add_used_and_trigger(dev->vq, head, totlen); /* Everything went OK! */ return true; @@ -1841,7 +1833,7 @@ static void __attribute__((noreturn)) restart_guest(void) /*L:220 Finally we reach the core of the Launcher which runs the Guest, serves * its input and output, and finally, lays it to rest. */ -static void __attribute__((noreturn)) run_guest(int lguest_fd) +static void __attribute__((noreturn)) run_guest(void) { for (;;) { unsigned long args[] = { LHREQ_BREAK, 0 }; @@ -1855,7 +1847,7 @@ static void __attribute__((noreturn)) run_guest(int lguest_fd) /* One unsigned long means the Guest did HCALL_NOTIFY */ if (readval == sizeof(notify_addr)) { verbose("Notify on address %#lx\n", notify_addr); - handle_output(lguest_fd, notify_addr); + handle_output(notify_addr); continue; /* ENOENT means the Guest died. Reading tells us why. */ } else if (errno == ENOENT) { @@ -1875,7 +1867,7 @@ static void __attribute__((noreturn)) run_guest(int lguest_fd) continue; /* Service input, then unset the BREAK to release the Waker. */ - handle_input(lguest_fd); + handle_input(); if (pwrite(lguest_fd, args, sizeof(args), cpu_id) < 0) err(1, "Resetting break"); } @@ -1911,8 +1903,8 @@ int main(int argc, char *argv[]) /* Memory, top-level pagetable, code startpoint and size of the * (optional) initrd. */ unsigned long mem = 0, start, initrd_size = 0; - /* Two temporaries and the /dev/lguest file descriptor. */ - int i, c, lguest_fd; + /* Two temporaries. */ + int i, c; /* The boot information for the Guest. */ struct boot_params *boot; /* If they specify an initrd file to load. */ @@ -2030,15 +2022,15 @@ int main(int argc, char *argv[]) /* We tell the kernel to initialize the Guest: this returns the open * /dev/lguest file descriptor. */ - lguest_fd = tell_kernel(start); + tell_kernel(start); /* We clone off a thread, which wakes the Launcher whenever one of the * input file descriptors needs attention. We call this the Waker, and * we'll cover it in a moment. */ - setup_waker(lguest_fd); + setup_waker(); /* Finally, run the Guest. This doesn't return. */ - run_guest(lguest_fd); + run_guest(); } /*:*/ -- cgit v1.1 From f7027c6387d0c3acf569845165ec7947e2083c82 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:00 -0600 Subject: lguest: get more serious about wmb() in example Launcher code Since the Launcher process runs the Guest, it doesn't have to be very serious about its barriers: the Guest isn't running while we are (Guest is UP). Before we change to use threads to service devices, we need to fix this. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 1a2b906..1e31d1e 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -182,9 +182,10 @@ struct virtqueue /* Remember the arguments to the program so we can "reboot" */ static char **main_args; -/* Since guest is UP and we don't run at the same time, we don't need barriers. - * But I include them in the code in case others copy it. */ -#define wmb() +/* We have to be careful with barriers: our devices are all run in separate + * threads and so we need to make sure that changes visible to the Guest happen + * in precise order. */ +#define wmb() __asm__ __volatile__("" : : : "memory") /* Convert an iovec element to the given type. * -- cgit v1.1 From ebf9a5a99c1a464afe0b4dfa64416fc8b273bc5c Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:01 -0600 Subject: lguest: remove invalid interrupt forcing logic. 20887611523e749d99cc7d64ff6c97d27529fbae (lguest: notify on empty) introduced lguest support for the VIRTIO_F_NOTIFY_ON_EMPTY flag, but in fact it turned on interrupts all the time. Because we always process one buffer at a time, the inflight count is always 0 when call trigger_irq and so we always ignore VRING_AVAIL_F_NO_INTERRUPT from the Guest. It should be looking to see if there are more buffers in the Guest's queue: if it's empty, then we force an interrupt. This makes little difference, since we usually have an empty queue; but that's the subject of another patch. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 1e31d1e..9f3240c 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -172,9 +172,6 @@ struct virtqueue /* The routine to call when the Guest pings us, or timeout. */ void (*handle_output)(struct virtqueue *me, bool timeout); - /* Outstanding buffers */ - unsigned int inflight; - /* Is this blocked awaiting a timer? */ bool blocked; }; @@ -699,7 +696,6 @@ static unsigned get_vq_desc(struct virtqueue *vq, errx(1, "Looped descriptor"); } while ((i = next_desc(vq, i)) != vq->vring.num); - vq->inflight++; return head; } @@ -717,7 +713,6 @@ static void add_used(struct virtqueue *vq, unsigned int head, int len) /* Make sure buffer is written before we update index. */ wmb(); vq->vring.used->idx++; - vq->inflight--; } /* This actually sends the interrupt for this virtqueue */ @@ -727,7 +722,7 @@ static void trigger_irq(struct virtqueue *vq) /* If they don't want an interrupt, don't send one, unless empty. */ if ((vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT) - && vq->inflight) + && lg_last_avail(vq) != vq->vring.avail->idx) return; /* Send the Guest an interrupt tell them we used something up. */ @@ -1171,7 +1166,6 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, vq->next = NULL; vq->last_avail_idx = 0; vq->dev = dev; - vq->inflight = 0; vq->blocked = false; /* Initialize the configuration. */ -- cgit v1.1 From 2644f17d6c932929fd68cfec95691490947e0fd1 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:03 -0600 Subject: lguest: clean up example launcher compile flags. 18 months ago 5bbf89fc260830f3f58b331d946a16b39ad1ca2d changed to loading bzImages directly, and no longer manually ungzipping them, so we no longer need libz. Also, -m32 is useful for those on 64-bit platforms (and harmless on 32-bit). Reported-by: Ron Minnich Signed-off-by: Rusty Russell --- Documentation/lguest/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/Makefile b/Documentation/lguest/Makefile index 1f4f9e8..28c8cdf 100644 --- a/Documentation/lguest/Makefile +++ b/Documentation/lguest/Makefile @@ -1,6 +1,5 @@ # This creates the demonstration utility "lguest" which runs a Linux guest. -CFLAGS:=-Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include -I../../arch/x86/include -U_FORTIFY_SOURCE -LDLIBS:=-lz +CFLAGS:=-m32 -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include -I../../arch/x86/include -U_FORTIFY_SOURCE all: lguest -- cgit v1.1 From e606490c440900e50ccf73a54f6fc6150ff40815 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:04 -0600 Subject: lguest: clean up length-used value in example launcher The "len" field in the used ring for virtio indicates the number of bytes *written* to the buffer. This means the guest doesn't have to zero the buffers in advance as it always knows the used length. Erroneously, the console and network example code puts the length *read* into that field. The guest ignores it, but it's wrong. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 9f3240c..8704600 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -830,15 +830,14 @@ static bool handle_console_input(struct device *dev) static void handle_console_output(struct virtqueue *vq, bool timeout) { unsigned int head, out, in; - int len; struct iovec iov[vq->vring.num]; /* Keep getting output buffers from the Guest until we run out. */ while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) { if (in) errx(1, "Input buffers in output queue?"); - len = writev(STDOUT_FILENO, iov, out); - add_used_and_trigger(vq, head, len); + writev(STDOUT_FILENO, iov, out); + add_used_and_trigger(vq, head, 0); } } @@ -870,7 +869,6 @@ static void block_vq(struct virtqueue *vq) static void handle_net_output(struct virtqueue *vq, bool timeout) { unsigned int head, out, in, num = 0; - int len; struct iovec iov[vq->vring.num]; static int last_timeout_num; @@ -878,10 +876,9 @@ static void handle_net_output(struct virtqueue *vq, bool timeout) while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) { if (in) errx(1, "Input buffers in output queue?"); - len = writev(vq->dev->fd, iov, out); - if (len < 0) + if (writev(vq->dev->fd, iov, out) < 0) err(1, "Writing network packet to tun"); - add_used_and_trigger(vq, head, len); + add_used_and_trigger(vq, head, 0); num++; } -- cgit v1.1 From 7b5c806c35f6ff76b2e36a8b5b1513c8a83fcff7 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:05 -0600 Subject: lguest: fix writev returning short on console output I've never seen it here, but I can't find anywhere that says writev will write everything. Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 8704600..02fa524 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -836,7 +836,12 @@ static void handle_console_output(struct virtqueue *vq, bool timeout) while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) { if (in) errx(1, "Input buffers in output queue?"); - writev(STDOUT_FILENO, iov, out); + while (!iov_empty(iov, out)) { + int len = writev(STDOUT_FILENO, iov, out); + if (len <= 0) + err(1, "Write to stdout gave %i", len); + iov_consume(iov, out, len); + } add_used_and_trigger(vq, head, 0); } } -- cgit v1.1 From acdd0b6292b282c4511897ac2691a47befbf1c6a Mon Sep 17 00:00:00 2001 From: Matias Zabaljauregui Date: Fri, 12 Jun 2009 22:27:07 -0600 Subject: lguest: PAE support This version requires that host and guest have the same PAE status. NX cap is not offered to the guest, yet. Signed-off-by: Matias Zabaljauregui Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.txt | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.txt b/Documentation/lguest/lguest.txt index 28c7473..efb3a6a 100644 --- a/Documentation/lguest/lguest.txt +++ b/Documentation/lguest/lguest.txt @@ -37,7 +37,6 @@ Running Lguest: "Paravirtualized guest support" = Y "Lguest guest support" = Y "High Memory Support" = off/4GB - "PAE (Physical Address Extension) Support" = N "Alignment value to which kernel should be aligned" = 0x100000 (CONFIG_PARAVIRT=y, CONFIG_LGUEST_GUEST=y, CONFIG_HIGHMEM64G=n and CONFIG_PHYSICAL_ALIGN=0x100000) -- cgit v1.1 From 659a0e6633567246edcb7bd400c7e2bece9237d9 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:10 -0600 Subject: lguest: have example Launcher service all devices in separate threads Currently lguest has three threads: the main Launcher thread, a Waker thread, and a thread for the block device (because synchronous block was simply too painful to bear). The Waker selects() on all the input file descriptors (eg. stdin, net devices, pipe to the block thread) and when one becomes readable it calls into the kernel to kick the Launcher thread out into userspace, which repeats the poll, services the device(s), and then tells the kernel to release the Waker before re-entering the kernel to run the Guest. Also, to make a slightly-decent network transmit routine, the Launcher would suppress further network interrupts while it set a timer: that signal handler would write to a pipe, which would rouse the Waker which would prod the Launcher out of the kernel to check the network device again. Now we can convert all our virtqueues to separate threads: each one has a separate eventfd for when the Guest pokes the device, and can trigger interrupts in the Guest directly. The linecount shows how much this simplifies, but to really bring it home, here's an strace analysis of single Guest->Host ping before: * Guest sends packet, notifies xmit vq, return control to Launcher * Launcher clears notification flag on xmit ring * Launcher writes packet to TUN device writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\366\r\224`\2058\272m\224vf\274\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108 * Launcher sets up interrupt for Guest (xmit ring is empty) write(10, "\2\0\0\0\3\0\0\0", 8) = 0 * Launcher sets up timer for interrupt mitigation setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 505}}, NULL) = 0 * Launcher re-runs guest pread64(10, 0xbfa5f4d4, 4, 0) ... * Waker notices reply packet in tun device (it was in select) select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [4]) * Waker kicks Launcher out of guest: pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0 * Launcher returns from running guest: ... = -1 EAGAIN (Resource temporarily unavailable) * Launcher looks at input fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [4], left {0, 0}) * Launcher reads pong from tun device: readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\272m\224vf\274\366\r\224`\2058\10\0E\0\0T\364\26\0\0@"..., 1518}], 2) = 108 * Launcher injects guest notification: write(10, "\2\0\0\0\2\0\0\0", 8) = 0 * Launcher rechecks fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout) * Launcher clears Waker: pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0 * Launcher reruns Guest: pread64(10, 0xbfa5f4d4, 4, 0) = ? ERESTARTSYS (To be restarted) * Signal comes in, uses pipe to wake up Launcher: --- SIGALRM (Alarm clock) @ 0 (0) --- write(8, "\0", 1) = 1 sigreturn() = ? (mask now []) * Waker sees write on pipe: select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [6]) * Waker kicks Launcher out of Guest: pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0 * Launcher exits from kernel: pread64(10, 0xbfa5f4d4, 4, 0) = -1 EAGAIN (Resource temporarily unavailable) * Launcher looks to see what fd woke it: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [6], left {0, 0}) * Launcher reads timeout fd, sets notification flag on xmit ring read(6, "\0", 32) = 1 * Launcher rechecks fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout) * Launcher clears Waker: pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0 * Launcher resumes Guest: pread64(10, "\0p\0\4", 4, 0) .... strace analysis of single Guest->Host ping after: * Guest sends packet, notifies xmit vq, creates event on eventfd. * Network xmit thread wakes from read on eventfd: read(7, "\1\0\0\0\0\0\0\0", 8) = 8 * Network xmit thread writes packet to TUN device writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"J\217\232FI\37j\27\375\276\0\304\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108 * Network recv thread wakes up from read on tunfd: readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"j\27\375\276\0\304J\217\232FI\37\10\0E\0\0TiO\0\0@\1\214"..., 1518}], 2) = 108 * Network recv thread sets up interrupt for the Guest write(6, "\2\0\0\0\2\0\0\0", 8) = 0 * Network recv thread goes back to reading tunfd 13:39:42.460285 readv(4, * Network xmit thread sets up interrupt for Guest (xmit ring is empty) write(6, "\2\0\0\0\3\0\0\0", 8) = 0 * Network xmit thread goes back to reading from eventfd read(7, Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 833 +++++++++++++----------------------------- 1 file changed, 259 insertions(+), 574 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 02fa524..5470b8e 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -59,7 +60,6 @@ typedef uint8_t u8; /*:*/ #define PAGE_PRESENT 0x7 /* Present, RW, Execute */ -#define NET_PEERNUM 1 #define BRIDGE_PFX "bridge:" #ifndef SIOCBRADDIF #define SIOCBRADDIF 0x89a2 /* add interface to bridge */ @@ -76,18 +76,10 @@ static bool verbose; do { if (verbose) printf(args); } while(0) /*:*/ -/* File descriptors for the Waker. */ -struct { - int pipe[2]; -} waker_fds; - /* The pointer to the start of guest memory. */ static void *guest_base; /* The maximum guest physical address allowed, and maximum possible. */ static unsigned long guest_limit, guest_max; -/* The pipe for signal hander to write to. */ -static int timeoutpipe[2]; -static unsigned int timeout_usec = 500; /* The /dev/lguest file descriptor. */ static int lguest_fd; @@ -97,11 +89,6 @@ static unsigned int __thread cpu_id; /* This is our list of devices. */ struct device_list { - /* Summary information about the devices in our list: ready to pass to - * select() to ask which need servicing.*/ - fd_set infds; - int max_infd; - /* Counter to assign interrupt numbers. */ unsigned int next_irq; @@ -137,16 +124,11 @@ struct device /* The name of this device, for --verbose. */ const char *name; - /* If handle_input is set, it wants to be called when this file - * descriptor is ready. */ - int fd; - bool (*handle_input)(struct device *me); - /* Any queues attached to this device */ struct virtqueue *vq; - /* Handle status being finalized (ie. feature bits stable). */ - void (*ready)(struct device *me); + /* Is it operational */ + bool running; /* Device-specific data. */ void *priv; @@ -169,16 +151,20 @@ struct virtqueue /* Last available index we saw. */ u16 last_avail_idx; - /* The routine to call when the Guest pings us, or timeout. */ - void (*handle_output)(struct virtqueue *me, bool timeout); + /* Eventfd where Guest notifications arrive. */ + int eventfd; - /* Is this blocked awaiting a timer? */ - bool blocked; + /* Function for the thread which is servicing this virtqueue. */ + void (*service)(struct virtqueue *vq); + pid_t thread; }; /* Remember the arguments to the program so we can "reboot" */ static char **main_args; +/* The original tty settings to restore on exit. */ +static struct termios orig_term; + /* We have to be careful with barriers: our devices are all run in separate * threads and so we need to make sure that changes visible to the Guest happen * in precise order. */ @@ -521,78 +507,6 @@ static void tell_kernel(unsigned long start) } /*:*/ -static void add_device_fd(int fd) -{ - FD_SET(fd, &devices.infds); - if (fd > devices.max_infd) - devices.max_infd = fd; -} - -/*L:200 - * The Waker. - * - * With console, block and network devices, we can have lots of input which we - * need to process. We could try to tell the kernel what file descriptors to - * watch, but handing a file descriptor mask through to the kernel is fairly - * icky. - * - * Instead, we clone off a thread which watches the file descriptors and writes - * the LHREQ_BREAK command to the /dev/lguest file descriptor to tell the Host - * stop running the Guest. This causes the Launcher to return from the - * /dev/lguest read with -EAGAIN, where it will write to /dev/lguest to reset - * the LHREQ_BREAK and wake us up again. - * - * This, of course, is merely a different *kind* of icky. - * - * Given my well-known antipathy to threads, I'd prefer to use processes. But - * it's easier to share Guest memory with threads, and trivial to share the - * devices.infds as the Launcher changes it. - */ -static int waker(void *unused) -{ - /* Close the write end of the pipe: only the Launcher has it open. */ - close(waker_fds.pipe[1]); - - for (;;) { - fd_set rfds = devices.infds; - unsigned long args[] = { LHREQ_BREAK, 1 }; - unsigned int maxfd = devices.max_infd; - - /* We also listen to the pipe from the Launcher. */ - FD_SET(waker_fds.pipe[0], &rfds); - if (waker_fds.pipe[0] > maxfd) - maxfd = waker_fds.pipe[0]; - - /* Wait until input is ready from one of the devices. */ - select(maxfd+1, &rfds, NULL, NULL, NULL); - - /* Message from Launcher? */ - if (FD_ISSET(waker_fds.pipe[0], &rfds)) { - char c; - /* If this fails, then assume Launcher has exited. - * Don't do anything on exit: we're just a thread! */ - if (read(waker_fds.pipe[0], &c, 1) != 1) - _exit(0); - continue; - } - - /* Send LHREQ_BREAK command to snap the Launcher out of it. */ - pwrite(lguest_fd, args, sizeof(args), cpu_id); - } - return 0; -} - -/* This routine just sets up a pipe to the Waker process. */ -static void setup_waker(void) -{ - /* This pipe is closed when Launcher dies, telling Waker. */ - if (pipe(waker_fds.pipe) != 0) - err(1, "Creating pipe for Waker"); - - if (clone(waker, malloc(4096) + 4096, CLONE_VM | SIGCHLD, NULL) == -1) - err(1, "Creating Waker"); -} - /* * Device Handling. * @@ -642,25 +556,27 @@ static unsigned next_desc(struct virtqueue *vq, unsigned int i) * number of output then some number of input descriptors, it's actually two * iovecs, but we pack them into one and note how many of each there were. * - * This function returns the descriptor number found, or vq->vring.num (which - * is never a valid descriptor number) if none was found. */ -static unsigned get_vq_desc(struct virtqueue *vq, - struct iovec iov[], - unsigned int *out_num, unsigned int *in_num) + * This function returns the descriptor number found. */ +static unsigned wait_for_vq_desc(struct virtqueue *vq, + struct iovec iov[], + unsigned int *out_num, unsigned int *in_num) { unsigned int i, head; - u16 last_avail; + u16 last_avail = lg_last_avail(vq); + + while (last_avail == vq->vring.avail->idx) { + u64 event; + + /* Nothing new? Wait for eventfd to tell us they refilled. */ + if (read(vq->eventfd, &event, sizeof(event)) != sizeof(event)) + errx(1, "Event read failed?"); + } /* Check it isn't doing very strange things with descriptor numbers. */ - last_avail = lg_last_avail(vq); if ((u16)(vq->vring.avail->idx - last_avail) > vq->vring.num) errx(1, "Guest moved used index from %u to %u", last_avail, vq->vring.avail->idx); - /* If there's nothing new since last we looked, return invalid. */ - if (vq->vring.avail->idx == last_avail) - return vq->vring.num; - /* Grab the next descriptor number they're advertising, and increment * the index we've seen. */ head = vq->vring.avail->ring[last_avail % vq->vring.num]; @@ -740,15 +656,7 @@ static void add_used_and_trigger(struct virtqueue *vq, unsigned head, int len) /* * The Console * - * Here is the input terminal setting we save, and the routine to restore them - * on exit so the user gets their terminal back. */ -static struct termios orig_term; -static void restore_term(void) -{ - tcsetattr(STDIN_FILENO, TCSANOW, &orig_term); -} - -/* We associate some data with the console for our exit hack. */ + * We associate some data with the console for our exit hack. */ struct console_abort { /* How many times have they hit ^C? */ @@ -758,245 +666,235 @@ struct console_abort }; /* This is the routine which handles console input (ie. stdin). */ -static bool handle_console_input(struct device *dev) +static void console_input(struct virtqueue *vq) { int len; unsigned int head, in_num, out_num; - struct iovec iov[dev->vq->vring.num]; - struct console_abort *abort = dev->priv; - - /* First we need a console buffer from the Guests's input virtqueue. */ - head = get_vq_desc(dev->vq, iov, &out_num, &in_num); - - /* If they're not ready for input, stop listening to this file - * descriptor. We'll start again once they add an input buffer. */ - if (head == dev->vq->vring.num) - return false; + struct console_abort *abort = vq->dev->priv; + struct iovec iov[vq->vring.num]; + /* Make sure there's a descriptor waiting. */ + head = wait_for_vq_desc(vq, iov, &out_num, &in_num); if (out_num) errx(1, "Output buffers in console in queue?"); - /* This is why we convert to iovecs: the readv() call uses them, and so - * it reads straight into the Guest's buffer. */ - len = readv(dev->fd, iov, in_num); + /* Read it in. */ + len = readv(STDIN_FILENO, iov, in_num); if (len <= 0) { - /* This implies that the console is closed, is /dev/null, or - * something went terribly wrong. */ + /* Ran out of input? */ warnx("Failed to get console input, ignoring console."); - /* Put the input terminal back. */ - restore_term(); - /* Remove callback from input vq, so it doesn't restart us. */ - dev->vq->handle_output = NULL; - /* Stop listening to this fd: don't call us again. */ - return false; + /* For simplicity, dying threads kill the whole Launcher. So + * just nap here. */ + for (;;) + pause(); } - /* Tell the Guest about the new input. */ - add_used_and_trigger(dev->vq, head, len); + add_used_and_trigger(vq, head, len); /* Three ^C within one second? Exit. * - * This is such a hack, but works surprisingly well. Each ^C has to be - * in a buffer by itself, so they can't be too fast. But we check that - * we get three within about a second, so they can't be too slow. */ - if (len == 1 && ((char *)iov[0].iov_base)[0] == 3) { - if (!abort->count++) - gettimeofday(&abort->start, NULL); - else if (abort->count == 3) { - struct timeval now; - gettimeofday(&now, NULL); - if (now.tv_sec <= abort->start.tv_sec+1) { - unsigned long args[] = { LHREQ_BREAK, 0 }; - /* Close the fd so Waker will know it has to - * exit. */ - close(waker_fds.pipe[1]); - /* Just in case Waker is blocked in BREAK, send - * unbreak now. */ - write(lguest_fd, args, sizeof(args)); - exit(2); - } - abort->count = 0; - } - } else - /* Any other key resets the abort counter. */ + * This is such a hack, but works surprisingly well. Each ^C has to + * be in a buffer by itself, so they can't be too fast. But we check + * that we get three within about a second, so they can't be too + * slow. */ + if (len != 1 || ((char *)iov[0].iov_base)[0] != 3) { abort->count = 0; + return; + } - /* Everything went OK! */ - return true; + abort->count++; + if (abort->count == 1) + gettimeofday(&abort->start, NULL); + else if (abort->count == 3) { + struct timeval now; + gettimeofday(&now, NULL); + /* Kill all Launcher processes with SIGINT, like normal ^C */ + if (now.tv_sec <= abort->start.tv_sec+1) + kill(0, SIGINT); + abort->count = 0; + } } -/* Handling output for console is simple: we just get all the output buffers - * and write them to stdout. */ -static void handle_console_output(struct virtqueue *vq, bool timeout) +/* This is the routine which handles console output (ie. stdout). */ +static void console_output(struct virtqueue *vq) { unsigned int head, out, in; struct iovec iov[vq->vring.num]; - /* Keep getting output buffers from the Guest until we run out. */ - while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) { - if (in) - errx(1, "Input buffers in output queue?"); - while (!iov_empty(iov, out)) { - int len = writev(STDOUT_FILENO, iov, out); - if (len <= 0) - err(1, "Write to stdout gave %i", len); - iov_consume(iov, out, len); - } - add_used_and_trigger(vq, head, 0); + head = wait_for_vq_desc(vq, iov, &out, &in); + if (in) + errx(1, "Input buffers in console output queue?"); + while (!iov_empty(iov, out)) { + int len = writev(STDOUT_FILENO, iov, out); + if (len <= 0) + err(1, "Write to stdout gave %i", len); + iov_consume(iov, out, len); } -} - -/* This is called when we no longer want to hear about Guest changes to a - * virtqueue. This is more efficient in high-traffic cases, but it means we - * have to set a timer to check if any more changes have occurred. */ -static void block_vq(struct virtqueue *vq) -{ - struct itimerval itm; - - vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY; - vq->blocked = true; - - itm.it_interval.tv_sec = 0; - itm.it_interval.tv_usec = 0; - itm.it_value.tv_sec = 0; - itm.it_value.tv_usec = timeout_usec; - - setitimer(ITIMER_REAL, &itm, NULL); + add_used_and_trigger(vq, head, 0); } /* * The Network * * Handling output for network is also simple: we get all the output buffers - * and write them (ignoring the first element) to this device's file descriptor - * (/dev/net/tun). + * and write them to /dev/net/tun. */ -static void handle_net_output(struct virtqueue *vq, bool timeout) +struct net_info { + int tunfd; +}; + +static void net_output(struct virtqueue *vq) { - unsigned int head, out, in, num = 0; + struct net_info *net_info = vq->dev->priv; + unsigned int head, out, in; struct iovec iov[vq->vring.num]; - static int last_timeout_num; - - /* Keep getting output buffers from the Guest until we run out. */ - while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) { - if (in) - errx(1, "Input buffers in output queue?"); - if (writev(vq->dev->fd, iov, out) < 0) - err(1, "Writing network packet to tun"); - add_used_and_trigger(vq, head, 0); - num++; - } - /* Block further kicks and set up a timer if we saw anything. */ - if (!timeout && num) - block_vq(vq); - - /* We never quite know how long should we wait before we check the - * queue again for more packets. We start at 500 microseconds, and if - * we get fewer packets than last time, we assume we made the timeout - * too small and increase it by 10 microseconds. Otherwise, we drop it - * by one microsecond every time. It seems to work well enough. */ - if (timeout) { - if (num < last_timeout_num) - timeout_usec += 10; - else if (timeout_usec > 1) - timeout_usec--; - last_timeout_num = num; - } + head = wait_for_vq_desc(vq, iov, &out, &in); + if (in) + errx(1, "Input buffers in net output queue?"); + if (writev(net_info->tunfd, iov, out) < 0) + errx(1, "Write to tun failed?"); + add_used_and_trigger(vq, head, 0); } -/* This is where we handle a packet coming in from the tun device to our +/* This is where we handle packets coming in from the tun device to our * Guest. */ -static bool handle_tun_input(struct device *dev) +static void net_input(struct virtqueue *vq) { - unsigned int head, in_num, out_num; int len; - struct iovec iov[dev->vq->vring.num]; - - /* First we need a network buffer from the Guests's recv virtqueue. */ - head = get_vq_desc(dev->vq, iov, &out_num, &in_num); - if (head == dev->vq->vring.num) { - /* Now, it's expected that if we try to send a packet too - * early, the Guest won't be ready yet. Wait until the device - * status says it's ready. */ - /* FIXME: Actually want DRIVER_ACTIVE here. */ - - /* Now tell it we want to know if new things appear. */ - dev->vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY; - wmb(); - - /* We'll turn this back on if input buffers are registered. */ - return false; - } else if (out_num) - errx(1, "Output buffers in network recv queue?"); - - /* Read the packet from the device directly into the Guest's buffer. */ - len = readv(dev->fd, iov, in_num); + unsigned int head, out, in; + struct iovec iov[vq->vring.num]; + struct net_info *net_info = vq->dev->priv; + + head = wait_for_vq_desc(vq, iov, &out, &in); + if (out) + errx(1, "Output buffers in net input queue?"); + len = readv(net_info->tunfd, iov, in); if (len <= 0) - err(1, "reading network"); + err(1, "Failed to read from tun."); + add_used_and_trigger(vq, head, len); +} - /* Tell the Guest about the new packet. */ - add_used_and_trigger(dev->vq, head, len); +/* This is the helper to create threads. */ +static int do_thread(void *_vq) +{ + struct virtqueue *vq = _vq; - verbose("tun input packet len %i [%02x %02x] (%s)\n", len, - ((u8 *)iov[1].iov_base)[0], ((u8 *)iov[1].iov_base)[1], - head != dev->vq->vring.num ? "sent" : "discarded"); + for (;;) + vq->service(vq); + return 0; +} - /* All good. */ - return true; +/* When a child dies, we kill our entire process group with SIGTERM. This + * also has the side effect that the shell restores the console for us! */ +static void kill_launcher(int signal) +{ + kill(0, SIGTERM); } -/*L:215 This is the callback attached to the network and console input - * virtqueues: it ensures we try again, in case we stopped console or net - * delivery because Guest didn't have any buffers. */ -static void enable_fd(struct virtqueue *vq, bool timeout) +static void reset_device(struct device *dev) { - add_device_fd(vq->dev->fd); - /* Snap the Waker out of its select loop. */ - write(waker_fds.pipe[1], "", 1); + struct virtqueue *vq; + + verbose("Resetting device %s\n", dev->name); + + /* Clear any features they've acked. */ + memset(get_feature_bits(dev) + dev->feature_len, 0, dev->feature_len); + + /* We're going to be explicitly killing threads, so ignore them. */ + signal(SIGCHLD, SIG_IGN); + + /* Zero out the virtqueues, get rid of their threads */ + for (vq = dev->vq; vq; vq = vq->next) { + if (vq->thread != (pid_t)-1) { + kill(vq->thread, SIGTERM); + waitpid(vq->thread, NULL, 0); + vq->thread = (pid_t)-1; + } + memset(vq->vring.desc, 0, + vring_size(vq->config.num, LGUEST_VRING_ALIGN)); + lg_last_avail(vq) = 0; + } + dev->running = false; + + /* Now we care if threads die. */ + signal(SIGCHLD, (void *)kill_launcher); } -static void net_enable_fd(struct virtqueue *vq, bool timeout) +static void create_thread(struct virtqueue *vq) { - /* We don't need to know again when Guest refills receive buffer. */ - vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY; - enable_fd(vq, timeout); + /* Create stack for thread and run it. Since stack grows + * upwards, we point the stack pointer to the end of this + * region. */ + char *stack = malloc(32768); + unsigned long args[] = { LHREQ_EVENTFD, + vq->config.pfn*getpagesize(), 0 }; + + /* Create a zero-initialized eventfd. */ + vq->eventfd = eventfd(0, 0); + if (vq->eventfd < 0) + err(1, "Creating eventfd"); + args[2] = vq->eventfd; + + /* Attach an eventfd to this virtqueue: it will go off + * when the Guest does an LHCALL_NOTIFY for this vq. */ + if (write(lguest_fd, &args, sizeof(args)) != 0) + err(1, "Attaching eventfd"); + + /* CLONE_VM: because it has to access the Guest memory, and + * SIGCHLD so we get a signal if it dies. */ + vq->thread = clone(do_thread, stack + 32768, CLONE_VM | SIGCHLD, vq); + if (vq->thread == (pid_t)-1) + err(1, "Creating clone"); + /* We close our local copy, now the child has it. */ + close(vq->eventfd); } -/* When the Guest tells us they updated the status field, we handle it. */ -static void update_device_status(struct device *dev) +static void start_device(struct device *dev) { + unsigned int i; struct virtqueue *vq; - /* This is a reset. */ - if (dev->desc->status == 0) { - verbose("Resetting device %s\n", dev->name); + verbose("Device %s OK: offered", dev->name); + for (i = 0; i < dev->feature_len; i++) + verbose(" %02x", get_feature_bits(dev)[i]); + verbose(", accepted"); + for (i = 0; i < dev->feature_len; i++) + verbose(" %02x", get_feature_bits(dev) + [dev->feature_len+i]); + + for (vq = dev->vq; vq; vq = vq->next) { + if (vq->service) + create_thread(vq); + } + dev->running = true; +} + +static void cleanup_devices(void) +{ + struct device *dev; + + for (dev = devices.dev; dev; dev = dev->next) + reset_device(dev); - /* Clear any features they've acked. */ - memset(get_feature_bits(dev) + dev->feature_len, 0, - dev->feature_len); + /* If we saved off the original terminal settings, restore them now. */ + if (orig_term.c_lflag & (ISIG|ICANON|ECHO)) + tcsetattr(STDIN_FILENO, TCSANOW, &orig_term); +} - /* Zero out the virtqueues. */ - for (vq = dev->vq; vq; vq = vq->next) { - memset(vq->vring.desc, 0, - vring_size(vq->config.num, LGUEST_VRING_ALIGN)); - lg_last_avail(vq) = 0; - } - } else if (dev->desc->status & VIRTIO_CONFIG_S_FAILED) { +/* When the Guest tells us they updated the status field, we handle it. */ +static void update_device_status(struct device *dev) +{ + /* A zero status is a reset, otherwise it's a set of flags. */ + if (dev->desc->status == 0) + reset_device(dev); + else if (dev->desc->status & VIRTIO_CONFIG_S_FAILED) { warnx("Device %s configuration FAILED", dev->name); + if (dev->running) + reset_device(dev); } else if (dev->desc->status & VIRTIO_CONFIG_S_DRIVER_OK) { - unsigned int i; - - verbose("Device %s OK: offered", dev->name); - for (i = 0; i < dev->feature_len; i++) - verbose(" %02x", get_feature_bits(dev)[i]); - verbose(", accepted"); - for (i = 0; i < dev->feature_len; i++) - verbose(" %02x", get_feature_bits(dev) - [dev->feature_len+i]); - - if (dev->ready) - dev->ready(dev); + if (!dev->running) + start_device(dev); } } @@ -1004,32 +902,24 @@ static void update_device_status(struct device *dev) static void handle_output(unsigned long addr) { struct device *i; - struct virtqueue *vq; - /* Check each device and virtqueue. */ + /* Check each device. */ for (i = devices.dev; i; i = i->next) { + struct virtqueue *vq; + /* Notifications to device descriptors update device status. */ if (from_guest_phys(addr) == i->desc) { update_device_status(i); return; } - /* Notifications to virtqueues mean output has occurred. */ + /* Devices *can* be used before status is set to DRIVER_OK. */ for (vq = i->vq; vq; vq = vq->next) { - if (vq->config.pfn != addr/getpagesize()) + if (addr != vq->config.pfn*getpagesize()) continue; - - /* Guest should acknowledge (and set features!) before - * using the device. */ - if (i->desc->status == 0) { - warnx("%s gave early output", i->name); - return; - } - - if (strcmp(vq->dev->name, "console") != 0) - verbose("Output to %s\n", vq->dev->name); - if (vq->handle_output) - vq->handle_output(vq, false); + if (i->running) + errx(1, "Notification on running %s", i->name); + start_device(i); return; } } @@ -1043,71 +933,6 @@ static void handle_output(unsigned long addr) strnlen(from_guest_phys(addr), guest_limit - addr)); } -static void handle_timeout(void) -{ - char buf[32]; - struct device *i; - struct virtqueue *vq; - - /* Clear the pipe */ - read(timeoutpipe[0], buf, sizeof(buf)); - - /* Check each device and virtqueue: flush blocked ones. */ - for (i = devices.dev; i; i = i->next) { - for (vq = i->vq; vq; vq = vq->next) { - if (!vq->blocked) - continue; - - vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY; - vq->blocked = false; - if (vq->handle_output) - vq->handle_output(vq, true); - } - } -} - -/* This is called when the Waker wakes us up: check for incoming file - * descriptors. */ -static void handle_input(void) -{ - /* select() wants a zeroed timeval to mean "don't wait". */ - struct timeval poll = { .tv_sec = 0, .tv_usec = 0 }; - - for (;;) { - struct device *i; - fd_set fds = devices.infds; - int num; - - num = select(devices.max_infd+1, &fds, NULL, NULL, &poll); - /* Could get interrupted */ - if (num < 0) - continue; - /* If nothing is ready, we're done. */ - if (num == 0) - break; - - /* Otherwise, call the device(s) which have readable file - * descriptors and a method of handling them. */ - for (i = devices.dev; i; i = i->next) { - if (i->handle_input && FD_ISSET(i->fd, &fds)) { - if (i->handle_input(i)) - continue; - - /* If handle_input() returns false, it means we - * should no longer service it. Networking and - * console do this when there's no input - * buffers to deliver into. Console also uses - * it when it discovers that stdin is closed. */ - FD_CLR(i->fd, &devices.infds); - } - } - - /* Is this the timeout fd? */ - if (FD_ISSET(timeoutpipe[0], &fds)) - handle_timeout(); - } -} - /*L:190 * Device Setup * @@ -1153,7 +978,7 @@ static struct lguest_device_desc *new_dev_desc(u16 type) /* Each device descriptor is followed by the description of its virtqueues. We * specify how many descriptors the virtqueue is to have. */ static void add_virtqueue(struct device *dev, unsigned int num_descs, - void (*handle_output)(struct virtqueue *, bool)) + void (*service)(struct virtqueue *)) { unsigned int pages; struct virtqueue **i, *vq = malloc(sizeof(*vq)); @@ -1168,7 +993,8 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, vq->next = NULL; vq->last_avail_idx = 0; vq->dev = dev; - vq->blocked = false; + vq->service = service; + vq->thread = (pid_t)-1; /* Initialize the configuration. */ vq->config.num = num_descs; @@ -1193,15 +1019,6 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs, * second. */ for (i = &dev->vq; *i; i = &(*i)->next); *i = vq; - - /* Set the routine to call when the Guest does something to this - * virtqueue. */ - vq->handle_output = handle_output; - - /* As an optimization, set the advisory "Don't Notify Me" flag if we - * don't have a handler */ - if (!handle_output) - vq->vring.used->flags = VRING_USED_F_NO_NOTIFY; } /* The first half of the feature bitmask is for us to advertise features. The @@ -1237,24 +1054,17 @@ static void set_config(struct device *dev, unsigned len, const void *conf) * calling new_dev_desc() to allocate the descriptor and device memory. * * See what I mean about userspace being boring? */ -static struct device *new_device(const char *name, u16 type, int fd, - bool (*handle_input)(struct device *)) +static struct device *new_device(const char *name, u16 type) { struct device *dev = malloc(sizeof(*dev)); /* Now we populate the fields one at a time. */ - dev->fd = fd; - /* If we have an input handler for this file descriptor, then we add it - * to the device_list's fdset and maxfd. */ - if (handle_input) - add_device_fd(dev->fd); dev->desc = new_dev_desc(type); - dev->handle_input = handle_input; dev->name = name; dev->vq = NULL; - dev->ready = NULL; dev->feature_len = 0; dev->num_vq = 0; + dev->running = false; /* Append to device list. Prepending to a single-linked list is * easier, but the user expects the devices to be arranged on the bus @@ -1282,13 +1092,10 @@ static void setup_console(void) * raw input stream to the Guest. */ term.c_lflag &= ~(ISIG|ICANON|ECHO); tcsetattr(STDIN_FILENO, TCSANOW, &term); - /* If we exit gracefully, the original settings will be - * restored so the user can see what they're typing. */ - atexit(restore_term); } - dev = new_device("console", VIRTIO_ID_CONSOLE, - STDIN_FILENO, handle_console_input); + dev = new_device("console", VIRTIO_ID_CONSOLE); + /* We store the console state in dev->priv, and initialize it. */ dev->priv = malloc(sizeof(struct console_abort)); ((struct console_abort *)dev->priv)->count = 0; @@ -1297,31 +1104,13 @@ static void setup_console(void) * they put something the input queue, we make sure we're listening to * stdin. When they put something in the output queue, we write it to * stdout. */ - add_virtqueue(dev, VIRTQUEUE_NUM, enable_fd); - add_virtqueue(dev, VIRTQUEUE_NUM, handle_console_output); + add_virtqueue(dev, VIRTQUEUE_NUM, console_input); + add_virtqueue(dev, VIRTQUEUE_NUM, console_output); - verbose("device %u: console\n", devices.device_num++); + verbose("device %u: console\n", ++devices.device_num); } /*:*/ -static void timeout_alarm(int sig) -{ - write(timeoutpipe[1], "", 1); -} - -static void setup_timeout(void) -{ - if (pipe(timeoutpipe) != 0) - err(1, "Creating timeout pipe"); - - if (fcntl(timeoutpipe[1], F_SETFL, - fcntl(timeoutpipe[1], F_GETFL) | O_NONBLOCK) != 0) - err(1, "Making timeout pipe nonblocking"); - - add_device_fd(timeoutpipe[0]); - signal(SIGALRM, timeout_alarm); -} - /*M:010 Inter-guest networking is an interesting area. Simplest is to have a * --sharenet= option which opens or creates a named pipe. This can be * used to send packets to another guest in a 1:1 manner. @@ -1443,21 +1232,23 @@ static int get_tun_device(char tapif[IFNAMSIZ]) static void setup_tun_net(char *arg) { struct device *dev; - int netfd, ipfd; + struct net_info *net_info = malloc(sizeof(*net_info)); + int ipfd; u32 ip = INADDR_ANY; bool bridging = false; char tapif[IFNAMSIZ], *p; struct virtio_net_config conf; - netfd = get_tun_device(tapif); + net_info->tunfd = get_tun_device(tapif); /* First we create a new network device. */ - dev = new_device("net", VIRTIO_ID_NET, netfd, handle_tun_input); + dev = new_device("net", VIRTIO_ID_NET); + dev->priv = net_info; /* Network devices need a receive and a send queue, just like * console. */ - add_virtqueue(dev, VIRTQUEUE_NUM, net_enable_fd); - add_virtqueue(dev, VIRTQUEUE_NUM, handle_net_output); + add_virtqueue(dev, VIRTQUEUE_NUM, net_input); + add_virtqueue(dev, VIRTQUEUE_NUM, net_output); /* We need a socket to perform the magic network ioctls to bring up the * tap interface, connect to the bridge etc. Any socket will do! */ @@ -1546,20 +1337,18 @@ struct vblk_info * Remember that the block device is handled by a separate I/O thread. We head * straight into the core of that thread here: */ -static bool service_io(struct device *dev) +static void blk_request(struct virtqueue *vq) { - struct vblk_info *vblk = dev->priv; + struct vblk_info *vblk = vq->dev->priv; unsigned int head, out_num, in_num, wlen; int ret; u8 *in; struct virtio_blk_outhdr *out; - struct iovec iov[dev->vq->vring.num]; + struct iovec iov[vq->vring.num]; off64_t off; - /* See if there's a request waiting. If not, nothing to do. */ - head = get_vq_desc(dev->vq, iov, &out_num, &in_num); - if (head == dev->vq->vring.num) - return false; + /* Get the next request. */ + head = wait_for_vq_desc(vq, iov, &out_num, &in_num); /* Every block request should contain at least one output buffer * (detailing the location on disk and the type of request) and one @@ -1633,83 +1422,21 @@ static bool service_io(struct device *dev) if (out->type & VIRTIO_BLK_T_BARRIER) fdatasync(vblk->fd); - /* We can't trigger an IRQ, because we're not the Launcher. It does - * that when we tell it we're done. */ - add_used(dev->vq, head, wlen); - return true; -} - -/* This is the thread which actually services the I/O. */ -static int io_thread(void *_dev) -{ - struct device *dev = _dev; - struct vblk_info *vblk = dev->priv; - char c; - - /* Close other side of workpipe so we get 0 read when main dies. */ - close(vblk->workpipe[1]); - /* Close the other side of the done_fd pipe. */ - close(dev->fd); - - /* When this read fails, it means Launcher died, so we follow. */ - while (read(vblk->workpipe[0], &c, 1) == 1) { - /* We acknowledge each request immediately to reduce latency, - * rather than waiting until we've done them all. I haven't - * measured to see if it makes any difference. - * - * That would be an interesting test, wouldn't it? You could - * also try having more than one I/O thread. */ - while (service_io(dev)) - write(vblk->done_fd, &c, 1); - } - return 0; -} - -/* Now we've seen the I/O thread, we return to the Launcher to see what happens - * when that thread tells us it's completed some I/O. */ -static bool handle_io_finish(struct device *dev) -{ - char c; - - /* If the I/O thread died, presumably it printed the error, so we - * simply exit. */ - if (read(dev->fd, &c, 1) != 1) - exit(1); - - /* It did some work, so trigger the irq. */ - trigger_irq(dev->vq); - return true; -} - -/* When the Guest submits some I/O, we just need to wake the I/O thread. */ -static void handle_virtblk_output(struct virtqueue *vq, bool timeout) -{ - struct vblk_info *vblk = vq->dev->priv; - char c = 0; - - /* Wake up I/O thread and tell it to go to work! */ - if (write(vblk->workpipe[1], &c, 1) != 1) - /* Presumably it indicated why it died. */ - exit(1); + add_used_and_trigger(vq, head, wlen); } /*L:198 This actually sets up a virtual block device. */ static void setup_block_file(const char *filename) { - int p[2]; struct device *dev; struct vblk_info *vblk; - void *stack; struct virtio_blk_config conf; - /* This is the pipe the I/O thread will use to tell us I/O is done. */ - pipe(p); - /* The device responds to return from I/O thread. */ - dev = new_device("block", VIRTIO_ID_BLOCK, p[0], handle_io_finish); + dev = new_device("block", VIRTIO_ID_BLOCK); /* The device has one virtqueue, where the Guest places requests. */ - add_virtqueue(dev, VIRTQUEUE_NUM, handle_virtblk_output); + add_virtqueue(dev, VIRTQUEUE_NUM, blk_request); /* Allocate the room for our own bookkeeping */ vblk = dev->priv = malloc(sizeof(*vblk)); @@ -1731,49 +1458,29 @@ static void setup_block_file(const char *filename) set_config(dev, sizeof(conf), &conf); - /* The I/O thread writes to this end of the pipe when done. */ - vblk->done_fd = p[1]; - - /* This is the second pipe, which is how we tell the I/O thread about - * more work. */ - pipe(vblk->workpipe); - - /* Create stack for thread and run it. Since stack grows upwards, we - * point the stack pointer to the end of this region. */ - stack = malloc(32768); - /* SIGCHLD - We dont "wait" for our cloned thread, so prevent it from - * becoming a zombie. */ - if (clone(io_thread, stack + 32768, CLONE_VM | SIGCHLD, dev) == -1) - err(1, "Creating clone"); - - /* We don't need to keep the I/O thread's end of the pipes open. */ - close(vblk->done_fd); - close(vblk->workpipe[0]); - verbose("device %u: virtblock %llu sectors\n", - devices.device_num, le64_to_cpu(conf.capacity)); + ++devices.device_num, le64_to_cpu(conf.capacity)); } +struct rng_info { + int rfd; +}; + /* Our random number generator device reads from /dev/random into the Guest's * input buffers. The usual case is that the Guest doesn't want random numbers * and so has no buffers although /dev/random is still readable, whereas * console is the reverse. * * The same logic applies, however. */ -static bool handle_rng_input(struct device *dev) +static void rng_input(struct virtqueue *vq) { int len; unsigned int head, in_num, out_num, totlen = 0; - struct iovec iov[dev->vq->vring.num]; + struct rng_info *rng_info = vq->dev->priv; + struct iovec iov[vq->vring.num]; /* First we need a buffer from the Guests's virtqueue. */ - head = get_vq_desc(dev->vq, iov, &out_num, &in_num); - - /* If they're not ready for input, stop listening to this file - * descriptor. We'll start again once they add an input buffer. */ - if (head == dev->vq->vring.num) - return false; - + head = wait_for_vq_desc(vq, iov, &out_num, &in_num); if (out_num) errx(1, "Output buffers in rng?"); @@ -1781,7 +1488,7 @@ static bool handle_rng_input(struct device *dev) * it reads straight into the Guest's buffer. We loop to make sure we * fill it. */ while (!iov_empty(iov, in_num)) { - len = readv(dev->fd, iov, in_num); + len = readv(rng_info->rfd, iov, in_num); if (len <= 0) err(1, "Read from /dev/random gave %i", len); iov_consume(iov, in_num, len); @@ -1789,25 +1496,23 @@ static bool handle_rng_input(struct device *dev) } /* Tell the Guest about the new input. */ - add_used_and_trigger(dev->vq, head, totlen); - - /* Everything went OK! */ - return true; + add_used_and_trigger(vq, head, totlen); } /* And this creates a "hardware" random number device for the Guest. */ static void setup_rng(void) { struct device *dev; - int fd; + struct rng_info *rng_info = malloc(sizeof(*rng_info)); - fd = open_or_die("/dev/random", O_RDONLY); + rng_info->rfd = open_or_die("/dev/random", O_RDONLY); /* The device responds to return from I/O thread. */ - dev = new_device("rng", VIRTIO_ID_RNG, fd, handle_rng_input); + dev = new_device("rng", VIRTIO_ID_RNG); + dev->priv = rng_info; /* The device has one virtqueue, where the Guest places inbufs. */ - add_virtqueue(dev, VIRTQUEUE_NUM, enable_fd); + add_virtqueue(dev, VIRTQUEUE_NUM, rng_input); verbose("device %u: rng\n", devices.device_num++); } @@ -1823,7 +1528,9 @@ static void __attribute__((noreturn)) restart_guest(void) for (i = 3; i < FD_SETSIZE; i++) close(i); - /* The exec automatically gets rid of the I/O and Waker threads. */ + /* Reset all the devices (kills all threads). */ + cleanup_devices(); + execv(main_args[0], main_args); err(1, "Could not exec %s", main_args[0]); } @@ -1833,7 +1540,6 @@ static void __attribute__((noreturn)) restart_guest(void) static void __attribute__((noreturn)) run_guest(void) { for (;;) { - unsigned long args[] = { LHREQ_BREAK, 0 }; unsigned long notify_addr; int readval; @@ -1845,7 +1551,6 @@ static void __attribute__((noreturn)) run_guest(void) if (readval == sizeof(notify_addr)) { verbose("Notify on address %#lx\n", notify_addr); handle_output(notify_addr); - continue; /* ENOENT means the Guest died. Reading tells us why. */ } else if (errno == ENOENT) { char reason[1024] = { 0 }; @@ -1854,19 +1559,9 @@ static void __attribute__((noreturn)) run_guest(void) /* ERESTART means that we need to reboot the guest */ } else if (errno == ERESTART) { restart_guest(); - /* EAGAIN means a signal (timeout). - * Anything else means a bug or incompatible change. */ - } else if (errno != EAGAIN) + /* Anything else means a bug or incompatible change. */ + } else err(1, "Running guest failed"); - - /* Only service input on thread for CPU 0. */ - if (cpu_id != 0) - continue; - - /* Service input, then unset the BREAK to release the Waker. */ - handle_input(); - if (pwrite(lguest_fd, args, sizeof(args), cpu_id) < 0) - err(1, "Resetting break"); } } /*L:240 @@ -1909,18 +1604,10 @@ int main(int argc, char *argv[]) /* Save the args: we "reboot" by execing ourselves again. */ main_args = argv; - /* We don't "wait" for the children, so prevent them from becoming - * zombies. */ - signal(SIGCHLD, SIG_IGN); - /* First we initialize the device list. Since console and network - * device receive input from a file descriptor, we keep an fdset - * (infds) and the maximum fd number (max_infd) with the head of the - * list. We also keep a pointer to the last device. Finally, we keep - * the next interrupt number to use for devices (1: remember that 0 is - * used by the timer). */ - FD_ZERO(&devices.infds); - devices.max_infd = -1; + /* First we initialize the device list. We keep a pointer to the last + * device, and the next interrupt number to use for devices (1: + * remember that 0 is used by the timer). */ devices.lastdev = NULL; devices.next_irq = 1; @@ -1978,9 +1665,6 @@ int main(int argc, char *argv[]) /* We always have a console device */ setup_console(); - /* We can timeout waiting for Guest network transmit. */ - setup_timeout(); - /* Now we load the kernel */ start = load_kernel(open_or_die(argv[optind+1], O_RDONLY)); @@ -2021,10 +1705,11 @@ int main(int argc, char *argv[]) * /dev/lguest file descriptor. */ tell_kernel(start); - /* We clone off a thread, which wakes the Launcher whenever one of the - * input file descriptors needs attention. We call this the Waker, and - * we'll cover it in a moment. */ - setup_waker(); + /* Ensure that we terminate if a child dies. */ + signal(SIGCHLD, kill_launcher); + + /* If we exit via err(), this kills all the threads, restores tty. */ + atexit(cleanup_devices); /* Finally, run the Guest. This doesn't return. */ run_guest(); -- cgit v1.1 From 38bc2b8c56a2e212bbd19de7cf9976dcc7bf9953 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:11 -0600 Subject: lguest: implement deferred interrupts in example Launcher Rather than sending an interrupt on every buffer, we only send an interrupt when we're about to wait for the Guest to send us a new one. The console input and network input still send interrupts manually, but the block device, network and console output queues can simply rely on this logic to send interrupts to the Guest at the right time. The patch is cluttered by moving trigger_irq() higher in the code. In practice, two factors make this optimization less interesting: (1) we often only get one input at a time, even for networking, (2) triggering an interrupt rapidly tends to get coalesced anyway. Before: Secs RxIRQS TxIRQs 1G TCP Guest->Host: 3.72 32784 32771 1M normal pings: 99 1000004 995541 100,000 1k pings (-l 120): 5 49510 49058 After: 1G TCP Guest->Host: 3.69 32809 32769 1M normal pings: 99 1000004 996196 100,000 1k pings (-l 120): 5 52435 52361 (Note the interrupt count on 100k pings goes *up*: see next patch). Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 5470b8e..84c471b 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -551,6 +551,21 @@ static unsigned next_desc(struct virtqueue *vq, unsigned int i) return next; } +/* This actually sends the interrupt for this virtqueue */ +static void trigger_irq(struct virtqueue *vq) +{ + unsigned long buf[] = { LHREQ_IRQ, vq->config.irq }; + + /* If they don't want an interrupt, don't send one, unless empty. */ + if ((vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT) + && lg_last_avail(vq) != vq->vring.avail->idx) + return; + + /* Send the Guest an interrupt tell them we used something up. */ + if (write(lguest_fd, buf, sizeof(buf)) != 0) + err(1, "Triggering irq %i", vq->config.irq); +} + /* This looks in the virtqueue and for the first available buffer, and converts * it to an iovec for convenient access. Since descriptors consist of some * number of output then some number of input descriptors, it's actually two @@ -567,6 +582,9 @@ static unsigned wait_for_vq_desc(struct virtqueue *vq, while (last_avail == vq->vring.avail->idx) { u64 event; + /* OK, tell Guest about progress up to now. */ + trigger_irq(vq); + /* Nothing new? Wait for eventfd to tell us they refilled. */ if (read(vq->eventfd, &event, sizeof(event)) != sizeof(event)) errx(1, "Event read failed?"); @@ -631,21 +649,6 @@ static void add_used(struct virtqueue *vq, unsigned int head, int len) vq->vring.used->idx++; } -/* This actually sends the interrupt for this virtqueue */ -static void trigger_irq(struct virtqueue *vq) -{ - unsigned long buf[] = { LHREQ_IRQ, vq->config.irq }; - - /* If they don't want an interrupt, don't send one, unless empty. */ - if ((vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT) - && lg_last_avail(vq) != vq->vring.avail->idx) - return; - - /* Send the Guest an interrupt tell them we used something up. */ - if (write(lguest_fd, buf, sizeof(buf)) != 0) - err(1, "Triggering irq %i", vq->config.irq); -} - /* And here's the combo meal deal. Supersize me! */ static void add_used_and_trigger(struct virtqueue *vq, unsigned head, int len) { @@ -730,7 +733,7 @@ static void console_output(struct virtqueue *vq) err(1, "Write to stdout gave %i", len); iov_consume(iov, out, len); } - add_used_and_trigger(vq, head, 0); + add_used(vq, head, 0); } /* @@ -754,7 +757,7 @@ static void net_output(struct virtqueue *vq) errx(1, "Input buffers in net output queue?"); if (writev(net_info->tunfd, iov, out) < 0) errx(1, "Write to tun failed?"); - add_used_and_trigger(vq, head, 0); + add_used(vq, head, 0); } /* This is where we handle packets coming in from the tun device to our @@ -1422,7 +1425,7 @@ static void blk_request(struct virtqueue *vq) if (out->type & VIRTIO_BLK_T_BARRIER) fdatasync(vblk->fd); - add_used_and_trigger(vq, head, wlen); + add_used(vq, head, wlen); } /*L:198 This actually sets up a virtual block device. */ @@ -1496,7 +1499,7 @@ static void rng_input(struct virtqueue *vq) } /* Tell the Guest about the new input. */ - add_used_and_trigger(vq, head, totlen); + add_used(vq, head, totlen); } /* And this creates a "hardware" random number device for the Guest. */ -- cgit v1.1 From 95c517c09bad31a03e22f2fdb5f0aa26a490a92d Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:11 -0600 Subject: lguest: avoid sending interrupts to Guest when no activity occurs. If we track how many buffers we've used, we can tell whether we really need to interrupt the Guest. This happens as a side effect of spurious notifications. Spurious notifications happen because it can take a while before the Host thread wakes up and sets the VRING_USED_F_NO_NOTIFY flag, and meanwhile the Guest can more notifications. A real fix would be to use wake counts, rather than a suppression flag, but the practical difference is generally in the noise: the interrupt is usually coalesced into a pending one anyway so we just save a system call which isn't clearly measurable. Secs Spurious IRQS 1G TCP Guest->Host: 3.93 58 1M normal pings: 100 72 1M 1k pings (-l 120): 57 492904 Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 84c471b..20f8253 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -151,6 +151,9 @@ struct virtqueue /* Last available index we saw. */ u16 last_avail_idx; + /* How many are used since we sent last irq? */ + unsigned int pending_used; + /* Eventfd where Guest notifications arrive. */ int eventfd; @@ -556,6 +559,11 @@ static void trigger_irq(struct virtqueue *vq) { unsigned long buf[] = { LHREQ_IRQ, vq->config.irq }; + /* Don't inform them if nothing used. */ + if (!vq->pending_used) + return; + vq->pending_used = 0; + /* If they don't want an interrupt, don't send one, unless empty. */ if ((vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT) && lg_last_avail(vq) != vq->vring.avail->idx) @@ -647,6 +655,7 @@ static void add_used(struct virtqueue *vq, unsigned int head, int len) /* Make sure buffer is written before we update index. */ wmb(); vq->vring.used->idx++; + vq->pending_used++; } /* And here's the combo meal deal. Supersize me! */ -- cgit v1.1 From 4a8962e21bc505c714fc2508494d4c7dd3fe2d29 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:12 -0600 Subject: lguest: try to batch interrupts on network receive Rather than triggering an interrupt every time, we only trigger an interrupt when there are no more incoming packets (or the recv queue is full). However, the overhead of doing the select to figure this out is measurable: 1M pings goes from 98 to 104 seconds, and 1G Guest->Host TCP goes from 3.69 to 3.94 seconds. It's close to the noise though. I tested various timeouts, including reducing it as the number of pending packets increased, timing a 1 gigabyte TCP send from Guest -> Host and Host -> Guest (GSO disabled, to increase packet rate). // time tcpblast -o -s 65536 -c 16k 192.168.2.1:9999 > /dev/null Timeout Guest->Host Pkts/irq Host->Guest Pkts/irq Before 11.3s 1.0 6.3s 1.0 0 11.7s 1.0 6.6s 23.5 1 17.1s 8.8 8.6s 26.0 1/pending 13.4s 1.9 6.6s 23.8 2/pending 13.6s 2.8 6.6s 24.1 5/pending 14.1s 5.0 6.6s 24.4 Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 20f8253..6411079 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -769,6 +769,16 @@ static void net_output(struct virtqueue *vq) add_used(vq, head, 0); } +/* Will reading from this file descriptor block? */ +static bool will_block(int fd) +{ + fd_set fdset; + struct timeval zero = { 0, 0 }; + FD_ZERO(&fdset); + FD_SET(fd, &fdset); + return select(fd+1, &fdset, NULL, NULL, &zero) != 1; +} + /* This is where we handle packets coming in from the tun device to our * Guest. */ static void net_input(struct virtqueue *vq) @@ -781,10 +791,15 @@ static void net_input(struct virtqueue *vq) head = wait_for_vq_desc(vq, iov, &out, &in); if (out) errx(1, "Output buffers in net input queue?"); + + /* Deliver interrupt now, since we're about to sleep. */ + if (vq->pending_used && will_block(net_info->tunfd)) + trigger_irq(vq); + len = readv(net_info->tunfd, iov, in); if (len <= 0) err(1, "Failed to read from tun."); - add_used_and_trigger(vq, head, len); + add_used(vq, head, len); } /* This is the helper to create threads. */ -- cgit v1.1 From b60da13fc7bbf99d3c68578bd3fbcf66e1cb5f41 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Fri, 12 Jun 2009 22:27:12 -0600 Subject: lguest: suppress notifications in example Launcher The Guest only really needs to tell us about activity when we're going to listen to the eventfd: normally, we don't want to know. So if there are no available buffers, turn on notifications, re-check, then wait for the Guest to notify us via the eventfd, then turn notifications off again. There's enough else going on that the differences are in the noise. Before: Secs RxKicks TxKicks 1G TCP Guest->Host: 3.94 4686 32815 1M normal pings: 104 142862 1000010 1M 1k pings (-l 120): 57 142026 1000007 After: 1G TCP Guest->Host: 3.76 4691 32811 1M normal pings: 111 142859 997467 1M 1k pings (-l 120): 55 19648 501549 Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index 6411079..bb5e3c2 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -172,6 +172,7 @@ static struct termios orig_term; * threads and so we need to make sure that changes visible to the Guest happen * in precise order. */ #define wmb() __asm__ __volatile__("" : : : "memory") +#define mb() __asm__ __volatile__("" : : : "memory") /* Convert an iovec element to the given type. * @@ -593,9 +594,23 @@ static unsigned wait_for_vq_desc(struct virtqueue *vq, /* OK, tell Guest about progress up to now. */ trigger_irq(vq); + /* OK, now we need to know about added descriptors. */ + vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY; + + /* They could have slipped one in as we were doing that: make + * sure it's written, then check again. */ + mb(); + if (last_avail != vq->vring.avail->idx) { + vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY; + break; + } + /* Nothing new? Wait for eventfd to tell us they refilled. */ if (read(vq->eventfd, &event, sizeof(event)) != sizeof(event)) errx(1, "Event read failed?"); + + /* We don't need to be notified again. */ + vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY; } /* Check it isn't doing very strange things with descriptor numbers. */ -- cgit v1.1 From d1f0132e76a11b05167313c606a853953f416081 Mon Sep 17 00:00:00 2001 From: Mark McLoughlin Date: Mon, 11 May 2009 18:11:46 +0100 Subject: lguest: add support for indirect ring entries Support the VIRTIO_RING_F_INDIRECT_DESC feature. This is a simple matter of changing the descriptor walking code to operate on a struct vring_desc* and supplying it with an indirect table if detected. Signed-off-by: Mark McLoughlin Signed-off-by: Rusty Russell --- Documentation/lguest/lguest.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c index bb5e3c2..9ebcd6e 100644 --- a/Documentation/lguest/lguest.c +++ b/Documentation/lguest/lguest.c @@ -536,20 +536,21 @@ static void *_check_pointer(unsigned long addr, unsigned int size, /* Each buffer in the virtqueues is actually a chain of descriptors. This * function returns the next descriptor in the chain, or vq->vring.num if we're * at the end. */ -static unsigned next_desc(struct virtqueue *vq, unsigned int i) +static unsigned next_desc(struct vring_desc *desc, + unsigned int i, unsigned int max) { unsigned int next; /* If this descriptor says it doesn't chain, we're done. */ - if (!(vq->vring.desc[i].flags & VRING_DESC_F_NEXT)) - return vq->vring.num; + if (!(desc[i].flags & VRING_DESC_F_NEXT)) + return max; /* Check they're not leading us off end of descriptors. */ - next = vq->vring.desc[i].next; + next = desc[i].next; /* Make sure compiler knows to grab that: we don't want it changing! */ wmb(); - if (next >= vq->vring.num) + if (next >= max) errx(1, "Desc next is %u", next); return next; @@ -585,7 +586,8 @@ static unsigned wait_for_vq_desc(struct virtqueue *vq, struct iovec iov[], unsigned int *out_num, unsigned int *in_num) { - unsigned int i, head; + unsigned int i, head, max; + struct vring_desc *desc; u16 last_avail = lg_last_avail(vq); while (last_avail == vq->vring.avail->idx) { @@ -630,15 +632,28 @@ static unsigned wait_for_vq_desc(struct virtqueue *vq, /* When we start there are none of either input nor output. */ *out_num = *in_num = 0; + max = vq->vring.num; + desc = vq->vring.desc; i = head; + + /* If this is an indirect entry, then this buffer contains a descriptor + * table which we handle as if it's any normal descriptor chain. */ + if (desc[i].flags & VRING_DESC_F_INDIRECT) { + if (desc[i].len % sizeof(struct vring_desc)) + errx(1, "Invalid size for indirect buffer table"); + + max = desc[i].len / sizeof(struct vring_desc); + desc = check_pointer(desc[i].addr, desc[i].len); + i = 0; + } + do { /* Grab the first descriptor, and check it's OK. */ - iov[*out_num + *in_num].iov_len = vq->vring.desc[i].len; + iov[*out_num + *in_num].iov_len = desc[i].len; iov[*out_num + *in_num].iov_base - = check_pointer(vq->vring.desc[i].addr, - vq->vring.desc[i].len); + = check_pointer(desc[i].addr, desc[i].len); /* If this is an input descriptor, increment that count. */ - if (vq->vring.desc[i].flags & VRING_DESC_F_WRITE) + if (desc[i].flags & VRING_DESC_F_WRITE) (*in_num)++; else { /* If it's an output descriptor, they're all supposed @@ -649,9 +664,9 @@ static unsigned wait_for_vq_desc(struct virtqueue *vq, } /* If we've got too many, that implies a descriptor loop. */ - if (*out_num + *in_num > vq->vring.num) + if (*out_num + *in_num > max) errx(1, "Looped descriptor"); - } while ((i = next_desc(vq, i)) != vq->vring.num); + } while ((i = next_desc(desc, i, max)) != max); return head; } @@ -1331,6 +1346,8 @@ static void setup_tun_net(char *arg) add_feature(dev, VIRTIO_NET_F_HOST_TSO4); add_feature(dev, VIRTIO_NET_F_HOST_TSO6); add_feature(dev, VIRTIO_NET_F_HOST_ECN); + /* We handle indirect ring entries */ + add_feature(dev, VIRTIO_RING_F_INDIRECT_DESC); set_config(dev, sizeof(conf), &conf); /* We don't need the socket any more; setup is done. */ -- cgit v1.1 From 98a1708de1bfa5fe1c490febba850d6043d3c7fa Mon Sep 17 00:00:00 2001 From: Martin Olsson Date: Wed, 22 Apr 2009 18:21:29 +0200 Subject: trivial: fix typos s/paramter/parameter/ and s/excute/execute/ in documentation and source comments. Signed-off-by: Martin Olsson Signed-off-by: Jiri Kosina --- Documentation/powerpc/dts-bindings/fsl/board.txt | 2 +- Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/board.txt b/Documentation/powerpc/dts-bindings/fsl/board.txt index 6c974d2..e8b5bc2 100644 --- a/Documentation/powerpc/dts-bindings/fsl/board.txt +++ b/Documentation/powerpc/dts-bindings/fsl/board.txt @@ -38,7 +38,7 @@ Required properities: - reg : Should contain the address and the length of the GPIO bank register. - #gpio-cells : Should be two. The first cell is the pin number and the - second cell is used to specify optional paramters (currently unused). + second cell is used to specify optional parameters (currently unused). - gpio-controller : Marks the port as GPIO controller. Example: diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt index 1815dfe..349f79f 100644 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt +++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/gpio.txt @@ -11,7 +11,7 @@ Required properties: "fsl,cpm1-pario-bank-c", "fsl,cpm1-pario-bank-d", "fsl,cpm1-pario-bank-e", "fsl,cpm2-pario-bank" - #gpio-cells : Should be two. The first cell is the pin number and the - second cell is used to specify optional paramters (currently unused). + second cell is used to specify optional parameters (currently unused). - gpio-controller : Marks the port as GPIO controller. Example of three SOC GPIO banks defined as gpio-controller nodes: -- cgit v1.1 From 19af5cdb7c79ff5ec96a99893ffb7f894f4a3dc1 Mon Sep 17 00:00:00 2001 From: Martin Olsson Date: Thu, 23 Apr 2009 11:37:37 +0200 Subject: trivial: fix typo milisecond/millisecond for documentation and source comments. Signed-off-by: Martin Olsson Signed-off-by: Jiri Kosina --- Documentation/CodingStyle | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 72968cd..8bb3723 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -698,8 +698,8 @@ very often is not. Abundant use of the inline keyword leads to a much bigger kernel, which in turn slows the system as a whole down, due to a bigger icache footprint for the CPU and simply because there is less memory available for the pagecache. Just think about it; a pagecache miss causes a -disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles -that can go into these 5 miliseconds. +disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles +that can go into these 5 milliseconds. A reasonable rule of thumb is to not put inline at functions that have more than 3 lines of code in them. An exception to this rule are the cases where -- cgit v1.1 From 19f594600110377ec4037fdf7fb93a25ec516212 Mon Sep 17 00:00:00 2001 From: Matt LaPlante Date: Mon, 27 Apr 2009 15:06:31 +0200 Subject: trivial: Miscellaneous documentation typo fixes Fix various typos in documentation txts. Signed-off-by: Matt LaPlante Signed-off-by: Jiri Kosina --- Documentation/DMA-API.txt | 4 ++-- Documentation/RCU/rculist_nulls.txt | 2 +- Documentation/SM501.txt | 2 +- Documentation/block/deadline-iosched.txt | 2 +- Documentation/braille-console.txt | 2 +- Documentation/driver-model/devres.txt | 2 +- Documentation/edac.txt | 8 ++++---- Documentation/fb/sh7760fb.txt | 2 +- Documentation/filesystems/autofs4-mount-control.txt | 2 +- Documentation/filesystems/caching/netfs-api.txt | 2 +- Documentation/filesystems/ext4.txt | 6 +++--- Documentation/filesystems/fiemap.txt | 2 +- Documentation/filesystems/nfs-rdma.txt | 2 +- Documentation/filesystems/proc.txt | 4 ++-- Documentation/filesystems/sysfs-pci.txt | 2 +- Documentation/filesystems/vfat.txt | 8 ++++---- Documentation/gpio.txt | 2 +- Documentation/kdump/kdump.txt | 4 ++-- Documentation/kernel-parameters.txt | 4 ++-- Documentation/kobject.txt | 2 +- Documentation/laptops/acer-wmi.txt | 2 +- Documentation/laptops/sony-laptop.txt | 2 +- Documentation/laptops/thinkpad-acpi.txt | 2 +- Documentation/local_ops.txt | 2 +- Documentation/memory-hotplug.txt | 8 ++++---- Documentation/mn10300/ABI.txt | 2 +- Documentation/mtd/nand_ecc.txt | 12 ++++++------ Documentation/networking/bonding.txt | 6 +++--- Documentation/networking/can.txt | 2 +- Documentation/networking/dm9000.txt | 2 +- Documentation/networking/l2tp.txt | 2 +- Documentation/networking/netdevices.txt | 2 +- Documentation/networking/phonet.txt | 2 +- Documentation/networking/regulatory.txt | 2 +- Documentation/power/regulator/consumer.txt | 2 +- Documentation/power/regulator/overview.txt | 2 +- Documentation/power/s2ram.txt | 2 +- Documentation/power/userland-swsusp.txt | 2 +- Documentation/powerpc/booting-without-of.txt | 4 ++-- Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt | 2 +- Documentation/powerpc/dts-bindings/fsl/msi-pic.txt | 2 +- Documentation/powerpc/dts-bindings/fsl/pmc.txt | 4 ++-- Documentation/powerpc/qe_firmware.txt | 2 +- Documentation/s390/Debugging390.txt | 4 ++-- Documentation/scheduler/sched-nice-design.txt | 2 +- Documentation/scsi/aic79xx.txt | 2 +- Documentation/scsi/ncr53c8xx.txt | 4 ++-- Documentation/scsi/sym53c8xx_2.txt | 2 +- Documentation/sound/alsa/ALSA-Configuration.txt | 2 +- Documentation/sound/alsa/HD-Audio.txt | 2 +- Documentation/sound/alsa/hda_codec.txt | 2 +- Documentation/sysctl/vm.txt | 4 ++-- Documentation/timers/hpet.txt | 2 +- Documentation/timers/timer_stats.txt | 2 +- Documentation/usb/WUSB-Design-overview.txt | 8 ++++---- Documentation/usb/anchors.txt | 4 ++-- Documentation/video4linux/cx18.txt | 2 +- 57 files changed, 88 insertions(+), 88 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 25fb8bc..5aceb88 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -676,8 +676,8 @@ this directory the following files can currently be found: dma-api/all_errors This file contains a numeric value. If this value is not equal to zero the debugging code will print a warning for every error it finds - into the kernel log. Be carefull with this - option. It can easily flood your logs. + into the kernel log. Be careful with this + option, as it can easily flood your logs. dma-api/disabled This read-only file contains the character 'Y' if the debugging code is disabled. This can diff --git a/Documentation/RCU/rculist_nulls.txt b/Documentation/RCU/rculist_nulls.txt index 6389dec..93cb28d 100644 --- a/Documentation/RCU/rculist_nulls.txt +++ b/Documentation/RCU/rculist_nulls.txt @@ -118,7 +118,7 @@ to another chain) checking the final 'nulls' value if the lookup met the end of chain. If final 'nulls' value is not the slot number, then we must restart the lookup at the beginning. If the object was moved to the same chain, -then the reader doesnt care : It might eventually +then the reader doesn't care : It might eventually scan the list again without harm. diff --git a/Documentation/SM501.txt b/Documentation/SM501.txt index 6fc6560..561826f 100644 --- a/Documentation/SM501.txt +++ b/Documentation/SM501.txt @@ -5,7 +5,7 @@ Copyright 2006, 2007 Simtec Electronics The Silicon Motion SM501 multimedia companion chip is a multifunction device which may provide numerous interfaces including USB host controller USB gadget, -Asyncronous Serial ports, Audio functions and a dual display video interface. +asynchronous serial ports, audio functions, and a dual display video interface. The device may be connected by PCI or local bus with varying functions enabled. Core diff --git a/Documentation/block/deadline-iosched.txt b/Documentation/block/deadline-iosched.txt index 7257676..2d82c80 100644 --- a/Documentation/block/deadline-iosched.txt +++ b/Documentation/block/deadline-iosched.txt @@ -58,7 +58,7 @@ same criteria as reads. front_merges (bool) ------------ -Sometimes it happens that a request enters the io scheduler that is contigious +Sometimes it happens that a request enters the io scheduler that is contiguous with a request that is already on the queue. Either it fits in the back of that request, or it fits at the front. That is called either a back merge candidate or a front merge candidate. Due to the way files are typically laid out, diff --git a/Documentation/braille-console.txt b/Documentation/braille-console.txt index 000b0fb..d0d042c 100644 --- a/Documentation/braille-console.txt +++ b/Documentation/braille-console.txt @@ -27,7 +27,7 @@ parameter. For simplicity, only one braille console can be enabled, other uses of console=brl,... will be discarded. Also note that it does not interfere with -the console selection mecanism described in serial-console.txt +the console selection mechanism described in serial-console.txt For now, only the VisioBraille device is supported. diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt index 387b8a7..d79aead 100644 --- a/Documentation/driver-model/devres.txt +++ b/Documentation/driver-model/devres.txt @@ -188,7 +188,7 @@ For example, you can do something like the following. void my_midlayer_destroy_something() { - devres_release_group(dev, my_midlayer_create_soemthing); + devres_release_group(dev, my_midlayer_create_something); } diff --git a/Documentation/edac.txt b/Documentation/edac.txt index 8eda3fb..06f8f46 100644 --- a/Documentation/edac.txt +++ b/Documentation/edac.txt @@ -23,8 +23,8 @@ first time, it was renamed to 'EDAC'. The bluesmoke project at sourceforge.net is now utilized as a 'staging area' for EDAC development, before it is sent upstream to kernel.org -At the bluesmoke/EDAC project site, is a series of quilt patches against -recent kernels, stored in a SVN respository. For easier downloading, there +At the bluesmoke/EDAC project site is a series of quilt patches against +recent kernels, stored in a SVN repository. For easier downloading, there is also a tarball snapshot available. ============================================================================ @@ -73,9 +73,9 @@ the vendor should tie the parity status bits to 0 if they do not intend to generate parity. Some vendors do not do this, and thus the parity bit can "float" giving false positives. -In the kernel there is a pci device attribute located in sysfs that is +In the kernel there is a PCI device attribute located in sysfs that is checked by the EDAC PCI scanning code. If that attribute is set, -PCI parity/error scannining is skipped for that device. The attribute +PCI parity/error scanning is skipped for that device. The attribute is: broken_parity_status diff --git a/Documentation/fb/sh7760fb.txt b/Documentation/fb/sh7760fb.txt index c87bfe5..b994c3b 100644 --- a/Documentation/fb/sh7760fb.txt +++ b/Documentation/fb/sh7760fb.txt @@ -1,7 +1,7 @@ SH7760/SH7763 integrated LCDC Framebuffer driver ================================================ -0. Overwiew +0. Overview ----------- The SH7760/SH7763 have an integrated LCD Display controller (LCDC) which supports (in theory) resolutions ranging from 1x1 to 1024x1024, diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index c634174..8f78ded 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -369,7 +369,7 @@ The call requires an initialized struct autofs_dev_ioctl. There are two possible variations. Both use the path field set to the path of the mount point to check and the size field adjusted appropriately. One uses the ioctlfd field to identify a specific mount point to check while the other -variation uses the path and optionaly arg1 set to an autofs mount type. +variation uses the path and optionally arg1 set to an autofs mount type. The call returns 1 if this is a mount point and sets arg1 to the device number of the mount and field arg2 to the relevant super block magic number (described below) or 0 if it isn't a mountpoint. In both cases diff --git a/Documentation/filesystems/caching/netfs-api.txt b/Documentation/filesystems/caching/netfs-api.txt index 4db125b..2666b1e 100644 --- a/Documentation/filesystems/caching/netfs-api.txt +++ b/Documentation/filesystems/caching/netfs-api.txt @@ -184,7 +184,7 @@ This has the following fields: have index children. If this function is not supplied or if it returns NULL then the first - cache in the parent's list will be chosed, or failing that, the first + cache in the parent's list will be chosen, or failing that, the first cache in the master list. (4) A function to retrieve an object's key from the netfs [mandatory]. diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 97882df..608fdba 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -294,7 +294,7 @@ max_batch_time=usec Maximum amount of time ext4 should wait for amount of time (on average) that it takes to finish committing a transaction. Call this time the "commit time". If the time that the - transactoin has been running is less than the + transaction has been running is less than the commit time, ext4 will try sleeping for the commit time to see if other operations will join the transaction. The commit time is capped by @@ -328,7 +328,7 @@ noauto_da_alloc replacing existing files via patterns such as journal commit, in the default data=ordered mode, the data blocks of the new file are forced to disk before the rename() operation is - commited. This provides roughly the same level + committed. This provides roughly the same level of guarantees as ext3, and avoids the "zero-length" problem that can happen when a system crashes before the delayed allocation @@ -358,7 +358,7 @@ written to the journal first, and then to its final location. In the event of a crash, the journal can be replayed, bringing both data and metadata into a consistent state. This mode is the slowest except when data needs to be read from and written to disk at the same time where it -outperforms all others modes. Curently ext4 does not have delayed +outperforms all others modes. Currently ext4 does not have delayed allocation support if this data journalling mode is selected. References diff --git a/Documentation/filesystems/fiemap.txt b/Documentation/filesystems/fiemap.txt index 1e3defc..606233c 100644 --- a/Documentation/filesystems/fiemap.txt +++ b/Documentation/filesystems/fiemap.txt @@ -204,7 +204,7 @@ fiemap_check_flags() helper: int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags); -The struct fieinfo should be passed in as recieved from ioctl_fiemap(). The +The struct fieinfo should be passed in as received from ioctl_fiemap(). The set of fiemap flags which the fs understands should be passed via fs_flags. If fiemap_check_flags finds invalid user flags, it will place the bad values in fieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from diff --git a/Documentation/filesystems/nfs-rdma.txt b/Documentation/filesystems/nfs-rdma.txt index 85eaead..e386f7e 100644 --- a/Documentation/filesystems/nfs-rdma.txt +++ b/Documentation/filesystems/nfs-rdma.txt @@ -100,7 +100,7 @@ Installation $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs In this location, mount.nfs will be invoked automatically for NFS mounts - by the system mount commmand. + by the system mount command. NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed on the NFS client machine. You do not need this specific version of diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index ce84cfc..cd8717a 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -366,7 +366,7 @@ just those considered 'most important'. The new vectors are: RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are sent from one CPU to another per the needs of the OS. Typically, their statistics are used by kernel developers and interested users to - determine the occurance of interrupt of the given type. + determine the occurrence of interrupts of the given type. The above IRQ vectors are displayed only when relevent. For example, the threshold vector does not exist on x86_64 platforms. Others are @@ -551,7 +551,7 @@ Committed_AS: The amount of memory presently allocated on the system. memory once that memory has been successfully allocated. VmallocTotal: total size of vmalloc memory area VmallocUsed: amount of vmalloc area which is used -VmallocChunk: largest contigious block of vmalloc area which is free +VmallocChunk: largest contiguous block of vmalloc area which is free .............................................................................. diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt index 26e4b8b..85354b3 100644 --- a/Documentation/filesystems/sysfs-pci.txt +++ b/Documentation/filesystems/sysfs-pci.txt @@ -72,7 +72,7 @@ The 'rom' file is special in that it provides read-only access to the device's ROM file, if available. It's disabled by default, however, so applications should write the string "1" to the file to enable it before attempting a read call, and disable it following the access by writing "0" to the file. Note -that the device must be enabled for a rom read to return data succesfully. +that the device must be enabled for a rom read to return data successfully. In the event a driver is not bound to the device, it can be enabled using the 'enable' file, documented above. diff --git a/Documentation/filesystems/vfat.txt b/Documentation/filesystems/vfat.txt index 3a5ddc9..5147be5 100644 --- a/Documentation/filesystems/vfat.txt +++ b/Documentation/filesystems/vfat.txt @@ -124,10 +124,10 @@ sys_immutable -- If set, ATTR_SYS attribute on FAT is handled as flush -- If set, the filesystem will try to flush to disk more early than normal. Not set by default. -rodir -- FAT has the ATTR_RO (read-only) attribute. But on Windows, - the ATTR_RO of the directory will be just ignored actually, - and is used by only applications as flag. E.g. it's setted - for the customized folder. +rodir -- FAT has the ATTR_RO (read-only) attribute. On Windows, + the ATTR_RO of the directory will just be ignored, + and is used only by applications as a flag (e.g. it's set + for the customized folder). If you want to use ATTR_RO as read-only flag even for the directory, set this option. diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 145c25a..e4b6985 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -458,7 +458,7 @@ debugfs interface, since it provides control over GPIO direction and value instead of just showing a gpio state summary. Plus, it could be present on production systems without debugging support. -Given approprate hardware documentation for the system, userspace could +Given appropriate hardware documentation for the system, userspace could know for example that GPIO #23 controls the write protect line used to protect boot loader segments in flash memory. System upgrade procedures may need to temporarily remove that protection, first importing a GPIO, diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt index 3f4bc84..cab61d8 100644 --- a/Documentation/kdump/kdump.txt +++ b/Documentation/kdump/kdump.txt @@ -108,7 +108,7 @@ There are two possible methods of using Kdump. 2) Or use the system kernel binary itself as dump-capture kernel and there is no need to build a separate dump-capture kernel. This is possible - only with the architecutres which support a relocatable kernel. As + only with the architectures which support a relocatable kernel. As of today, i386, x86_64, ppc64 and ia64 architectures support relocatable kernel. @@ -222,7 +222,7 @@ Dump-capture kernel config options (Arch Dependent, ia64) ---------------------------------------------------------- - No specific options are required to create a dump-capture kernel - for ia64, other than those specified in the arch idependent section + for ia64, other than those specified in the arch independent section above. This means that it is possible to use the system kernel as a dump-capture kernel if desired. diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7bcdebf..24d726f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1076,7 +1076,7 @@ and is between 256 and 4096 characters. It is defined in the file kgdboc= [HW] kgdb over consoles. Requires a tty driver that supports console polling. - (only serial suported for now) + (only serial supported for now) Format: [,baud] kmac= [MIPS] korina ethernet MAC address. @@ -1405,7 +1405,7 @@ and is between 256 and 4096 characters. It is defined in the file ('y', default) or cooked coordinates ('n') mtrr_chunk_size=nn[KMG] [X86] - used for mtrr cleanup. It is largest continous chunk + used for mtrr cleanup. It is largest continuous chunk that could hold holes aka. UC entries. mtrr_gran_size=nn[KMG] [X86] diff --git a/Documentation/kobject.txt b/Documentation/kobject.txt index b2e3745..c79ab99 100644 --- a/Documentation/kobject.txt +++ b/Documentation/kobject.txt @@ -132,7 +132,7 @@ kobject_name(): const char *kobject_name(const struct kobject * kobj); There is a helper function to both initialize and add the kobject to the -kernel at the same time, called supprisingly enough kobject_init_and_add(): +kernel at the same time, called surprisingly enough kobject_init_and_add(): int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype, struct kobject *parent, const char *fmt, ...); diff --git a/Documentation/laptops/acer-wmi.txt b/Documentation/laptops/acer-wmi.txt index 5ee2a02..0768fcc 100644 --- a/Documentation/laptops/acer-wmi.txt +++ b/Documentation/laptops/acer-wmi.txt @@ -40,7 +40,7 @@ NOTE: The Acer Aspire One is not supported hardware. It cannot work with acer-wmi until Acer fix their ACPI-WMI implementation on them, so has been blacklisted until that happens. -Please see the website for the current list of known working hardare: +Please see the website for the current list of known working hardware: http://code.google.com/p/aceracpi/wiki/SupportedHardware diff --git a/Documentation/laptops/sony-laptop.txt b/Documentation/laptops/sony-laptop.txt index 8b2bc15..23ce7d3 100644 --- a/Documentation/laptops/sony-laptop.txt +++ b/Documentation/laptops/sony-laptop.txt @@ -22,7 +22,7 @@ If your laptop model supports it, you will find sysfs files in the /sys/class/backlight/sony/ directory. You will be able to query and set the current screen brightness: - brightness get/set screen brightness (an iteger + brightness get/set screen brightness (an integer between 0 and 7) actual_brightness reading from this file will query the HW to get real brightness value diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index e7e9a690..78e354b 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -506,7 +506,7 @@ generate input device EV_KEY events. In addition to the EV_KEY events, thinkpad-acpi may also issue EV_SW events for switches: -SW_RFKILL_ALL T60 and later hardare rfkill rocker switch +SW_RFKILL_ALL T60 and later hardware rfkill rocker switch SW_TABLET_MODE Tablet ThinkPads HKEY events 0x5009 and 0x500A Non hot-key ACPI HKEY event map: diff --git a/Documentation/local_ops.txt b/Documentation/local_ops.txt index 23045b8..300da4b 100644 --- a/Documentation/local_ops.txt +++ b/Documentation/local_ops.txt @@ -34,7 +34,7 @@ out of order wrt other memory writes by the owner CPU. It can be done by slightly modifying the standard atomic operations : only their UP variant must be kept. It typically means removing LOCK prefix (on -i386 and x86_64) and any SMP sychronization barrier. If the architecture does +i386 and x86_64) and any SMP synchronization barrier. If the architecture does not have a different behavior between SMP and UP, including asm-generic/local.h in your architecture's local.h is sufficient. diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt index 4c2ecf5..bbc8a6a 100644 --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -73,13 +73,13 @@ this phase is triggered automatically. ACPI can notify this event. If not, (see Section 4.). Logical Memory Hotplug phase is to change memory state into -avaiable/unavailable for users. Amount of memory from user's view is +available/unavailable for users. Amount of memory from user's view is changed by this phase. The kernel makes all memory in it as free pages when a memory range is available. In this document, this phase is described as online/offline. -Logical Memory Hotplug phase is triggred by write of sysfs file by system +Logical Memory Hotplug phase is triggered by write of sysfs file by system administrator. For the hot-add case, it must be executed after Physical Hotplug phase by hand. (However, if you writes udev's hotplug scripts for memory hotplug, these @@ -334,7 +334,7 @@ MEMORY_CANCEL_ONLINE Generated if MEMORY_GOING_ONLINE fails. MEMORY_ONLINE - Generated when memory has succesfully brought online. The callback may + Generated when memory has successfully brought online. The callback may allocate pages from the new memory. MEMORY_GOING_OFFLINE @@ -359,7 +359,7 @@ The third argument is passed by pointer of struct memory_notify. struct memory_notify { unsigned long start_pfn; unsigned long nr_pages; - int status_cahnge_nid; + int status_change_nid; } start_pfn is start_pfn of online/offline memory. diff --git a/Documentation/mn10300/ABI.txt b/Documentation/mn10300/ABI.txt index 1fef1f0..d3507ba 100644 --- a/Documentation/mn10300/ABI.txt +++ b/Documentation/mn10300/ABI.txt @@ -26,7 +26,7 @@ registers and the stack. If the first argument is a 64-bit value, it will be passed in D0:D1. If the first argument is not a 64-bit value, but the second is, the second will be passed entirely on the stack and D1 will be unused. -Arguments smaller than 32-bits are not coelesced within a register or a stack +Arguments smaller than 32-bits are not coalesced within a register or a stack word. For example, two byte-sized arguments will always be passed in separate registers or word-sized stack slots. diff --git a/Documentation/mtd/nand_ecc.txt b/Documentation/mtd/nand_ecc.txt index bdf93b7..274821b 100644 --- a/Documentation/mtd/nand_ecc.txt +++ b/Documentation/mtd/nand_ecc.txt @@ -50,7 +50,7 @@ byte 255: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp1 rp3 rp5 ... rp15 cp5 cp5 cp5 cp5 cp4 cp4 cp4 cp4 This figure represents a sector of 256 bytes. -cp is my abbreviaton for column parity, rp for row parity. +cp is my abbreviation for column parity, rp for row parity. Let's start to explain column parity. cp0 is the parity that belongs to all bit0, bit2, bit4, bit6. @@ -560,7 +560,7 @@ Measuring this code again showed big gain. When executing the original linux code 1 million times, this took about 1 second on my system. (using time to measure the performance). After this iteration I was back to 0.075 sec. Actually I had to decide to start measuring over 10 -million interations in order not to loose too much accuracy. This one +million iterations in order not to lose too much accuracy. This one definitely seemed to be the jackpot! There is a little bit more room for improvement though. There are three @@ -571,8 +571,8 @@ loop; This eliminates 3 statements per loop. Of course after the loop we need to correct by adding: rp4 ^= rp4_6; rp6 ^= rp4_6 -Furthermore there are 4 sequential assingments to rp8. This can be -encoded slightly more efficient by saving tmppar before those 4 lines +Furthermore there are 4 sequential assignments to rp8. This can be +encoded slightly more efficiently by saving tmppar before those 4 lines and later do rp8 = rp8 ^ tmppar ^ notrp8; (where notrp8 is the value of rp8 before those 4 lines). Again a use of the commutative property of xor. @@ -622,7 +622,7 @@ Not a big change, but every penny counts :-) Analysis 7 ========== -Acutally this made things worse. Not very much, but I don't want to move +Actually this made things worse. Not very much, but I don't want to move into the wrong direction. Maybe something to investigate later. Could have to do with caching again. @@ -642,7 +642,7 @@ Analysis 8 This makes things worse. Let's stick with attempt 6 and continue from there. Although it seems that the code within the loop cannot be optimised further there is still room to optimize the generation of the ecc codes. -We can simply calcualate the total parity. If this is 0 then rp4 = rp5 +We can simply calculate the total parity. If this is 0 then rp4 = rp5 etc. If the parity is 1, then rp4 = !rp5; But if rp4 = rp5 we do not need rp5 etc. We can just write the even bits in the result byte and then do something like diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 0876275..d5181ce 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -221,7 +221,7 @@ ad_select - Any slave's 802.3ad association state changes - - The bond's adminstrative state changes to up + - The bond's administrative state changes to up count or 2 @@ -369,7 +369,7 @@ fail_over_mac When this policy is used in conjuction with the mii monitor, devices which assert link up prior to being able to actually transmit and receive are particularly - susecptible to loss of the gratuitous ARP, and an + susceptible to loss of the gratuitous ARP, and an appropriate updelay setting may be required. follow or 2 @@ -1794,7 +1794,7 @@ target to query. generally referred to as "trunk failover." This is a feature of the switch that causes the link state of a particular switch port to be set down (or up) when the state of another switch port goes down (or up). -It's purpose is to propogate link failures from logically "exterior" ports +Its purpose is to propagate link failures from logically "exterior" ports to the logically "interior" ports that bonding is able to monitor via miimon. Availability and configuration for trunk failover varies by switch, but this can be a viable alternative to the ARP monitor when using diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt index 2035bc4..463d9e0 100644 --- a/Documentation/networking/can.txt +++ b/Documentation/networking/can.txt @@ -327,7 +327,7 @@ solution for a couple of reasons: return 1; } - /* paraniod check ... */ + /* paranoid check ... */ if (nbytes < sizeof(struct can_frame)) { fprintf(stderr, "read: incomplete CAN frame\n"); return 1; diff --git a/Documentation/networking/dm9000.txt b/Documentation/networking/dm9000.txt index 65df3de..5552e2e 100644 --- a/Documentation/networking/dm9000.txt +++ b/Documentation/networking/dm9000.txt @@ -129,7 +129,7 @@ PHY Link state polling ---------------------- The driver keeps track of the link state and informs the network core -about link (carrier) availablilty. This is managed by several methods +about link (carrier) availability. This is managed by several methods depending on the version of the chip and on which PHY is being used. For the internal PHY, the original (and currently default) method is diff --git a/Documentation/networking/l2tp.txt b/Documentation/networking/l2tp.txt index 2451f55..63214b2 100644 --- a/Documentation/networking/l2tp.txt +++ b/Documentation/networking/l2tp.txt @@ -158,7 +158,7 @@ Sample Userspace Code } return 0; -Miscellanous +Miscellaneous ============ The PPPoL2TP driver was developed as part of the OpenL2TP project by diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index a2ab6a0..87b3d15 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -74,7 +74,7 @@ dev->hard_start_xmit: for this and return NETDEV_TX_LOCKED when the spin lock fails. The locking there should also properly protect against set_multicast_list. Note that the use of NETIF_F_LLTX is deprecated. - Dont use it for new drivers. + Don't use it for new drivers. Context: Process with BHs disabled or BH (timer), will be called with interrupts disabled by netconsole. diff --git a/Documentation/networking/phonet.txt b/Documentation/networking/phonet.txt index 6a07e45..6e8ce09 100644 --- a/Documentation/networking/phonet.txt +++ b/Documentation/networking/phonet.txt @@ -36,7 +36,7 @@ Phonet packets have a common header as follows: On Linux, the link-layer header includes the pn_media byte (see below). The next 7 bytes are part of the network-layer header. -The device ID is split: the 6 higher-order bits consitute the device +The device ID is split: the 6 higher-order bits constitute the device address, while the 2 lower-order bits are used for multiplexing, as are the 8-bit object identifiers. As such, Phonet can be considered as a network layer with 6 bits of address space and 10 bits for transport diff --git a/Documentation/networking/regulatory.txt b/Documentation/networking/regulatory.txt index dcf3164..eaa1a25 100644 --- a/Documentation/networking/regulatory.txt +++ b/Documentation/networking/regulatory.txt @@ -89,7 +89,7 @@ added to this document when its support is enabled. Device drivers who provide their own built regulatory domain do not need a callback as the channels registered by them are the only ones that will be allowed and therefore *additional* -cannels cannot be enabled. +channels cannot be enabled. Example code - drivers hinting an alpha2: ------------------------------------------ diff --git a/Documentation/power/regulator/consumer.txt b/Documentation/power/regulator/consumer.txt index 82b7a43..5f83fd2 100644 --- a/Documentation/power/regulator/consumer.txt +++ b/Documentation/power/regulator/consumer.txt @@ -178,5 +178,5 @@ Consumers can uregister interest by calling :- int regulator_unregister_notifier(struct regulator *regulator, struct notifier_block *nb); -Regulators use the kernel notifier framework to send event to thier interested +Regulators use the kernel notifier framework to send event to their interested consumers. diff --git a/Documentation/power/regulator/overview.txt b/Documentation/power/regulator/overview.txt index bdcb332..0cded69 100644 --- a/Documentation/power/regulator/overview.txt +++ b/Documentation/power/regulator/overview.txt @@ -119,7 +119,7 @@ Some terms used in this document:- battery power, USB power) Regulator Domains: is the new current limit within the - regulator operating parameters for input/ouput voltage. + regulator operating parameters for input/output voltage. If the regulator request passes all the constraint tests then the new regulator value is applied. diff --git a/Documentation/power/s2ram.txt b/Documentation/power/s2ram.txt index 2ebdc60..514b94f 100644 --- a/Documentation/power/s2ram.txt +++ b/Documentation/power/s2ram.txt @@ -63,7 +63,7 @@ hardware during resume operations where a value can be set that will survive a reboot. Consequence is that after a resume (even if it is successful) your system -clock will have a value corresponding to the magic mumber instead of the +clock will have a value corresponding to the magic number instead of the correct date/time! It is therefore advisable to use a program like ntp-date or rdate to reset the correct date/time from an external time source when using this trace option. diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt index 7b99636..b967cd9 100644 --- a/Documentation/power/userland-swsusp.txt +++ b/Documentation/power/userland-swsusp.txt @@ -109,7 +109,7 @@ unfreeze user space processes frozen by SNAPSHOT_UNFREEZE if they are still frozen when the device is being closed). Currently it is assumed that the userland utilities reading/writing the -snapshot image from/to the kernel will use a swap parition, called the resume +snapshot image from/to the kernel will use a swap partition, called the resume partition, or a swap file as storage space (if a swap file is used, the resume partition is the partition that holds this file). However, this is not really required, as they can use, for example, a special (blank) suspend partition or diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index d16b7a1..8d999d8 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1356,7 +1356,7 @@ platforms are moved over to use the flattened-device-tree model. - phy-map : 1 cell, optional, bitmap of addresses to probe the PHY for, used if phy-address is absent. bit 0x00000001 is MDIO address 0. - For Axon it can be absent, thouugh my current driver + For Axon it can be absent, though my current driver doesn't handle phy-address yet so for now, keep 0x00ffffff in it. - rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec @@ -1438,7 +1438,7 @@ platforms are moved over to use the flattened-device-tree model. The Xilinx EDK toolchain ships with a set of IP cores (devices) for use in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range - of standard device types (network, serial, etc.) and miscellanious + of standard device types (network, serial, etc.) and miscellaneous devices (gpio, LCD, spi, etc). Also, since these devices are implemented within the fpga fabric every instance of the device can be synthesised with different options that change the behaviour. diff --git a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt index 088fc47..160c752 100644 --- a/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt +++ b/Documentation/powerpc/dts-bindings/fsl/cpm_qe/cpm.txt @@ -19,7 +19,7 @@ Example: reg = <119c0 30>; } -* Properties common to mulitple CPM/QE devices +* Properties common to multiple CPM/QE devices - fsl,cpm-command : This value is ORed with the opcode and command flag to specify the device on which a CPM command operates. diff --git a/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt b/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt index b26b919..bcc30ba 100644 --- a/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt +++ b/Documentation/powerpc/dts-bindings/fsl/msi-pic.txt @@ -1,6 +1,6 @@ * Freescale MSI interrupt controller -Reguired properities: +Required properties: - compatible : compatible list, contains 2 entries, first is "fsl,CHIP-msi", where CHIP is the processor(mpc8610, mpc8572, etc.) and the second is "fsl,mpic-msi" or "fsl,ipic-msi" depending on diff --git a/Documentation/powerpc/dts-bindings/fsl/pmc.txt b/Documentation/powerpc/dts-bindings/fsl/pmc.txt index 02f6f43..07256b7 100644 --- a/Documentation/powerpc/dts-bindings/fsl/pmc.txt +++ b/Documentation/powerpc/dts-bindings/fsl/pmc.txt @@ -15,8 +15,8 @@ Properties: compatible; all statements below that apply to "fsl,mpc8548-pmc" also apply to "fsl,mpc8641d-pmc". - Compatibility does not include bit assigments in SCCR/PMCDR/DEVDISR; these - bit assigments are indicated via the sleep specifier in each device's + Compatibility does not include bit assignments in SCCR/PMCDR/DEVDISR; these + bit assignments are indicated via the sleep specifier in each device's sleep property. - reg: For devices compatible with "fsl,mpc8349-pmc", the first resource diff --git a/Documentation/powerpc/qe_firmware.txt b/Documentation/powerpc/qe_firmware.txt index 06da4d4..2031ddb 100644 --- a/Documentation/powerpc/qe_firmware.txt +++ b/Documentation/powerpc/qe_firmware.txt @@ -225,7 +225,7 @@ For example, to match the 8323, revision 1.0: soc.major = 1 soc.minor = 0 -'padding' is neccessary for structure alignment. This field ensures that the +'padding' is necessary for structure alignment. This field ensures that the 'extended_modes' field is aligned on a 64-bit boundary. 'extended_modes' is a bitfield that defines special functionality which has an diff --git a/Documentation/s390/Debugging390.txt b/Documentation/s390/Debugging390.txt index 10711d9..1eb576a 100644 --- a/Documentation/s390/Debugging390.txt +++ b/Documentation/s390/Debugging390.txt @@ -1984,7 +1984,7 @@ break *$pc break *0x400618 -heres a really useful one for large programs +Here's a really useful one for large programs rbr Set a breakpoint for all functions matching REGEXP e.g. @@ -2211,7 +2211,7 @@ Breakpoint 2 at 0x4d87a4: file top.c, line 2609. #5 0x51692c in readline_internal () at readline.c:521 #6 0x5164fe in readline (prompt=0x7ffff810 "\177ÿøx\177ÿ÷Ø\177ÿøxÀ") at readline.c:349 -#7 0x4d7a8a in command_line_input (prrompt=0x564420 "(gdb) ", repeat=1, +#7 0x4d7a8a in command_line_input (prompt=0x564420 "(gdb) ", repeat=1, annotation_suffix=0x4d6b44 "prompt") at top.c:2091 #8 0x4d6cf0 in command_loop () at top.c:1345 #9 0x4e25bc in main (argc=1, argv=0x7ffffdf4) at main.c:635 diff --git a/Documentation/scheduler/sched-nice-design.txt b/Documentation/scheduler/sched-nice-design.txt index e2bae5a..3ac1e46 100644 --- a/Documentation/scheduler/sched-nice-design.txt +++ b/Documentation/scheduler/sched-nice-design.txt @@ -55,7 +55,7 @@ To sum it up: we always wanted to make nice levels more consistent, but within the constraints of HZ and jiffies and their nasty design level coupling to timeslices and granularity it was not really viable. -The second (less frequent but still periodically occuring) complaint +The second (less frequent but still periodically occurring) complaint about Linux's nice level support was its assymetry around the origo (which you can see demonstrated in the picture above), or more accurately: the fact that nice level behavior depended on the _absolute_ diff --git a/Documentation/scsi/aic79xx.txt b/Documentation/scsi/aic79xx.txt index 683ccae..c014ecc 100644 --- a/Documentation/scsi/aic79xx.txt +++ b/Documentation/scsi/aic79xx.txt @@ -194,7 +194,7 @@ The following information is available in this file: - Packetized SCSI Protocol at 160MB/s and 320MB/s - Quick Arbitration Selection (QAS) - Retained Training Information (Rev B. ASIC only) - - Interrupt Coalessing + - Interrupt Coalescing - Initiator Mode (target mode not currently supported) - Support for the PCI-X standard up to 133MHz diff --git a/Documentation/scsi/ncr53c8xx.txt b/Documentation/scsi/ncr53c8xx.txt index 230e308..08e2b4d 100644 --- a/Documentation/scsi/ncr53c8xx.txt +++ b/Documentation/scsi/ncr53c8xx.txt @@ -206,7 +206,7 @@ of MOVE MEMORY instructions. The 896 and the 895A allows handling of the phase mismatch context from SCRIPTS (avoids the phase mismatch interrupt that stops the SCSI processor until the C code has saved the context of the transfer). -Implementing this without using LOAD/STORE instructions would be painfull +Implementing this without using LOAD/STORE instructions would be painful and I didn't even want to try it. The 896 chip supports 64 bit PCI transactions and addressing, while the @@ -240,7 +240,7 @@ characteristics. This feature may also reduce average command latency. In order to really gain advantage of this feature, devices must have a reasonable cache size (No miracle is to be expected for a low-end hard disk with 128 KB or less). -Some kown SCSI devices do not properly support tagged command queuing. +Some known SCSI devices do not properly support tagged command queuing. Generally, firmware revisions that fix this kind of problems are available at respective vendor web/ftp sites. All I can say is that the hard disks I use on my machines behave well with diff --git a/Documentation/scsi/sym53c8xx_2.txt b/Documentation/scsi/sym53c8xx_2.txt index 49ea5c5..eb9a7b9 100644 --- a/Documentation/scsi/sym53c8xx_2.txt +++ b/Documentation/scsi/sym53c8xx_2.txt @@ -206,7 +206,7 @@ characteristics. This feature may also reduce average command latency. In order to really gain advantage of this feature, devices must have a reasonable cache size (No miracle is to be expected for a low-end hard disk with 128 KB or less). -Some kown old SCSI devices do not properly support tagged command queuing. +Some known old SCSI devices do not properly support tagged command queuing. Generally, firmware revisions that fix this kind of problems are available at respective vendor web/ftp sites. All I can say is that I never have had problem with tagged queuing using diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 012858d..ecb969b 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -754,7 +754,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. single_cmd - Use single immediate commands to communicate with codecs (for debugging only) enable_msi - Enable Message Signaled Interrupt (MSI) (default = off) - power_save - Automatic power-saving timtout (in second, 0 = + power_save - Automatic power-saving timeout (in second, 0 = disable) power_save_controller - Reset HD-audio controller in power-saving mode (default = on) diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt index 88b7433..71ac995 100644 --- a/Documentation/sound/alsa/HD-Audio.txt +++ b/Documentation/sound/alsa/HD-Audio.txt @@ -16,7 +16,7 @@ methods for the HD-audio hardware. The HD-audio component consists of two parts: the controller chip and the codec chips on the HD-audio bus. Linux provides a single driver for all controllers, snd-hda-intel. Although the driver name contains -a word of a well-known harware vendor, it's not specific to it but for +a word of a well-known hardware vendor, it's not specific to it but for all controller chips by other companies. Since the HD-audio controllers are supposed to be compatible, the single snd-hda-driver should work in most cases. But, not surprisingly, there are known diff --git a/Documentation/sound/alsa/hda_codec.txt b/Documentation/sound/alsa/hda_codec.txt index 34e87ec..de8efbc 100644 --- a/Documentation/sound/alsa/hda_codec.txt +++ b/Documentation/sound/alsa/hda_codec.txt @@ -114,7 +114,7 @@ For writing a sequence of verbs, use snd_hda_sequence_write(). There are variants of cached read/write, snd_hda_codec_write_cache(), snd_hda_sequence_write_cache(). These are used for recording the -register states for the power-mangement resume. When no PM is needed, +register states for the power-management resume. When no PM is needed, these are equivalent with non-cached version. To retrieve the number of sub nodes connected to the given node, use diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index c302ddf..6fab2dc 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -358,7 +358,7 @@ nr_pdflush_threads The current number of pdflush threads. This value is read-only. The value changes according to the number of dirty pages in the system. -When neccessary, additional pdflush threads are created, one per second, up to +When necessary, additional pdflush threads are created, one per second, up to nr_pdflush_threads_max. ============================================================== @@ -565,7 +565,7 @@ swappiness This control is used to define how aggressive the kernel will swap memory pages. Higher values will increase agressiveness, lower values -descrease the amount of swap. +decrease the amount of swap. The default value is 60. diff --git a/Documentation/timers/hpet.txt b/Documentation/timers/hpet.txt index e7c09ab..04763a3 100644 --- a/Documentation/timers/hpet.txt +++ b/Documentation/timers/hpet.txt @@ -7,7 +7,7 @@ by Intel and Microsoft which can be found at Each HPET has one fixed-rate counter (at 10+ MHz, hence "High Precision") and up to 32 comparators. Normally three or more comparators are provided, -each of which can generate oneshot interupts and at least one of which has +each of which can generate oneshot interrupts and at least one of which has additional hardware to support periodic interrupts. The comparators are also called "timers", which can be misleading since usually timers are independent of each other ... these share a counter, complicating resets. diff --git a/Documentation/timers/timer_stats.txt b/Documentation/timers/timer_stats.txt index 20d368c..9bd00fc 100644 --- a/Documentation/timers/timer_stats.txt +++ b/Documentation/timers/timer_stats.txt @@ -62,7 +62,7 @@ Timerstats sample period: 3.888770 s The first column is the number of events, the second column the pid, the third column is the name of the process. The forth column shows the function which -initialized the timer and in parantheses the callback function which was +initialized the timer and in parenthesis the callback function which was executed on expiry. Thomas, Ingo diff --git a/Documentation/usb/WUSB-Design-overview.txt b/Documentation/usb/WUSB-Design-overview.txt index 4c3d62c..c480e9c 100644 --- a/Documentation/usb/WUSB-Design-overview.txt +++ b/Documentation/usb/WUSB-Design-overview.txt @@ -84,7 +84,7 @@ The different logical parts of this driver are: *UWB*: the Ultra-Wide-Band stack -- manages the radio and associated spectrum to allow for devices sharing it. Allows to - control bandwidth assingment, beaconing, scanning, etc + control bandwidth assignment, beaconing, scanning, etc * @@ -184,7 +184,7 @@ and sends the replies and notifications back to the API [/uwb_rc_neh_grok()/]. Notifications are handled to the UWB daemon, that is chartered, among other things, to keep the tab of how the UWB radio neighborhood looks, creating and destroying devices as they show up or -dissapear. +disappear. Command execution is very simple: a command block is sent and a event block or reply is expected back. For sending/receiving command/events, a @@ -333,7 +333,7 @@ read descriptors and move our data. *Device life cycle and keep alives* -Everytime there is a succesful transfer to/from a device, we update a +Every time there is a successful transfer to/from a device, we update a per-device activity timestamp. If not, every now and then we check and if the activity timestamp gets old, we ping the device by sending it a Keep Alive IE; it responds with a /DN_Alive/ pong during the DNTS (this @@ -411,7 +411,7 @@ context (wa_xfer) and submit it. When the xfer is done, our callback is called and we assign the status bits and release the xfer resources. In dequeue() we are basically cancelling/aborting the transfer. We issue -a xfer abort request to the HC, cancell all the URBs we had submitted +a xfer abort request to the HC, cancel all the URBs we had submitted and not yet done and when all that is done, the xfer callback will be called--this will call the URB callback. diff --git a/Documentation/usb/anchors.txt b/Documentation/usb/anchors.txt index 6f24f56..fe6a99a 100644 --- a/Documentation/usb/anchors.txt +++ b/Documentation/usb/anchors.txt @@ -27,7 +27,7 @@ Association and disassociation of URBs with anchors An association of URBs to an anchor is made by an explicit call to usb_anchor_urb(). The association is maintained until -an URB is finished by (successfull) completion. Thus disassociation +an URB is finished by (successful) completion. Thus disassociation is automatic. A function is provided to forcibly finish (kill) all URBs associated with an anchor. Furthermore, disassociation can be made with usb_unanchor_urb() @@ -76,4 +76,4 @@ usb_get_from_anchor() Returns the oldest anchored URB of an anchor. The URB is unanchored and returned with a reference. As you may mix URBs to several destinations in one anchor you have no guarantee the chronologically -first submitted URB is returned. \ No newline at end of file +first submitted URB is returned. diff --git a/Documentation/video4linux/cx18.txt b/Documentation/video4linux/cx18.txt index 914cb7e..4652c0f 100644 --- a/Documentation/video4linux/cx18.txt +++ b/Documentation/video4linux/cx18.txt @@ -11,7 +11,7 @@ encoder chip: 2) Some people have problems getting the i2c bus to work. The symptom is that the eeprom cannot be read and the card is unusable. This is probably fixed, but if you have problems - then post to the video4linux or ivtv-users mailinglist. + then post to the video4linux or ivtv-users mailing list. 3) VBI (raw or sliced) has not yet been implemented. -- cgit v1.1 From 27af1da4b58675d5c6bacf9b7de9c2746687d272 Mon Sep 17 00:00:00 2001 From: "figo.zhang" Date: Fri, 17 Apr 2009 10:58:48 +0800 Subject: trivial: Documentation/rbtree.txt: cleanup kerneldoc of rbtree.txt The first formal parameter of the rb_link_node() is a pointer, and the "node" is define a data struct (pls see line 67 and line 73 in the doc), so the actual parameter should use "&data->node". Signed-off-by: Figo.zhang Signed-off-by: Jiri Kosina --- Documentation/rbtree.txt | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/rbtree.txt b/Documentation/rbtree.txt index 7224459b..7710214 100644 --- a/Documentation/rbtree.txt +++ b/Documentation/rbtree.txt @@ -131,8 +131,8 @@ Example: } /* Add new node and rebalance tree. */ - rb_link_node(data->node, parent, new); - rb_insert_color(data->node, root); + rb_link_node(&data->node, parent, new); + rb_insert_color(&data->node, root); return TRUE; } @@ -146,10 +146,10 @@ To remove an existing node from a tree, call: Example: - struct mytype *data = mysearch(mytree, "walrus"); + struct mytype *data = mysearch(&mytree, "walrus"); if (data) { - rb_erase(data->node, mytree); + rb_erase(&data->node, &mytree); myfree(data); } -- cgit v1.1 From 190342335c2a7939407d7391e5bb6c9ee39244eb Mon Sep 17 00:00:00 2001 From: Wang Tinggong Date: Thu, 14 May 2009 11:00:20 +0200 Subject: trivial: rbtree.txt: fix rb_entry() parameters in sample code Reviewed-by: WANG Cong Signed-off-by: Jiri Kosina --- Documentation/rbtree.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/rbtree.txt b/Documentation/rbtree.txt index 7710214..aae8355 100644 --- a/Documentation/rbtree.txt +++ b/Documentation/rbtree.txt @@ -188,5 +188,5 @@ Example: struct rb_node *node; for (node = rb_first(&mytree); node; node = rb_next(node)) - printk("key=%s\n", rb_entry(node, int, keystring)); + printk("key=%s\n", rb_entry(node, struct mytype, node)->keystring); -- cgit v1.1 From baf20b3e51913e8a5003d4e7a143934be8fe52b5 Mon Sep 17 00:00:00 2001 From: GeunSik Lim Date: Mon, 1 Jun 2009 10:49:41 +0200 Subject: trivial: ftrace:fix description of trace directory Fix trace source directory from kernel/tracing/ to kernel/trace/. Signed-off-by: GeunSik Lim Signed-off-by: Jiri Kosina --- Documentation/trace/ftrace.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 2a82d86..7bd27f0 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -1834,4 +1834,4 @@ an error. ----------- More details can be found in the source code, in the -kernel/tracing/*.c files. +kernel/trace/*.c files. -- cgit v1.1 From 1b68bfc18b258f5a0f285f9101a84da502254768 Mon Sep 17 00:00:00 2001 From: Masanori Kobayasi Date: Thu, 4 Jun 2009 21:12:29 +0900 Subject: trivial: Documentation/dell_rbu.txt: fix typos Remove a period from end of command-line and fix misplaced comma. Signed-off-by: Masanori Kobayasi Signed-off-by: Jiri Kosina --- Documentation/dell_rbu.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/dell_rbu.txt b/Documentation/dell_rbu.txt index c11b931..1517498 100644 --- a/Documentation/dell_rbu.txt +++ b/Documentation/dell_rbu.txt @@ -76,9 +76,9 @@ Do the steps below to download the BIOS image. The /sys/class/firmware/dell_rbu/ entries will remain till the following is done. -echo -1 > /sys/class/firmware/dell_rbu/loading. +echo -1 > /sys/class/firmware/dell_rbu/loading Until this step is completed the driver cannot be unloaded. -Also echoing either mono ,packet or init in to image_type will free up the +Also echoing either mono, packet or init in to image_type will free up the memory allocated by the driver. If a user by accident executes steps 1 and 3 above without executing step 2; -- cgit v1.1 From 3226224039c8f8cb840d236b5f27d2a1104789e2 Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Thu, 4 Jun 2009 16:26:50 +0200 Subject: trivial: SubmittingPatches: fix typo Fix typo. Signed-off-by: Pavel Machek Signed-off-by: Jiri Kosina --- Documentation/SubmittingPatches | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index f309d3c..c282380 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -444,7 +444,7 @@ offer a Reviewed-by tag for a patch. This tag serves to give credit to reviewers and to inform maintainers of the degree of review which has been done on the patch. Reviewed-by: tags, when supplied by reviewers known to understand the subject area and to perform thorough reviews, will normally -increase the liklihood of your patch getting into the kernel. +increase the likelihood of your patch getting into the kernel. 15) The canonical patch format -- cgit v1.1 From ff2f5ff0cf224780c3bd035d3e6ff4a30fcacae7 Mon Sep 17 00:00:00 2001 From: Matt Kraai Date: Thu, 4 Jun 2009 21:43:10 -0700 Subject: trivial: Remove the hyphen from git commands Signed-off-by: Matt Kraai Signed-off-by: Jiri Kosina --- Documentation/trace/kmemtrace.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/trace/kmemtrace.txt b/Documentation/trace/kmemtrace.txt index a956d9b..6308735 100644 --- a/Documentation/trace/kmemtrace.txt +++ b/Documentation/trace/kmemtrace.txt @@ -64,7 +64,7 @@ III. Quick usage guide CONFIG_KMEMTRACE). 2) Get the userspace tool and build it: -$ git-clone git://repo.or.cz/kmemtrace-user.git # current repository +$ git clone git://repo.or.cz/kmemtrace-user.git # current repository $ cd kmemtrace-user/ $ ./autogen.sh $ ./configure -- cgit v1.1 From 6e2216895421b4f83d2ebac15c9d9506dc105cff Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?N=C3=A9meth=20M=C3=A1rton?= Date: Sat, 6 Jun 2009 19:06:36 +0200 Subject: trivial: usb: fix missing space typo in doc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Márton Németh Signed-off-by: Jiri Kosina --- Documentation/usb/callbacks.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/usb/callbacks.txt b/Documentation/usb/callbacks.txt index 7c81241..bfb36b3 100644 --- a/Documentation/usb/callbacks.txt +++ b/Documentation/usb/callbacks.txt @@ -65,7 +65,7 @@ Accept or decline an interface. If you accept the device return 0, otherwise -ENODEV or -ENXIO. Other error codes should be used only if a genuine error occurred during initialisation which prevented a driver from accepting a device that would else have been accepted. -You are strongly encouraged to use usbcore'sfacility, +You are strongly encouraged to use usbcore's facility, usb_set_intfdata(), to associate a data structure with an interface, so that you know which internal state and identity you associate with a particular interface. The device will not be suspended and you may do IO -- cgit v1.1 From 82d27b2b2f3a80ffa7759a49b9cba39e47df476e Mon Sep 17 00:00:00 2001 From: Markus Heidelberg Date: Fri, 12 Jun 2009 01:02:34 +0200 Subject: trivial: remove the trivial patch monkey's name from SubmittingPatches It is outdated here and can be found in the MAINTAINERS file. Also remove the URL of the previous maintainer, similar content can be found in the SubmittingPatches file. Signed-off-by: Markus Heidelberg Signed-off-by: Jiri Kosina --- Documentation/SubmittingPatches | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index c282380..ddc903a 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -183,8 +183,9 @@ Even if the maintainer did not respond in step #4, make sure to ALWAYS copy the maintainer when you change their code. For small patches you may want to CC the Trivial Patch Monkey -trivial@kernel.org managed by Jesper Juhl; which collects "trivial" -patches. Trivial patches must qualify for one of the following rules: +trivial@kernel.org which collects "trivial" patches. Have a look +into the MAINTAINERS file for its current manager. +Trivial patches must qualify for one of the following rules: Spelling fixes in documentation Spelling fixes which could break grep(1) Warning fixes (cluttering with useless warnings is bad) @@ -196,7 +197,6 @@ patches. Trivial patches must qualify for one of the following rules: since people copy, as long as it's trivial) Any fix by the author/maintainer of the file (ie. patch monkey in re-transmission mode) -URL: -- cgit v1.1 From 9e4f5e29610162fd426366f3b29e3cc6e575b858 Mon Sep 17 00:00:00 2001 From: James Smart Date: Thu, 26 Mar 2009 13:33:19 -0400 Subject: [SCSI] FC Pass Thru support Attached is the ELS/CT pass-thru patch for the FC Transport. The patch creates a generic framework that lays on top of bsg and the SGIO v4 ioctl in order to pass transaction requests to LLDD's. The interface supports the following operations: On an fc_host basis: Request login to the specified N_Port_ID, creating an fc_rport. Request logout of the specified N_Port_ID, deleting an fc_rport Send ELS request to specified N_Port_ID w/o requiring a login, and wait for ELS response. Send CT request to specified N_Port_ID and wait for CT response. Login is required, but LLDD is allowed to manage login and decide whether it stays in place after the request is satisfied. Vendor-Unique request. Allows a LLDD-specific request to be passed to the LLDD, and the passing of a response back to the application. On an fc_rport basis: Send ELS request to nport and wait for ELS response. Send CT request to nport and wait for CT response. The patch also exports several headers from include/scsi such that they can be available to user-space applications: include/scsi/scsi.h include/scsi/scsi_netlink.h include/scsi/scsi_netlink_fc.h include/scsi/scsi_bsg_fc.h For further information, refer to the last RFC: http://marc.info/?l=linux-scsi&m=123436574018579&w=2 Note: Documentation is still spotty and will be added later. [bharrosh@panasas.com: update for new block API] Signed-off-by: James Smart Signed-off-by: James Bottomley --- Documentation/scsi/scsi_fc_transport.txt | 14 ++++++++++++-- Documentation/scsi/scsi_mid_low_api.txt | 5 +++++ 2 files changed, 17 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/scsi/scsi_fc_transport.txt b/Documentation/scsi/scsi_fc_transport.txt index e5b071d..d7f1817 100644 --- a/Documentation/scsi/scsi_fc_transport.txt +++ b/Documentation/scsi/scsi_fc_transport.txt @@ -1,10 +1,11 @@ SCSI FC Tansport ============================================= -Date: 4/12/2007 +Date: 11/18/2008 Kernel Revisions for features: rports : <> - vports : 2.6.22 (? TBD) + vports : 2.6.22 + bsg support : 2.6.30 (?TBD?) Introduction @@ -15,6 +16,7 @@ The FC transport can be found at: drivers/scsi/scsi_transport_fc.c include/scsi/scsi_transport_fc.h include/scsi/scsi_netlink_fc.h + include/scsi/scsi_bsg_fc.h This file is found at Documentation/scsi/scsi_fc_transport.txt @@ -472,6 +474,14 @@ int fc_vport_terminate(struct fc_vport *vport) +FC BSG support (CT & ELS passthru, and more) +======================================================================== +<< To Be Supplied >> + + + + + Credits ======= The following people have contributed to this document: diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt index a6d5354..de67229 100644 --- a/Documentation/scsi/scsi_mid_low_api.txt +++ b/Documentation/scsi/scsi_mid_low_api.txt @@ -1271,6 +1271,11 @@ of interest: hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size is set by the second argument (named 'xtr_bytes') to scsi_host_alloc() or scsi_register(). + vendor_id - a unique value that identifies the vendor supplying + the LLD for the Scsi_Host. Used most often in validating + vendor-specific message requests. Value consists of an + identifier type and a vendor-specific value. + See scsi_netlink.h for a description of valid formats. The scsi_host structure is defined in include/scsi/scsi_host.h -- cgit v1.1 From e240b58c79144708530138e05f17c6d0d8d744a8 Mon Sep 17 00:00:00 2001 From: Magnus Damm Date: Sun, 24 May 2009 22:05:54 +0200 Subject: PM: Remove bus_type suspend_late()/resume_early() V2 Remove the ->suspend_late() and ->resume_early() callbacks from struct bus_type V2. These callbacks are legacy stuff at this point and since there seem to be no in-tree users we may as well remove them. New users should use dev_pm_ops. Signed-off-by: Magnus Damm Acked-by: Pavel Machek Acked-by: Greg Kroah-Hartman Signed-off-by: Rafael J. Wysocki --- Documentation/power/devices.txt | 34 +++++----------------------------- 1 file changed, 5 insertions(+), 29 deletions(-) (limited to 'Documentation') diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt index 421e7d0..c9abbd8 100644 --- a/Documentation/power/devices.txt +++ b/Documentation/power/devices.txt @@ -75,9 +75,6 @@ may need to apply in domain-specific ways to their devices: struct bus_type { ... int (*suspend)(struct device *dev, pm_message_t state); - int (*suspend_late)(struct device *dev, pm_message_t state); - - int (*resume_early)(struct device *dev); int (*resume)(struct device *dev); }; @@ -226,20 +223,7 @@ The phases are seen by driver notifications issued in this order: This call should handle parts of device suspend logic that require sleeping. It probably does work to quiesce the device which hasn't - been abstracted into class.suspend() or bus.suspend_late(). - - 3 bus.suspend_late(dev, message) is called with IRQs disabled, and - with only one CPU active. Until the bus.resume_early() phase - completes (see later), IRQs are not enabled again. This method - won't be exposed by all busses; for message based busses like USB, - I2C, or SPI, device interactions normally require IRQs. This bus - call may be morphed into a driver call with bus-specific parameters. - - This call might save low level hardware state that might otherwise - be lost in the upcoming low power state, and actually put the - device into a low power state ... so that in some cases the device - may stay partly usable until this late. This "late" call may also - help when coping with hardware that behaves badly. + been abstracted into class.suspend(). The pm_message_t parameter is currently used to refine those semantics (described later). @@ -351,19 +335,11 @@ devices processing each phase's calls before the next phase begins. The phases are seen by driver notifications issued in this order: - 1 bus.resume_early(dev) is called with IRQs disabled, and with - only one CPU active. As with bus.suspend_late(), this method - won't be supported on busses that require IRQs in order to - interact with devices. - - This reverses the effects of bus.suspend_late(). - - 2 bus.resume(dev) is called next. This may be morphed into a device - driver call with bus-specific parameters; implementations may sleep. - - This reverses the effects of bus.suspend(). + 1 bus.resume(dev) reverses the effects of bus.suspend(). This may + be morphed into a device driver call with bus-specific parameters; + implementations may sleep. - 3 class.resume(dev) is called for devices associated with a class + 2 class.resume(dev) is called for devices associated with a class that has such a method. Implementations may sleep. This reverses the effects of class.suspend(), and would usually -- cgit v1.1 From dd14be4c274fc484eccace03ae9726e516630331 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Richard=20R=C3=B6jfors?= Date: Fri, 5 Jun 2009 15:40:32 +0200 Subject: i2c-ocores: Can add I2C devices to the bus MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit There is sometimes a need for the ocores driver to add devices to the bus when installed. i2c_register_board_info can not always be used, because the I2C devices are not known at an early state, they could for instance be connected on a I2C bus on a PCI device which has the Open Cores IP. i2c_new_device can not be used in all cases either since the resulting bus nummer might be unknown. The solution is the pass a list of I2C devices in the platform data to the Open Cores driver. This is useful for MFD drivers. Signed-off-by: Richard Röjfors Signed-off-by: Ben Dooks --- Documentation/i2c/busses/i2c-ocores | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'Documentation') diff --git a/Documentation/i2c/busses/i2c-ocores b/Documentation/i2c/busses/i2c-ocores index cfcebb1..c269aaa 100644 --- a/Documentation/i2c/busses/i2c-ocores +++ b/Documentation/i2c/busses/i2c-ocores @@ -20,6 +20,8 @@ platform_device with the base address and interrupt number. The dev.platform_data of the device should also point to a struct ocores_i2c_platform_data (see linux/i2c-ocores.h) describing the distance between registers and the input clock speed. +There is also a possibility to attach a list of i2c_board_info which +the i2c-ocores driver will add to the bus upon creation. E.G. something like: @@ -36,9 +38,24 @@ static struct resource ocores_resources[] = { }, }; +/* optional board info */ +struct i2c_board_info ocores_i2c_board_info[] = { + { + I2C_BOARD_INFO("tsc2003", 0x48), + .platform_data = &tsc2003_platform_data, + .irq = TSC_IRQ + }, + { + I2C_BOARD_INFO("adv7180", 0x42 >> 1), + .irq = ADV_IRQ + } +}; + static struct ocores_i2c_platform_data myi2c_data = { .regstep = 2, /* two bytes between registers */ .clock_khz = 50000, /* input clock of 50MHz */ + .devices = ocores_i2c_board_info, /* optional table of devices */ + .num_devices = ARRAY_SIZE(ocores_i2c_board_info), /* table size */ }; static struct platform_device myi2c = { -- cgit v1.1 From e594c8de3bd4e7732ed3340fb01e18ec94b12df2 Mon Sep 17 00:00:00 2001 From: Vegard Nossum Date: Sat, 13 Jun 2009 14:15:57 +0200 Subject: kmemcheck: add the kmemcheck documentation Thanks to Sitsofe Wheeler, Randy Dunlap, and Jonathan Corbet for providing input and feedback on this! Signed-off-by: Vegard Nossum --- Documentation/kmemcheck.txt | 773 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 773 insertions(+) create mode 100644 Documentation/kmemcheck.txt (limited to 'Documentation') diff --git a/Documentation/kmemcheck.txt b/Documentation/kmemcheck.txt new file mode 100644 index 0000000..3630446 --- /dev/null +++ b/Documentation/kmemcheck.txt @@ -0,0 +1,773 @@ +GETTING STARTED WITH KMEMCHECK +============================== + +Vegard Nossum + + +Contents +======== +0. Introduction +1. Downloading +2. Configuring and compiling +3. How to use +3.1. Booting +3.2. Run-time enable/disable +3.3. Debugging +3.4. Annotating false positives +4. Reporting errors +5. Technical description + + +0. Introduction +=============== + +kmemcheck is a debugging feature for the Linux Kernel. More specifically, it +is a dynamic checker that detects and warns about some uses of uninitialized +memory. + +Userspace programmers might be familiar with Valgrind's memcheck. The main +difference between memcheck and kmemcheck is that memcheck works for userspace +programs only, and kmemcheck works for the kernel only. The implementations +are of course vastly different. Because of this, kmemcheck is not as accurate +as memcheck, but it turns out to be good enough in practice to discover real +programmer errors that the compiler is not able to find through static +analysis. + +Enabling kmemcheck on a kernel will probably slow it down to the extent that +the machine will not be usable for normal workloads such as e.g. an +interactive desktop. kmemcheck will also cause the kernel to use about twice +as much memory as normal. For this reason, kmemcheck is strictly a debugging +feature. + + +1. Downloading +============== + +kmemcheck can only be downloaded using git. If you want to write patches +against the current code, you should use the kmemcheck development branch of +the tip tree. It is also possible to use the linux-next tree, which also +includes the latest version of kmemcheck. + +Assuming that you've already cloned the linux-2.6.git repository, all you +have to do is add the -tip tree as a remote, like this: + + $ git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git + +To actually download the tree, fetch the remote: + + $ git fetch tip + +And to check out a new local branch with the kmemcheck code: + + $ git checkout -b kmemcheck tip/kmemcheck + +General instructions for the -tip tree can be found here: +http://people.redhat.com/mingo/tip.git/readme.txt + + +2. Configuring and compiling +============================ + +kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of +configuration variables must have specific settings in order for the kmemcheck +menu to even appear in "menuconfig". These are: + + o CONFIG_CC_OPTIMIZE_FOR_SIZE=n + + This option is located under "General setup" / "Optimize for size". + + Without this, gcc will use certain optimizations that usually lead to + false positive warnings from kmemcheck. An example of this is a 16-bit + field in a struct, where gcc may load 32 bits, then discard the upper + 16 bits. kmemcheck sees only the 32-bit load, and may trigger a + warning for the upper 16 bits (if they're uninitialized). + + o CONFIG_SLAB=y or CONFIG_SLUB=y + + This option is located under "General setup" / "Choose SLAB + allocator". + + o CONFIG_FUNCTION_TRACER=n + + This option is located under "Kernel hacking" / "Tracers" / "Kernel + Function Tracer" + + When function tracing is compiled in, gcc emits a call to another + function at the beginning of every function. This means that when the + page fault handler is called, the ftrace framework will be called + before kmemcheck has had a chance to handle the fault. If ftrace then + modifies memory that was tracked by kmemcheck, the result is an + endless recursive page fault. + + o CONFIG_DEBUG_PAGEALLOC=n + + This option is located under "Kernel hacking" / "Debug page memory + allocations". + +In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also +located under "Kernel hacking". With this, you will be able to get line number +information from the kmemcheck warnings, which is extremely valuable in +debugging a problem. This option is not mandatory, however, because it slows +down the compilation process and produces a much bigger kernel image. + +Now the kmemcheck menu should be visible (under "Kernel hacking" / "kmemcheck: +trap use of uninitialized memory"). Here follows a description of the +kmemcheck configuration variables: + + o CONFIG_KMEMCHECK + + This must be enabled in order to use kmemcheck at all... + + o CONFIG_KMEMCHECK_[DISABLED | ENABLED | ONESHOT]_BY_DEFAULT + + This option controls the status of kmemcheck at boot-time. "Enabled" + will enable kmemcheck right from the start, "disabled" will boot the + kernel as normal (but with the kmemcheck code compiled in, so it can + be enabled at run-time after the kernel has booted), and "one-shot" is + a special mode which will turn kmemcheck off automatically after + detecting the first use of uninitialized memory. + + If you are using kmemcheck to actively debug a problem, then you + probably want to choose "enabled" here. + + The one-shot mode is mostly useful in automated test setups because it + can prevent floods of warnings and increase the chances of the machine + surviving in case something is really wrong. In other cases, the one- + shot mode could actually be counter-productive because it would turn + itself off at the very first error -- in the case of a false positive + too -- and this would come in the way of debugging the specific + problem you were interested in. + + If you would like to use your kernel as normal, but with a chance to + enable kmemcheck in case of some problem, it might be a good idea to + choose "disabled" here. When kmemcheck is disabled, most of the run- + time overhead is not incurred, and the kernel will be almost as fast + as normal. + + o CONFIG_KMEMCHECK_QUEUE_SIZE + + Select the maximum number of error reports to store in an internal + (fixed-size) buffer. Since errors can occur virtually anywhere and in + any context, we need a temporary storage area which is guaranteed not + to generate any other page faults when accessed. The queue will be + emptied as soon as a tasklet may be scheduled. If the queue is full, + new error reports will be lost. + + The default value of 64 is probably fine. If some code produces more + than 64 errors within an irqs-off section, then the code is likely to + produce many, many more, too, and these additional reports seldom give + any more information (the first report is usually the most valuable + anyway). + + This number might have to be adjusted if you are not using serial + console or similar to capture the kernel log. If you are using the + "dmesg" command to save the log, then getting a lot of kmemcheck + warnings might overflow the kernel log itself, and the earlier reports + will get lost in that way instead. Try setting this to 10 or so on + such a setup. + + o CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT + + Select the number of shadow bytes to save along with each entry of the + error-report queue. These bytes indicate what parts of an allocation + are initialized, uninitialized, etc. and will be displayed when an + error is detected to help the debugging of a particular problem. + + The number entered here is actually the logarithm of the number of + bytes that will be saved. So if you pick for example 5 here, kmemcheck + will save 2^5 = 32 bytes. + + The default value should be fine for debugging most problems. It also + fits nicely within 80 columns. + + o CONFIG_KMEMCHECK_PARTIAL_OK + + This option (when enabled) works around certain GCC optimizations that + produce 32-bit reads from 16-bit variables where the upper 16 bits are + thrown away afterwards. + + The default value (enabled) is recommended. This may of course hide + some real errors, but disabling it would probably produce a lot of + false positives. + + o CONFIG_KMEMCHECK_BITOPS_OK + + This option silences warnings that would be generated for bit-field + accesses where not all the bits are initialized at the same time. This + may also hide some real bugs. + + This option is probably obsolete, or it should be replaced with + the kmemcheck-/bitfield-annotations for the code in question. The + default value is therefore fine. + +Now compile the kernel as usual. + + +3. How to use +============= + +3.1. Booting +============ + +First some information about the command-line options. There is only one +option specific to kmemcheck, and this is called "kmemcheck". It can be used +to override the default mode as chosen by the CONFIG_KMEMCHECK_*_BY_DEFAULT +option. Its possible settings are: + + o kmemcheck=0 (disabled) + o kmemcheck=1 (enabled) + o kmemcheck=2 (one-shot mode) + +If SLUB debugging has been enabled in the kernel, it may take precedence over +kmemcheck in such a way that the slab caches which are under SLUB debugging +will not be tracked by kmemcheck. In order to ensure that this doesn't happen +(even though it shouldn't by default), use SLUB's boot option "slub_debug", +like this: slub_debug=- + +In fact, this option may also be used for fine-grained control over SLUB vs. +kmemcheck. For example, if the command line includes "kmemcheck=1 +slub_debug=,dentry", then SLUB debugging will be used only for the "dentry" +slab cache, and with kmemcheck tracking all the other caches. This is advanced +usage, however, and is not generally recommended. + + +3.2. Run-time enable/disable +============================ + +When the kernel has booted, it is possible to enable or disable kmemcheck at +run-time. WARNING: This feature is still experimental and may cause false +positive warnings to appear. Therefore, try not to use this. If you find that +it doesn't work properly (e.g. you see an unreasonable amount of warnings), I +will be happy to take bug reports. + +Use the file /proc/sys/kernel/kmemcheck for this purpose, e.g.: + + $ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck + +The numbers are the same as for the kmemcheck= command-line option. + + +3.3. Debugging +============== + +A typical report will look something like this: + +WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024) +80000000000000000000000000000000000000000088ffff0000000000000000 + i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u + ^ + +Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A +RIP: 0010:[] [] __dequeue_signal+0xc8/0x190 +RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002 +RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009 +RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84 +RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000 +R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e +R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8 +FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000 +CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 +CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0 +DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 +DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 + [] dequeue_signal+0x8e/0x170 + [] get_signal_to_deliver+0x98/0x390 + [] do_notify_resume+0xad/0x7d0 + [] int_signal+0x12/0x17 + [] 0xffffffffffffffff + +The single most valuable information in this report is the RIP (or EIP on 32- +bit) value. This will help us pinpoint exactly which instruction that caused +the warning. + +If your kernel was compiled with CONFIG_DEBUG_INFO=y, then all we have to do +is give this address to the addr2line program, like this: + + $ addr2line -e vmlinux -i ffffffff8104ede8 + arch/x86/include/asm/string_64.h:12 + include/asm-generic/siginfo.h:287 + kernel/signal.c:380 + kernel/signal.c:410 + +The "-e vmlinux" tells addr2line which file to look in. IMPORTANT: This must +be the vmlinux of the kernel that produced the warning in the first place! If +not, the line number information will almost certainly be wrong. + +The "-i" tells addr2line to also print the line numbers of inlined functions. +In this case, the flag was very important, because otherwise, it would only +have printed the first line, which is just a call to memcpy(), which could be +called from a thousand places in the kernel, and is therefore not very useful. +These inlined functions would not show up in the stack trace above, simply +because the kernel doesn't load the extra debugging information. This +technique can of course be used with ordinary kernel oopses as well. + +In this case, it's the caller of memcpy() that is interesting, and it can be +found in include/asm-generic/siginfo.h, line 287: + +281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from) +282 { +283 if (from->si_code < 0) +284 memcpy(to, from, sizeof(*to)); +285 else +286 /* _sigchld is currently the largest know union member */ +287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld)); +288 } + +Since this was a read (kmemcheck usually warns about reads only, though it can +warn about writes to unallocated or freed memory as well), it was probably the +"from" argument which contained some uninitialized bytes. Following the chain +of calls, we move upwards to see where "from" was allocated or initialized, +kernel/signal.c, line 380: + +359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info) +360 { +... +367 list_for_each_entry(q, &list->list, list) { +368 if (q->info.si_signo == sig) { +369 if (first) +370 goto still_pending; +371 first = q; +... +377 if (first) { +378 still_pending: +379 list_del_init(&first->list); +380 copy_siginfo(info, &first->info); +381 __sigqueue_free(first); +... +392 } +393 } + +Here, it is &first->info that is being passed on to copy_siginfo(). The +variable "first" was found on a list -- passed in as the second argument to +collect_signal(). We continue our journey through the stack, to figure out +where the item on "list" was allocated or initialized. We move to line 410: + +395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask, +396 siginfo_t *info) +397 { +... +410 collect_signal(sig, pending, info); +... +414 } + +Now we need to follow the "pending" pointer, since that is being passed on to +collect_signal() as "list". At this point, we've run out of lines from the +"addr2line" output. Not to worry, we just paste the next addresses from the +kmemcheck stack dump, i.e.: + + [] dequeue_signal+0x8e/0x170 + [] get_signal_to_deliver+0x98/0x390 + [] do_notify_resume+0xad/0x7d0 + [] int_signal+0x12/0x17 + + $ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \ + ffffffff8100b87d ffffffff8100c7b5 + kernel/signal.c:446 + kernel/signal.c:1806 + arch/x86/kernel/signal.c:805 + arch/x86/kernel/signal.c:871 + arch/x86/kernel/entry_64.S:694 + +Remember that since these addresses were found on the stack and not as the +RIP value, they actually point to the _next_ instruction (they are return +addresses). This becomes obvious when we look at the code for line 446: + +422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info) +423 { +... +431 signr = __dequeue_signal(&tsk->signal->shared_pending, +432 mask, info); +433 /* +434 * itimer signal ? +435 * +436 * itimers are process shared and we restart periodic +437 * itimers in the signal delivery path to prevent DoS +438 * attacks in the high resolution timer case. This is +439 * compliant with the old way of self restarting +440 * itimers, as the SIGALRM is a legacy signal and only +441 * queued once. Changing the restart behaviour to +442 * restart the timer in the signal dequeue path is +443 * reducing the timer noise on heavy loaded !highres +444 * systems too. +445 */ +446 if (unlikely(signr == SIGALRM)) { +... +489 } + +So instead of looking at 446, we should be looking at 431, which is the line +that executes just before 446. Here we see that what we are looking for is +&tsk->signal->shared_pending. + +Our next task is now to figure out which function that puts items on this +"shared_pending" list. A crude, but efficient tool, is git grep: + + $ git grep -n 'shared_pending' kernel/ + ... + kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending; + kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending; + ... + +There were more results, but none of them were related to list operations, +and these were the only assignments. We inspect the line numbers more closely +and find that this is indeed where items are being added to the list: + +816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t, +817 int group) +818 { +... +828 pending = group ? &t->signal->shared_pending : &t->pending; +... +851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN && +852 (is_si_special(info) || +853 info->si_code >= 0))); +854 if (q) { +855 list_add_tail(&q->list, &pending->list); +... +890 } + +and: + +1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group) +1310 { +.... +1339 pending = group ? &t->signal->shared_pending : &t->pending; +1340 list_add_tail(&q->list, &pending->list); +.... +1347 } + +In the first case, the list element we are looking for, "q", is being returned +from the function __sigqueue_alloc(), which looks like an allocation function. +Let's take a look at it: + +187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags, +188 int override_rlimit) +189 { +190 struct sigqueue *q = NULL; +191 struct user_struct *user; +192 +193 /* +194 * We won't get problems with the target's UID changing under us +195 * because changing it requires RCU be used, and if t != current, the +196 * caller must be holding the RCU readlock (by way of a spinlock) and +197 * we use RCU protection here +198 */ +199 user = get_uid(__task_cred(t)->user); +200 atomic_inc(&user->sigpending); +201 if (override_rlimit || +202 atomic_read(&user->sigpending) <= +203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur) +204 q = kmem_cache_alloc(sigqueue_cachep, flags); +205 if (unlikely(q == NULL)) { +206 atomic_dec(&user->sigpending); +207 free_uid(user); +208 } else { +209 INIT_LIST_HEAD(&q->list); +210 q->flags = 0; +211 q->user = user; +212 } +213 +214 return q; +215 } + +We see that this function initializes q->list, q->flags, and q->user. It seems +that now is the time to look at the definition of "struct sigqueue", e.g.: + +14 struct sigqueue { +15 struct list_head list; +16 int flags; +17 siginfo_t info; +18 struct user_struct *user; +19 }; + +And, you might remember, it was a memcpy() on &first->info that caused the +warning, so this makes perfect sense. It also seems reasonable to assume that +it is the caller of __sigqueue_alloc() that has the responsibility of filling +out (initializing) this member. + +But just which fields of the struct were uninitialized? Let's look at +kmemcheck's report again: + +WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024) +80000000000000000000000000000000000000000088ffff0000000000000000 + i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u + ^ + +These first two lines are the memory dump of the memory object itself, and the +shadow bytemap, respectively. The memory object itself is in this case +&first->info. Just beware that the start of this dump is NOT the start of the +object itself! The position of the caret (^) corresponds with the address of +the read (ffff88003e4a2024). + +The shadow bytemap dump legend is as follows: + + i - initialized + u - uninitialized + a - unallocated (memory has been allocated by the slab layer, but has not + yet been handed off to anybody) + f - freed (memory has been allocated by the slab layer, but has been freed + by the previous owner) + +In order to figure out where (relative to the start of the object) the +uninitialized memory was located, we have to look at the disassembly. For +that, we'll need the RIP address again: + +RIP: 0010:[] [] __dequeue_signal+0xc8/0x190 + + $ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8: + ffffffff8104edc8: mov %r8,0x8(%r8) + ffffffff8104edcc: test %r10d,%r10d + ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168> + ffffffff8104edd5: mov %rax,%rdx + ffffffff8104edd8: mov $0xc,%ecx + ffffffff8104eddd: mov %r13,%rdi + ffffffff8104ede0: mov $0x30,%eax + ffffffff8104ede5: mov %rdx,%rsi + ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi) + ffffffff8104edea: test $0x2,%al + ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0> + ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi) + ffffffff8104edf0: test $0x1,%al + ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5> + ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi) + ffffffff8104edf5: mov %r8,%rdi + ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free> + +As expected, it's the "rep movsl" instruction from the memcpy() that causes +the warning. We know about REP MOVSL that it uses the register RCX to count +the number of remaining iterations. By taking a look at the register dump +again (from the kmemcheck report), we can figure out how many bytes were left +to copy: + +RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009 + +By looking at the disassembly, we also see that %ecx is being loaded with the +value $0xc just before (ffffffff8104edd8), so we are very lucky. Keep in mind +that this is the number of iterations, not bytes. And since this is a "long" +operation, we need to multiply by 4 to get the number of bytes. So this means +that the uninitialized value was encountered at 4 * (0xc - 0x9) = 12 bytes +from the start of the object. + +We can now try to figure out which field of the "struct siginfo" that was not +initialized. This is the beginning of the struct: + +40 typedef struct siginfo { +41 int si_signo; +42 int si_errno; +43 int si_code; +44 +45 union { +.. +92 } _sifields; +93 } siginfo_t; + +On 64-bit, the int is 4 bytes long, so it must the the union member that has +not been initialized. We can verify this using gdb: + + $ gdb vmlinux + ... + (gdb) p &((struct siginfo *) 0)->_sifields + $1 = (union {...} *) 0x10 + +Actually, it seems that the union member is located at offset 0x10 -- which +means that gcc has inserted 4 bytes of padding between the members si_code +and _sifields. We can now get a fuller picture of the memory dump: + + _----------------------------=> si_code + / _--------------------=> (padding) + | / _------------=> _sifields(._kill._pid) + | | / _----=> _sifields(._kill._uid) + | | | / +-------|-------|-------|-------| +80000000000000000000000000000000000000000088ffff0000000000000000 + i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u + +This allows us to realize another important fact: si_code contains the value +0x80. Remember that x86 is little endian, so the first 4 bytes "80000000" are +really the number 0x00000080. With a bit of research, we find that this is +actually the constant SI_KERNEL defined in include/asm-generic/siginfo.h: + +144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */ + +This macro is used in exactly one place in the x86 kernel: In send_signal() +in kernel/signal.c: + +816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t, +817 int group) +818 { +... +828 pending = group ? &t->signal->shared_pending : &t->pending; +... +851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN && +852 (is_si_special(info) || +853 info->si_code >= 0))); +854 if (q) { +855 list_add_tail(&q->list, &pending->list); +856 switch ((unsigned long) info) { +... +865 case (unsigned long) SEND_SIG_PRIV: +866 q->info.si_signo = sig; +867 q->info.si_errno = 0; +868 q->info.si_code = SI_KERNEL; +869 q->info.si_pid = 0; +870 q->info.si_uid = 0; +871 break; +... +890 } + +Not only does this match with the .si_code member, it also matches the place +we found earlier when looking for where siginfo_t objects are enqueued on the +"shared_pending" list. + +So to sum up: It seems that it is the padding introduced by the compiler +between two struct fields that is uninitialized, and this gets reported when +we do a memcpy() on the struct. This means that we have identified a false +positive warning. + +Normally, kmemcheck will not report uninitialized accesses in memcpy() calls +when both the source and destination addresses are tracked. (Instead, we copy +the shadow bytemap as well). In this case, the destination address clearly +was not tracked. We can dig a little deeper into the stack trace from above: + + arch/x86/kernel/signal.c:805 + arch/x86/kernel/signal.c:871 + arch/x86/kernel/entry_64.S:694 + +And we clearly see that the destination siginfo object is located on the +stack: + +782 static void do_signal(struct pt_regs *regs) +783 { +784 struct k_sigaction ka; +785 siginfo_t info; +... +804 signr = get_signal_to_deliver(&info, &ka, regs, NULL); +... +854 } + +And this &info is what eventually gets passed to copy_siginfo() as the +destination argument. + +Now, even though we didn't find an actual error here, the example is still a +good one, because it shows how one would go about to find out what the report +was all about. + + +3.4. Annotating false positives +=============================== + +There are a few different ways to make annotations in the source code that +will keep kmemcheck from checking and reporting certain allocations. Here +they are: + + o __GFP_NOTRACK_FALSE_POSITIVE + + This flag can be passed to kmalloc() or kmem_cache_alloc() (therefore + also to other functions that end up calling one of these) to indicate + that the allocation should not be tracked because it would lead to + a false positive report. This is a "big hammer" way of silencing + kmemcheck; after all, even if the false positive pertains to + particular field in a struct, for example, we will now lose the + ability to find (real) errors in other parts of the same struct. + + Example: + + /* No warnings will ever trigger on accessing any part of x */ + x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE); + + o kmemcheck_bitfield_begin(name)/kmemcheck_bitfield_end(name) and + kmemcheck_annotate_bitfield(ptr, name) + + The first two of these three macros can be used inside struct + definitions to signal, respectively, the beginning and end of a + bitfield. Additionally, this will assign the bitfield a name, which + is given as an argument to the macros. + + Having used these markers, one can later use + kmemcheck_annotate_bitfield() at the point of allocation, to indicate + which parts of the allocation is part of a bitfield. + + Example: + + struct foo { + int x; + + kmemcheck_bitfield_begin(flags); + int flag_a:1; + int flag_b:1; + kmemcheck_bitfield_end(flags); + + int y; + }; + + struct foo *x = kmalloc(sizeof *x); + + /* No warnings will trigger on accessing the bitfield of x */ + kmemcheck_annotate_bitfield(x, flags); + + Note that kmemcheck_annotate_bitfield() can be used even before the + return value of kmalloc() is checked -- in other words, passing NULL + as the first argument is legal (and will do nothing). + + +4. Reporting errors +=================== + +As we have seen, kmemcheck will produce false positive reports. Therefore, it +is not very wise to blindly post kmemcheck warnings to mailing lists and +maintainers. Instead, I encourage maintainers and developers to find errors +in their own code. If you get a warning, you can try to work around it, try +to figure out if it's a real error or not, or simply ignore it. Most +developers know their own code and will quickly and efficiently determine the +root cause of a kmemcheck report. This is therefore also the most efficient +way to work with kmemcheck. + +That said, we (the kmemcheck maintainers) will always be on the lookout for +false positives that we can annotate and silence. So whatever you find, +please drop us a note privately! Kernel configs and steps to reproduce (if +available) are of course a great help too. + +Happy hacking! + + +5. Technical description +======================== + +kmemcheck works by marking memory pages non-present. This means that whenever +somebody attempts to access the page, a page fault is generated. The page +fault handler notices that the page was in fact only hidden, and so it calls +on the kmemcheck code to make further investigations. + +When the investigations are completed, kmemcheck "shows" the page by marking +it present (as it would be under normal circumstances). This way, the +interrupted code can continue as usual. + +But after the instruction has been executed, we should hide the page again, so +that we can catch the next access too! Now kmemcheck makes use of a debugging +feature of the processor, namely single-stepping. When the processor has +finished the one instruction that generated the memory access, a debug +exception is raised. From here, we simply hide the page again and continue +execution, this time with the single-stepping feature turned off. + +kmemcheck requires some assistance from the memory allocator in order to work. +The memory allocator needs to + + 1. Tell kmemcheck about newly allocated pages and pages that are about to + be freed. This allows kmemcheck to set up and tear down the shadow memory + for the pages in question. The shadow memory stores the status of each + byte in the allocation proper, e.g. whether it is initialized or + uninitialized. + + 2. Tell kmemcheck which parts of memory should be marked uninitialized. + There are actually a few more states, such as "not yet allocated" and + "recently freed". + +If a slab cache is set up using the SLAB_NOTRACK flag, it will never return +memory that can take page faults because of kmemcheck. + +If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still +request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags. +This does not prevent the page faults from occurring, however, but marks the +object in question as being initialized so that no warnings will ever be +produced for this object. + +Currently, the SLAB and SLUB allocators are supported by kmemcheck. -- cgit v1.1 From 8a8a2050c844d9de224ff591e91bda3f77bd6eda Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Sat, 13 Jun 2009 10:08:59 -0400 Subject: ext4: document the "abort" mount option Signed-off-by: "Theodore Ts'o" --- Documentation/filesystems/ext4.txt | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 608fdba..7be02ac 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -235,6 +235,10 @@ minixdf Make 'df' act like Minix. debug Extra debugging information is sent to syslog. +abort Simulate the effects of calling ext4_abort() for + debugging purposes. This is normally used while + remounting a filesystem which is already mounted. + errors=remount-ro Remount the filesystem read-only on an error. errors=continue Keep going on a filesystem error. errors=panic Panic and halt the machine if an error occurs. -- cgit v1.1 From 11013911daea4820147ae6d7094dd7c6894e8651 Mon Sep 17 00:00:00 2001 From: Andreas Dilger Date: Sat, 13 Jun 2009 11:45:35 -0400 Subject: ext4: teach the inode allocator to use a goal inode number Enhance the inode allocator to take a goal inode number as a paremeter; if it is specified, it takes precedence over Orlov or parent directory inode allocation algorithms. The extents migration function uses the goal inode number so that the extent trees allocated the migration function use the correct flex_bg. In the future, the goal inode functionality will also be used to allocate an adjacent inode for the extended attributes. Also, for testing purposes the goal inode number can be specified via /sys/fs/{dev}/inode_goal. This can be useful for testing inode allocation beyond 2^32 blocks on very large filesystems. Signed-off-by: Andreas Dilger Signed-off-by: "Theodore Ts'o" --- Documentation/ABI/testing/sysfs-fs-ext4 | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-fs-ext4 b/Documentation/ABI/testing/sysfs-fs-ext4 index 4e79074..5fb7099 100644 --- a/Documentation/ABI/testing/sysfs-fs-ext4 +++ b/Documentation/ABI/testing/sysfs-fs-ext4 @@ -79,3 +79,13 @@ Description: This file is read-only and shows the number of kilobytes of data that have been written to this filesystem since it was mounted. + +What: /sys/fs/ext4//inode_goal +Date: June 2008 +Contact: "Theodore Ts'o" +Description: + Tuning parameter which (if non-zero) controls the goal + inode used by the inode allocator in p0reference to + all other allocation hueristics. This is intended for + debugging use only, and should be 0 on production + systems. -- cgit v1.1 From 2185a5ecd98d2cebc6a29b07b1ea4f7334c2ccc3 Mon Sep 17 00:00:00 2001 From: Adam Lackorzynski Date: Sun, 14 Jun 2009 22:38:59 +0200 Subject: documentation: make version fix The Makefiles in the build directories use the internal make variable MAKEFILE_LIST which is available from make 3.80 only. (The patch would be valid back to 2.6.25) Signed-off-by: Adam Lackorzynski Signed-off-by: Andrew Morton Signed-off-by: Sam Ravnborg --- Documentation/Changes | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/Changes b/Documentation/Changes index b95082b..112e765 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -29,7 +29,7 @@ hardware, for example, you probably needn't concern yourself with isdn4k-utils. o Gnu C 3.2 # gcc --version -o Gnu make 3.79.1 # make --version +o Gnu make 3.80 # make --version o binutils 2.12 # ld -v o util-linux 2.10o # fdformat --version o module-init-tools 0.9.10 # depmod -V @@ -61,7 +61,7 @@ computer. Make ---- -You will need Gnu make 3.79.1 or later to build the kernel. +You will need Gnu make 3.80 or later to build the kernel. Binutils -------- -- cgit v1.1 From 4f4d1ad6ee69027f51f9d137f7e7d3c863cbc53d Mon Sep 17 00:00:00 2001 From: Thomas Renninger Date: Wed, 22 Apr 2009 13:48:31 +0200 Subject: [CPUFREQ] Only set sampling_rate_max deprecated, sampling_rate_min is useful Update the documentation accordingly. Cleanup and use printk_once. Signed-off-by: Thomas Renninger Signed-off-by: Dave Jones --- Documentation/cpu-freq/governors.txt | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) (limited to 'Documentation') diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index ce73f3eb..aed082f 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt @@ -119,10 +119,6 @@ want the kernel to look at the CPU usage and to make decisions on what to do about the frequency. Typically this is set to values of around '10000' or more. It's default value is (cmp. with users-guide.txt): transition_latency * 1000 -The lowest value you can set is: -transition_latency * 100 or it may get restricted to a value where it -makes not sense for the kernel anymore to poll that often which depends -on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000). Be aware that transition latency is in ns and sampling_rate is in us, so you get the same sysfs value by default. Sampling rate should always get adjusted considering the transition latency @@ -131,14 +127,20 @@ in the bash (as said, 1000 is default), do: echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \ >ondemand/sampling_rate -show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT. -You can use wider ranges now and the general -cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be -used to obtain exactly the same info: -show_sampling_rate_min = transtition_latency * 500 / 1000 -show_sampling_rate_max = transtition_latency * 500000 / 1000 -(divided by 1000 is to illustrate that sampling rate is in us and -transition latency is exported ns). +show_sampling_rate_min: +The sampling rate is limited by the HW transition latency: +transition_latency * 100 +Or by kernel restrictions: +If CONFIG_NO_HZ is set, the limit is 10ms fixed. +If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the +limits depend on the CONFIG_HZ option: +HZ=1000: min=20000us (20ms) +HZ=250: min=80000us (80ms) +HZ=100: min=200000us (200ms) +The highest value of kernel and HW latency restrictions is shown and +used as the minimum sampling rate. + +show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT. up_threshold: defines what the average CPU usage between the samplings of 'sampling_rate' needs to be for the kernel to make a decision on -- cgit v1.1 From 51555c0e91160f6d4c6c1cb7a44d20ea346aed08 Mon Sep 17 00:00:00 2001 From: Chumbalkar Nagananda Date: Thu, 21 May 2009 23:29:48 +0000 Subject: [CPUFREQ] minor correction to cpu-freq documentation I have been reading the documentation for cpufreq closely. Found a couple of minor errors in the Documentation. Signed-off-by: Naga Chumbalkar Signed-off-by: Dave Jones --- Documentation/cpu-freq/cpu-drivers.txt | 2 +- Documentation/cpu-freq/user-guide.txt | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt index 43c7439..75a58d1 100644 --- a/Documentation/cpu-freq/cpu-drivers.txt +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -155,7 +155,7 @@ actual frequency must be determined using the following rules: - if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal target_freq. ("H for highest, but no higher than") -Here again the frequency table helper might assist you - see section 3 +Here again the frequency table helper might assist you - see section 2 for details. diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt index 75f4119..5d5f5fa 100644 --- a/Documentation/cpu-freq/user-guide.txt +++ b/Documentation/cpu-freq/user-guide.txt @@ -31,7 +31,6 @@ Contents: 3. How to change the CPU cpufreq policy and/or speed 3.1 Preferred interface: sysfs -3.2 Deprecated interfaces -- cgit v1.1 From a231591f0427cfb91ae247be974a7fa0e6b37389 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Mon, 15 Jun 2009 18:01:49 +0200 Subject: i2c-viapro: Add new PCI device ID for VX855 The south bridge of the VIA VX855 chipset has a different PCI Device ID so i2c-viapro.c needs to be updated with this. Signed-off-by: Harald Welte Signed-off-by: Jean Delvare --- Documentation/i2c/busses/i2c-viapro | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro index 22efedf..2e758b0 100644 --- a/Documentation/i2c/busses/i2c-viapro +++ b/Documentation/i2c/busses/i2c-viapro @@ -19,6 +19,9 @@ Supported adapters: * VIA Technologies, Inc. VX800/VX820 Datasheet: available on http://linux.via.com.tw + * VIA Technologies, Inc. VX855/VX875 + Datasheet: Availability unknown + Authors: Kyösti Mälkki , Mark D. Studebaker , @@ -53,6 +56,7 @@ Your lspci -n listing must show one of these : device 1106:3287 (VT8251) device 1106:8324 (CX700) device 1106:8353 (VX800/VX820) + device 1106:8409 (VX855/VX875) If none of these show up, you should look in the BIOS for settings like enable ACPI / SMBus or even USB. -- cgit v1.1 From 8070408b5446232ba6eb6e0809a329da58a6ae52 Mon Sep 17 00:00:00 2001 From: "Darrick J. Wong" Date: Mon, 15 Jun 2009 18:39:46 +0200 Subject: hwmon: (ibmaem) Automatically load on HC10 blade Enable auto-probing for the HC10 blade and amend the supported system list. Signed-off-by: Darrick J. Wong Signed-off-by: Jean Delvare --- Documentation/hwmon/ibmaem | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/hwmon/ibmaem b/Documentation/hwmon/ibmaem index e98bdfe..1e0d59e 100644 --- a/Documentation/hwmon/ibmaem +++ b/Documentation/hwmon/ibmaem @@ -7,7 +7,7 @@ henceforth as AEM. Supported systems: * Any recent IBM System X server with AEM support. This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2, - x3950 M2, and certain HS2x/LS2x/QS2x blades. The IPMI host interface + x3950 M2, and certain HC10/HS2x/LS2x/QS2x blades. The IPMI host interface driver ("ipmi-si") needs to be loaded for this driver to do anything. Prefix: 'ibmaem' Datasheet: Not available -- cgit v1.1 From cd4e96c5dd4a72bdc54ea9981e02465708c204d3 Mon Sep 17 00:00:00 2001 From: Andre Prendel Date: Mon, 15 Jun 2009 18:39:49 +0200 Subject: hwmon: (tmp401) Add documentation Documentation for the tmp401 driver. The documentation describes the tmp401 driver and the supported Texas Instruments TMP401 and TMP411 temperature sensor chips. Further documentation for new sysfs attributes supported by this driver is added to Documentation/hwmon/sysfs-interface. Signed-off-by: Andre Prendel Acked-by: Hans de Goede Signed-off-by: Jean Delvare --- Documentation/hwmon/sysfs-interface | 19 +++++++++++++++++ Documentation/hwmon/tmp401 | 42 +++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100644 Documentation/hwmon/tmp401 (limited to 'Documentation') diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface index 004ee16..dcbd502 100644 --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -70,6 +70,7 @@ are interpreted as 0! For more on how written strings are interpreted see the [0-*] denotes any positive number starting from 0 [1-*] denotes any positive number starting from 1 RO read only value +WO write only value RW read/write value Read/write values may be read-only for some chips, depending on the @@ -295,6 +296,24 @@ temp[1-*]_label Suggested temperature channel label. user-space. RO +temp[1-*]_lowest + Historical minimum temperature + Unit: millidegree Celsius + RO + +temp[1-*]_highest + Historical maximum temperature + Unit: millidegree Celsius + RO + +temp[1-*]_reset_history + Reset temp_lowest and temp_highest + WO + +temp_reset_history + Reset temp_lowest and temp_highest for all sensors + WO + Some chips measure temperature using external thermistors and an ADC, and report the temperature measurement as a voltage. Converting this voltage back to a temperature (or the other way around for limits) requires diff --git a/Documentation/hwmon/tmp401 b/Documentation/hwmon/tmp401 new file mode 100644 index 0000000..9fc4472 --- /dev/null +++ b/Documentation/hwmon/tmp401 @@ -0,0 +1,42 @@ +Kernel driver tmp401 +==================== + +Supported chips: + * Texas Instruments TMP401 + Prefix: 'tmp401' + Addresses scanned: I2C 0x4c + Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html + * Texas Instruments TMP411 + Prefix: 'tmp411' + Addresses scanned: I2C 0x4c + Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html + +Authors: + Hans de Goede + Andre Prendel + +Description +----------- + +This driver implements support for Texas Instruments TMP401 and +TMP411 chips. These chips implements one remote and one local +temperature sensor. Temperature is measured in degrees +Celsius. Resolution of the remote sensor is 0.0625 degree. Local +sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not +supported by the driver so far, so using the default resolution of 0.5 +degree). + +The driver provides the common sysfs-interface for temperatures (see +/Documentation/hwmon/sysfs-interface under Temperatures). + +The TMP411 chip is compatible with TMP401. It provides some additional +features. + +* Minimum and Maximum temperature measured since power-on, chip-reset + + Exported via sysfs attributes tempX_lowest and tempX_highest. + +* Reset of historical minimum/maximum temperature measurements + + Exported via sysfs attribute temp_reset_history. Writing 1 to this + file triggers a reset. -- cgit v1.1 From c1e48dce05ff06266cdfd0cba55fc5367cd499a5 Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Mon, 15 Jun 2009 18:39:50 +0200 Subject: hwmon: (w83627ehf) Add W83627DHG-P support Add support for the new incarnation of the Winbond/Nuvoton W83627DHG chip known as W83627DHG-P. It is basically the same as the original W83627DHG with an additional automatic can speed control mode (not supported by the driver yet.) Signed-off-by: Jean Delvare Tested-by: Madhu --- Documentation/hwmon/w83627ehf | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/hwmon/w83627ehf b/Documentation/hwmon/w83627ehf index b6eb593..02b7489 100644 --- a/Documentation/hwmon/w83627ehf +++ b/Documentation/hwmon/w83627ehf @@ -12,6 +12,10 @@ Supported chips: Addresses scanned: ISA address retrieved from Super I/O registers Datasheet: http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf + * Winbond W83627DHG-P + Prefix: 'w83627dhg' + Addresses scanned: ISA address retrieved from Super I/O registers + Datasheet: not available * Winbond W83667HG Prefix: 'w83667hg' Addresses scanned: ISA address retrieved from Super I/O registers @@ -28,8 +32,8 @@ Description ----------- This driver implements support for the Winbond W83627EHF, W83627EHG, -W83627DHG and W83667HG super I/O chips. We will refer to them collectively -as Winbond chips. +W83627DHG, W83627DHG-P and W83667HG super I/O chips. We will refer to them +collectively as Winbond chips. The chips implement three temperature sensors, five fan rotation speed sensors, ten analog voltage sensors (only nine for the 627DHG), one @@ -135,3 +139,6 @@ done in the driver for all register addresses. The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and the ICH8 southbridge gets that data via PECI from the DHG, so that the southbridge drives the fans. And the DHG supports SST, a one-wire serial bus. + +The DHG-P has an additional automatic fan speed control mode named Smart Fan +(TM) III+. This mode is not yet supported by the driver. -- cgit v1.1 From 09475d32e652fe60901fe8c9cd50f3f6db0c4933 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Mon, 15 Jun 2009 18:39:52 +0200 Subject: hwmon: (f71882fg) Add support for the F71858F Add support for the hwmon part of the Fintek F71858FG superio IC to the f71882fg driver. Many thanks to Jelle de Jong for lending me a motherboard with this superio on it. Signed-off-by: Hans de Goede Signed-off-by: Jean Delvare --- Documentation/hwmon/f71882fg | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/hwmon/f71882fg b/Documentation/hwmon/f71882fg index a832126..bee4c30 100644 --- a/Documentation/hwmon/f71882fg +++ b/Documentation/hwmon/f71882fg @@ -2,14 +2,18 @@ Kernel driver f71882fg ====================== Supported chips: - * Fintek F71882FG and F71883FG - Prefix: 'f71882fg' + * Fintek F71858FG + Prefix: 'f71858fg' Addresses scanned: none, address read from Super I/O config space Datasheet: Available from the Fintek website * Fintek F71862FG and F71863FG Prefix: 'f71862fg' Addresses scanned: none, address read from Super I/O config space Datasheet: Available from the Fintek website + * Fintek F71882FG and F71883FG + Prefix: 'f71882fg' + Addresses scanned: none, address read from Super I/O config space + Datasheet: Available from the Fintek website * Fintek F8000 Prefix: 'f8000' Addresses scanned: none, address read from Super I/O config space @@ -66,13 +70,13 @@ printed when loading the driver. Three different fan control modes are supported; the mode number is written to the pwm#_enable file. Note that not all modes are supported on all -chips, and some modes may only be available in RPM / PWM mode on the F8000. +chips, and some modes may only be available in RPM / PWM mode. Writing an unsupported mode will result in an invalid parameter error. * 1: Manual mode You ask for a specific PWM duty cycle / DC voltage or a specific % of fan#_full_speed by writing to the pwm# file. This mode is only - available on the F8000 if the fan channel is in RPM mode. + available on the F71858FG / F8000 if the fan channel is in RPM mode. * 2: Normal auto mode You can define a number of temperature/fan speed trip points, which % the -- cgit v1.1 From ce0879e324df007f12cc25227102847b92fcafb7 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Mon, 15 Jun 2009 15:36:38 +0200 Subject: rfkill: improve docs Now that the dust has settled a bit, improve the docs on rfkill and include more information about /dev/rfkill. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/rfkill.txt | 137 ++++++++++++++++++++++++----------------------- 1 file changed, 69 insertions(+), 68 deletions(-) (limited to 'Documentation') diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index 1b74b5f..c8acd86 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -3,9 +3,8 @@ rfkill - RF kill switch support 1. Introduction 2. Implementation details -3. Kernel driver guidelines -4. Kernel API -5. Userspace support +3. Kernel API +4. Userspace support 1. Introduction @@ -19,82 +18,62 @@ disable all transmitters of a certain type (or all). This is intended for situations where transmitters need to be turned off, for example on aircraft. +The rfkill subsystem has a concept of "hard" and "soft" block, which +differ little in their meaning (block == transmitters off) but rather in +whether they can be changed or not: + - hard block: read-only radio block that cannot be overriden by software + - soft block: writable radio block (need not be readable) that is set by + the system software. 2. Implementation details -The rfkill subsystem is composed of various components: the rfkill class, the -rfkill-input module (an input layer handler), and some specific input layer -events. - -The rfkill class is provided for kernel drivers to register their radio -transmitter with the kernel, provide methods for turning it on and off and, -optionally, letting the system know about hardware-disabled states that may -be implemented on the device. This code is enabled with the CONFIG_RFKILL -Kconfig option, which drivers can "select". - -The rfkill class code also notifies userspace of state changes, this is -achieved via uevents. It also provides some sysfs files for userspace to -check the status of radio transmitters. See the "Userspace support" section -below. +The rfkill subsystem is composed of three main components: + * the rfkill core, + * the deprecated rfkill-input module (an input layer handler, being + replaced by userspace policy code) and + * the rfkill drivers. +The rfkill core provides API for kernel drivers to register their radio +transmitter with the kernel, methods for turning it on and off and, letting +the system know about hardware-disabled states that may be implemented on +the device. -The rfkill-input code implements a basic response to rfkill buttons -- it -implements turning on/off all devices of a certain class (or all). +The rfkill core code also notifies userspace of state changes, and provides +ways for userspace to query the current states. See the "Userspace support" +section below. When the device is hard-blocked (either by a call to rfkill_set_hw_state() -or from query_hw_block) set_block() will be invoked but drivers can well -ignore the method call since they can use the return value of the function -rfkill_set_hw_state() to sync the software state instead of keeping track -of calls to set_block(). - - -The entire functionality is spread over more than one subsystem: - - * The kernel input layer generates KEY_WWAN, KEY_WLAN etc. and - SW_RFKILL_ALL -- when the user presses a button. Drivers for radio - transmitters generally do not register to the input layer, unless the - device really provides an input device (i.e. a button that has no - effect other than generating a button press event) - - * The rfkill-input code hooks up to these events and switches the soft-block - of the various radio transmitters, depending on the button type. - - * The rfkill drivers turn off/on their transmitters as requested. - - * The rfkill class will generate userspace notifications (uevents) to tell - userspace what the current state is. +or from query_hw_block) set_block() will be invoked for additional software +block, but drivers can ignore the method call since they can use the return +value of the function rfkill_set_hw_state() to sync the software state +instead of keeping track of calls to set_block(). In fact, drivers should +use the return value of rfkill_set_hw_state() unless the hardware actually +keeps track of soft and hard block separately. +3. Kernel API -3. Kernel driver guidelines - -Drivers for radio transmitters normally implement only the rfkill class. -These drivers may not unblock the transmitter based on own decisions, they -should act on information provided by the rfkill class only. +Drivers for radio transmitters normally implement an rfkill driver. Platform drivers might implement input devices if the rfkill button is just that, a button. If that button influences the hardware then you need to -implement an rfkill class instead. This also applies if the platform provides +implement an rfkill driver instead. This also applies if the platform provides a way to turn on/off the transmitter(s). -During suspend/hibernation, transmitters should only be left enabled when -wake-on wlan or similar functionality requires it and the device wasn't -blocked before suspend/hibernate. Note that it may be necessary to update -the rfkill subsystem's idea of what the current state is at resume time if -the state may have changed over suspend. - +For some platforms, it is possible that the hardware state changes during +suspend/hibernation, in which case it will be necessary to update the rfkill +core with the current state is at resume time. +To create an rfkill driver, driver's Kconfig needs to have -4. Kernel API + depends on RFKILL || !RFKILL -To build a driver with rfkill subsystem support, the driver should depend on -(or select) the Kconfig symbol RFKILL. - -The hardware the driver talks to may be write-only (where the current state -of the hardware is unknown), or read-write (where the hardware can be queried -about its current state). +to ensure the driver cannot be built-in when rfkill is modular. The !RFKILL +case allows the driver to be built when rfkill is not configured, which which +case all rfkill API can still be used but will be provided by static inlines +which compile to almost nothing. Calling rfkill_set_hw_state() when a state change happens is required from rfkill drivers that control devices that can be hard-blocked unless they also @@ -105,10 +84,33 @@ device). Don't do this unless you cannot get the event in any other way. 5. Userspace support -The following sysfs entries exist for every rfkill device: +The recommended userspace interface to use is /dev/rfkill, which is a misc +character device that allows userspace to obtain and set the state of rfkill +devices and sets of devices. It also notifies userspace about device addition +and removal. The API is a simple read/write API that is defined in +linux/rfkill.h, with one ioctl that allows turning off the deprecated input +handler in the kernel for the transition period. + +Except for the one ioctl, communication with the kernel is done via read() +and write() of instances of 'struct rfkill_event'. In this structure, the +soft and hard block are properly separated (unlike sysfs, see below) and +userspace is able to get a consistent snapshot of all rfkill devices in the +system. Also, it is possible to switch all rfkill drivers (or all drivers of +a specified type) into a state which also updates the default state for +hotplugged devices. + +After an application opens /dev/rfkill, it can read the current state of +all devices, and afterwards can poll the descriptor for hotplug or state +change events. + +Applications must ignore operations (the "op" field) they do not handle, +this allows the API to be extended in the future. + +Additionally, each rfkill device is registered in sysfs and there has the +following attributes: name: Name assigned by driver to this key (interface or driver name). - type: Name of the key type ("wlan", "bluetooth", etc). + type: Driver type string ("wlan", "bluetooth", etc). state: Current state of the transmitter 0: RFKILL_STATE_SOFT_BLOCKED transmitter is turned off by software @@ -117,7 +119,12 @@ The following sysfs entries exist for every rfkill device: 2: RFKILL_STATE_HARD_BLOCKED transmitter is forced off by something outside of the driver's control. - claim: 0: Kernel handles events (currently always reads that value) + This file is deprecated because it can only properly show + three of the four possible states, soft-and-hard-blocked is + missing. + claim: 0: Kernel handles events + This file is deprecated because there no longer is a way to + claim just control over a single rfkill instance. rfkill devices also issue uevents (with an action of "change"), with the following environment variables set: @@ -128,9 +135,3 @@ RFKILL_TYPE The contents of these variables corresponds to the "name", "state" and "type" sysfs files explained above. - -An alternative userspace interface exists as a misc device /dev/rfkill, -which allows userspace to obtain and set the state of rfkill devices and -sets of devices. It also notifies userspace about device addition and -removal. The API is a simple read/write API that is defined in -linux/rfkill.h. -- cgit v1.1 From 39f45d7bc0df52db8a7dd099ac02fef89514b7fc Mon Sep 17 00:00:00 2001 From: Michel Pollet Date: Wed, 20 May 2009 11:10:31 +0100 Subject: [ARM] MINI2440: Document the mini2440= kernel parameter Also explains why the touchscreen support is not present, and were to find a working one... Signed-off-by: Michel Pollet [ben-linux@fluff.org: Fix header description] Signed-off-by: Ben Dooks --- Documentation/kernel-parameters.txt | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5f66ba2..4b765b7 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1351,6 +1351,27 @@ and is between 256 and 4096 characters. It is defined in the file min_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory below this physical address is ignored. + mini2440= [ARM,HW,KNL] + Format:[0..2][b][c][t] + Default: "0tb" + MINI2440 configuration specification: + 0 - The attached screen is the 3.5" TFT + 1 - The attached screen is the 7" TFT + 2 - The VGA Shield is attached (1024x768) + Leaving out the screen size parameter will not load + the TFT driver, and the framebuffer will be left + unconfigured. + b - Enable backlight. The TFT backlight pin will be + linked to the kernel VESA blanking code and a GPIO + LED. This parameter is not necessary when using the + VGA shield. + c - Enable the s3c camera interface. + t - Reserved for enabling touchscreen support. The + touchscreen support is not enabled in the mainstream + kernel as of 2.6.30, a preliminary port can be found + in the "bleeding edge" mini2440 support kernel at + http://repo.or.cz/w/linux-2.6/mini2440.git + mminit_loglevel= [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this parameter allows control of the logging verbosity for -- cgit v1.1 From b22813b373749d0878e7140e9a6eadf182298709 Mon Sep 17 00:00:00 2001 From: Grant Likely Date: Fri, 6 Mar 2009 14:05:39 -0700 Subject: Driver Core: Warn driver authors about adding device attributes Add a blurb to the driver-model documentation about how (not) to add extra attributes to a struct device at driver probe time. Signed-off-by: Grant Likely Cc: Kay Sievers Signed-off-by: Greg Kroah-Hartman --- Documentation/driver-model/device.txt | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) (limited to 'Documentation') diff --git a/Documentation/driver-model/device.txt b/Documentation/driver-model/device.txt index a7cbfff..a124f31 100644 --- a/Documentation/driver-model/device.txt +++ b/Documentation/driver-model/device.txt @@ -162,3 +162,35 @@ device_remove_file(dev,&dev_attr_power); The file name will be 'power' with a mode of 0644 (-rw-r--r--). +Word of warning: While the kernel allows device_create_file() and +device_remove_file() to be called on a device at any time, userspace has +strict expectations on when attributes get created. When a new device is +registered in the kernel, a uevent is generated to notify userspace (like +udev) that a new device is available. If attributes are added after the +device is registered, then userspace won't get notified and userspace will +not know about the new attributes. + +This is important for device driver that need to publish additional +attributes for a device at driver probe time. If the device driver simply +calls device_create_file() on the device structure passed to it, then +userspace will never be notified of the new attributes. Instead, it should +probably use class_create() and class->dev_attrs to set up a list of +desired attributes in the modules_init function, and then in the .probe() +hook, and then use device_create() to create a new device as a child +of the probed device. The new device will generate a new uevent and +properly advertise the new attributes to userspace. + +For example, if a driver wanted to add the following attributes: +struct device_attribute mydriver_attribs[] = { + __ATTR(port_count, 0444, port_count_show), + __ATTR(serial_number, 0444, serial_number_show), + NULL +}; + +Then in the module init function is would do: + mydriver_class = class_create(THIS_MODULE, "my_attrs"); + mydriver_class.dev_attr = mydriver_attribs; + +And assuming 'dev' is the struct device passed into the probe hook, the driver +probe function would do something like: + create_device(&mydriver_class, dev, chrdev, &private_data, "my_name"); -- cgit v1.1 From 7fcab099795812a8a08eb3b8c8ddb35c3685045f Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Fri, 29 May 2009 11:33:19 +0800 Subject: driver core: fix documentation of request_firmware_nowait request_firmware_nowait declares it can be called in non-sleep contexts, but kthead_run called by request_firmware_nowait may sleep. So fix its documentation and comment to make callers clear about it. Signed-off-by: Ming Lei Signed-off-by: Greg Kroah-Hartman --- Documentation/firmware_class/README | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/firmware_class/README b/Documentation/firmware_class/README index c3480aa..7eceaff 100644 --- a/Documentation/firmware_class/README +++ b/Documentation/firmware_class/README @@ -77,7 +77,8 @@ seconds for the whole load operation. - request_firmware_nowait() is also provided for convenience in - non-user contexts. + user contexts to request firmware asynchronously, but can't be called + in atomic contexts. about in-kernel persistence: -- cgit v1.1 From 156f5a7801195fa2ce44aeeb62d6cf8468f3332a Mon Sep 17 00:00:00 2001 From: GeunSik Lim Date: Tue, 2 Jun 2009 15:01:37 +0900 Subject: debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/" directory name to mount debugfs filesystem for ftrace according to ./Documentation/tracers/ftrace.txt file. And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is existed in kernel source like ftrace, DRM, Wireless, Documentation, Network[sky2]files to mount debugfs filesystem. debugfs means debug filesystem for debugging easy to use by greg kroah hartman. "/sys/kernel/debug/" name is suitable as directory name of debugfs filesystem. - debugfs related reference: http://lwn.net/Articles/334546/ Fix inconsistency of directory name to mount debugfs filesystem. * From Steven Rostedt - find_debugfs() and tracing_files() in this patch. Signed-off-by: GeunSik Lim Acked-by : Inaky Perez-Gonzalez Reviewed-by : Steven Rostedt Reviewed-by : James Smart CC: Jiri Kosina CC: David Airlie CC: Peter Osterlund CC: Ananth N Mavinakayanahalli CC: Anil S Keshavamurthy CC: Masami Hiramatsu Signed-off-by: Greg Kroah-Hartman --- Documentation/DocBook/debugobjects.tmpl | 2 +- Documentation/cdrom/packet-writing.txt | 2 +- Documentation/fault-injection/fault-injection.txt | 70 +++---- Documentation/kprobes.txt | 6 +- Documentation/trace/ftrace.txt | 233 +++++++++++++--------- Documentation/trace/mmiotrace.txt | 26 +-- 6 files changed, 195 insertions(+), 144 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl index 7f5f218..08ff908 100644 --- a/Documentation/DocBook/debugobjects.tmpl +++ b/Documentation/DocBook/debugobjects.tmpl @@ -106,7 +106,7 @@ number of errors are printk'ed including a full stack trace. - The statistics are available via debugfs/debug_objects/stats. + The statistics are available via /sys/kernel/debug/debug_objects/stats. They provide information about the number of warnings and the number of successful fixups along with information about the usage of the internal tracking objects and the state of the diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.txt index cf1f812..1c40777 100644 --- a/Documentation/cdrom/packet-writing.txt +++ b/Documentation/cdrom/packet-writing.txt @@ -117,7 +117,7 @@ Using the pktcdvd debugfs interface To read pktcdvd device infos in human readable form, do: - # cat /debug/pktcdvd/pktcdvd[0-7]/info + # cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info For a description of the debugfs interface look into the file: diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt index 4bc374a..0793056 100644 --- a/Documentation/fault-injection/fault-injection.txt +++ b/Documentation/fault-injection/fault-injection.txt @@ -29,16 +29,16 @@ o debugfs entries fault-inject-debugfs kernel module provides some debugfs entries for runtime configuration of fault-injection capabilities. -- /debug/fail*/probability: +- /sys/kernel/debug/fail*/probability: likelihood of failure injection, in percent. Format: Note that one-failure-per-hundred is a very high error rate for some testcases. Consider setting probability=100 and configure - /debug/fail*/interval for such testcases. + /sys/kernel/debug/fail*/interval for such testcases. -- /debug/fail*/interval: +- /sys/kernel/debug/fail*/interval: specifies the interval between failures, for calls to should_fail() that pass all the other tests. @@ -46,18 +46,18 @@ configuration of fault-injection capabilities. Note that if you enable this, by setting interval>1, you will probably want to set probability=100. -- /debug/fail*/times: +- /sys/kernel/debug/fail*/times: specifies how many times failures may happen at most. A value of -1 means "no limit". -- /debug/fail*/space: +- /sys/kernel/debug/fail*/space: specifies an initial resource "budget", decremented by "size" on each call to should_fail(,size). Failure injection is suppressed until "space" reaches zero. -- /debug/fail*/verbose +- /sys/kernel/debug/fail*/verbose Format: { 0 | 1 | 2 } specifies the verbosity of the messages when failure is @@ -65,17 +65,17 @@ configuration of fault-injection capabilities. log line per failure; '2' will print a call trace too -- useful to debug the problems revealed by fault injection. -- /debug/fail*/task-filter: +- /sys/kernel/debug/fail*/task-filter: Format: { 'Y' | 'N' } A value of 'N' disables filtering by process (default). Any positive value limits failures to only processes indicated by /proc//make-it-fail==1. -- /debug/fail*/require-start: -- /debug/fail*/require-end: -- /debug/fail*/reject-start: -- /debug/fail*/reject-end: +- /sys/kernel/debug/fail*/require-start: +- /sys/kernel/debug/fail*/require-end: +- /sys/kernel/debug/fail*/reject-start: +- /sys/kernel/debug/fail*/reject-end: specifies the range of virtual addresses tested during stacktrace walking. Failure is injected only if some caller @@ -84,26 +84,26 @@ configuration of fault-injection capabilities. Default required range is [0,ULONG_MAX) (whole of virtual address space). Default rejected range is [0,0). -- /debug/fail*/stacktrace-depth: +- /sys/kernel/debug/fail*/stacktrace-depth: specifies the maximum stacktrace depth walked during search for a caller within [require-start,require-end) OR [reject-start,reject-end). -- /debug/fail_page_alloc/ignore-gfp-highmem: +- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: Format: { 'Y' | 'N' } default is 'N', setting it to 'Y' won't inject failures into highmem/user allocations. -- /debug/failslab/ignore-gfp-wait: -- /debug/fail_page_alloc/ignore-gfp-wait: +- /sys/kernel/debug/failslab/ignore-gfp-wait: +- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: Format: { 'Y' | 'N' } default is 'N', setting it to 'Y' will inject failures only into non-sleep allocations (GFP_ATOMIC allocations). -- /debug/fail_page_alloc/min-order: +- /sys/kernel/debug/fail_page_alloc/min-order: specifies the minimum page allocation order to be injected failures. @@ -166,13 +166,13 @@ o Inject slab allocation failures into module init/exit code #!/bin/bash FAILTYPE=failslab -echo Y > /debug/$FAILTYPE/task-filter -echo 10 > /debug/$FAILTYPE/probability -echo 100 > /debug/$FAILTYPE/interval -echo -1 > /debug/$FAILTYPE/times -echo 0 > /debug/$FAILTYPE/space -echo 2 > /debug/$FAILTYPE/verbose -echo 1 > /debug/$FAILTYPE/ignore-gfp-wait +echo Y > /sys/kernel/debug/$FAILTYPE/task-filter +echo 10 > /sys/kernel/debug/$FAILTYPE/probability +echo 100 > /sys/kernel/debug/$FAILTYPE/interval +echo -1 > /sys/kernel/debug/$FAILTYPE/times +echo 0 > /sys/kernel/debug/$FAILTYPE/space +echo 2 > /sys/kernel/debug/$FAILTYPE/verbose +echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait faulty_system() { @@ -217,20 +217,20 @@ then exit 1 fi -cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start -cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end +cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start +cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end -echo N > /debug/$FAILTYPE/task-filter -echo 10 > /debug/$FAILTYPE/probability -echo 100 > /debug/$FAILTYPE/interval -echo -1 > /debug/$FAILTYPE/times -echo 0 > /debug/$FAILTYPE/space -echo 2 > /debug/$FAILTYPE/verbose -echo 1 > /debug/$FAILTYPE/ignore-gfp-wait -echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem -echo 10 > /debug/$FAILTYPE/stacktrace-depth +echo N > /sys/kernel/debug/$FAILTYPE/task-filter +echo 10 > /sys/kernel/debug/$FAILTYPE/probability +echo 100 > /sys/kernel/debug/$FAILTYPE/interval +echo -1 > /sys/kernel/debug/$FAILTYPE/times +echo 0 > /sys/kernel/debug/$FAILTYPE/space +echo 2 > /sys/kernel/debug/$FAILTYPE/verbose +echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait +echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem +echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth -trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT +trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT echo "Injecting errors into the module $module... (interrupt to stop)" sleep 1000000 diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index 1e7a769..053037a 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -507,9 +507,9 @@ http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115) Appendix A: The kprobes debugfs interface With recent kernels (> 2.6.20) the list of registered kprobes is visible -under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug). +under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug). -/debug/kprobes/list: Lists all registered probes on the system +/sys/kernel/debug/kprobes/list: Lists all registered probes on the system c015d71a k vfs_read+0x0 c011a316 j do_fork+0x0 @@ -525,7 +525,7 @@ virtual addresses that correspond to modules that've been unloaded), such probes are marked with [GONE]. If the probe is temporarily disabled, such probes are marked with [DISABLED]. -/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly. +/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly. Provides a knob to globally and forcibly turn registered kprobes ON or OFF. By default, all kprobes are enabled. By echoing "0" to this file, all diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 7bd27f0..a39b3c7 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -7,7 +7,6 @@ Copyright 2008 Red Hat Inc. (dual licensed under the GPL v2) Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton, John Kacur, and David Teigland. - Written for: 2.6.28-rc2 Introduction @@ -33,13 +32,26 @@ The File System Ftrace uses the debugfs file system to hold the control files as well as the files to display output. -To mount the debugfs system: +When debugfs is configured into the kernel (which selecting any ftrace +option will do) the directory /sys/kernel/debug will be created. To mount +this directory, you can add to your /etc/fstab file: + + debugfs /sys/kernel/debug debugfs defaults 0 0 + +Or you can mount it at run time with: + + mount -t debugfs nodev /sys/kernel/debug - # mkdir /debug - # mount -t debugfs nodev /debug +For quicker access to that directory you may want to make a soft link to +it: -( Note: it is more common to mount at /sys/kernel/debug, but for - simplicity this document will use /debug) + ln -s /sys/kernel/debug /debug + +Any selected ftrace option will also create a directory called tracing +within the debugfs. The rest of the document will assume that you are in +the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate +on the files within that directory and not distract from the content with +the extended "/sys/kernel/debug/tracing" path name. That's it! (assuming that you have ftrace configured into your kernel) @@ -389,18 +401,18 @@ trace_options The trace_options file is used to control what gets printed in the trace output. To see what is available, simply cat the file: - cat /debug/tracing/trace_options + cat trace_options print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \ noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj To disable one of the options, echo in the option prepended with "no". - echo noprint-parent > /debug/tracing/trace_options + echo noprint-parent > trace_options To enable an option, leave off the "no". - echo sym-offset > /debug/tracing/trace_options + echo sym-offset > trace_options Here are the available options: @@ -476,11 +488,11 @@ sched_switch This tracer simply records schedule switches. Here is an example of how to use it. - # echo sched_switch > /debug/tracing/current_tracer - # echo 1 > /debug/tracing/tracing_enabled + # echo sched_switch > current_tracer + # echo 1 > tracing_enabled # sleep 1 - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/trace + # echo 0 > tracing_enabled + # cat trace # tracer: sched_switch # @@ -583,13 +595,13 @@ new trace is saved. To reset the maximum, echo 0 into tracing_max_latency. Here is an example: - # echo irqsoff > /debug/tracing/current_tracer - # echo 0 > /debug/tracing/tracing_max_latency - # echo 1 > /debug/tracing/tracing_enabled + # echo irqsoff > current_tracer + # echo 0 > tracing_max_latency + # echo 1 > tracing_enabled # ls -ltr [...] - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/latency_trace + # echo 0 > tracing_enabled + # cat latency_trace # tracer: irqsoff # irqsoff latency trace v1.1.5 on 2.6.26 @@ -690,13 +702,13 @@ Like the irqsoff tracer, it records the maximum latency for which preemption was disabled. The control of preemptoff tracer is much like the irqsoff tracer. - # echo preemptoff > /debug/tracing/current_tracer - # echo 0 > /debug/tracing/tracing_max_latency - # echo 1 > /debug/tracing/tracing_enabled + # echo preemptoff > current_tracer + # echo 0 > tracing_max_latency + # echo 1 > tracing_enabled # ls -ltr [...] - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/latency_trace + # echo 0 > tracing_enabled + # cat latency_trace # tracer: preemptoff # preemptoff latency trace v1.1.5 on 2.6.26-rc8 @@ -837,13 +849,13 @@ tracer. Again, using this trace is much like the irqsoff and preemptoff tracers. - # echo preemptirqsoff > /debug/tracing/current_tracer - # echo 0 > /debug/tracing/tracing_max_latency - # echo 1 > /debug/tracing/tracing_enabled + # echo preemptirqsoff > current_tracer + # echo 0 > tracing_max_latency + # echo 1 > tracing_enabled # ls -ltr [...] - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/latency_trace + # echo 0 > tracing_enabled + # cat latency_trace # tracer: preemptirqsoff # preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 @@ -999,12 +1011,12 @@ slightly differently than we did with the previous tracers. Instead of performing an 'ls', we will run 'sleep 1' under 'chrt' which changes the priority of the task. - # echo wakeup > /debug/tracing/current_tracer - # echo 0 > /debug/tracing/tracing_max_latency - # echo 1 > /debug/tracing/tracing_enabled + # echo wakeup > current_tracer + # echo 0 > tracing_max_latency + # echo 1 > tracing_enabled # chrt -f 5 sleep 1 - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/latency_trace + # echo 0 > tracing_enabled + # cat latency_trace # tracer: wakeup # wakeup latency trace v1.1.5 on 2.6.26-rc8 @@ -1114,11 +1126,11 @@ can be done from the debug file system. Make sure the ftrace_enabled is set; otherwise this tracer is a nop. # sysctl kernel.ftrace_enabled=1 - # echo function > /debug/tracing/current_tracer - # echo 1 > /debug/tracing/tracing_enabled + # echo function > current_tracer + # echo 1 > tracing_enabled # usleep 1 - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/trace + # echo 0 > tracing_enabled + # cat trace # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION @@ -1155,7 +1167,7 @@ int trace_fd; [...] int main(int argc, char *argv[]) { [...] - trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY); + trace_fd = open(tracing_file("tracing_enabled"), O_WRONLY); [...] if (condition_hit()) { write(trace_fd, "0", 1); @@ -1163,26 +1175,20 @@ int main(int argc, char *argv[]) { [...] } -Note: Here we hard coded the path name. The debugfs mount is not -guaranteed to be at /debug (and is more commonly at -/sys/kernel/debug). For simple one time traces, the above is -sufficent. For anything else, a search through /proc/mounts may -be needed to find where the debugfs file-system is mounted. - Single thread tracing --------------------- -By writing into /debug/tracing/set_ftrace_pid you can trace a +By writing into set_ftrace_pid you can trace a single thread. For example: -# cat /debug/tracing/set_ftrace_pid +# cat set_ftrace_pid no pid -# echo 3111 > /debug/tracing/set_ftrace_pid -# cat /debug/tracing/set_ftrace_pid +# echo 3111 > set_ftrace_pid +# cat set_ftrace_pid 3111 -# echo function > /debug/tracing/current_tracer -# cat /debug/tracing/trace | head +# echo function > current_tracer +# cat trace | head # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION @@ -1193,8 +1199,8 @@ no pid yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll -# echo -1 > /debug/tracing/set_ftrace_pid -# cat /debug/tracing/trace |head +# echo -1 > set_ftrace_pid +# cat trace |head # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION @@ -1216,6 +1222,51 @@ something like this simple program: #include #include +#define _STR(x) #x +#define STR(x) _STR(x) +#define MAX_PATH 256 + +const char *find_debugfs(void) +{ + static char debugfs[MAX_PATH+1]; + static int debugfs_found; + char type[100]; + FILE *fp; + + if (debugfs_found) + return debugfs; + + if ((fp = fopen("/proc/mounts","r")) == NULL) { + perror("/proc/mounts"); + return NULL; + } + + while (fscanf(fp, "%*s %" + STR(MAX_PATH) + "s %99s %*s %*d %*d\n", + debugfs, type) == 2) { + if (strcmp(type, "debugfs") == 0) + break; + } + fclose(fp); + + if (strcmp(type, "debugfs") != 0) { + fprintf(stderr, "debugfs not mounted"); + return NULL; + } + + debugfs_found = 1; + + return debugfs; +} + +const char *tracing_file(const char *file_name) +{ + static char trace_file[MAX_PATH+1]; + snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name); + return trace_file; +} + int main (int argc, char **argv) { if (argc < 1) @@ -1226,12 +1277,12 @@ int main (int argc, char **argv) char line[64]; int s; - ffd = open("/debug/tracing/current_tracer", O_WRONLY); + ffd = open(tracing_file("current_tracer"), O_WRONLY); if (ffd < 0) exit(-1); write(ffd, "nop", 3); - fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY); + fd = open(tracing_file("set_ftrace_pid"), O_WRONLY); s = sprintf(line, "%d\n", getpid()); write(fd, line, s); @@ -1383,22 +1434,22 @@ want, depending on your needs. tracing_cpu_mask file) or you might sometimes see unordered function calls while cpu tracing switch. - hide: echo nofuncgraph-cpu > /debug/tracing/trace_options - show: echo funcgraph-cpu > /debug/tracing/trace_options + hide: echo nofuncgraph-cpu > trace_options + show: echo funcgraph-cpu > trace_options - The duration (function's time of execution) is displayed on the closing bracket line of a function or on the same line than the current function in case of a leaf one. It is default enabled. - hide: echo nofuncgraph-duration > /debug/tracing/trace_options - show: echo funcgraph-duration > /debug/tracing/trace_options + hide: echo nofuncgraph-duration > trace_options + show: echo funcgraph-duration > trace_options - The overhead field precedes the duration field in case of reached duration thresholds. - hide: echo nofuncgraph-overhead > /debug/tracing/trace_options - show: echo funcgraph-overhead > /debug/tracing/trace_options + hide: echo nofuncgraph-overhead > trace_options + show: echo funcgraph-overhead > trace_options depends on: funcgraph-duration ie: @@ -1427,8 +1478,8 @@ want, depending on your needs. - The task/pid field displays the thread cmdline and pid which executed the function. It is default disabled. - hide: echo nofuncgraph-proc > /debug/tracing/trace_options - show: echo funcgraph-proc > /debug/tracing/trace_options + hide: echo nofuncgraph-proc > trace_options + show: echo funcgraph-proc > trace_options ie: @@ -1451,8 +1502,8 @@ want, depending on your needs. system clock since it started. A snapshot of this time is given on each entry/exit of functions - hide: echo nofuncgraph-abstime > /debug/tracing/trace_options - show: echo funcgraph-abstime > /debug/tracing/trace_options + hide: echo nofuncgraph-abstime > trace_options + show: echo funcgraph-abstime > trace_options ie: @@ -1549,7 +1600,7 @@ listed in: available_filter_functions - # cat /debug/tracing/available_filter_functions + # cat available_filter_functions put_prev_task_idle kmem_cache_create pick_next_task_rt @@ -1561,12 +1612,12 @@ mutex_lock If I am only interested in sys_nanosleep and hrtimer_interrupt: # echo sys_nanosleep hrtimer_interrupt \ - > /debug/tracing/set_ftrace_filter - # echo ftrace > /debug/tracing/current_tracer - # echo 1 > /debug/tracing/tracing_enabled + > set_ftrace_filter + # echo ftrace > current_tracer + # echo 1 > tracing_enabled # usleep 1 - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/trace + # echo 0 > tracing_enabled + # cat trace # tracer: ftrace # # TASK-PID CPU# TIMESTAMP FUNCTION @@ -1577,7 +1628,7 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt: To see which functions are being traced, you can cat the file: - # cat /debug/tracing/set_ftrace_filter + # cat set_ftrace_filter hrtimer_interrupt sys_nanosleep @@ -1597,7 +1648,7 @@ Note: It is better to use quotes to enclose the wild cards, otherwise the shell may expand the parameters into names of files in the local directory. - # echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter + # echo 'hrtimer_*' > set_ftrace_filter Produces: @@ -1618,7 +1669,7 @@ Produces: Notice that we lost the sys_nanosleep. - # cat /debug/tracing/set_ftrace_filter + # cat set_ftrace_filter hrtimer_run_queues hrtimer_run_pending hrtimer_init @@ -1644,17 +1695,17 @@ To append to the filters, use '>>' To clear out a filter so that all functions will be recorded again: - # echo > /debug/tracing/set_ftrace_filter - # cat /debug/tracing/set_ftrace_filter + # echo > set_ftrace_filter + # cat set_ftrace_filter # Again, now we want to append. - # echo sys_nanosleep > /debug/tracing/set_ftrace_filter - # cat /debug/tracing/set_ftrace_filter + # echo sys_nanosleep > set_ftrace_filter + # cat set_ftrace_filter sys_nanosleep - # echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter - # cat /debug/tracing/set_ftrace_filter + # echo 'hrtimer_*' >> set_ftrace_filter + # cat set_ftrace_filter hrtimer_run_queues hrtimer_run_pending hrtimer_init @@ -1677,7 +1728,7 @@ hrtimer_init_sleeper The set_ftrace_notrace prevents those functions from being traced. - # echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace + # echo '*preempt*' '*lock*' > set_ftrace_notrace Produces: @@ -1767,13 +1818,13 @@ the effect on the tracing is different. Every read from trace_pipe is consumed. This means that subsequent reads will be different. The trace is live. - # echo function > /debug/tracing/current_tracer - # cat /debug/tracing/trace_pipe > /tmp/trace.out & + # echo function > current_tracer + # cat trace_pipe > /tmp/trace.out & [1] 4153 - # echo 1 > /debug/tracing/tracing_enabled + # echo 1 > tracing_enabled # usleep 1 - # echo 0 > /debug/tracing/tracing_enabled - # cat /debug/tracing/trace + # echo 0 > tracing_enabled + # cat trace # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION @@ -1809,7 +1860,7 @@ number listed is the number of entries that can be recorded per CPU. To know the full size, multiply the number of possible CPUS with the number of entries. - # cat /debug/tracing/buffer_size_kb + # cat buffer_size_kb 1408 (units kilobytes) Note, to modify this, you must have tracing completely disabled. @@ -1817,18 +1868,18 @@ To do that, echo "nop" into the current_tracer. If the current_tracer is not set to "nop", an EINVAL error will be returned. - # echo nop > /debug/tracing/current_tracer - # echo 10000 > /debug/tracing/buffer_size_kb - # cat /debug/tracing/buffer_size_kb + # echo nop > current_tracer + # echo 10000 > buffer_size_kb + # cat buffer_size_kb 10000 (units kilobytes) The number of pages which will be allocated is limited to a percentage of available memory. Allocating too much will produce an error. - # echo 1000000000000 > /debug/tracing/buffer_size_kb + # echo 1000000000000 > buffer_size_kb -bash: echo: write error: Cannot allocate memory - # cat /debug/tracing/buffer_size_kb + # cat buffer_size_kb 85 ----------- diff --git a/Documentation/trace/mmiotrace.txt b/Documentation/trace/mmiotrace.txt index 5731c67..162effb 100644 --- a/Documentation/trace/mmiotrace.txt +++ b/Documentation/trace/mmiotrace.txt @@ -32,41 +32,41 @@ is no way to automatically detect if you are losing events due to CPUs racing. Usage Quick Reference --------------------- -$ mount -t debugfs debugfs /debug -$ echo mmiotrace > /debug/tracing/current_tracer -$ cat /debug/tracing/trace_pipe > mydump.txt & +$ mount -t debugfs debugfs /sys/kernel/debug +$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer +$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & Start X or whatever. -$ echo "X is up" > /debug/tracing/trace_marker -$ echo nop > /debug/tracing/current_tracer +$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker +$ echo nop > /sys/kernel/debug/tracing/current_tracer Check for lost events. Usage ----- -Make sure debugfs is mounted to /debug. If not, (requires root privileges) -$ mount -t debugfs debugfs /debug +Make sure debugfs is mounted to /sys/kernel/debug. If not, (requires root privileges) +$ mount -t debugfs debugfs /sys/kernel/debug Check that the driver you are about to trace is not loaded. Activate mmiotrace (requires root privileges): -$ echo mmiotrace > /debug/tracing/current_tracer +$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer Start storing the trace: -$ cat /debug/tracing/trace_pipe > mydump.txt & +$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & The 'cat' process should stay running (sleeping) in the background. Load the driver you want to trace and use it. Mmiotrace will only catch MMIO accesses to areas that are ioremapped while mmiotrace is active. During tracing you can place comments (markers) into the trace by -$ echo "X is up" > /debug/tracing/trace_marker +$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker This makes it easier to see which part of the (huge) trace corresponds to which action. It is recommended to place descriptive markers about what you do. Shut down mmiotrace (requires root privileges): -$ echo nop > /debug/tracing/current_tracer +$ echo nop > /sys/kernel/debug/tracing/current_tracer The 'cat' process exits. If it does not, kill it by issuing 'fg' command and pressing ctrl+c. @@ -78,10 +78,10 @@ to view your kernel log and look for "mmiotrace has lost events" warning. If events were lost, the trace is incomplete. You should enlarge the buffers and try again. Buffers are enlarged by first seeing how large the current buffers are: -$ cat /debug/tracing/buffer_size_kb +$ cat /sys/kernel/debug/tracing/buffer_size_kb gives you a number. Approximately double this number and write it back, for instance: -$ echo 128000 > /debug/tracing/buffer_size_kb +$ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb Then start again from the top. If you are doing a trace for a driver project, e.g. Nouveau, you should also -- cgit v1.1 From ebf58f70e853b9ffe50d6b194d3679b7dc2cac9c Mon Sep 17 00:00:00 2001 From: Theodore Kilgore Date: Sun, 5 Apr 2009 15:36:04 -0300 Subject: V4L/DVB (11483): gspca - mr97310a: Webcam 093a:010f added. Signed-off-by: Theodore Kilgore Signed-off-by: Jean-Francois Moine Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/gspca.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 98529e0..e418a8f 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -209,6 +209,7 @@ sunplus 08ca:2050 Medion MD 41437 sunplus 08ca:2060 Aiptek PocketDV5300 tv8532 0923:010f ICM532 cams mars 093a:050f Mars-Semi Pc-Camera +mr97310a 093a:010f Sakar Digital no. 77379 pac207 093a:2460 Qtec Webcam 100 pac207 093a:2461 HP Webcam pac207 093a:2463 Philips SPC 220 NC -- cgit v1.1 From e5db5d44432abc82b1250dd05bd0a4b011392d9d Mon Sep 17 00:00:00 2001 From: Douglas Schilling Landgraf Date: Thu, 9 Apr 2009 18:24:34 -0300 Subject: V4L/DVB (11486): em28xx: Add EmpireTV board support Added EmpireTV entry. Thanks to Xwang to provide data for this board. Signed-off-by: Douglas Schilling Landgraf Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 78d0a6e..bf58d95 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -61,3 +61,4 @@ 63 -> Kaiomy TVnPC U2 (em2860) [eb1a:e303] 64 -> Easy Cap Capture DC-60 (em2860) 65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515] + 66 -> Empire dual TV (em2880) -- cgit v1.1 From d852d53dcd1f4c353d54cc055eb23cdaad18c906 Mon Sep 17 00:00:00 2001 From: Joseba Goitia Gandiaga Date: Thu, 9 Apr 2009 18:29:16 -0300 Subject: V4L/DVB (11488): get_dvb_firmware: trivial url change Trivial url changes in script Signed-off-by: Joseba Goitia Gandiaga Signed-off-by: Douglas Schilling Landgraf Signed-off-by: Mauro Carvalho Chehab --- Documentation/dvb/get_dvb_firmware | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 2f21ecd..815a5784 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -112,7 +112,7 @@ sub tda10045 { sub tda10046 { my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip"; - my $url = "http://technotrend-online.com/download/software/219/$sourcefile"; + my $url = "http://www.tt-download.com/download/updates/219/$sourcefile"; my $hash = "6a7e1e2f2644b162ff0502367553c72d"; my $outfile = "dvb-fe-tda10046.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); @@ -129,8 +129,8 @@ sub tda10046 { } sub tda10046lifeview { - my $sourcefile = "Drv_2.11.02.zip"; - my $url = "http://www.lifeview.com.tw/drivers/pci_card/FlyDVB-T/$sourcefile"; + my $sourcefile = "7%5Cdrv_2.11.02.zip"; + my $url = "http://www.lifeview.hk/dbimages/document/$sourcefile"; my $hash = "1ea24dee4eea8fe971686981f34fd2e0"; my $outfile = "dvb-fe-tda10046.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); -- cgit v1.1 From df0dbbe24053b7c669f63341d3d3f090560c3217 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Wed, 8 Apr 2009 14:01:19 -0300 Subject: V4L/DVB (11442): saa7134: BZ#7524: Add AVerTV Studio 507UA support [mchehab@redhat.com: Fix merge conflicts and CodingStyle issues] Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.saa7134 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 6dacf28..8a15e2b 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -155,3 +155,4 @@ 154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d] 155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708] 156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a] +157 -> Avermedia AVerTV Studio 507UA [1461:a11b] -- cgit v1.1 From d46de9d2364cad55caddc04632707f5739b4cd87 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Old=C5=99ich=20Jedli=C4=8Dka?= Date: Tue, 14 Apr 2009 15:47:17 -0300 Subject: V4L/DVB (11567): saa7134: Added support for AVerMedia Cardbus Plus MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Here comes the full support for AVerMedia Cardbus Plus (E501R) - including remote control. TV, Composite and FM radio tested, I don't have S-Video to test. I've figured out that the radio works only with xtal frequency 13MHz. [mchehab@redhat.com: CodingStyle fixes] Signed-off-by: Oldřich Jedlička Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.saa7134 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 8a15e2b..fb3098d 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -156,3 +156,4 @@ 155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708] 156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a] 157 -> Avermedia AVerTV Studio 507UA [1461:a11b] +158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9] -- cgit v1.1 From 84d728c3df9931d1937e4a76324838ce065c521e Mon Sep 17 00:00:00 2001 From: Dmitri Belimov Date: Thu, 23 Apr 2009 02:32:49 -0300 Subject: V4L/DVB (11604): saa7134: split Behold`s card entries to properly identify the model Split Beholdr`s cards to correct models. Signed-off-by: Beholder Intl. Ltd. Dmitry Belimov Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.saa7134 | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index fb3098d..112ff4f 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -124,10 +124,10 @@ 123 -> Beholder BeholdTV 407 [0000:4070] 124 -> Beholder BeholdTV 407 FM [0000:4071] 125 -> Beholder BeholdTV 409 [0000:4090] -126 -> Beholder BeholdTV 505 FM/RDS [0000:5051,0000:505B,5ace:5050] -127 -> Beholder BeholdTV 507 FM/RDS / BeholdTV 509 FM [0000:5071,0000:507B,5ace:5070,5ace:5090] +126 -> Beholder BeholdTV 505 FM [5ace:5050] +127 -> Beholder BeholdTV 507 FM / BeholdTV 509 FM [5ace:5070,5ace:5090] 128 -> Beholder BeholdTV Columbus TVFM [0000:5201] -129 -> Beholder BeholdTV 607 / BeholdTV 609 [5ace:6070,5ace:6071,5ace:6072,5ace:6073,5ace:6090,5ace:6091,5ace:6092,5ace:6093] +129 -> Beholder BeholdTV 607 FM [5ace:6070] 130 -> Beholder BeholdTV M6 [5ace:6190] 131 -> Twinhan Hybrid DTV-DVB 3056 PCI [1822:0022] 132 -> Genius TVGO AM11MCE @@ -157,3 +157,13 @@ 156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a] 157 -> Avermedia AVerTV Studio 507UA [1461:a11b] 158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9] +159 -> Beholder BeholdTV 505 RDS [0000:505B] +160 -> Beholder BeholdTV 507 RDS [0000:5071] +161 -> Beholder BeholdTV 507 RDS [0000:507B] +162 -> Beholder BeholdTV 607 FM [5ace:6071] +163 -> Beholder BeholdTV 609 FM [5ace:6090] +164 -> Beholder BeholdTV 609 FM [5ace:6091] +165 -> Beholder BeholdTV 607 RDS [5ace:6072] +166 -> Beholder BeholdTV 607 RDS [5ace:6073] +167 -> Beholder BeholdTV 609 RDS [5ace:6092] +168 -> Beholder BeholdTV 609 RDS [5ace:6093] -- cgit v1.1 From 56600093644c6929a7d1809dab5b8265532df045 Mon Sep 17 00:00:00 2001 From: Robert Jarzmik Date: Fri, 24 Apr 2009 12:58:35 -0300 Subject: V4L/DVB (11613): pxa_camera: Documentation of the FSM After DMA redesign, the pxa_camera dynamic behaviour should be documented so that future contributors understand how it works, and improve it. Signed-off-by: Robert Jarzmik Signed-off-by: Guennadi Liakhovetski Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/pxa_camera.txt | 49 ++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/pxa_camera.txt b/Documentation/video4linux/pxa_camera.txt index b1137f9..4f6d0ca 100644 --- a/Documentation/video4linux/pxa_camera.txt +++ b/Documentation/video4linux/pxa_camera.txt @@ -26,6 +26,55 @@ Global video workflow Once the last buffer is filled in, the QCI interface stops. + c) Capture global finite state machine schema + + +----+ +---+ +----+ + | DQ | | Q | | DQ | + | v | v | v + +-----------+ +------------------------+ + | STOP | | Wait for capture start | + +-----------+ Q +------------------------+ ++-> | QCI: stop | ------------------> | QCI: run | <------------+ +| | DMA: stop | | DMA: stop | | +| +-----------+ +-----> +------------------------+ | +| / | | +| / +---+ +----+ | | +|capture list empty / | Q | | DQ | | QCI Irq EOF | +| / | v | v v | +| +--------------------+ +----------------------+ | +| | DMA hotlink missed | | Capture running | | +| +--------------------+ +----------------------+ | +| | QCI: run | +-----> | QCI: run | <-+ | +| | DMA: stop | / | DMA: run | | | +| +--------------------+ / +----------------------+ | Other | +| ^ /DMA still | | channels | +| | capture list / running | DMA Irq End | not | +| | not empty / | | finished | +| | / v | yet | +| +----------------------+ +----------------------+ | | +| | Videobuf released | | Channel completed | | | +| +----------------------+ +----------------------+ | | ++-- | QCI: run | | QCI: run | --+ | + | DMA: run | | DMA: run | | + +----------------------+ +----------------------+ | + ^ / | | + | no overrun / | overrun | + | / v | + +--------------------+ / +----------------------+ | + | Frame completed | / | Frame overran | | + +--------------------+ <-----+ +----------------------+ restart frame | + | QCI: run | | QCI: stop | --------------+ + | DMA: run | | DMA: stop | + +--------------------+ +----------------------+ + + Legend: - each box is a FSM state + - each arrow is the condition to transition to another state + - an arrow with a comment is a mandatory transition (no condition) + - arrow "Q" means : a buffer was enqueued + - arrow "DQ" means : a buffer was dequeued + - "QCI: stop" means the QCI interface is not enabled + - "DMA: stop" means all 3 DMA channels are stopped + - "DMA: run" means at least 1 DMA channel is still running DMA usage --------- -- cgit v1.1 From 501d8cd4e248feebd465b016a7d5b7bc084f5f1f Mon Sep 17 00:00:00 2001 From: Steven Toth Date: Sat, 28 Mar 2009 14:22:21 -0300 Subject: V4L/DVB (11665): cx88: Add support for the Hauppauge IROnly board. cx88: Add support for the Hauppauge IROnly board. Signed-off-by: Steven Toth Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx88 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 71e9db0..80527b2 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -78,3 +78,4 @@ 77 -> TBS 8910 DVB-S [8910:8888] 78 -> Prof 6200 DVB-S [b022:3022] 79 -> Terratec Cinergy HT PCI MKII [153b:1177] + 80 -> Hauppauge WinTV-IR Only [0070:9290] -- cgit v1.1 From 102e78136446faca7d7d241b628c5bd0e0d61d5d Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Sat, 2 May 2009 10:12:50 -0300 Subject: V4L/DVB (11671): v4l2: add v4l2_device_set_name() Add a utility function that can be used to setup the v4l2_device's name field in a standard manner. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/v4l2-framework.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/v4l2-framework.txt b/Documentation/video4linux/v4l2-framework.txt index 854808b..d54c1e4 100644 --- a/Documentation/video4linux/v4l2-framework.txt +++ b/Documentation/video4linux/v4l2-framework.txt @@ -89,6 +89,11 @@ from dev (driver name followed by the bus_id, to be precise). If you set it up before calling v4l2_device_register then it will be untouched. If dev is NULL, then you *must* setup v4l2_dev->name before calling v4l2_device_register. +You can use v4l2_device_set_name() to set the name based on a driver name and +a driver-global atomic_t instance. This will generate names like ivtv0, ivtv1, +etc. If the name ends with a digit, then it will insert a dash: cx18-0, +cx18-1, etc. This function returns the instance number. + The first 'dev' argument is normally the struct device pointer of a pci_dev, usb_interface or platform_device. It is rare for dev to be NULL, but it happens with ISA devices or when one device creates multiple PCI devices, thus making -- cgit v1.1 From ceec80e5a52580bd7b257c14c6c8355be58c971f Mon Sep 17 00:00:00 2001 From: Jean-Francois Moine Date: Sat, 9 May 2009 06:21:35 -0300 Subject: V4L/DVB (11717): gspca - sonixj: Webcams with bridge sn9c128 added Signed-off-by: Jean-Francois Moine Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/gspca.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index e418a8f..fd07d33 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -266,6 +266,11 @@ sonixj 0c45:60ec SN9C105+MO4000 sonixj 0c45:60fb Surfer NoName sonixj 0c45:60fc LG-LIC300 sonixj 0c45:60fe Microdia Audio +sonixj 0c45:6100 PC Camera (SN9C128) +sonixj 0c45:610a PC Camera (SN9C128) +sonixj 0c45:610b PC Camera (SN9C128) +sonixj 0c45:610c PC Camera (SN9C128) +sonixj 0c45:610e PC Camera (SN9C128) sonixj 0c45:6128 Microdia/Sonix SNP325 sonixj 0c45:612a Avant Camera sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix -- cgit v1.1 From f8eaaf4f2a2810d6e486da2916ef07f7e00665c9 Mon Sep 17 00:00:00 2001 From: Jani Monoses Date: Thu, 7 May 2009 03:32:27 -0300 Subject: V4L/DVB (11720): gspca - sonixj: Webcam 06f8:3008 added Signed-off-by: Jani Monoses Signed-off-by: Jean-Francois Moine Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/gspca.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index fd07d33..1bb103f 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -178,6 +178,7 @@ spca506 06e1:a190 ADS Instant VCD ov534 06f8:3002 Hercules Blog Webcam ov534 06f8:3003 Hercules Dualpix HD Weblog sonixj 06f8:3004 Hercules Classic Silver +sonixj 06f8:3008 Hercules Deluxe Optical Glass spca508 0733:0110 ViewQuest VQ110 spca508 0130:0130 Clone Digital Webcam 11043 spca501 0733:0401 Intel Create and Share -- cgit v1.1 From 2074dffaedebbf5a8468fd37855d6d94ba34041c Mon Sep 17 00:00:00 2001 From: Steven Toth Date: Sat, 2 May 2009 11:39:46 -0300 Subject: V4L/DVB (11767): cx23885: Add preliminary support for the HVR1270 The patch means the board will be recognised, and the parts brought out of reset correctly. This patches depends on the centralized GPIO patch to be merged. What's missing before the HVR-1270 will function for DTV? The model# needs to be added to avoid 'unknown model' output and the LG3305/Tuner need to be attached in cx23885-dvb.c Signed-off-by: Steven Toth Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx23885 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 91aa3c0..9f07a6a 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -16,3 +16,4 @@ 15 -> TeVii S470 [d470:9022] 16 -> DVBWorld DVB-S2 2005 [0001:2005] 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c] + 18 -> Hauppauge WinTV-HVR1270 [0070:2211] -- cgit v1.1 From d099becb0bd7ee01a13d58371b4ea5a2f7052c04 Mon Sep 17 00:00:00 2001 From: Michael Krufky Date: Fri, 8 May 2009 22:39:24 -0300 Subject: V4L/DVB (11769): cx23885: add ATSC/QAM tuning support for Hauppauge WinTV-HVR1275 Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx23885 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 9f07a6a..ce62f5a 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -17,3 +17,4 @@ 16 -> DVBWorld DVB-S2 2005 [0001:2005] 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c] 18 -> Hauppauge WinTV-HVR1270 [0070:2211] + 19 -> Hauppauge WinTV-HVR1275 [0070:2215] -- cgit v1.1 From 19bc57968cc854c7da4846c21b3ef2a39e43f97d Mon Sep 17 00:00:00 2001 From: Michael Krufky Date: Fri, 8 May 2009 16:05:29 -0300 Subject: V4L/DVB (11770): cx23885: add ATSC/QAM tuning support for Hauppauge WinTV-HVR1255 Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx23885 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index ce62f5a..e7ed710 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -18,3 +18,4 @@ 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c] 18 -> Hauppauge WinTV-HVR1270 [0070:2211] 19 -> Hauppauge WinTV-HVR1275 [0070:2215] + 20 -> Hauppauge WinTV-HVR1255 [0070:2251] -- cgit v1.1 From 6b926eca9824568b18825d3eade5fb39e3b5a9fb Mon Sep 17 00:00:00 2001 From: Michael Krufky Date: Tue, 12 May 2009 17:32:17 -0300 Subject: V4L/DVB (11771): cx23885: add DVB-T tuning support for Hauppauge WinTV-HVR1210 Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx23885 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index e7ed710..948e436 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -19,3 +19,4 @@ 18 -> Hauppauge WinTV-HVR1270 [0070:2211] 19 -> Hauppauge WinTV-HVR1275 [0070:2215] 20 -> Hauppauge WinTV-HVR1255 [0070:2251] + 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295] -- cgit v1.1 From 8475cbcb0f885189969915eb3680d10fc525d722 Mon Sep 17 00:00:00 2001 From: Dmitri Belimov Date: Mon, 11 May 2009 08:16:06 -0300 Subject: V4L/DVB (11775): tuner: add support Philips MK5 tuner Signed-off-by: Beholder Intl. Ltd. Dmitry Belimov Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.tuner | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index 691d2f3..22c1c55 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -76,3 +76,4 @@ tuner=75 - Philips TEA5761 FM Radio tuner=76 - Xceive 5000 tuner tuner=77 - TCL tuner MF02GIP-5N-E tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner +tuner=79 - Philips PAL/SECAM multi (FM1216 MK5) -- cgit v1.1 From 1010ed132727bbf486ac28fd149ccfb0ef5cd2ab Mon Sep 17 00:00:00 2001 From: "Cohen David.A" Date: Mon, 11 May 2009 11:00:20 -0300 Subject: V4L/DVB (11840): change kmalloc to vmalloc for sglist allocation in videobuf_dma_map/unmap Change kmalloc()/kfree() to vmalloc()/vfree() for sglist allocation during videobuf_dma_map() and videobuf_dma_unmap() High resolution sensors might require too many contiguous pages to be allocated for sglist by kmalloc() during videobuf_dma_map() (i.e. 256Kib for 8MP sensor). In such situations, kmalloc() could face some problem to find the required free memory. vmalloc() is a safer solution instead, as the allocated memory does not need to be contiguous. Signed-off-by: David Cohen Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index bf58d95..20aa65a 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -62,3 +62,5 @@ 64 -> Easy Cap Capture DC-60 (em2860) 65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515] 66 -> Empire dual TV (em2880) + 67 -> Terratec Grabby (em2860) [0ccd:0096] + 68 -> Terratec AV350 (em2860) [0ccd:0084] -- cgit v1.1 From 1bc7f51c57c52cfac1a455d8f8ef99703e719e55 Mon Sep 17 00:00:00 2001 From: Michael Krufky Date: Mon, 19 Jan 2009 01:10:49 -0300 Subject: V4L/DVB (11861): saa7134: enable digital tv support for Hauppauge WinTV-HVR1110r3 Signed-off-by: Michael Krufky Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.saa7134 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 112ff4f..e56934e 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -154,7 +154,7 @@ 153 -> Kworld Plus TV Analog Lite PCI [17de:7128] 154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d] 155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708] -156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a] +156 -> Hauppauge WinTV-HVR1110r3 DVB-T/Hybrid [0070:6707,0070:6709,0070:670a] 157 -> Avermedia AVerTV Studio 507UA [1461:a11b] 158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9] 159 -> Beholder BeholdTV 505 RDS [0000:505B] -- cgit v1.1 From 3047a17639d499691c657772667f2c1e65edabfb Mon Sep 17 00:00:00 2001 From: Miroslav Sustek Date: Sun, 31 May 2009 16:47:28 -0300 Subject: V4L/DVB (11879): Adds support for Leadtek WinFast DTV-1800H Enables analog/digital tv, radio and remote control (gpio). Tested-by: Marcin Wojcikowski Tested-by: Karel Juhanak Tested-by: Andrew Goff Tested-by: Jan Novak Signed-off-by: Miroslav Sustek Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx88 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 80527b2..89093f5 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -79,3 +79,4 @@ 78 -> Prof 6200 DVB-S [b022:3022] 79 -> Terratec Cinergy HT PCI MKII [153b:1177] 80 -> Hauppauge WinTV-IR Only [0070:9290] + 81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654] -- cgit v1.1 From 493b7127aa56d0a5c041797639bf543d96f6261b Mon Sep 17 00:00:00 2001 From: David Wong Date: Mon, 18 May 2009 05:25:49 -0300 Subject: V4L/DVB (11880): cx23885: support for card Mygica X8506 DMB-TH This patch add cx23885 support for card "Mygica X8506 DMB-TH". It should work on "Magic-Pro ProHDTV Extreme" as well, as they are same hardware with different branding. Sign-off-by: David T.L. Wong Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx23885 | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 948e436..450b8f8 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -20,3 +20,4 @@ 19 -> Hauppauge WinTV-HVR1275 [0070:2215] 20 -> Hauppauge WinTV-HVR1255 [0070:2251] 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295] + 22 -> Mygica X8506 DMB-TH [14f1:8651] -- cgit v1.1 From bfe5a7401785f864a4f30b4f066bfa045a28b9f6 Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Fri, 24 Apr 2009 10:45:31 +0800 Subject: PCI: PCIE AER: Document for PCIE AER software error injection This patch adds a minimal HOWTO for PCIE AER software error injection in Documentation/PCI/pcieaer-howto.txt. Signed-off-by: Huang Ying Signed-off-by: Jesse Barnes --- Documentation/PCI/pcieaer-howto.txt | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'Documentation') diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.txt index ddeb14b..f6b1ba7 100644 --- a/Documentation/PCI/pcieaer-howto.txt +++ b/Documentation/PCI/pcieaer-howto.txt @@ -246,3 +246,24 @@ with the PCI Express AER Root driver? A: It could call the helper functions to enable AER in devices and cleanup uncorrectable status register. Pls. refer to section 3.3. + +4. Software error injection + +Debugging PCIE AER error recovery code is quite difficult because it +is hard to trigger real hardware errors. Software based error +injection can be used to fake various kinds of PCIE errors. + +First you should enable PCIE AER software error injection in kernel +configuration, that is, following item should be in your .config. + +CONFIG_PCIEAER_INJECT=y or CONFIG_PCIEAER_INJECT=m + +After reboot with new kernel or insert the module, a device file named +/dev/aer_inject should be created. + +Then, you need a user space tool named aer-inject, which can be gotten +from: + http://www.kernel.org/pub/linux/kernel/people/yhuang/ + +More information about aer-inject can be found in the document comes +with its source code. -- cgit v1.1 From 1298307736a5882352418b4dc7c65442ab3b8bd4 Mon Sep 17 00:00:00 2001 From: Andreas Herrmann Date: Sun, 7 Jun 2009 16:15:16 +0200 Subject: x86/PCI: add description for check_enable_amd_mmconf boot parameter Describe check_enable_amd_mmconf. Signed-off-by: Andreas Herrmann Signed-off-by: Jesse Barnes --- Documentation/kernel-parameters.txt | 3 +++ 1 file changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 395d1a0..8a82489 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1735,6 +1735,9 @@ and is between 256 and 4096 characters. It is defined in the file root domains (aka PCI segments, in ACPI-speak). nommconf [X86] Disable use of MMCONFIG for PCI Configuration + check_enable_amd_mmconf [X86] check for and enable + properly configured MMIO access to PCI + config space on AMD family 10h CPU nomsi [MSI] If the PCI_MSI kernel config parameter is enabled, this kernel boot option can be used to disable the use of MSI interrupts system-wide. -- cgit v1.1 From c825bc94c8c1908750ab20413eb639c6be029e2d Mon Sep 17 00:00:00 2001 From: Kenji Kaneshige Date: Tue, 16 Jun 2009 11:01:25 +0900 Subject: PCI hotplug: create symlink to hotplug driver module Create symbolic link to hotplug driver module in the PCI slot directory (/sys/bus/pci/slots/). In the past, we need to load hotplug drivers one by one to identify the hotplug driver that handles the slot, and it was very inconvenient especially for trouble shooting. With this change, we can easily identify the hotplug driver. Signed-off-by: Taku Izumi Signed-off-by: Kenji Kaneshige Reviewed-by: Alex Chiang Signed-off-by: Jesse Barnes --- Documentation/ABI/testing/sysfs-bus-pci | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 97ad190..6bf6805 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -122,3 +122,10 @@ Description: This symbolic link appears when a device is a Virtual Function. The symbolic link points to the PCI device sysfs entry of the Physical Function this device associates with. + +What: /sys/bus/pci/slots/.../module +Date: June 2009 +Contact: linux-pci@vger.kernel.org +Description: + This symbolic link points to the PCI hotplug controller driver + module that manages the hotplug slot. -- cgit v1.1 From 28eb27cf0839a30948335f9b2edda739f48b7a2e Mon Sep 17 00:00:00 2001 From: "Zhang, Yanmin" Date: Tue, 16 Jun 2009 13:35:11 +0800 Subject: PCI AER: support invalid error source IDs When the bus id part of error source id is equal to 0 or nosourceid=1, make the kernel probe the AER status registers of all devices under the root port to find the initial error reporter. Reviewed-by: Andrew Patterson Signed-off-by: Zhang Yanmin Signed-off-by: Jesse Barnes --- Documentation/PCI/pcieaer-howto.txt | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.txt index f6b1ba7..5408b9b 100644 --- a/Documentation/PCI/pcieaer-howto.txt +++ b/Documentation/PCI/pcieaer-howto.txt @@ -61,6 +61,10 @@ be initiated although firmwares have no _OSC support. To enable the walkaround, pls. add aerdriver.forceload=y to kernel boot parameter line when booting kernel. Note that forceload=n by default. +nosourceid, another parameter of type bool, can be used when broken +hardware (mostly chipsets) has root ports that cannot obtain the reporting +source ID. nosourceid=n by default. + 2.3 AER error output When a PCI-E AER error is captured, an error message will be outputed to console. If it's a correctable error, it is outputed as a warning. -- cgit v1.1 From c465def6bfe834b62623caa9b98f2d4f4739875a Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Mon, 15 Jun 2009 10:42:57 +0800 Subject: PCI AER: software error injection Debugging PCIE AER code can be very difficult because it is hard to trigger various real hardware errors. This patch provide a software based error injection tool, which can fake various PCIE errors with a user space helper tool named "aer-inject". Which can be gotten from: http://www.kernel.org/pub/linux/kernel/people/yhuang/ The patch fakes AER error by faking some PCIE AER related registers and an AER interrupt for specified the PCIE device. Signed-off-by: Huang Ying Signed-off-by: Jesse Barnes --- Documentation/PCI/pcieaer-howto.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/PCI/pcieaer-howto.txt b/Documentation/PCI/pcieaer-howto.txt index 5408b9b..be21001 100644 --- a/Documentation/PCI/pcieaer-howto.txt +++ b/Documentation/PCI/pcieaer-howto.txt @@ -267,7 +267,7 @@ After reboot with new kernel or insert the module, a device file named Then, you need a user space tool named aer-inject, which can be gotten from: - http://www.kernel.org/pub/linux/kernel/people/yhuang/ + http://www.kernel.org/pub/linux/utils/pci/aer-inject/ More information about aer-inject can be found in the document comes with its source code. -- cgit v1.1 From 3ed58baf5db4eab553803916a990a3dbca4dc611 Mon Sep 17 00:00:00 2001 From: Devin Heitmueller Date: Wed, 27 May 2009 23:27:26 -0300 Subject: V4L/DVB (11925): em28xx: Add support for the K-World 2800d Make the KWorld 2800d work properly. In this case, that means making the profile more generic so that it works for both the Pointnix Intra-Oral USB camera and the KWorld device. The device provides the audio through a pass-thru cable, so we don't need an actual audio capture profile (neither the K-World device nor the Pointnix have an onboard audio decoder). Thanks to Paul Thomas for providing sample hardware. Cc: Paul Thomas Signed-off-by: Devin Heitmueller Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 20aa65a..b03a685 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -17,7 +17,7 @@ 16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b] 17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227] 18 -> Hauppauge WinTV HVR 900 (R2) (em2880) [2040:6502] - 19 -> PointNix Intra-Oral Camera (em2860) + 19 -> EM2860/SAA711X Reference Design (em2860) 20 -> AMD ATI TV Wonder HD 600 (em2880) [0438:b002] 21 -> eMPIA Technology, Inc. GrabBeeX+ Video Encoder (em2800) [eb1a:2801] 22 -> Unknown EM2750/EM2751 webcam grabber (em2750) [eb1a:2750,eb1a:2751] -- cgit v1.1 From 5ddc9b100fc96e8f3c6d435cecd9d09e5b9673f9 Mon Sep 17 00:00:00 2001 From: Andy Walls Date: Sun, 7 Jun 2009 21:39:03 -0300 Subject: V4L/DVB (11933): tuner-simple, tveeprom: Add Philips FQ1216LME MK3 analog tuner Signed-off-by: Andy Walls Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.tuner | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index 22c1c55..be67844 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -77,3 +77,4 @@ tuner=76 - Xceive 5000 tuner tuner=77 - TCL tuner MF02GIP-5N-E tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner tuner=79 - Philips PAL/SECAM multi (FM1216 MK5) +tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough -- cgit v1.1 From f6a061bb0f143ff40070e6fd3d38fde5bd60027c Mon Sep 17 00:00:00 2001 From: Jan Ceuleers Date: Thu, 11 Jun 2009 16:20:23 -0300 Subject: V4L/DVB (11962): dvb: Fix broken link in get_dvb_firmware for nxt2004 (A180) Due to a reorganisation of AVermedia's websites, get_dvb_firmware no longer works for nxt2004. Fix it. Signed-off-by: Jan Ceuleers Signed-off-by: Douglas Schilling Landgraf Signed-off-by: Mauro Carvalho Chehab --- Documentation/dvb/get_dvb_firmware | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 815a5784..a52adfc 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -317,7 +317,7 @@ sub nxt2002 { sub nxt2004 { my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip"; - my $url = "http://www.aver.com/support/Drivers/$sourcefile"; + my $url = "http://www.avermedia-usa.com/support/Drivers/$sourcefile"; my $hash = "111cb885b1e009188346d72acfed024c"; my $outfile = "dvb-fe-nxt2004.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); -- cgit v1.1 From 3d48f7d09aadccf570a871ce0d5eec34092b38c1 Mon Sep 17 00:00:00 2001 From: Jean-Francois Moine Date: Sun, 7 Jun 2009 13:51:54 -0300 Subject: V4L/DVB (11971): gspca - doc: Add the 05a9:a518 webcam to the Documentation. Signed-off-by: Jean-Francois Moine Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/gspca.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 1bb103f..2bcf788 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -163,10 +163,11 @@ sunplus 055f:c650 Mustek MDC5500Z zc3xx 055f:d003 Mustek WCam300A zc3xx 055f:d004 Mustek WCam300 AN conex 0572:0041 Creative Notebook cx11646 -ov519 05a9:0519 OmniVision +ov519 05a9:0519 OV519 Microphone ov519 05a9:0530 OmniVision -ov519 05a9:4519 OmniVision +ov519 05a9:4519 Webcam Classic ov519 05a9:8519 OmniVision +ov519 05a9:a518 D-Link DSB-C310 Webcam sunplus 05da:1018 Digital Dream Enigma 1.3 stk014 05e1:0893 Syntek DV4000 spca561 060b:a001 Maxell Compact Pc PM3 -- cgit v1.1 From d7de5d8ff74efd01916b01af875a0e87419a3599 Mon Sep 17 00:00:00 2001 From: Franklin Meng Date: Sat, 6 Jun 2009 17:05:02 -0300 Subject: V4L/DVB (11977): em28xx: Add Kworld 315 entry Added an entry for Kworld 315 (for while, dvb only) Signed-off-by: Franklin Meng Signed-off-by: Douglas Schilling Landgraf Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index b03a685..a98a688 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -64,3 +64,4 @@ 66 -> Empire dual TV (em2880) 67 -> Terratec Grabby (em2860) [0ccd:0096] 68 -> Terratec AV350 (em2860) [0ccd:0084] + 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313] -- cgit v1.1 From 27049dc30152ad0401082f32c33859821b4be029 Mon Sep 17 00:00:00 2001 From: Barry Kitson Date: Sun, 7 Jun 2009 10:41:03 -0300 Subject: V4L/DVB (11996): saa7134: add support for AVerMedia M103 (f736) Add 1461:f736 to the list of identifiers corresponding to the SAA7134_BOARD_AVERMEDIA_M103 board. This patch adds support for a variant of the AVerMedia M103 MiniPCI DVB-T Hybrid card. Signed-off-by: Barry Kitson Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.saa7134 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index e56934e..1556242 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -143,7 +143,7 @@ 142 -> Beholder BeholdTV H6 [5ace:6290] 143 -> Beholder BeholdTV M63 [5ace:6191] 144 -> Beholder BeholdTV M6 Extra [5ace:6193] -145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636] +145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636,1461:f736] 146 -> ASUSTeK P7131 Analog 147 -> Asus Tiger 3in1 [1043:4878] 148 -> Encore ENLTV-FM v5.3 [1a7f:2008] -- cgit v1.1 From 418589663d6011de9006425b6c5721e1544fb47a Mon Sep 17 00:00:00 2001 From: Mel Gorman Date: Tue, 16 Jun 2009 15:32:12 -0700 Subject: page allocator: use allocation flags as an index to the zone watermark ALLOC_WMARK_MIN, ALLOC_WMARK_LOW and ALLOC_WMARK_HIGH determin whether pages_min, pages_low or pages_high is used as the zone watermark when allocating the pages. Two branches in the allocator hotpath determine which watermark to use. This patch uses the flags as an array index into a watermark array that is indexed with WMARK_* defines accessed via helpers. All call sites that use zone->pages_* are updated to use the helpers for accessing the values and the array offsets for setting. Signed-off-by: Mel Gorman Reviewed-by: Christoph Lameter Cc: KOSAKI Motohiro Cc: Pekka Enberg Cc: Peter Zijlstra Cc: Nick Piggin Cc: Dave Hansen Cc: Lee Schermerhorn Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/sysctl/vm.txt | 11 ++++++----- Documentation/vm/balance | 18 +++++++++--------- 2 files changed, 15 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 6fab2dc..0ea5adb 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -233,8 +233,8 @@ These protections are added to score to judge whether this zone should be used for page allocation or should be reclaimed. In this example, if normal pages (index=2) are required to this DMA zone and -pages_high is used for watermark, the kernel judges this zone should not be -used because pages_free(1355) is smaller than watermark + protection[2] +watermark[WMARK_HIGH] is used for watermark, the kernel judges this zone should +not be used because pages_free(1355) is smaller than watermark + protection[2] (4 + 2004 = 2008). If this protection value is 0, this zone would be used for normal page requirement. If requirement is DMA zone(index=0), protection[0] (=0) is used. @@ -280,9 +280,10 @@ The default value is 65536. min_free_kbytes: This is used to force the Linux VM to keep a minimum number -of kilobytes free. The VM uses this number to compute a pages_min -value for each lowmem zone in the system. Each lowmem zone gets -a number of reserved free pages based proportionally on its size. +of kilobytes free. The VM uses this number to compute a +watermark[WMARK_MIN] value for each lowmem zone in the system. +Each lowmem zone gets a number of reserved free pages based +proportionally on its size. Some minimal amount of memory is needed to satisfy PF_MEMALLOC allocations; if you set this to lower than 1024KB, your system will diff --git a/Documentation/vm/balance b/Documentation/vm/balance index bd3d31b..c46e68c 100644 --- a/Documentation/vm/balance +++ b/Documentation/vm/balance @@ -75,15 +75,15 @@ Page stealing from process memory and shm is done if stealing the page would alleviate memory pressure on any zone in the page's node that has fallen below its watermark. -pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are -per-zone fields, used to determine when a zone needs to be balanced. When -the number of pages falls below pages_min, the hysteric field low_on_memory -gets set. This stays set till the number of free pages becomes pages_high. -When low_on_memory is set, page allocation requests will try to free some -pages in the zone (providing GFP_WAIT is set in the request). Orthogonal -to this, is the decision to poke kswapd to free some zone pages. That -decision is not hysteresis based, and is done when the number of free -pages is below pages_low; in which case zone_wake_kswapd is also set. +watemark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These +are per-zone fields, used to determine when a zone needs to be balanced. When +the number of pages falls below watermark[WMARK_MIN], the hysteric field +low_on_memory gets set. This stays set till the number of free pages becomes +watermark[WMARK_HIGH]. When low_on_memory is set, page allocation requests will +try to free some pages in the zone (providing GFP_WAIT is set in the request). +Orthogonal to this, is the decision to poke kswapd to free some zone pages. +That decision is not hysteresis based, and is done when the number of free +pages is below watermark[WMARK_LOW]; in which case zone_wake_kswapd is also set. (Good) Ideas that I have heard: -- cgit v1.1 From c9ba78e226057a1c2f19671383c496df187c02b5 Mon Sep 17 00:00:00 2001 From: Wu Fengguang Date: Tue, 16 Jun 2009 15:32:25 -0700 Subject: pagemap: document clarifications Some bit ranges were inclusive and some not. Fix them to be consistently inclusive. Signed-off-by: Wu Fengguang Cc: KOSAKI Motohiro Cc: Andi Kleen Cc: Matt Mackall Cc: Alexey Dobriyan Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/pagemap.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt index ce72c0f..1f1e69f 100644 --- a/Documentation/vm/pagemap.txt +++ b/Documentation/vm/pagemap.txt @@ -12,9 +12,9 @@ There are three components to pagemap: value for each virtual page, containing the following data (from fs/proc/task_mmu.c, above pagemap_read): - * Bits 0-55 page frame number (PFN) if present + * Bits 0-54 page frame number (PFN) if present * Bits 0-4 swap type if swapped - * Bits 5-55 swap offset if swapped + * Bits 5-54 swap offset if swapped * Bits 55-60 page shift (page size = 1< Date: Tue, 16 Jun 2009 15:32:26 -0700 Subject: pagemap: document 9 more exported page flags Also add short descriptions for all of the 20 exported page flags. Signed-off-by: Wu Fengguang Cc: KOSAKI Motohiro Cc: Andi Kleen Cc: Matt Mackall Cc: Alexey Dobriyan Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/pagemap.txt | 62 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) (limited to 'Documentation') diff --git a/Documentation/vm/pagemap.txt b/Documentation/vm/pagemap.txt index 1f1e69f..600a304 100644 --- a/Documentation/vm/pagemap.txt +++ b/Documentation/vm/pagemap.txt @@ -49,6 +49,68 @@ There are three components to pagemap: 8. WRITEBACK 9. RECLAIM 10. BUDDY + 11. MMAP + 12. ANON + 13. SWAPCACHE + 14. SWAPBACKED + 15. COMPOUND_HEAD + 16. COMPOUND_TAIL + 16. HUGE + 18. UNEVICTABLE + 20. NOPAGE + +Short descriptions to the page flags: + + 0. LOCKED + page is being locked for exclusive access, eg. by undergoing read/write IO + + 7. SLAB + page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator + When compound page is used, SLUB/SLQB will only set this flag on the head + page; SLOB will not flag it at all. + +10. BUDDY + a free memory block managed by the buddy system allocator + The buddy system organizes free memory in blocks of various orders. + An order N block has 2^N physically contiguous pages, with the BUDDY flag + set for and _only_ for the first page. + +15. COMPOUND_HEAD +16. COMPOUND_TAIL + A compound page with order N consists of 2^N physically contiguous pages. + A compound page with order 2 takes the form of "HTTT", where H donates its + head page and T donates its tail page(s). The major consumers of compound + pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc. + memory allocators and various device drivers. However in this interface, + only huge/giga pages are made visible to end users. +17. HUGE + this is an integral part of a HugeTLB page + +20. NOPAGE + no page frame exists at the requested address + + [IO related page flags] + 1. ERROR IO error occurred + 3. UPTODATE page has up-to-date data + ie. for file backed page: (in-memory data revision >= on-disk one) + 4. DIRTY page has been written to, hence contains new data + ie. for file backed page: (in-memory data revision > on-disk one) + 8. WRITEBACK page is being synced to disk + + [LRU related page flags] + 5. LRU page is in one of the LRU lists + 6. ACTIVE page is in the active LRU list +18. UNEVICTABLE page is in the unevictable (non-)LRU list + It is somehow pinned and not a candidate for LRU page reclaims, + eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments + 2. REFERENCED page has been referenced since last LRU list enqueue/requeue + 9. RECLAIM page will be reclaimed soon after its pageout IO completed +11. MMAP a memory mapped page +12. ANON a memory mapped page that is not part of a file +13. SWAPCACHE page is mapped to swap space, ie. has an associated swap entry +14. SWAPBACKED page is backed by swap/RAM + +The page-types tool in this directory can be used to query the above flags. Using pagemap to do something useful: -- cgit v1.1 From 35efa5e993a7a00a50b87d2b7725c3eafc80b083 Mon Sep 17 00:00:00 2001 From: Wu Fengguang Date: Tue, 16 Jun 2009 15:32:27 -0700 Subject: pagemap: add page-types tool Add page-types, a handy tool for querying page flags. It will expand some of the overloaded flags: PG_slob_free = PG_private PG_slub_frozen = PG_active PG_slub_debug = PG_error PG_readahead = PG_reclaim and mask out obscure flags except in -raw mode: PG_reserved PG_mlocked PG_mappedtodisk PG_private PG_private_2 PG_owner_priv_1 PG_arch_1 PG_uncached PG_compound* for non hugeTLB pages [akpm@linux-foundation.org: fix warning] Signed-off-by: Wu Fengguang Cc: KOSAKI Motohiro Cc: Andi Kleen Cc: Matt Mackall Cc: Alexey Dobriyan Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/Makefile | 2 +- Documentation/vm/page-types.c | 698 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 699 insertions(+), 1 deletion(-) create mode 100644 Documentation/vm/page-types.c (limited to 'Documentation') diff --git a/Documentation/vm/Makefile b/Documentation/vm/Makefile index 6f562f7..27479d4 100644 --- a/Documentation/vm/Makefile +++ b/Documentation/vm/Makefile @@ -2,7 +2,7 @@ obj- := dummy.o # List of programs to build -hostprogs-y := slabinfo +hostprogs-y := slabinfo slqbinfo page-types # Tell kbuild to always build the programs always := $(hostprogs-y) diff --git a/Documentation/vm/page-types.c b/Documentation/vm/page-types.c new file mode 100644 index 0000000..0833f44 --- /dev/null +++ b/Documentation/vm/page-types.c @@ -0,0 +1,698 @@ +/* + * page-types: Tool for querying page flags + * + * Copyright (C) 2009 Intel corporation + * Copyright (C) 2009 Wu Fengguang + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +/* + * kernel page flags + */ + +#define KPF_BYTES 8 +#define PROC_KPAGEFLAGS "/proc/kpageflags" + +/* copied from kpageflags_read() */ +#define KPF_LOCKED 0 +#define KPF_ERROR 1 +#define KPF_REFERENCED 2 +#define KPF_UPTODATE 3 +#define KPF_DIRTY 4 +#define KPF_LRU 5 +#define KPF_ACTIVE 6 +#define KPF_SLAB 7 +#define KPF_WRITEBACK 8 +#define KPF_RECLAIM 9 +#define KPF_BUDDY 10 + +/* [11-20] new additions in 2.6.31 */ +#define KPF_MMAP 11 +#define KPF_ANON 12 +#define KPF_SWAPCACHE 13 +#define KPF_SWAPBACKED 14 +#define KPF_COMPOUND_HEAD 15 +#define KPF_COMPOUND_TAIL 16 +#define KPF_HUGE 17 +#define KPF_UNEVICTABLE 18 +#define KPF_NOPAGE 20 + +/* [32-] kernel hacking assistances */ +#define KPF_RESERVED 32 +#define KPF_MLOCKED 33 +#define KPF_MAPPEDTODISK 34 +#define KPF_PRIVATE 35 +#define KPF_PRIVATE_2 36 +#define KPF_OWNER_PRIVATE 37 +#define KPF_ARCH 38 +#define KPF_UNCACHED 39 + +/* [48-] take some arbitrary free slots for expanding overloaded flags + * not part of kernel API + */ +#define KPF_READAHEAD 48 +#define KPF_SLOB_FREE 49 +#define KPF_SLUB_FROZEN 50 +#define KPF_SLUB_DEBUG 51 + +#define KPF_ALL_BITS ((uint64_t)~0ULL) +#define KPF_HACKERS_BITS (0xffffULL << 32) +#define KPF_OVERLOADED_BITS (0xffffULL << 48) +#define BIT(name) (1ULL << KPF_##name) +#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL)) + +static char *page_flag_names[] = { + [KPF_LOCKED] = "L:locked", + [KPF_ERROR] = "E:error", + [KPF_REFERENCED] = "R:referenced", + [KPF_UPTODATE] = "U:uptodate", + [KPF_DIRTY] = "D:dirty", + [KPF_LRU] = "l:lru", + [KPF_ACTIVE] = "A:active", + [KPF_SLAB] = "S:slab", + [KPF_WRITEBACK] = "W:writeback", + [KPF_RECLAIM] = "I:reclaim", + [KPF_BUDDY] = "B:buddy", + + [KPF_MMAP] = "M:mmap", + [KPF_ANON] = "a:anonymous", + [KPF_SWAPCACHE] = "s:swapcache", + [KPF_SWAPBACKED] = "b:swapbacked", + [KPF_COMPOUND_HEAD] = "H:compound_head", + [KPF_COMPOUND_TAIL] = "T:compound_tail", + [KPF_HUGE] = "G:huge", + [KPF_UNEVICTABLE] = "u:unevictable", + [KPF_NOPAGE] = "n:nopage", + + [KPF_RESERVED] = "r:reserved", + [KPF_MLOCKED] = "m:mlocked", + [KPF_MAPPEDTODISK] = "d:mappedtodisk", + [KPF_PRIVATE] = "P:private", + [KPF_PRIVATE_2] = "p:private_2", + [KPF_OWNER_PRIVATE] = "O:owner_private", + [KPF_ARCH] = "h:arch", + [KPF_UNCACHED] = "c:uncached", + + [KPF_READAHEAD] = "I:readahead", + [KPF_SLOB_FREE] = "P:slob_free", + [KPF_SLUB_FROZEN] = "A:slub_frozen", + [KPF_SLUB_DEBUG] = "E:slub_debug", +}; + + +/* + * data structures + */ + +static int opt_raw; /* for kernel developers */ +static int opt_list; /* list pages (in ranges) */ +static int opt_no_summary; /* don't show summary */ +static pid_t opt_pid; /* process to walk */ + +#define MAX_ADDR_RANGES 1024 +static int nr_addr_ranges; +static unsigned long opt_offset[MAX_ADDR_RANGES]; +static unsigned long opt_size[MAX_ADDR_RANGES]; + +#define MAX_BIT_FILTERS 64 +static int nr_bit_filters; +static uint64_t opt_mask[MAX_BIT_FILTERS]; +static uint64_t opt_bits[MAX_BIT_FILTERS]; + +static int page_size; + +#define PAGES_BATCH (64 << 10) /* 64k pages */ +static int kpageflags_fd; +static uint64_t kpageflags_buf[KPF_BYTES * PAGES_BATCH]; + +#define HASH_SHIFT 13 +#define HASH_SIZE (1 << HASH_SHIFT) +#define HASH_MASK (HASH_SIZE - 1) +#define HASH_KEY(flags) (flags & HASH_MASK) + +static unsigned long total_pages; +static unsigned long nr_pages[HASH_SIZE]; +static uint64_t page_flags[HASH_SIZE]; + + +/* + * helper functions + */ + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +#define min_t(type, x, y) ({ \ + type __min1 = (x); \ + type __min2 = (y); \ + __min1 < __min2 ? __min1 : __min2; }) + +unsigned long pages2mb(unsigned long pages) +{ + return (pages * page_size) >> 20; +} + +void fatal(const char *x, ...) +{ + va_list ap; + + va_start(ap, x); + vfprintf(stderr, x, ap); + va_end(ap); + exit(EXIT_FAILURE); +} + + +/* + * page flag names + */ + +char *page_flag_name(uint64_t flags) +{ + static char buf[65]; + int present; + int i, j; + + for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { + present = (flags >> i) & 1; + if (!page_flag_names[i]) { + if (present) + fatal("unkown flag bit %d\n", i); + continue; + } + buf[j++] = present ? page_flag_names[i][0] : '_'; + } + + return buf; +} + +char *page_flag_longname(uint64_t flags) +{ + static char buf[1024]; + int i, n; + + for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + if ((flags >> i) & 1) + n += snprintf(buf + n, sizeof(buf) - n, "%s,", + page_flag_names[i] + 2); + } + if (n) + n--; + buf[n] = '\0'; + + return buf; +} + + +/* + * page list and summary + */ + +void show_page_range(unsigned long offset, uint64_t flags) +{ + static uint64_t flags0; + static unsigned long index; + static unsigned long count; + + if (flags == flags0 && offset == index + count) { + count++; + return; + } + + if (count) + printf("%lu\t%lu\t%s\n", + index, count, page_flag_name(flags0)); + + flags0 = flags; + index = offset; + count = 1; +} + +void show_page(unsigned long offset, uint64_t flags) +{ + printf("%lu\t%s\n", offset, page_flag_name(flags)); +} + +void show_summary(void) +{ + int i; + + printf(" flags\tpage-count MB" + " symbolic-flags\t\t\tlong-symbolic-flags\n"); + + for (i = 0; i < ARRAY_SIZE(nr_pages); i++) { + if (nr_pages[i]) + printf("0x%016llx\t%10lu %8lu %s\t%s\n", + (unsigned long long)page_flags[i], + nr_pages[i], + pages2mb(nr_pages[i]), + page_flag_name(page_flags[i]), + page_flag_longname(page_flags[i])); + } + + printf(" total\t%10lu %8lu\n", + total_pages, pages2mb(total_pages)); +} + + +/* + * page flag filters + */ + +int bit_mask_ok(uint64_t flags) +{ + int i; + + for (i = 0; i < nr_bit_filters; i++) { + if (opt_bits[i] == KPF_ALL_BITS) { + if ((flags & opt_mask[i]) == 0) + return 0; + } else { + if ((flags & opt_mask[i]) != opt_bits[i]) + return 0; + } + } + + return 1; +} + +uint64_t expand_overloaded_flags(uint64_t flags) +{ + /* SLOB/SLUB overload several page flags */ + if (flags & BIT(SLAB)) { + if (flags & BIT(PRIVATE)) + flags ^= BIT(PRIVATE) | BIT(SLOB_FREE); + if (flags & BIT(ACTIVE)) + flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN); + if (flags & BIT(ERROR)) + flags ^= BIT(ERROR) | BIT(SLUB_DEBUG); + } + + /* PG_reclaim is overloaded as PG_readahead in the read path */ + if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM)) + flags ^= BIT(RECLAIM) | BIT(READAHEAD); + + return flags; +} + +uint64_t well_known_flags(uint64_t flags) +{ + /* hide flags intended only for kernel hacker */ + flags &= ~KPF_HACKERS_BITS; + + /* hide non-hugeTLB compound pages */ + if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE))) + flags &= ~BITS_COMPOUND; + + return flags; +} + + +/* + * page frame walker + */ + +int hash_slot(uint64_t flags) +{ + int k = HASH_KEY(flags); + int i; + + /* Explicitly reserve slot 0 for flags 0: the following logic + * cannot distinguish an unoccupied slot from slot (flags==0). + */ + if (flags == 0) + return 0; + + /* search through the remaining (HASH_SIZE-1) slots */ + for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) { + if (!k || k >= ARRAY_SIZE(page_flags)) + k = 1; + if (page_flags[k] == 0) { + page_flags[k] = flags; + return k; + } + if (page_flags[k] == flags) + return k; + } + + fatal("hash table full: bump up HASH_SHIFT?\n"); + exit(EXIT_FAILURE); +} + +void add_page(unsigned long offset, uint64_t flags) +{ + flags = expand_overloaded_flags(flags); + + if (!opt_raw) + flags = well_known_flags(flags); + + if (!bit_mask_ok(flags)) + return; + + if (opt_list == 1) + show_page_range(offset, flags); + else if (opt_list == 2) + show_page(offset, flags); + + nr_pages[hash_slot(flags)]++; + total_pages++; +} + +void walk_pfn(unsigned long index, unsigned long count) +{ + unsigned long batch; + unsigned long n; + unsigned long i; + + if (index > ULONG_MAX / KPF_BYTES) + fatal("index overflow: %lu\n", index); + + lseek(kpageflags_fd, index * KPF_BYTES, SEEK_SET); + + while (count) { + batch = min_t(unsigned long, count, PAGES_BATCH); + n = read(kpageflags_fd, kpageflags_buf, batch * KPF_BYTES); + if (n == 0) + break; + if (n < 0) { + perror(PROC_KPAGEFLAGS); + exit(EXIT_FAILURE); + } + + if (n % KPF_BYTES != 0) + fatal("partial read: %lu bytes\n", n); + n = n / KPF_BYTES; + + for (i = 0; i < n; i++) + add_page(index + i, kpageflags_buf[i]); + + index += batch; + count -= batch; + } +} + +void walk_addr_ranges(void) +{ + int i; + + kpageflags_fd = open(PROC_KPAGEFLAGS, O_RDONLY); + if (kpageflags_fd < 0) { + perror(PROC_KPAGEFLAGS); + exit(EXIT_FAILURE); + } + + if (!nr_addr_ranges) + walk_pfn(0, ULONG_MAX); + + for (i = 0; i < nr_addr_ranges; i++) + walk_pfn(opt_offset[i], opt_size[i]); + + close(kpageflags_fd); +} + + +/* + * user interface + */ + +const char *page_flag_type(uint64_t flag) +{ + if (flag & KPF_HACKERS_BITS) + return "(r)"; + if (flag & KPF_OVERLOADED_BITS) + return "(o)"; + return " "; +} + +void usage(void) +{ + int i, j; + + printf( +"page-types [options]\n" +" -r|--raw Raw mode, for kernel developers\n" +" -a|--addr addr-spec Walk a range of pages\n" +" -b|--bits bits-spec Walk pages with specified bits\n" +#if 0 /* planned features */ +" -p|--pid pid Walk process address space\n" +" -f|--file filename Walk file address space\n" +#endif +" -l|--list Show page details in ranges\n" +" -L|--list-each Show page details one by one\n" +" -N|--no-summary Don't show summay info\n" +" -h|--help Show this usage message\n" +"addr-spec:\n" +" N one page at offset N (unit: pages)\n" +" N+M pages range from N to N+M-1\n" +" N,M pages range from N to M-1\n" +" N, pages range from N to end\n" +" ,M pages range from 0 to M\n" +"bits-spec:\n" +" bit1,bit2 (flags & (bit1|bit2)) != 0\n" +" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n" +" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n" +" =bit1,bit2 flags == (bit1|bit2)\n" +"bit-names:\n" + ); + + for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + printf("%16s%s", page_flag_names[i] + 2, + page_flag_type(1ULL << i)); + if (++j > 3) { + j = 0; + putchar('\n'); + } + } + printf("\n " + "(r) raw mode bits (o) overloaded bits\n"); +} + +unsigned long long parse_number(const char *str) +{ + unsigned long long n; + + n = strtoll(str, NULL, 0); + + if (n == 0 && str[0] != '0') + fatal("invalid name or number: %s\n", str); + + return n; +} + +void parse_pid(const char *str) +{ + opt_pid = parse_number(str); +} + +void parse_file(const char *name) +{ +} + +void add_addr_range(unsigned long offset, unsigned long size) +{ + if (nr_addr_ranges >= MAX_ADDR_RANGES) + fatal("too much addr ranges\n"); + + opt_offset[nr_addr_ranges] = offset; + opt_size[nr_addr_ranges] = size; + nr_addr_ranges++; +} + +void parse_addr_range(const char *optarg) +{ + unsigned long offset; + unsigned long size; + char *p; + + p = strchr(optarg, ','); + if (!p) + p = strchr(optarg, '+'); + + if (p == optarg) { + offset = 0; + size = parse_number(p + 1); + } else if (p) { + offset = parse_number(optarg); + if (p[1] == '\0') + size = ULONG_MAX; + else { + size = parse_number(p + 1); + if (*p == ',') { + if (size < offset) + fatal("invalid range: %lu,%lu\n", + offset, size); + size -= offset; + } + } + } else { + offset = parse_number(optarg); + size = 1; + } + + add_addr_range(offset, size); +} + +void add_bits_filter(uint64_t mask, uint64_t bits) +{ + if (nr_bit_filters >= MAX_BIT_FILTERS) + fatal("too much bit filters\n"); + + opt_mask[nr_bit_filters] = mask; + opt_bits[nr_bit_filters] = bits; + nr_bit_filters++; +} + +uint64_t parse_flag_name(const char *str, int len) +{ + int i; + + if (!*str || !len) + return 0; + + if (len <= 8 && !strncmp(str, "compound", len)) + return BITS_COMPOUND; + + for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + if (!strncmp(str, page_flag_names[i] + 2, len)) + return 1ULL << i; + } + + return parse_number(str); +} + +uint64_t parse_flag_names(const char *str, int all) +{ + const char *p = str; + uint64_t flags = 0; + + while (1) { + if (*p == ',' || *p == '=' || *p == '\0') { + if ((*str != '~') || (*str == '~' && all && *++str)) + flags |= parse_flag_name(str, p - str); + if (*p != ',') + break; + str = p + 1; + } + p++; + } + + return flags; +} + +void parse_bits_mask(const char *optarg) +{ + uint64_t mask; + uint64_t bits; + const char *p; + + p = strchr(optarg, '='); + if (p == optarg) { + mask = KPF_ALL_BITS; + bits = parse_flag_names(p + 1, 0); + } else if (p) { + mask = parse_flag_names(optarg, 0); + bits = parse_flag_names(p + 1, 0); + } else if (strchr(optarg, '~')) { + mask = parse_flag_names(optarg, 1); + bits = parse_flag_names(optarg, 0); + } else { + mask = parse_flag_names(optarg, 0); + bits = KPF_ALL_BITS; + } + + add_bits_filter(mask, bits); +} + + +struct option opts[] = { + { "raw" , 0, NULL, 'r' }, + { "pid" , 1, NULL, 'p' }, + { "file" , 1, NULL, 'f' }, + { "addr" , 1, NULL, 'a' }, + { "bits" , 1, NULL, 'b' }, + { "list" , 0, NULL, 'l' }, + { "list-each" , 0, NULL, 'L' }, + { "no-summary", 0, NULL, 'N' }, + { "help" , 0, NULL, 'h' }, + { NULL , 0, NULL, 0 } +}; + +int main(int argc, char *argv[]) +{ + int c; + + page_size = getpagesize(); + + while ((c = getopt_long(argc, argv, + "rp:f:a:b:lLNh", opts, NULL)) != -1) { + switch (c) { + case 'r': + opt_raw = 1; + break; + case 'p': + parse_pid(optarg); + break; + case 'f': + parse_file(optarg); + break; + case 'a': + parse_addr_range(optarg); + break; + case 'b': + parse_bits_mask(optarg); + break; + case 'l': + opt_list = 1; + break; + case 'L': + opt_list = 2; + break; + case 'N': + opt_no_summary = 1; + break; + case 'h': + usage(); + exit(0); + default: + usage(); + exit(1); + } + } + + if (opt_list == 1) + printf("offset\tcount\tflags\n"); + if (opt_list == 2) + printf("offset\tflags\n"); + + walk_addr_ranges(); + + if (opt_list == 1) + show_page_range(0, 0); /* drain the buffer */ + + if (opt_no_summary) + return 0; + + if (opt_list) + printf("\n\n"); + + show_summary(); + + return 0; +} -- cgit v1.1 From 2ff05b2b4eac2e63d345fc731ea151a060247f53 Mon Sep 17 00:00:00 2001 From: David Rientjes Date: Tue, 16 Jun 2009 15:32:56 -0700 Subject: oom: move oom_adj value from task_struct to mm_struct The per-task oom_adj value is a characteristic of its mm more than the task itself since it's not possible to oom kill any thread that shares the mm. If a task were to be killed while attached to an mm that could not be freed because another thread were set to OOM_DISABLE, it would have needlessly been terminated since there is no potential for future memory freeing. This patch moves oomkilladj (now more appropriately named oom_adj) from struct task_struct to struct mm_struct. This requires task_lock() on a task to check its oom_adj value to protect against exec, but it's already necessary to take the lock when dereferencing the mm to find the total VM size for the badness heuristic. This fixes a livelock if the oom killer chooses a task and another thread sharing the same memory has an oom_adj value of OOM_DISABLE. This occurs because oom_kill_task() repeatedly returns 1 and refuses to kill the chosen task while select_bad_process() will repeatedly choose the same task during the next retry. Taking task_lock() in select_bad_process() to check for OOM_DISABLE and in oom_kill_task() to check for threads sharing the same memory will be removed in the next patch in this series where it will no longer be necessary. Writing to /proc/pid/oom_adj for a kthread will now return -EINVAL since these threads are immune from oom killing already. They simply report an oom_adj value of OOM_DISABLE. Cc: Nick Piggin Cc: Rik van Riel Cc: Mel Gorman Signed-off-by: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/proc.txt | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index cd8717a..ebff3c1 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -1003,11 +1003,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS 3.1 /proc//oom_adj - Adjust the oom-killer score ------------------------------------------------------ -This file can be used to adjust the score used to select which processes -should be killed in an out-of-memory situation. Giving it a high score will -increase the likelihood of this process being killed by the oom-killer. Valid -values are in the range -16 to +15, plus the special value -17, which disables -oom-killing altogether for this process. +This file can be used to adjust the score used to select which processes should +be killed in an out-of-memory situation. The oom_adj value is a characteristic +of the task's mm, so all threads that share an mm with pid will have the same +oom_adj value. A high value will increase the likelihood of this process being +killed by the oom-killer. Valid values are in the range -16 to +15 as +explained below and a special value of -17, which disables oom-killing +altogether for threads sharing pid's mm. The process to be killed in an out-of-memory situation is selected among all others based on its badness score. This value equals the original memory size of the process @@ -1021,6 +1023,9 @@ the parent's score if they do not share the same memory. Thus forking servers are the prime candidates to be killed. Having only one 'hungry' child will make parent less preferable than the child. +/proc//oom_adj cannot be changed for kthreads since they are immune from +oom-killing already. + /proc//oom_score shows process' current badness score. The following heuristics are then applied: -- cgit v1.1 From 90afa5de6f3fa89a733861e843377302479fcf7e Mon Sep 17 00:00:00 2001 From: Mel Gorman Date: Tue, 16 Jun 2009 15:33:20 -0700 Subject: vmscan: properly account for the number of page cache pages zone_reclaim() can reclaim A bug was brought to my attention against a distro kernel but it affects mainline and I believe problems like this have been reported in various guises on the mailing lists although I don't have specific examples at the moment. The reported problem was that malloc() stalled for a long time (minutes in some cases) if a large tmpfs mount was occupying a large percentage of memory overall. The pages did not get cleaned or reclaimed by zone_reclaim() because the zone_reclaim_mode was unsuitable, but the lists are uselessly scanned frequencly making the CPU spin at near 100%. This patchset intends to address that bug and bring the behaviour of zone_reclaim() more in line with expectations which were noticed during investigation. It is based on top of mmotm and takes advantage of Kosaki's work with respect to zone_reclaim(). Patch 1 fixes the heuristics that zone_reclaim() uses to determine if the scan should go ahead. The broken heuristic is what was causing the malloc() stall as it uselessly scanned the LRU constantly. Currently, zone_reclaim is assuming zone_reclaim_mode is 1 and historically it could not deal with tmpfs pages at all. This fixes up the heuristic so that an unnecessary scan is more likely to be correctly avoided. Patch 2 notes that zone_reclaim() returning a failure automatically means the zone is marked full. This is not always true. It could have failed because the GFP mask or zone_reclaim_mode were unsuitable. Patch 3 introduces a counter zreclaim_failed that will increment each time the zone_reclaim scan-avoidance heuristics fail. If that counter is rapidly increasing, then zone_reclaim_mode should be set to 0 as a temporarily resolution and a bug reported because the scan-avoidance heuristic is still broken. This patch: On NUMA machines, the administrator can configure zone_reclaim_mode that is a more targetted form of direct reclaim. On machines with large NUMA distances for example, a zone_reclaim_mode defaults to 1 meaning that clean unmapped pages will be reclaimed if the zone watermarks are not being met. There is a heuristic that determines if the scan is worthwhile but the problem is that the heuristic is not being properly applied and is basically assuming zone_reclaim_mode is 1 if it is enabled. The lack of proper detection can manfiest as high CPU usage as the LRU list is scanned uselessly. Historically, once enabled it was depending on NR_FILE_PAGES which may include swapcache pages that the reclaim_mode cannot deal with. Patch vmscan-change-the-number-of-the-unmapped-files-in-zone-reclaim.patch by Kosaki Motohiro noted that zone_page_state(zone, NR_FILE_PAGES) included pages that were not file-backed such as swapcache and made a calculation based on the inactive, active and mapped files. This is far superior when zone_reclaim==1 but if RECLAIM_SWAP is set, then NR_FILE_PAGES is a reasonable starting figure. This patch alters how zone_reclaim() works out how many pages it might be able to reclaim given the current reclaim_mode. If RECLAIM_SWAP is set in the reclaim_mode it will either consider NR_FILE_PAGES as potential candidates or else use NR_{IN}ACTIVE}_PAGES-NR_FILE_MAPPED to discount swapcache and other non-file-backed pages. If RECLAIM_WRITE is not set, then NR_FILE_DIRTY number of pages are not candidates. If RECLAIM_SWAP is not set, then NR_FILE_MAPPED are not. [kosaki.motohiro@jp.fujitsu.com: Estimate unmapped pages minus tmpfs pages] [fengguang.wu@intel.com: Fix underflow problem in Kosaki's estimate] Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Acked-by: Christoph Lameter Cc: KOSAKI Motohiro Cc: Wu Fengguang Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/sysctl/vm.txt | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 0ea5adb..c4de635 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -315,10 +315,14 @@ min_unmapped_ratio: This is available only on NUMA kernels. -A percentage of the total pages in each zone. Zone reclaim will only -occur if more than this percentage of pages are file backed and unmapped. -This is to insure that a minimal amount of local pages is still available for -file I/O even if the node is overallocated. +This is a percentage of the total pages in each zone. Zone reclaim will +only occur if more than this percentage of pages are in a state that +zone_reclaim_mode allows to be reclaimed. + +If zone_reclaim_mode has the value 4 OR'd, then the percentage is compared +against all file-backed unmapped pages including swapcache pages and tmpfs +files. Otherwise, only unmapped pages backed by normal files but not tmpfs +files and similar are considered. The default is 1 percent. -- cgit v1.1 From b8d9a86590fb334d28c5905a4c419ece7d08e37d Mon Sep 17 00:00:00 2001 From: Jaswinder Singh Rajput Date: Tue, 16 Jun 2009 15:33:46 -0700 Subject: Documentation/accounting/getdelays.c intialize the variable before using it Fix compilation warning: Documentation/accounting/getdelays.c: In function `main': Documentation/accounting/getdelays.c:249: warning: `cmd_type' may be used uninitialized in this function This is in fact a false positive. Signed-off-by: Jaswinder Singh Rajput Acked-by: Balbir Singh Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/accounting/getdelays.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index 7ea2311..aa73e72 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -246,7 +246,8 @@ void print_ioacct(struct taskstats *t) int main(int argc, char *argv[]) { - int c, rc, rep_len, aggr_len, len2, cmd_type; + int c, rc, rep_len, aggr_len, len2; + int cmd_type = TASKSTATS_CMD_ATTR_UNSPEC; __u16 id; __u32 mypid; -- cgit v1.1 From 4764e280dc7dde1534161e148d38dbd792a2b8ab Mon Sep 17 00:00:00 2001 From: "Figo.zhang" Date: Tue, 16 Jun 2009 15:33:51 -0700 Subject: Documentation/atomic_ops.txt: fix sample code list_add() lost a parameter in sample code. Signed-off-by: Figo.zhang Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/atomic_ops.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt index 4ef2450..396bec3 100644 --- a/Documentation/atomic_ops.txt +++ b/Documentation/atomic_ops.txt @@ -229,10 +229,10 @@ kernel. It is the use of atomic counters to implement reference counting, and it works such that once the counter falls to zero it can be guaranteed that no other entity can be accessing the object: -static void obj_list_add(struct obj *obj) +static void obj_list_add(struct obj *obj, struct list_head *head) { obj->active = 1; - list_add(&obj->list); + list_add(&obj->list, head); } static void obj_list_del(struct obj *obj) -- cgit v1.1 From f324edc85e5c1137e49e3b36a58cf436ab5b1fb3 Mon Sep 17 00:00:00 2001 From: Daniel Mack Date: Tue, 16 Jun 2009 15:33:52 -0700 Subject: console: make blank timeout value a boot option The console blank timer is currently hardcoded to 10*60 seconds which might be annoying on systems with no input devices attached to wake up the console again. Especially during development, disabling the screen saver can be handy - for example when debugging the root fs mount mechanism or other scenarios where no userspace program could be started to do that at runtime from userspace. This patch defines a core_param for the variable in charge which allows users to entirely disable the blank feature at boot time by setting it 0. The value can still be overwritten at runtime using the standard ioctl call - this just allows to conditionally change the default. Signed-off-by: Daniel Mack Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index ad38006..5578248 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -546,6 +546,10 @@ and is between 256 and 4096 characters. It is defined in the file console=brl,ttyS0 For now, only VisioBraille is supported. + consoleblank= [KNL] The console blank (screen saver) timeout in + seconds. Defaults to 10*60 = 10mins. A value of 0 + disables the blank timer. + coredump_filter= [KNL] Change the default value for /proc//coredump_filter. -- cgit v1.1 From 2d9d2fdfae4cf7fda90178a9daf0f8f750043ae8 Mon Sep 17 00:00:00 2001 From: Paul Menzel Date: Tue, 16 Jun 2009 15:34:21 -0700 Subject: Documentation/fb/vesafb.txt: fix typo Signed-off-by: Paul Menzel Cc: Gerd Knorr Cc: Nico Schmoigl Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/fb/vesafb.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt index ee277dd..950d5a6 100644 --- a/Documentation/fb/vesafb.txt +++ b/Documentation/fb/vesafb.txt @@ -95,7 +95,7 @@ There is no way to change the vesafb video mode and/or timings after booting linux. If you are not happy with the 60 Hz refresh rate, you have these options: - * configure and load the DOS-Tools for your the graphics board (if + * configure and load the DOS-Tools for the graphics board (if available) and boot linux with loadlin. * use a native driver (matroxfb/atyfb) instead if vesafb. If none is available, write a new one! -- cgit v1.1 From fe36adf47eb1f7f4972559efa30ce3d2d3f977f2 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Tue, 16 Jun 2009 13:35:01 -0400 Subject: No instance of ->bmap() needs BKL Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 3120f8d..229d7b7 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -187,7 +187,7 @@ readpages: no write_begin: no locks the page yes write_end: no yes, unlocks yes perform_write: no n/a yes -bmap: yes +bmap: no invalidatepage: no yes releasepage: no yes direct_IO: no -- cgit v1.1 From 0bd8df908de2aefe312d05bd25cd3abc21a6d1da Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Tue, 16 Jun 2009 22:38:29 -0700 Subject: Documentation/vm/Makefile: don't try to build slqbinfo For it is only in linux-next at this stage. Cc: Wu Fengguang Cc: KOSAKI Motohiro Cc: Andi Kleen Cc: Matt Mackall Cc: Alexey Dobriyan Cc: Ingo Molnar Cc: Pekka Enberg Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/vm/Makefile b/Documentation/vm/Makefile index 27479d4..5bd269b 100644 --- a/Documentation/vm/Makefile +++ b/Documentation/vm/Makefile @@ -2,7 +2,7 @@ obj- := dummy.o # List of programs to build -hostprogs-y := slabinfo slqbinfo page-types +hostprogs-y := slabinfo page-types # Tell kbuild to always build the programs always := $(hostprogs-y) -- cgit v1.1 From f21179a47ff8d1046a61c1cf5920244997a4a7bb Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Sat, 30 May 2009 13:25:08 -0300 Subject: thinkpad-acpi: enhance led support Add support for extra LEDs on recent ThinkPads, and avoid registering with the led class the LEDs which are not available for a given ThinkPad model. All non-restricted LEDs are always available through the procfs interface, as the firmware doesn't care if an attempt is made to access an invalid LED. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index e7e9a690..88fc066 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -920,7 +920,7 @@ The available commands are: echo ' off' >/proc/acpi/ibm/led echo ' blink' >/proc/acpi/ibm/led -The range is 0 to 7. The set of LEDs that can be +The range is 0 to 15. The set of LEDs that can be controlled varies from model to model. Here is the common ThinkPad mapping: @@ -932,6 +932,11 @@ mapping: 5 - UltraBase battery slot 6 - (unknown) 7 - standby + 8 - dock status 1 + 9 - dock status 2 + 10, 11 - (unknown) + 12 - thinkvantage + 13, 14, 15 - (unknown) All of the above can be turned on and off and can be made to blink. @@ -940,10 +945,12 @@ sysfs notes: The ThinkPad LED sysfs interface is described in detail by the LED class documentation, in Documentation/leds-class.txt. -The leds are named (in LED ID order, from 0 to 7): +The LEDs are named (in LED ID order, from 0 to 12): "tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt", "tpacpi::dock_active", "tpacpi::bay_active", "tpacpi::dock_batt", -"tpacpi::unknown_led", "tpacpi::standby". +"tpacpi::unknown_led", "tpacpi::standby", "tpacpi::dock_status1", +"tpacpi::dock_status2", "tpacpi::unknown_led2", "tpacpi::unknown_led3", +"tpacpi::thinkvantage". Due to limitations in the sysfs LED class, if the status of the LED indicators cannot be read due to an error, thinkpad-acpi will report it as @@ -958,6 +965,12 @@ ThinkPad indicator LED should blink in hardware accelerated mode, use the "timer" trigger, and leave the delay_on and delay_off parameters set to zero (to request hardware acceleration autodetection). +LEDs that are known not to exist in a given ThinkPad model are not +made available through the sysfs interface. If you have a dock and you +notice there are LEDs listed for your ThinkPad that do not exist (and +are not in the dock), or if you notice that there are missing LEDs, +a report to ibm-acpi-devel@lists.sourceforge.net is appreciated. + ACPI sounds -- /proc/acpi/ibm/beep ---------------------------------- @@ -1555,3 +1568,7 @@ Sysfs interface changelog: 0x020300: hotkey enable/disable support removed, attributes hotkey_bios_enabled and hotkey_enable deprecated and marked for removal. + +0x020400: Marker for 16 LEDs support. Also, LEDs that are known + to not exist in a given model are not registered with + the LED sysfs class anymore. -- cgit v1.1 From d7880f10c5d42ba182a97c1fd41d41d0b8837097 Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Thu, 18 Jun 2009 00:40:16 -0300 Subject: thinkpad-acpi: forbid the use of HBRV on Lenovo ThinkPads Forcing thinkpad-acpi to do EC-based brightness control (HBRV) on a X61 has very... interesting effects, instead of doing nothing (since it doesn't have EC-based backlight control), it causes "weirdness" in the fan tachometer readings, for example. This means the EC register that used to be HBRV has been reused by Lenovo for something else, but they didn't remove it from the DSDT. Make sure the documentation reflects this data, and forbid the user from forcing the driver to access HBRV on Lenovo ThinkPads. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index 88fc066..f2ff638 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -1169,17 +1169,19 @@ may not be distinct. Later Lenovo models that implement the ACPI display backlight brightness control methods have 16 levels, ranging from 0 to 15. -There are two interfaces to the firmware for direct brightness control, -EC and UCMS (or CMOS). To select which one should be used, use the -brightness_mode module parameter: brightness_mode=1 selects EC mode, -brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC -mode with NVRAM backing (so that brightness changes are remembered -across shutdown/reboot). +For IBM ThinkPads, there are two interfaces to the firmware for direct +brightness control, EC and UCMS (or CMOS). To select which one should be +used, use the brightness_mode module parameter: brightness_mode=1 selects +EC mode, brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC +mode with NVRAM backing (so that brightness changes are remembered across +shutdown/reboot). The driver tries to select which interface to use from a table of defaults for each ThinkPad model. If it makes a wrong choice, please report this as a bug, so that we can fix it. +Lenovo ThinkPads only support brightness_mode=2 (UCMS). + When display backlight brightness controls are available through the standard ACPI interface, it is best to use it instead of this direct ThinkPad-specific interface. The driver will disable its native -- cgit v1.1 From d73772474f6ebbacbe820c31c0fa1cffa7160246 Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Thu, 18 Jun 2009 00:40:17 -0300 Subject: thinkpad-acpi: support the second fan on the X61 Support reading the tachometer of the auxiliary fan of a X60/X61. It was found out by sheer luck, that bit 0 of EC register 0x31 (formely HBRV) selects which fan is active for tachometer readings through EC 0x84/0x085: 0 for fan1, 1 for fan2. Many thanks to Christoph Kl??nter, to Whoopie, and to weasel, who helped confirm that behaviour. Fan control through EC HFSP applies to both fans equally, regardless of the state of bit 0 of EC 0x31. That matches the way the DSDT uses HFSP. In order to better support the secondary fan, export a second tachometer over hwmon, and add defensive measures to make sure we are reading the correct tachometer. Support for the second fan is whitelist-based, as I have not found anything obvious to look for in the DSDT to detect the presence of the second fan. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index f2ff638..0d5e913 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -1269,7 +1269,7 @@ Fan control and monitoring: fan speed, fan enable/disable procfs: /proc/acpi/ibm/fan sysfs device attributes: (hwmon "thinkpad") fan1_input, pwm1, - pwm1_enable + pwm1_enable, fan2_input sysfs hwmon driver attributes: fan_watchdog NOTE NOTE NOTE: fan control operations are disabled by default for @@ -1282,6 +1282,9 @@ from the hardware registers of the embedded controller. This is known to work on later R, T, X and Z series ThinkPads but may show a bogus value on other models. +Some Lenovo ThinkPads support a secondary fan. This fan cannot be +controlled separately, it shares the main fan control. + Fan levels: Most ThinkPad fans work in "levels" at the firmware interface. Level 0 @@ -1412,6 +1415,11 @@ hwmon device attribute fan1_input: which can take up to two minutes. May return rubbish on older ThinkPads. +hwmon device attribute fan2_input: + Fan tachometer reading, in RPM, for the secondary fan. + Available only on some ThinkPads. If the secondary fan is + not installed, will always read 0. + hwmon driver attribute fan_watchdog: Fan safety watchdog timer interval, in seconds. Minimum is 1 second, maximum is 120 seconds. 0 disables the watchdog. -- cgit v1.1 From 47bece87b14b866872b52ff04d469832e4936756 Mon Sep 17 00:00:00 2001 From: Thomas Mingarelli Date: Thu, 4 Jun 2009 19:50:45 +0000 Subject: [WATCHDOG] hpwdt: Add NMI sourcing Add NMI sourcing functionality (Can only be active if nmi_watchdog is inactive). Signed-off-by: Thomas Mingarelli Signed-off-by: Wim Van Sebroeck --- Documentation/watchdog/hpwdt.txt | 84 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) create mode 100644 Documentation/watchdog/hpwdt.txt (limited to 'Documentation') diff --git a/Documentation/watchdog/hpwdt.txt b/Documentation/watchdog/hpwdt.txt new file mode 100644 index 0000000..127839e --- /dev/null +++ b/Documentation/watchdog/hpwdt.txt @@ -0,0 +1,84 @@ +Last reviewed: 06/02/2009 + + HP iLO2 NMI Watchdog Driver + NMI sourcing for iLO2 based ProLiant Servers + Documentation and Driver by + Thomas Mingarelli + + The HP iLO2 NMI Watchdog driver is a kernel module that provides basic + watchdog functionality and the added benefit of NMI sourcing. Both the + watchdog functionality and the NMI sourcing capability need to be enabled + by the user. Remember that the two modes are not dependant on one another. + A user can have the NMI sourcing without the watchdog timer and vice-versa. + + Watchdog functionality is enabled like any other common watchdog driver. That + is, an application needs to be started that kicks off the watchdog timer. A + basic application exists in the Documentation/watchdog/src directory called + watchdog-test.c. Simply compile the C file and kick it off. If the system + gets into a bad state and hangs, the HP ProLiant iLO 2 timer register will + not be updated in a timely fashion and a hardware system reset (also known as + an Automatic Server Recovery (ASR)) event will occur. + + The hpwdt driver also has three (3) module parameters. They are the following: + + soft_margin - allows the user to set the watchdog timer value + allow_kdump - allows the user to save off a kernel dump image after an NMI + nowayout - basic watchdog parameter that does not allow the timer to + be restarted or an impending ASR to be escaped. + + NOTE: More information about watchdog drivers in general, including the ioctl + interface to /dev/watchdog can be found in + Documentation/watchdog/watchdog-api.txt and Documentation/IPMI.txt. + + The NMI sourcing capability is disabled when the driver discovers that the + nmi_watchdog is turned on (nmi_watchdog = 1). This is due to the inability to + distinguish between "NMI Watchdog Ticks" and "HW generated NMI events" in the + Linux kernel. What this means is that the hpwdt nmi handler code is called + each time the NMI signal fires off. This could amount to several thousands of + NMIs in a matter of seconds. If a user sees the Linux kernel's "dazed and + confused" message in the logs or if the system gets into a hung state, then + the user should reboot with nmi_watchdog=0. + + 1. If the kernel has not been booted with nmi_watchdog turned off then + edit /boot/grub/menu.lst and place the nmi_watchdog=0 at the end of the + currently booting kernel line. + 2. reboot the sever + + Now, the hpwdt can successfully receive and source the NMI and provide a log + message that details the reason for the NMI (as determined by the HP BIOS). + + Below is a list of NMIs the HP BIOS understands along with the associated + code (reason): + + No source found 00h + + Uncorrectable Memory Error 01h + + ASR NMI 1Bh + + PCI Parity Error 20h + + NMI Button Press 27h + + SB_BUS_NMI 28h + + ILO Doorbell NMI 29h + + ILO IOP NMI 2Ah + + ILO Watchdog NMI 2Bh + + Proc Throt NMI 2Ch + + Front Side Bus NMI 2Dh + + PCI Express Error 2Fh + + DMA controller NMI 30h + + Hypertransport/CSI Error 31h + + + + -- Tom Mingarelli + (thomas.mingarelli@hp.com) -- cgit v1.1 From 9d9b8fb0e5ebf4b0398e579f6061d4451fea3242 Mon Sep 17 00:00:00 2001 From: Robin Getz Date: Wed, 17 Jun 2009 16:25:54 -0700 Subject: irqs: add IRQF_SAMPLE_RANDOM to the feature-removal-schedule.txt (deprecated) list This adds IRQF_SAMPLE_RANDOM to the feature-removal (deprecated) list since most of the IRQF_SAMPLE_RANDOM users are technically bogus as entropy sources in the kernel's current entropy model. This was discussed on the lkml the past few days, which started here: http://lkml.org/lkml/2009/4/6/283 Signed-off-by: Robin Getz Cc: Theodore Ts'o Cc: Matt Mackall Cc: Randy Dunlap Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/feature-removal-schedule.txt | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 7129846..8d07ed3 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -6,6 +6,20 @@ be removed from this file. --------------------------- +What: IRQF_SAMPLE_RANDOM +Check: IRQF_SAMPLE_RANDOM +When: July 2009 + +Why: Many of IRQF_SAMPLE_RANDOM users are technically bogus as entropy + sources in the kernel's current entropy model. To resolve this, every + input point to the kernel's entropy pool needs to better document the + type of entropy source it actually is. This will be replaced with + additional add_*_randomness functions in drivers/char/random.c + +Who: Robin Getz & Matt Mackall + +--------------------------- + What: The ieee80211_regdom module parameter When: March 2010 / desktop catchup -- cgit v1.1 From d3d64df21d3d0de675a0d3ffa7c10514f3644b30 Mon Sep 17 00:00:00 2001 From: Keika Kobayashi Date: Wed, 17 Jun 2009 16:25:55 -0700 Subject: proc: export statistics for softirq to /proc Export statistics for softirq in /proc/softirqs and /proc/stat. 1. /proc/softirqs Implement /proc/softirqs which shows the number of softirq for each CPU like /proc/interrupts. 2. /proc/stat Add the "softirq" line to /proc/stat. This line shows the number of softirq for all cpu. The first column is the total of all softirqs and each subsequent column is the total for particular softirq. [kosaki.motohiro@jp.fujitsu.com: remove redundant for_each_possible_cpu() loop] Signed-off-by: Keika Kobayashi Reviewed-by: Hiroshi Shimamoto Cc: KOSAKI Motohiro Cc: Ingo Molnar Cc: Eric Dumazet Cc: Alexey Dobriyan Signed-off-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/proc.txt | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) (limited to 'Documentation') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index ebff3c1..fb7d649 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -283,6 +283,7 @@ Table 1-4: Kernel info in /proc rtc Real time clock scsi SCSI info (see text) slabinfo Slab pool info + softirqs softirq usage stat Overall statistics swaps Swap space utilization sys See chapter 2 @@ -597,6 +598,25 @@ on the kind of area : 0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ... pages=10 vmalloc N0=10 +.............................................................................. + +softirqs: + +Provides counts of softirq handlers serviced since boot time, for each cpu. + +> cat /proc/softirqs + CPU0 CPU1 CPU2 CPU3 + HI: 0 0 0 0 + TIMER: 27166 27120 27097 27034 + NET_TX: 0 0 0 17 + NET_RX: 42 0 0 39 + BLOCK: 0 0 107 1121 + TASKLET: 0 0 0 290 + SCHED: 27035 26983 26971 26746 + HRTIMER: 0 0 0 0 + RCU: 1678 1769 2178 2250 + + 1.3 IDE devices in /proc/ide ---------------------------- @@ -883,6 +903,7 @@ since the system first booted. For a quick look, simply cat the file: processes 2915 procs_running 1 procs_blocked 0 + softirq 183433 0 21755 12 39 1137 231 21459 2263 The very first "cpu" line aggregates the numbers in all of the other "cpuN" lines. These numbers identify the amount of time the CPU has spent performing @@ -918,6 +939,11 @@ CPUs. The "procs_blocked" line gives the number of processes currently blocked, waiting for I/O to complete. +The "softirq" line gives counts of softirqs serviced since boot time, for each +of the possible system softirqs. The first column is the total of all +softirqs serviced; each subsequent column is the total for that particular +softirq. + 1.9 Ext4 file system parameters ------------------------------ -- cgit v1.1 From 349888ee7b2c1ffb44c806adf6f4289ca4a6fd42 Mon Sep 17 00:00:00 2001 From: Stefani Seibold Date: Wed, 17 Jun 2009 16:26:01 -0700 Subject: proc.txt: update kernel filesystem/proc.txt documentation An update for the "Process-Specific Subdirectories" section to reflect the changes till kernel 2.6.30. Signed-off-by: Stefani Seibold Cc: Randy Dunlap Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/proc.txt | 242 +++++++++++++++++++++++++++++-------- 1 file changed, 190 insertions(+), 52 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index fb7d649..fad18f9 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -5,11 +5,12 @@ Bodo Bauer 2.4.x update Jorge Nerin November 14 2000 -move /proc/sys Shen Feng April 1 2009 +move /proc/sys Shen Feng April 1 2009 ------------------------------------------------------------------------------ Version 1.3 Kernel version 2.2.12 Kernel version 2.4.0-test11-pre4 ------------------------------------------------------------------------------ +fixes/update part 1.1 Stefani Seibold June 9 2009 Table of Contents ----------------- @@ -116,7 +117,7 @@ The link self points to the process reading the file system. Each process subdirectory has the entries listed in Table 1-1. -Table 1-1: Process specific entries in /proc +Table 1-1: Process specific entries in /proc .............................................................................. File Content clear_refs Clears page referenced bits shown in smaps output @@ -134,46 +135,103 @@ Table 1-1: Process specific entries in /proc status Process status in human readable form wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan stack Report full stack trace, enable via CONFIG_STACKTRACE - smaps Extension based on maps, the rss size for each mapped file + smaps a extension based on maps, showing the memory consumption of + each mapping .............................................................................. For example, to get the status information of a process, all you have to do is read the file /proc/PID/status: - >cat /proc/self/status - Name: cat - State: R (running) - Pid: 5452 - PPid: 743 + >cat /proc/self/status + Name: cat + State: R (running) + Tgid: 5452 + Pid: 5452 + PPid: 743 TracerPid: 0 (2.4) - Uid: 501 501 501 501 - Gid: 100 100 100 100 - Groups: 100 14 16 - VmSize: 1112 kB - VmLck: 0 kB - VmRSS: 348 kB - VmData: 24 kB - VmStk: 12 kB - VmExe: 8 kB - VmLib: 1044 kB - SigPnd: 0000000000000000 - SigBlk: 0000000000000000 - SigIgn: 0000000000000000 - SigCgt: 0000000000000000 - CapInh: 00000000fffffeff - CapPrm: 0000000000000000 - CapEff: 0000000000000000 - + Uid: 501 501 501 501 + Gid: 100 100 100 100 + FDSize: 256 + Groups: 100 14 16 + VmPeak: 5004 kB + VmSize: 5004 kB + VmLck: 0 kB + VmHWM: 476 kB + VmRSS: 476 kB + VmData: 156 kB + VmStk: 88 kB + VmExe: 68 kB + VmLib: 1412 kB + VmPTE: 20 kb + Threads: 1 + SigQ: 0/28578 + SigPnd: 0000000000000000 + ShdPnd: 0000000000000000 + SigBlk: 0000000000000000 + SigIgn: 0000000000000000 + SigCgt: 0000000000000000 + CapInh: 00000000fffffeff + CapPrm: 0000000000000000 + CapEff: 0000000000000000 + CapBnd: ffffffffffffffff + voluntary_ctxt_switches: 0 + nonvoluntary_ctxt_switches: 1 This shows you nearly the same information you would get if you viewed it with the ps command. In fact, ps uses the proc file system to obtain its -information. The statm file contains more detailed information about the -process memory usage. Its seven fields are explained in Table 1-2. The stat -file contains details information about the process itself. Its fields are -explained in Table 1-3. +information. But you get a more detailed view of the process by reading the +file /proc/PID/status. It fields are described in table 1-2. + +The statm file contains more detailed information about the process +memory usage. Its seven fields are explained in Table 1-3. The stat file +contains details information about the process itself. Its fields are +explained in Table 1-4. +Table 1-2: Contents of the statm files (as of 2.6.30-rc7) +.............................................................................. + Field Content + Name filename of the executable + State state (R is running, S is sleeping, D is sleeping + in an uninterruptible wait, Z is zombie, + T is traced or stopped) + Tgid thread group ID + Pid process id + PPid process id of the parent process + TracerPid PID of process tracing this process (0 if not) + Uid Real, effective, saved set, and file system UIDs + Gid Real, effective, saved set, and file system GIDs + FDSize number of file descriptor slots currently allocated + Groups supplementary group list + VmPeak peak virtual memory size + VmSize total program size + VmLck locked memory size + VmHWM peak resident set size ("high water mark") + VmRSS size of memory portions + VmData size of data, stack, and text segments + VmStk size of data, stack, and text segments + VmExe size of text segment + VmLib size of shared library code + VmPTE size of page table entries + Threads number of threads + SigQ number of signals queued/max. number for queue + SigPnd bitmap of pending signals for the thread + ShdPnd bitmap of shared pending signals for the process + SigBlk bitmap of blocked signals + SigIgn bitmap of ignored signals + SigCgt bitmap of catched signals + CapInh bitmap of inheritable capabilities + CapPrm bitmap of permitted capabilities + CapEff bitmap of effective capabilities + CapBnd bitmap of capabilities bounding set + Cpus_allowed mask of CPUs on which this process may run + Cpus_allowed_list Same as previous, but in "list format" + Mems_allowed mask of memory nodes allowed to this process + Mems_allowed_list Same as previous, but in "list format" + voluntary_ctxt_switches number of voluntary context switches + nonvoluntary_ctxt_switches number of non voluntary context switches +.............................................................................. -Table 1-2: Contents of the statm files (as of 2.6.8-rc3) +Table 1-3: Contents of the statm files (as of 2.6.8-rc3) .............................................................................. Field Content size total program size (pages) (same as VmSize in status) @@ -188,7 +246,7 @@ Table 1-2: Contents of the statm files (as of 2.6.8-rc3) .............................................................................. -Table 1-3: Contents of the stat files (as of 2.6.22-rc3) +Table 1-4: Contents of the stat files (as of 2.6.30-rc7) .............................................................................. Field Content pid process id @@ -222,10 +280,10 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3) start_stack address of the start of the stack esp current value of ESP eip current value of EIP - pending bitmap of pending signals (obsolete) - blocked bitmap of blocked signals (obsolete) - sigign bitmap of ignored signals (obsolete) - sigcatch bitmap of catched signals (obsolete) + pending bitmap of pending signals + blocked bitmap of blocked signals + sigign bitmap of ignored signals + sigcatch bitmap of catched signals wchan address where process went to sleep 0 (place holder) 0 (place holder) @@ -234,19 +292,99 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3) rt_priority realtime priority policy scheduling policy (man sched_setscheduler) blkio_ticks time spent waiting for block IO + gtime guest time of the task in jiffies + cgtime guest time of the task children in jiffies .............................................................................. +The /proc/PID/map file containing the currently mapped memory regions and +their access permissions. + +The format is: + +address perms offset dev inode pathname + +08048000-08049000 r-xp 00000000 03:00 8312 /opt/test +08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test +0804a000-0806b000 rw-p 00000000 00:00 0 [heap] +a7cb1000-a7cb2000 ---p 00000000 00:00 0 +a7cb2000-a7eb2000 rw-p 00000000 00:00 0 +a7eb2000-a7eb3000 ---p 00000000 00:00 0 +a7eb3000-a7ed5000 rw-p 00000000 00:00 0 +a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6 +a8008000-a800a000 r--p 00133000 03:00 4222 /lib/libc.so.6 +a800a000-a800b000 rw-p 00135000 03:00 4222 /lib/libc.so.6 +a800b000-a800e000 rw-p 00000000 00:00 0 +a800e000-a8022000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0 +a8022000-a8023000 r--p 00013000 03:00 14462 /lib/libpthread.so.0 +a8023000-a8024000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0 +a8024000-a8027000 rw-p 00000000 00:00 0 +a8027000-a8043000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2 +a8043000-a8044000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2 +a8044000-a8045000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2 +aff35000-aff4a000 rw-p 00000000 00:00 0 [stack] +ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso] + +where "address" is the address space in the process that it occupies, "perms" +is a set of permissions: + + r = read + w = write + x = execute + s = shared + p = private (copy on write) + +"offset" is the offset into the mapping, "dev" is the device (major:minor), and +"inode" is the inode on that device. 0 indicates that no inode is associated +with the memory region, as the case would be with BSS (uninitialized data). +The "pathname" shows the name associated file for this mapping. If the mapping +is not associated with a file: + + [heap] = the heap of the program + [stack] = the stack of the main process + [vdso] = the "virtual dynamic shared object", + the kernel system call handler + + or if empty, the mapping is anonymous. + + +The /proc/PID/smaps is an extension based on maps, showing the memory +consumption for each of the process's mappings. For each of mappings there +is a series of lines such as the following: + +08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash +Size: 1084 kB +Rss: 892 kB +Pss: 374 kB +Shared_Clean: 892 kB +Shared_Dirty: 0 kB +Private_Clean: 0 kB +Private_Dirty: 0 kB +Referenced: 892 kB +Swap: 0 kB +KernelPageSize: 4 kB +MMUPageSize: 4 kB + +The first of these lines shows the same information as is displayed for the +mapping in /proc/PID/maps. The remaining lines show the size of the mapping, +the amount of the mapping that is currently resident in RAM, the "proportional +set size” (divide each shared page by the number of processes sharing it), the +number of clean and dirty shared pages in the mapping, and the number of clean +and dirty private pages in the mapping. The "Referenced" indicates the amount +of memory currently marked as referenced or accessed. + +This file is only present if the CONFIG_MMU kernel configuration option is +enabled. 1.2 Kernel data --------------- Similar to the process entries, the kernel data files give information about the running kernel. The files used to obtain this information are contained in -/proc and are listed in Table 1-4. Not all of these will be present in your +/proc and are listed in Table 1-5. Not all of these will be present in your system. It depends on the kernel configuration and the loaded modules, which files are there, and which are missing. -Table 1-4: Kernel info in /proc +Table 1-5: Kernel info in /proc .............................................................................. File Content apm Advanced power management info @@ -634,10 +772,10 @@ IDE devices: More detailed information can be found in the controller specific subdirectories. These are named ide0, ide1 and so on. Each of these -directories contains the files shown in table 1-5. +directories contains the files shown in table 1-6. -Table 1-5: IDE controller info in /proc/ide/ide? +Table 1-6: IDE controller info in /proc/ide/ide? .............................................................................. File Content channel IDE channel (0 or 1) @@ -647,11 +785,11 @@ Table 1-5: IDE controller info in /proc/ide/ide? .............................................................................. Each device connected to a controller has a separate subdirectory in the -controllers directory. The files listed in table 1-6 are contained in these +controllers directory. The files listed in table 1-7 are contained in these directories. -Table 1-6: IDE device information +Table 1-7: IDE device information .............................................................................. File Content cache The cache @@ -693,12 +831,12 @@ the drive parameters: 1.4 Networking info in /proc/net -------------------------------- -The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the +The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the additional values you get for IP version 6 if you configure the kernel to -support this. Table 1-7 lists the files and their meaning. +support this. Table 1-9 lists the files and their meaning. -Table 1-6: IPv6 info in /proc/net +Table 1-8: IPv6 info in /proc/net .............................................................................. File Content udp6 UDP sockets (IPv6) @@ -713,7 +851,7 @@ Table 1-6: IPv6 info in /proc/net .............................................................................. -Table 1-7: Network info in /proc/net +Table 1-9: Network info in /proc/net .............................................................................. File Content arp Kernel ARP table @@ -837,10 +975,10 @@ The directory /proc/parport contains information about the parallel ports of your system. It has one subdirectory for each port, named after the port number (0,1,2,...). -These directories contain the four files shown in Table 1-8. +These directories contain the four files shown in Table 1-10. -Table 1-8: Files in /proc/parport +Table 1-10: Files in /proc/parport .............................................................................. File Content autoprobe Any IEEE-1284 device ID information that has been acquired. @@ -858,10 +996,10 @@ Table 1-8: Files in /proc/parport Information about the available and actually used tty's can be found in the directory /proc/tty.You'll find entries for drivers and line disciplines in -this directory, as shown in Table 1-9. +this directory, as shown in Table 1-11. -Table 1-9: Files in /proc/tty +Table 1-11: Files in /proc/tty .............................................................................. File Content drivers list of drivers and their usage @@ -952,9 +1090,9 @@ Information about mounted ext4 file systems can be found in /proc/fs/ext4. Each mounted filesystem will have a directory in /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or /proc/fs/ext4/dm-0). The files in each per-device directory are shown -in Table 1-10, below. +in Table 1-12, below. -Table 1-10: Files in /proc/fs/ext4/ +Table 1-12: Files in /proc/fs/ext4/ .............................................................................. File Content mb_groups details of multiblock allocator buddy cache of free blocks -- cgit v1.1 From ce05b2a9db1d86635a906f14427deff97eeb6183 Mon Sep 17 00:00:00 2001 From: Michael Shields Date: Wed, 17 Jun 2009 16:26:22 -0700 Subject: Doc fix: ext2 can only have 32,000 subdirs, not 32,768 ext2.txt says that dirs can have 32,768 subdirs, but the actual value of EXT2_LINK_MAX is 32000. ext3 is the same, but the doc does not mention it. One of ext4's features is to "fix 32000 subdirectory limit". Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/ext2.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/ext2.txt b/Documentation/filesystems/ext2.txt index e055acb..67639f9 100644 --- a/Documentation/filesystems/ext2.txt +++ b/Documentation/filesystems/ext2.txt @@ -322,7 +322,7 @@ an upper limit on the block size imposed by the page size of the kernel, so 8kB blocks are only allowed on Alpha systems (and other architectures which support larger pages). -There is an upper limit of 32768 subdirectories in a single directory. +There is an upper limit of 32000 subdirectories in a single directory. There is a "soft" upper limit of about 10-15k files in a single directory with the current linear linked-list directory implementation. This limit -- cgit v1.1 From 52b680c81238ea14693ab893d5d32a4d1c0a987d Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Wed, 17 Jun 2009 16:26:25 -0700 Subject: isofs: let mode and dmode mount options override rock ridge mode setting So far, permissions set via 'mode' and/or 'dmode' mount options were effective only if the medium had no rock ridge extensions (or was mounted without them). Add 'overriderockmode' mount option to indicate that these options should override permissions set in rock ridge extensions. Maybe this should be default but the current behavior is there since mount options were created so I think we should not change how they behave. Cc: Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/isofs.txt | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/isofs.txt b/Documentation/filesystems/isofs.txt index 6973b98..3c367c3 100644 --- a/Documentation/filesystems/isofs.txt +++ b/Documentation/filesystems/isofs.txt @@ -23,8 +23,13 @@ Mount options unique to the isofs filesystem. map=off Do not map non-Rock Ridge filenames to lower case map=normal Map non-Rock Ridge filenames to lower case map=acorn As map=normal but also apply Acorn extensions if present - mode=xxx Sets the permissions on files to xxx - dmode=xxx Sets the permissions on directories to xxx + mode=xxx Sets the permissions on files to xxx unless Rock Ridge + extensions set the permissions otherwise + dmode=xxx Sets the permissions on directories to xxx unless Rock Ridge + extensions set the permissions otherwise + overriderockperm Set permissions on files and directories according to + 'mode' and 'dmode' even though Rock Ridge extensions are + present. nojoliet Ignore Joliet extensions if they are present. norock Ignore Rock Ridge extensions if they are present. hide Completely strip hidden files from the file system. -- cgit v1.1 From 082196242e24ff13354a2d376b275e01c08e6799 Mon Sep 17 00:00:00 2001 From: Jose Luis Perez Diez Date: Wed, 17 Jun 2009 16:26:30 -0700 Subject: Documentation/Changes: perl is needed to build the kernel Perl is used on the kernel Makefile to generate documentation, firmwares in c source form, sources, graphs, and some headers and this fact is undocumented. [akpm@linux-foundation.org: 80-columns, please] Signed-off-by: Jose Luis Perez Diez Cc: Sam Ravnborg Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/Changes | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/Changes b/Documentation/Changes index 6643924..6d0f1ef 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -72,6 +72,13 @@ assembling the 16-bit boot code, removing the need for as86 to compile your kernel. This change does, however, mean that you need a recent release of binutils. +Perl +---- + +You will need perl 5 and the following modules: Getopt::Long, Getopt::Std, +File::Basename, and File::Find to build the kernel. + + System utilities ================ -- cgit v1.1 From 28f06c6f4ba2ff450b134b82a3285a26f232a2c1 Mon Sep 17 00:00:00 2001 From: Jaswinder Singh Rajput Date: Wed, 17 Jun 2009 16:26:30 -0700 Subject: Documentation/connector/cn_test.c comment unused cn_test_want_notify() Currently cn_test_want_notify() has no user. So add an ifdef and a comment which tells us to not remove it. Signed-off-by: Jaswinder Singh Rajput Acked-by: Evgeniy Polyakov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/connector/cn_test.c | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'Documentation') diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c index 6977c17..f688eba 100644 --- a/Documentation/connector/cn_test.c +++ b/Documentation/connector/cn_test.c @@ -41,6 +41,12 @@ void cn_test_callback(void *data) msg->seq, msg->ack, msg->len, (char *)msg->data); } +/* + * Do not remove this function even if no one is using it as + * this is an example of how to get notifications about new + * connector user registration + */ +#if 0 static int cn_test_want_notify(void) { struct cn_ctl_msg *ctl; @@ -117,6 +123,7 @@ nlmsg_failure: kfree_skb(skb); return -EINVAL; } +#endif static u32 cn_test_timer_counter; static void cn_test_timer_func(unsigned long __data) -- cgit v1.1 From 22a668d7c3ef833e7d67e9cef587ecc78069d532 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki Date: Wed, 17 Jun 2009 16:27:19 -0700 Subject: memcg: fix behavior under memory.limit equals to memsw.limit A user can set memcg.limit_in_bytes == memcg.memsw.limit_in_bytes when the user just want to limit the total size of applications, in other words, not very interested in memory usage itself. In this case, swap-out will be done only by global-LRU. But, under current implementation, memory.limit_in_bytes is checked at first and try_to_free_page() may do swap-out. But, that swap-out is useless for memsw.limit_in_bytes and the thread may hit limit again. This patch tries to fix the current behavior at memory.limit == memsw.limit case. And documentation is updated to explain the behavior of this special case. Signed-off-by: KAMEZAWA Hiroyuki Cc: Daisuke Nishimura Cc: Balbir Singh Cc: Li Zefan Cc: Dhaval Giani Cc: YAMAMOTO Takashi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/cgroups/memory.txt | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 1a60887..af48135 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -152,14 +152,19 @@ When swap is accounted, following files are added. usage of mem+swap is limited by memsw.limit_in_bytes. -Note: why 'mem+swap' rather than swap. +* why 'mem+swap' rather than swap. The global LRU(kswapd) can swap out arbitrary pages. Swap-out means to move account from memory to swap...there is no change in usage of -mem+swap. - -In other words, when we want to limit the usage of swap without affecting -global LRU, mem+swap limit is better than just limiting swap from OS point -of view. +mem+swap. In other words, when we want to limit the usage of swap without +affecting global LRU, mem+swap limit is better than just limiting swap from +OS point of view. + +* What happens when a cgroup hits memory.memsw.limit_in_bytes +When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out +in this cgroup. Then, swap-out will not be done by cgroup routine and file +caches are dropped. But as mentioned above, global LRU can do swapout memory +from it for sanity of the system's memory management state. You can't forbid +it by cgroup. 2.5 Reclaim -- cgit v1.1 From c5b947b28828e82814605824e5db0bc58d66d8c0 Mon Sep 17 00:00:00 2001 From: Daisuke Nishimura Date: Wed, 17 Jun 2009 16:27:20 -0700 Subject: memcg: add interface to reset limits We don't have an interface to reset mem.limit or memsw.limit now. This patch allows to reset mem.limit or memsw.limit when they are being set to -1. Signed-off-by: Daisuke Nishimura Cc: KAMEZAWA Hiroyuki Cc: Balbir Singh Cc: Li Zefan Cc: Dhaval Giani Cc: YAMAMOTO Takashi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/cgroups/memory.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index af48135..23d1262 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -209,6 +209,7 @@ We can alter the memory limit: NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo, mega or gigabytes. +NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited). # cat /cgroups/0/memory.limit_in_bytes 4194304 -- cgit v1.1 From 26c369dada267d3df1beb86cf89b865ac1178a7f Mon Sep 17 00:00:00 2001 From: Matt Helsley Date: Wed, 17 Jun 2009 16:27:58 -0700 Subject: futex: documentation: fix inconsistent description of futex list_op_pending Strictly speaking list_op_pending points to the 'lock entry', not the 'lock word' (which is actually at 'offset' from 'lock entry'). We can infer this based on reading the code in kernel/futex.c: struct robust_list __user *entry, *next_entry, *pending; ... if (fetch_robust_entry(&pending, &head->list_op_pending, &pip)) return; ... if (pending) handle_futex_death((void __user *)pending + futex_offset, curr, pip); Which is also consistent with the rest of the docs on robust futex lists. Signed-off-by: Matt Helsley Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/robust-futex-ABI.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/robust-futex-ABI.txt b/Documentation/robust-futex-ABI.txt index 535f69f..fd1cd8a 100644 --- a/Documentation/robust-futex-ABI.txt +++ b/Documentation/robust-futex-ABI.txt @@ -135,7 +135,7 @@ manipulating this list), the user code must observe the following protocol on 'lock entry' insertion and removal: On insertion: - 1) set the 'list_op_pending' word to the address of the 'lock word' + 1) set the 'list_op_pending' word to the address of the 'lock entry' to be inserted, 2) acquire the futex lock, 3) add the lock entry, with its thread id (TID) in the bottom 29 bits @@ -143,7 +143,7 @@ On insertion: 4) clear the 'list_op_pending' word. On removal: - 1) set the 'list_op_pending' word to the address of the 'lock word' + 1) set the 'list_op_pending' word to the address of the 'lock entry' to be removed, 2) remove the lock entry for this lock from the 'head' list, 2) release the futex lock, and -- cgit v1.1 From 2521f2c228ad750701ba4702484e31d876dbc386 Mon Sep 17 00:00:00 2001 From: Peter Oberparleiter Date: Wed, 17 Jun 2009 16:28:08 -0700 Subject: gcov: add gcov profiling infrastructure Enable the use of GCC's coverage testing tool gcov [1] with the Linux kernel. gcov may be useful for: * debugging (has this code been reached at all?) * test improvement (how do I change my test to cover these lines?) * minimizing kernel configurations (do I need this option if the associated code is never run?) The profiling patch incorporates the following changes: * change kbuild to include profiling flags * provide functions needed by profiling code * present profiling data as files in debugfs Note that on some architectures, enabling gcc's profiling option "-fprofile-arcs" for the entire kernel may trigger compile/link/ run-time problems, some of which are caused by toolchain bugs and others which require adjustment of architecture code. For this reason profiling the entire kernel is initially restricted to those architectures for which it is known to work without changes. This restriction can be lifted once an architecture has been tested and found compatible with gcc's profiling. Profiling of single files or directories is still available on all platforms (see config help text). [1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html Signed-off-by: Peter Oberparleiter Cc: Andi Kleen Cc: Huang Ying Cc: Li Wei Cc: Michael Ellerman Cc: Ingo Molnar Cc: Heiko Carstens Cc: Martin Schwidefsky Cc: Rusty Russell Cc: WANG Cong Cc: Sam Ravnborg Cc: Jeff Dike Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/gcov.txt | 246 ++++++++++++++++++++++++++++++++++++ Documentation/kernel-parameters.txt | 7 + 2 files changed, 253 insertions(+) create mode 100644 Documentation/gcov.txt (limited to 'Documentation') diff --git a/Documentation/gcov.txt b/Documentation/gcov.txt new file mode 100644 index 0000000..e716aad --- /dev/null +++ b/Documentation/gcov.txt @@ -0,0 +1,246 @@ +Using gcov with the Linux kernel +================================ + +1. Introduction +2. Preparation +3. Customization +4. Files +5. Modules +6. Separated build and test machines +7. Troubleshooting +Appendix A: sample script: gather_on_build.sh +Appendix B: sample script: gather_on_test.sh + + +1. Introduction +=============== + +gcov profiling kernel support enables the use of GCC's coverage testing +tool gcov [1] with the Linux kernel. Coverage data of a running kernel +is exported in gcov-compatible format via the "gcov" debugfs directory. +To get coverage data for a specific file, change to the kernel build +directory and use gcov with the -o option as follows (requires root): + +# cd /tmp/linux-out +# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c + +This will create source code files annotated with execution counts +in the current directory. In addition, graphical gcov front-ends such +as lcov [2] can be used to automate the process of collecting data +for the entire kernel and provide coverage overviews in HTML format. + +Possible uses: + +* debugging (has this line been reached at all?) +* test improvement (how do I change my test to cover these lines?) +* minimizing kernel configurations (do I need this option if the + associated code is never run?) + +-- + +[1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html +[2] http://ltp.sourceforge.net/coverage/lcov.php + + +2. Preparation +============== + +Configure the kernel with: + + CONFIG_DEBUGFS=y + CONFIG_GCOV_KERNEL=y + +and to get coverage data for the entire kernel: + + CONFIG_GCOV_PROFILE_ALL=y + +Note that kernels compiled with profiling flags will be significantly +larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported +on all architectures. + +Profiling data will only become accessible once debugfs has been +mounted: + + mount -t debugfs none /sys/kernel/debug + + +3. Customization +================ + +To enable profiling for specific files or directories, add a line +similar to the following to the respective kernel Makefile: + + For a single file (e.g. main.o): + GCOV_PROFILE_main.o := y + + For all files in one directory: + GCOV_PROFILE := y + +To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL +is specified, use: + + GCOV_PROFILE_main.o := n + and: + GCOV_PROFILE := n + +Only files which are linked to the main kernel image or are compiled as +kernel modules are supported by this mechanism. + + +4. Files +======== + +The gcov kernel support creates the following files in debugfs: + + /sys/kernel/debug/gcov + Parent directory for all gcov-related files. + + /sys/kernel/debug/gcov/reset + Global reset file: resets all coverage data to zero when + written to. + + /sys/kernel/debug/gcov/path/to/compile/dir/file.gcda + The actual gcov data file as understood by the gcov + tool. Resets file coverage data to zero when written to. + + /sys/kernel/debug/gcov/path/to/compile/dir/file.gcno + Symbolic link to a static data file required by the gcov + tool. This file is generated by gcc when compiling with + option -ftest-coverage. + + +5. Modules +========== + +Kernel modules may contain cleanup code which is only run during +module unload time. The gcov mechanism provides a means to collect +coverage data for such code by keeping a copy of the data associated +with the unloaded module. This data remains available through debugfs. +Once the module is loaded again, the associated coverage counters are +initialized with the data from its previous instantiation. + +This behavior can be deactivated by specifying the gcov_persist kernel +parameter: + + gcov_persist=0 + +At run-time, a user can also choose to discard data for an unloaded +module by writing to its data file or the global reset file. + + +6. Separated build and test machines +==================================== + +The gcov kernel profiling infrastructure is designed to work out-of-the +box for setups where kernels are built and run on the same machine. In +cases where the kernel runs on a separate machine, special preparations +must be made, depending on where the gcov tool is used: + +a) gcov is run on the TEST machine + +The gcov tool version on the test machine must be compatible with the +gcc version used for kernel build. Also the following files need to be +copied from build to test machine: + +from the source tree: + - all C source files + headers + +from the build tree: + - all C source files + headers + - all .gcda and .gcno files + - all links to directories + +It is important to note that these files need to be placed into the +exact same file system location on the test machine as on the build +machine. If any of the path components is symbolic link, the actual +directory needs to be used instead (due to make's CURDIR handling). + +b) gcov is run on the BUILD machine + +The following files need to be copied after each test case from test +to build machine: + +from the gcov directory in sysfs: + - all .gcda files + - all links to .gcno files + +These files can be copied to any location on the build machine. gcov +must then be called with the -o option pointing to that directory. + +Example directory setup on the build machine: + + /tmp/linux: kernel source tree + /tmp/out: kernel build directory as specified by make O= + /tmp/coverage: location of the files copied from the test machine + + [user@build] cd /tmp/out + [user@build] gcov -o /tmp/coverage/tmp/out/init main.c + + +7. Troubleshooting +================== + +Problem: Compilation aborts during linker step. +Cause: Profiling flags are specified for source files which are not + linked to the main kernel or which are linked by a custom + linker procedure. +Solution: Exclude affected source files from profiling by specifying + GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the + corresponding Makefile. + + +Appendix A: gather_on_build.sh +============================== + +Sample script to gather coverage meta files on the build machine +(see 6a): + +#!/bin/bash + +KSRC=$1 +KOBJ=$2 +DEST=$3 + +if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then + echo "Usage: $0 " >&2 + exit 1 +fi + +KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) +KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) + +find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \ + -perm /u+r,g+r | tar cfz $DEST -P -T - + +if [ $? -eq 0 ] ; then + echo "$DEST successfully created, copy to test system and unpack with:" + echo " tar xfz $DEST -P" +else + echo "Could not create file $DEST" +fi + + +Appendix B: gather_on_test.sh +============================= + +Sample script to gather coverage data files on the test machine +(see 6b): + +#!/bin/bash + +DEST=$1 +GCDA=/sys/kernel/debug/gcov + +if [ -z "$DEST" ] ; then + echo "Usage: $0 " >&2 + exit 1 +fi + +find $GCDA -name '*.gcno' -o -name '*.gcda' | tar cfz $DEST -T - + +if [ $? -eq 0 ] ; then + echo "$DEST successfully created, copy to build system and unpack with:" + echo " tar xfz $DEST" +else + echo "Could not create file $DEST" +fi diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 5578248..08def8d 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -48,6 +48,7 @@ parameter is applicable: EFI EFI Partitioning (GPT) is enabled EIDE EIDE/ATAPI support is enabled. FB The frame buffer device is enabled. + GCOV GCOV profiling is enabled. HW Appropriate hardware is enabled. IA-64 IA-64 architecture is enabled. IMA Integrity measurement architecture is enabled. @@ -796,6 +797,12 @@ and is between 256 and 4096 characters. It is defined in the file Format: off | on default: on + gcov_persist= [GCOV] When non-zero (default), profiling data for + kernel modules is saved and remains accessible via + debugfs, even when the module is unloaded/reloaded. + When zero, profiling data is discarded and associated + debugfs files are removed at module unload time. + gdth= [HW,SCSI] See header of drivers/scsi/gdth.c. -- cgit v1.1 From eae9d2ba0cfc27a2ad9765f23efb98fb80d80234 Mon Sep 17 00:00:00 2001 From: Rodolfo Giometti Date: Wed, 17 Jun 2009 16:28:37 -0700 Subject: LinuxPPS: core support This patch adds the kernel side of the PPS support currently named "LinuxPPS". PPS means "pulse per second" and a PPS source is just a device which provides a high precision signal each second so that an application can use it to adjust system clock time. Common use is the combination of the NTPD as userland program with a GPS receiver as PPS source to obtain a wallclock-time with sub-millisecond synchronisation to UTC. To obtain this goal the userland programs shoud use the PPS API specification (RFC 2783 - Pulse-Per-Second API for UNIX-like Operating Systems, Version 1.0) which in part is implemented by this patch. It provides a set of chars devices, one per PPS source, which can be used to get the time signal. The RFC's functions can be implemented by accessing to these char devices. Signed-off-by: Rodolfo Giometti Cc: David Woodhouse Cc: Greg KH Cc: Randy Dunlap Cc: Kay Sievers Acked-by: Alan Cox Cc: Michael Kerrisk Cc: Christoph Hellwig Cc: Roman Zippel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/ABI/testing/sysfs-pps | 73 +++++++++++++++ Documentation/ioctl/ioctl-number.txt | 2 + Documentation/pps/pps.txt | 172 +++++++++++++++++++++++++++++++++++ 3 files changed, 247 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-pps create mode 100644 Documentation/pps/pps.txt (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-pps b/Documentation/ABI/testing/sysfs-pps new file mode 100644 index 0000000..25028c7 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-pps @@ -0,0 +1,73 @@ +What: /sys/class/pps/ +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ directory will contain files and + directories that will provide a unified interface to + the PPS sources. + +What: /sys/class/pps/ppsX/ +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/ directory is related to X-th + PPS source into the system. Each directory will + contain files to manage and control its PPS source. + +What: /sys/class/pps/ppsX/assert +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/assert file reports the assert events + and the assert sequence number of the X-th source in the form: + + .# + + If the source has no assert events the content of this file + is empty. + +What: /sys/class/pps/ppsX/clear +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/clear file reports the clear events + and the clear sequence number of the X-th source in the form: + + .# + + If the source has no clear events the content of this file + is empty. + +What: /sys/class/pps/ppsX/mode +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/mode file reports the functioning + mode of the X-th source in hexadecimal encoding. + + Please, refer to linux/include/linux/pps.h for further + info. + +What: /sys/class/pps/ppsX/echo +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/echo file reports if the X-th does + or does not support an "echo" function. + +What: /sys/class/pps/ppsX/name +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/name file reports the name of the + X-th source. + +What: /sys/class/pps/ppsX/path +Date: February 2008 +Contact: Rodolfo Giometti +Description: + The /sys/class/pps/ppsX/path file reports the path name of + the device connected with the X-th source. + + If the source is not connected with any device the content + of this file is empty. diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 1f779a2..7bb0d93 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -149,6 +149,8 @@ Code Seq# Include File Comments 'p' 40-7F linux/nvram.h 'p' 80-9F user-space parport +'p' a1-a4 linux/pps.h LinuxPPS + 'q' 00-1F linux/serio.h 'q' 80-FF Internet PhoneJACK, Internet LineJACK diff --git a/Documentation/pps/pps.txt b/Documentation/pps/pps.txt new file mode 100644 index 0000000..125f4ab --- /dev/null +++ b/Documentation/pps/pps.txt @@ -0,0 +1,172 @@ + + PPS - Pulse Per Second + ---------------------- + +(C) Copyright 2007 Rodolfo Giometti + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + + + +Overview +-------- + +LinuxPPS provides a programming interface (API) to define in the +system several PPS sources. + +PPS means "pulse per second" and a PPS source is just a device which +provides a high precision signal each second so that an application +can use it to adjust system clock time. + +A PPS source can be connected to a serial port (usually to the Data +Carrier Detect pin) or to a parallel port (ACK-pin) or to a special +CPU's GPIOs (this is the common case in embedded systems) but in each +case when a new pulse arrives the system must apply to it a timestamp +and record it for userland. + +Common use is the combination of the NTPD as userland program, with a +GPS receiver as PPS source, to obtain a wallclock-time with +sub-millisecond synchronisation to UTC. + + +RFC considerations +------------------ + +While implementing a PPS API as RFC 2783 defines and using an embedded +CPU GPIO-Pin as physical link to the signal, I encountered a deeper +problem: + + At startup it needs a file descriptor as argument for the function + time_pps_create(). + +This implies that the source has a /dev/... entry. This assumption is +ok for the serial and parallel port, where you can do something +useful besides(!) the gathering of timestamps as it is the central +task for a PPS-API. But this assumption does not work for a single +purpose GPIO line. In this case even basic file-related functionality +(like read() and write()) makes no sense at all and should not be a +precondition for the use of a PPS-API. + +The problem can be simply solved if you consider that a PPS source is +not always connected with a GPS data source. + +So your programs should check if the GPS data source (the serial port +for instance) is a PPS source too, and if not they should provide the +possibility to open another device as PPS source. + +In LinuxPPS the PPS sources are simply char devices usually mapped +into files /dev/pps0, /dev/pps1, etc.. + + +Coding example +-------------- + +To register a PPS source into the kernel you should define a struct +pps_source_info_s as follows: + + static struct pps_source_info pps_ktimer_info = { + .name = "ktimer", + .path = "", + .mode = PPS_CAPTUREASSERT | PPS_OFFSETASSERT | \ + PPS_ECHOASSERT | \ + PPS_CANWAIT | PPS_TSFMT_TSPEC, + .echo = pps_ktimer_echo, + .owner = THIS_MODULE, + }; + +and then calling the function pps_register_source() in your +intialization routine as follows: + + source = pps_register_source(&pps_ktimer_info, + PPS_CAPTUREASSERT | PPS_OFFSETASSERT); + +The pps_register_source() prototype is: + + int pps_register_source(struct pps_source_info_s *info, int default_params) + +where "info" is a pointer to a structure that describes a particular +PPS source, "default_params" tells the system what the initial default +parameters for the device should be (it is obvious that these parameters +must be a subset of ones defined in the struct +pps_source_info_s which describe the capabilities of the driver). + +Once you have registered a new PPS source into the system you can +signal an assert event (for example in the interrupt handler routine) +just using: + + pps_event(source, &ts, PPS_CAPTUREASSERT, ptr) + +where "ts" is the event's timestamp. + +The same function may also run the defined echo function +(pps_ktimer_echo(), passing to it the "ptr" pointer) if the user +asked for that... etc.. + +Please see the file drivers/pps/clients/ktimer.c for example code. + + +SYSFS support +------------- + +If the SYSFS filesystem is enabled in the kernel it provides a new class: + + $ ls /sys/class/pps/ + pps0/ pps1/ pps2/ + +Every directory is the ID of a PPS sources defined in the system and +inside you find several files: + + $ ls /sys/class/pps/pps0/ + assert clear echo mode name path subsystem@ uevent + +Inside each "assert" and "clear" file you can find the timestamp and a +sequence number: + + $ cat /sys/class/pps/pps0/assert + 1170026870.983207967#8 + +Where before the "#" is the timestamp in seconds; after it is the +sequence number. Other files are: + +* echo: reports if the PPS source has an echo function or not; + +* mode: reports available PPS functioning modes; + +* name: reports the PPS source's name; + +* path: reports the PPS source's device path, that is the device the + PPS source is connected to (if it exists). + + +Testing the PPS support +----------------------- + +In order to test the PPS support even without specific hardware you can use +the ktimer driver (see the client subsection in the PPS configuration menu) +and the userland tools provided into Documentaion/pps/ directory. + +Once you have enabled the compilation of ktimer just modprobe it (if +not statically compiled): + + # modprobe ktimer + +and the run ppstest as follow: + + $ ./ppstest /dev/pps0 + trying PPS source "/dev/pps1" + found PPS source "/dev/pps1" + ok, found 1 source(s), now start fetching data... + source 0 - assert 1186592699.388832443, sequence: 364 - clear 0.000000000, sequence: 0 + source 0 - assert 1186592700.388931295, sequence: 365 - clear 0.000000000, sequence: 0 + source 0 - assert 1186592701.389032765, sequence: 366 - clear 0.000000000, sequence: 0 + +Please, note that to compile userland programs you need the file timepps.h +(see Documentation/pps/). -- cgit v1.1 From 90c699a9ee4be165966d40f1837909ccb8890a68 Mon Sep 17 00:00:00 2001 From: Bartlomiej Zolnierkiewicz Date: Fri, 19 Jun 2009 08:08:50 +0200 Subject: block: rename CONFIG_LBD to CONFIG_LBDAF Follow-up to "block: enable by default support for large devices and files on 32-bit archs". Rename CONFIG_LBD to CONFIG_LBDAF to: - allow update of existing [def]configs for "default y" change - reflect that it is used also for large files support nowadays Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Jens Axboe --- Documentation/SubmitChecklist | 2 +- Documentation/ja_JP/SubmitChecklist | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist index ac5e0b2..78a9168 100644 --- a/Documentation/SubmitChecklist +++ b/Documentation/SubmitChecklist @@ -54,7 +54,7 @@ kernel patches. CONFIG_PREEMPT. 14: If the patch affects IO/Disk, etc: has been tested with and without - CONFIG_LBD. + CONFIG_LBDAF. 15: All codepaths have been exercised with all lockdep features enabled. diff --git a/Documentation/ja_JP/SubmitChecklist b/Documentation/ja_JP/SubmitChecklist index 6c42e07..2df4576 100644 --- a/Documentation/ja_JP/SubmitChecklist +++ b/Documentation/ja_JP/SubmitChecklist @@ -75,7 +75,7 @@ Linux カーネルパッチ投稿者向けチェックリスト ビルドした上、動作確認を行ってください。 14: もしパッチがディスクのI/O性能などに影響を与えるようであれば、 - 'CONFIG_LBD'オプションを有効にした場合と無効にした場合の両方で + 'CONFIG_LBDAF'オプションを有効にした場合と無効にした場合の両方で テストを実施してみてください。 15: lockdepの機能を全て有効にした上で、全てのコードパスを評価してください。 -- cgit v1.1 From 352da9820e5506e3b8496e6052a2ad9c488efae8 Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Fri, 19 Jun 2009 16:58:17 +0200 Subject: i2c: Kill client_register and client_unregister methods These methods were useful in the legacy binding model but no longer in the new (standard) binding model. There are no users left so we can drop them. Signed-off-by: Jean Delvare Cc: David Brownell --- Documentation/feature-removal-schedule.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 8d07ed3..9163dad 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -368,8 +368,7 @@ Who: Krzysztof Piotr Oledzki --------------------------- -What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client(), - i2c_adapter->client_register(), i2c_adapter->client_unregister +What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client() When: 2.6.30 Check: i2c_attach_client i2c_detach_client Why: Deprecated by the new (standard) device driver binding model. Use -- cgit v1.1 From 729d6dd571464954f625e6b80950d9e4e3bd94f7 Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Fri, 19 Jun 2009 16:58:18 +0200 Subject: i2c: Get rid of the legacy binding model We converted all the legacy i2c drivers so we can finally get rid of the legacy binding model. Hooray! Signed-off-by: Jean Delvare Cc: David Brownell --- Documentation/feature-removal-schedule.txt | 9 --------- Documentation/i2c/writing-clients | 16 +++------------- 2 files changed, 3 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 9163dad..f8cd450 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -368,15 +368,6 @@ Who: Krzysztof Piotr Oledzki --------------------------- -What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client() -When: 2.6.30 -Check: i2c_attach_client i2c_detach_client -Why: Deprecated by the new (standard) device driver binding model. Use - i2c_driver->probe() and ->remove() instead. -Who: Jean Delvare - ---------------------------- - What: fscher and fscpos drivers When: June 2009 Why: Deprecated by the new fschmd driver. diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index c1a06f9..7860aaf 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -126,19 +126,9 @@ different) configuration information, as do drivers handling chip variants that can't be distinguished by protocol probing, or which need some board specific information to operate correctly. -Accordingly, the I2C stack now has two models for associating I2C devices -with their drivers: the original "legacy" model, and a newer one that's -fully compatible with the Linux 2.6 driver model. These models do not mix, -since the "legacy" model requires drivers to create "i2c_client" device -objects after SMBus style probing, while the Linux driver model expects -drivers to be given such device objects in their probe() routines. -The legacy model is deprecated now and will soon be removed, so we no -longer document it here. - - -Standard Driver Model Binding ("New Style") -------------------------------------------- +Device/Driver Binding +--------------------- System infrastructure, typically board-specific initialization code or boot firmware, reports what I2C devices exist. For example, there may be @@ -201,7 +191,7 @@ a given I2C bus. This is for example the case of hardware monitoring devices on a PC's SMBus. In that case, you may want to let your driver detect supported devices automatically. This is how the legacy model was working, and is now available as an extension to the standard -driver model (so that we can finally get rid of the legacy model.) +driver model. You simply have to define a detect callback which will attempt to identify supported devices (returning 0 for supported ones and -ENODEV -- cgit v1.1 From 99cd8e25875a109455b709b5a41d4891b8d8e58e Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Fri, 19 Jun 2009 16:58:20 +0200 Subject: i2c: Add a sysfs interface to instantiate devices Add a sysfs interface to instantiate and delete I2C devices. This is primarily a replacement of the force_* module parameters implemented by some i2c drivers. These module parameters were implemented internally by the I2C_CLIENT_INSMOD* macros, which don't scale well. This can also be used when developing a driver on a self-soldered board which doesn't yet have proper I2C device declaration at the platform level, and presumably for various debugging situations. Signed-off-by: Jean Delvare Cc: David Brownell --- Documentation/i2c/instantiating-devices | 44 +++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) (limited to 'Documentation') diff --git a/Documentation/i2c/instantiating-devices b/Documentation/i2c/instantiating-devices index b55ce57..c740b7b 100644 --- a/Documentation/i2c/instantiating-devices +++ b/Documentation/i2c/instantiating-devices @@ -165,3 +165,47 @@ was done there. Two significant differences are: Once again, method 3 should be avoided wherever possible. Explicit device instantiation (methods 1 and 2) is much preferred for it is safer and faster. + + +Method 4: Instantiate from user-space +------------------------------------- + +In general, the kernel should know which I2C devices are connected and +what addresses they live at. However, in certain cases, it does not, so a +sysfs interface was added to let the user provide the information. This +interface is made of 2 attribute files which are created in every I2C bus +directory: new_device and delete_device. Both files are write only and you +must write the right parameters to them in order to properly instantiate, +respectively delete, an I2C device. + +File new_device takes 2 parameters: the name of the I2C device (a string) +and the address of the I2C device (a number, typically expressed in +hexadecimal starting with 0x, but can also be expressed in decimal.) + +File delete_device takes a single parameter: the address of the I2C +device. As no two devices can live at the same address on a given I2C +segment, the address is sufficient to uniquely identify the device to be +deleted. + +Example: +# echo eeprom 0x50 > /sys/class/i2c-adapter/i2c-3/new_device + +While this interface should only be used when in-kernel device declaration +can't be done, there is a variety of cases where it can be helpful: +* The I2C driver usually detects devices (method 3 above) but the bus + segment your device lives on doesn't have the proper class bit set and + thus detection doesn't trigger. +* The I2C driver usually detects devices, but your device lives at an + unexpected address. +* The I2C driver usually detects devices, but your device is not detected, + either because the detection routine is too strict, or because your + device is not officially supported yet but you know it is compatible. +* You are developing a driver on a test board, where you soldered the I2C + device yourself. + +This interface is a replacement for the force_* module parameters some I2C +drivers implement. Being implemented in i2c-core rather than in each +device driver individually, it is much more efficient, and also has the +advantage that you do not have to reload the driver to change a setting. +You can also instantiate the device before the driver is loaded or even +available, and you don't need to know what driver the device needs. -- cgit v1.1 From 464902e812025792c9e33e19e1555c343672d5cf Mon Sep 17 00:00:00 2001 From: Alan Jenkins Date: Tue, 16 Jun 2009 14:54:04 +0100 Subject: rfkill: export persistent attribute in sysfs This information allows userspace to implement a hybrid policy where it can store the rfkill soft-blocked state in platform non-volatile storage if available, and if not then file-based storage can be used. Some users prefer platform non-volatile storage because of the behaviour when dual-booting multiple versions of Linux, or if the rfkill setting is changed in the BIOS setting screens, or if the BIOS responds to wireless-toggle hotkeys itself before the relevant platform driver has been loaded. Signed-off-by: Alan Jenkins Acked-by: Henrique de Moraes Holschuh Signed-off-by: John W. Linville --- Documentation/rfkill.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index c8acd86..b486050 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -111,6 +111,8 @@ following attributes: name: Name assigned by driver to this key (interface or driver name). type: Driver type string ("wlan", "bluetooth", etc). + persistent: Whether the soft blocked state is initialised from + non-volatile storage at startup. state: Current state of the transmitter 0: RFKILL_STATE_SOFT_BLOCKED transmitter is turned off by software -- cgit v1.1 From 952e57ba3769d6fc6139b8a99c32ea2bb63f23e9 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 20 Jun 2009 01:10:38 -0700 Subject: isdn: clean up documentation index Remove duplicates, a stray merge conflict marker, and an entry for a file which doesn't exist, and move one entry to its correct alphabetical place. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller --- Documentation/isdn/00-INDEX | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-) (limited to 'Documentation') diff --git a/Documentation/isdn/00-INDEX b/Documentation/isdn/00-INDEX index f6010a5..e87e336 100644 --- a/Documentation/isdn/00-INDEX +++ b/Documentation/isdn/00-INDEX @@ -14,25 +14,14 @@ README - general info on what you need and what to do for Linux ISDN. README.FAQ - general info for FAQ. -README.audio - - info for running audio over ISDN. -README.fax - - info for using Fax over ISDN. -README.gigaset - - info on the drivers for Siemens Gigaset ISDN adapters. -README.icn - - info on the ICN-ISDN-card and its driver. ->>>>>>> 93af7aca44f0e82e67bda10a0fb73d383edcc8bd:Documentation/isdn/00-INDEX README.HiSax - info on the HiSax driver which replaces the old teles. +README.act2000 + - info on driver for IBM ACT-2000 card. README.audio - info for running audio over ISDN. README.avmb1 - info on driver for AVM-B1 ISDN card. -README.act2000 - - info on driver for IBM ACT-2000 card. -README.eicon - - info on driver for Eicon active cards. README.concap - info on "CONCAP" encapsulation protocol interface used for X.25. README.diversion @@ -59,7 +48,3 @@ README.x25 - info for running X.25 over ISDN. syncPPP.FAQ - frequently asked questions about running PPP over ISDN. -README.hysdn - - info on driver for Hypercope active HYSDN cards -README.mISDN - - info on the Modular ISDN subsystem (mISDN). -- cgit v1.1 From b1a914690c581f8f88b897d83a79b1c6eaf494c9 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 21 Jun 2009 10:56:44 +0200 Subject: ALSA: hda - Add model=6530g option Add the new model string corresponding to the previous Acer Aspire 6530G support. Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index de8e10a..0d8d235 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -139,6 +139,7 @@ ALC883/888 acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc) acer-aspire Acer Aspire 9810 acer-aspire-4930g Acer Aspire 4930G + acer-aspire-6530g Acer Aspire 6530G acer-aspire-8930g Acer Aspire 8930G medion Medion Laptops medion-md2 Medion MD2 -- cgit v1.1 From 5fe23c7f51def59f66bc6aeee988ef1a467a2c8c Mon Sep 17 00:00:00 2001 From: Anton Vorontsov Date: Thu, 18 Jun 2009 00:14:08 +0400 Subject: sdhci: Add support for hosts that are only capable of 1-bit transfers Some hosts (hardware configurations, or particular SD/MMC slots) may not support 4-bit bus. For example, on MPC8569E-MDS boards we can switch between serial (1-bit only) and nibble (4-bit) modes, thought we have to disable more peripherals to work in 4-bit mode. Along with some small core changes, this patch modifies sdhci-of driver, so that now it looks for "sdhci,1-bit-only" property in the device-tree, and if specified we enable a proper quirk. Signed-off-by: Anton Vorontsov Acked-by: Grant Likely Signed-off-by: Pierre Ossman --- Documentation/powerpc/dts-bindings/fsl/esdhc.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt index 5093ddf..3ed3797 100644 --- a/Documentation/powerpc/dts-bindings/fsl/esdhc.txt +++ b/Documentation/powerpc/dts-bindings/fsl/esdhc.txt @@ -10,6 +10,8 @@ Required properties: - interrupts : should contain eSDHC interrupt. - interrupt-parent : interrupt source phandle. - clock-frequency : specifies eSDHC base clock frequency. + - sdhci,1-bit-only : (optional) specifies that a controller can + only handle 1-bit data transfers. Example: -- cgit v1.1 From fd5e033908b7b743b5650790f196761dd930f988 Mon Sep 17 00:00:00 2001 From: Kiyoshi Ueda Date: Mon, 22 Jun 2009 10:12:27 +0100 Subject: dm mpath: add queue length load balancer This patch adds a dynamic load balancer, dm-queue-length, which balances the number of in-flight I/Os across the paths. The code is based on the patch posted by Stefan Bader: https://www.redhat.com/archives/dm-devel/2005-October/msg00050.html Signed-off-by: Stefan Bader Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-queue-length.txt | 39 +++++++++++++++++++++++++ 1 file changed, 39 insertions(+) create mode 100644 Documentation/device-mapper/dm-queue-length.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-queue-length.txt b/Documentation/device-mapper/dm-queue-length.txt new file mode 100644 index 0000000..f4db256 --- /dev/null +++ b/Documentation/device-mapper/dm-queue-length.txt @@ -0,0 +1,39 @@ +dm-queue-length +=============== + +dm-queue-length is a path selector module for device-mapper targets, +which selects a path with the least number of in-flight I/Os. +The path selector name is 'queue-length'. + +Table parameters for each path: [] + : The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + +Status for each path: + : 'A' if the path is active, 'F' if the path is failed. + : The number of path failures. + : The number of in-flight I/Os on the path. + + +Algorithm +========= + +dm-queue-length increments/decrements 'in-flight' when an I/O is +dispatched/completed respectively. +dm-queue-length selects a path with the minimum 'in-flight'. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128. + +# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0 -- cgit v1.1 From f392ba889b019602976082bfe7bf486c2594f85c Mon Sep 17 00:00:00 2001 From: Kiyoshi Ueda Date: Mon, 22 Jun 2009 10:12:28 +0100 Subject: dm mpath: add service time load balancer This patch adds a service time oriented dynamic load balancer, dm-service-time, which selects the path with the shortest estimated service time for the incoming I/O. The service time is estimated by dividing the in-flight I/O size by a performance value of each path. The performance value can be given as a table argument at the table loading time. If no performance value is given, all paths are considered equal. Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-service-time.txt | 91 +++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 Documentation/device-mapper/dm-service-time.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-service-time.txt b/Documentation/device-mapper/dm-service-time.txt new file mode 100644 index 0000000..7d00668 --- /dev/null +++ b/Documentation/device-mapper/dm-service-time.txt @@ -0,0 +1,91 @@ +dm-service-time +=============== + +dm-service-time is a path selector module for device-mapper targets, +which selects a path with the shortest estimated service time for +the incoming I/O. + +The service time for each path is estimated by dividing the total size +of in-flight I/Os on a path with the performance value of the path. +The performance value is a relative throughput value among all paths +in a path-group, and it can be specified as a table argument. + +The path selector name is 'service-time'. + +Table parameters for each path: [ []] + : The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + : The relative throughput value of the path + among all paths in the path-group. + The valid range is 0-100. + If not given, minimum value '1' is used. + If '0' is given, the path isn't selected while + other paths having a positive value are available. + +Status for each path: \ + + : 'A' if the path is active, 'F' if the path is failed. + : The number of path failures. + : The size of in-flight I/Os on the path. + : The relative throughput value of the path + among all paths in the path-group. + + +Algorithm +========= + +dm-service-time adds the I/O size to 'in-flight-size' when the I/O is +dispatched and substracts when completed. +Basically, dm-service-time selects a path having minimum service time +which is calculated by: + + ('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput' + +However, some optimizations below are used to reduce the calculation +as much as possible. + + 1. If the paths have the same 'relative_throughput', skip + the division and just compare the 'in-flight-size'. + + 2. If the paths have the same 'in-flight-size', skip the division + and just compare the 'relative_throughput'. + + 3. If some paths have non-zero 'relative_throughput' and others + have zero 'relative_throughput', ignore those paths with zero + 'relative_throughput'. + +If such optimizations can't be applied, calculate service time, and +compare service time. +If calculated service time is equal, the path having maximum +'relative_throughput' may be better. So compare 'relative_throughput' +then. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128 +and sda has an average throughput 1GB/s and sdb has 4GB/s, +'relative_throughput' value may be '1' for sda and '4' for sdb. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4 + + +Or '2' for sda and '8' for sdb would be also true. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8 -- cgit v1.1 From f5db4af466e2dca0fe822019812d586ca910b00c Mon Sep 17 00:00:00 2001 From: Jonthan Brassow Date: Mon, 22 Jun 2009 10:12:35 +0100 Subject: dm raid1: add userspace log This patch contains a device-mapper mirror log module that forwards requests to userspace for processing. The structures used for communication between kernel and userspace are located in include/linux/dm-log-userspace.h. Due to the frequency, diversity, and 2-way communication nature of the exchanges between kernel and userspace, 'connector' was chosen as the interface for communication. The first log implementations written in userspace - "clustered-disk" and "clustered-core" - support clustered shared storage. A userspace daemon (in the LVM2 source code repository) uses openAIS/corosync to process requests in an ordered fashion with the rest of the nodes in the cluster so as to prevent log state corruption. Other implementations with no association to LVM or openAIS/corosync, are certainly possible. (Imagine if two machines are writing to the same region of a mirror. They would both mark the region dirty, but you need a cluster-aware entity that can handle properly marking the region clean when they are done. Otherwise, you might clear the region when the first machine is done, not the second.) Signed-off-by: Jonathan Brassow Cc: Evgeniy Polyakov Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-log.txt | 54 ++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 Documentation/device-mapper/dm-log.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-log.txt b/Documentation/device-mapper/dm-log.txt new file mode 100644 index 0000000..994dd75 --- /dev/null +++ b/Documentation/device-mapper/dm-log.txt @@ -0,0 +1,54 @@ +Device-Mapper Logging +===================== +The device-mapper logging code is used by some of the device-mapper +RAID targets to track regions of the disk that are not consistent. +A region (or portion of the address space) of the disk may be +inconsistent because a RAID stripe is currently being operated on or +a machine died while the region was being altered. In the case of +mirrors, a region would be considered dirty/inconsistent while you +are writing to it because the writes need to be replicated for all +the legs of the mirror and may not reach the legs at the same time. +Once all writes are complete, the region is considered clean again. + +There is a generic logging interface that the device-mapper RAID +implementations use to perform logging operations (see +dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different +logging implementations are available and provide different +capabilities. The list includes: + +Type Files +==== ===== +disk drivers/md/dm-log.c +core drivers/md/dm-log.c +userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h + +The "disk" log type +------------------- +This log implementation commits the log state to disk. This way, the +logging state survives reboots/crashes. + +The "core" log type +------------------- +This log implementation keeps the log state in memory. The log state +will not survive a reboot or crash, but there may be a small boost in +performance. This method can also be used if no storage device is +available for storing log state. + +The "userspace" log type +------------------------ +This log type simply provides a way to export the log API to userspace, +so log implementations can be done there. This is done by forwarding most +logging requests to userspace, where a daemon receives and processes the +request. + +The structure used for communication between kernel and userspace are +located in include/linux/dm-log-userspace.h. Due to the frequency, +diversity, and 2-way communication nature of the exchanges between +kernel and userspace, 'connector' is used as the interface for +communication. + +There are currently two userspace log implementations that leverage this +framework - "clustered_disk" and "clustered_core". These implementations +provide a cluster-coherent log for shared-storage. Device-mapper mirroring +can be used in a shared-storage environment when the cluster log implementations +are employed. -- cgit v1.1 From 14422f9dd8515bfbe6fdbde37eadf59e2980f104 Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Tue, 16 Jun 2009 23:55:44 -0300 Subject: V4L/DVB (12010): cx88: Properly support Leadtek TV2000 XP Global Fix Leadtek TV2000 XP Global entries and add missing PCI ID's. Thanks to Terry Wu for pointing us for the proper settings. Cc: Terry Wu Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.cx88 | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 89093f5..0736518 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -6,8 +6,8 @@ 5 -> Leadtek Winfast 2000XP Expert [107d:6611,107d:6613] 6 -> AverTV Studio 303 (M126) [1461:000b] 7 -> MSI TV-@nywhere Master [1462:8606] - 8 -> Leadtek Winfast DV2000 [107d:6620] - 9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632] + 8 -> Leadtek Winfast DV2000 [107d:6620,107d:6621] + 9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632,107d:6630,107d:6638,107d:6631,107d:6637,107d:663d] 10 -> IODATA GV-VCP3/PCI [10fc:d003] 11 -> Prolink PlayTV PVR 12 -> ASUS PVR-416 [1043:4823,1461:c111] @@ -59,7 +59,7 @@ 58 -> Pinnacle PCTV HD 800i [11bd:0051] 59 -> DViCO FusionHDTV 5 PCI nano [18ac:d530] 60 -> Pinnacle Hybrid PCTV [12ab:1788] - 61 -> Winfast TV2000 XP Global [107d:6f18] + 61 -> Leadtek TV2000 XP Global [107d:6f18,107d:6618] 62 -> PowerColor RA330 [14f1:ea3d] 63 -> Geniatech X8000-MT DVBT [14f1:8852] 64 -> DViCO FusionHDTV DVB-T PRO [18ac:db30] -- cgit v1.1 From 19859229d7d98bc2d582ff45045dd7f73d649383 Mon Sep 17 00:00:00 2001 From: Devin Heitmueller Date: Fri, 19 Jun 2009 00:33:54 -0300 Subject: V4L/DVB (12101): em28xx: add support for EVGA inDtube Add support for the EVGA inDtube. Both ATSC and analog side validated as fully functional. Thanks to Jake Crimmins from EVGA for providing the correct GPIO info. Thanks to Alan Hagge for doing all the device testing. Thanks to Greg Williamson for providing hardware for testing. Cc: Jake Crimmins Cc: Alan Hagge Cc: Greg Williamson Signed-off-by: Devin Heitmueller Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index a98a688..873630e 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -65,3 +65,4 @@ 67 -> Terratec Grabby (em2860) [0ccd:0096] 68 -> Terratec AV350 (em2860) [0ccd:0084] 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313] + 70 -> Evga inDtube (em2882) -- cgit v1.1 From 2c0b19ac3b73199fe7b3fbff884051046554c048 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Tue, 9 Jun 2009 17:29:29 -0300 Subject: V4L/DVB (12128): v4l2: update framework documentation. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/v4l2-framework.txt | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/v4l2-framework.txt b/Documentation/video4linux/v4l2-framework.txt index d54c1e4..ba4706a 100644 --- a/Documentation/video4linux/v4l2-framework.txt +++ b/Documentation/video4linux/v4l2-framework.txt @@ -390,6 +390,30 @@ later date. It differs between i2c drivers and as such can be confusing. To see which chip variants are supported you can look in the i2c driver code for the i2c_device_id table. This lists all the possibilities. +There are two more helper functions: + +v4l2_i2c_new_subdev_cfg: this function adds new irq and platform_data +arguments and has both 'addr' and 'probed_addrs' arguments: if addr is not +0 then that will be used (non-probing variant), otherwise the probed_addrs +are probed. + +For example: this will probe for address 0x10: + +struct v4l2_subdev *sd = v4l2_i2c_new_subdev_cfg(v4l2_dev, adapter, + "module_foo", "chipid", 0, NULL, 0, I2C_ADDRS(0x10)); + +v4l2_i2c_new_subdev_board uses an i2c_board_info struct which is passed +to the i2c driver and replaces the irq, platform_data and addr arguments. + +If the subdev supports the s_config core ops, then that op is called with +the irq and platform_data arguments after the subdev was setup. The older +v4l2_i2c_new_(probed_)subdev functions will call s_config as well, but with +irq set to 0 and platform_data set to NULL. + +Note that in the next kernel release the functions v4l2_i2c_new_subdev, +v4l2_i2c_new_probed_subdev and v4l2_i2c_new_probed_subdev_addr will all be +replaced by a single v4l2_i2c_new_subdev that is identical to +v4l2_i2c_new_subdev_cfg but without the irq and platform_data arguments. struct video_device ------------------- -- cgit v1.1 From 44df75353bc8f32e26e049284053a61d4f1047d6 Mon Sep 17 00:00:00 2001 From: Tom Mingarelli Date: Thu, 18 Jun 2009 23:28:57 +0000 Subject: [WATCHDOG] hpwdt: Add NMI priority option Add a priority option so that the user can choose if we do the NMI first or last. Signed-off-by: Thomas Mingarelli Signed-off-by: Wim Van Sebroeck --- Documentation/watchdog/hpwdt.txt | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/watchdog/hpwdt.txt b/Documentation/watchdog/hpwdt.txt index 127839e..9c24d5f 100644 --- a/Documentation/watchdog/hpwdt.txt +++ b/Documentation/watchdog/hpwdt.txt @@ -19,30 +19,41 @@ Last reviewed: 06/02/2009 not be updated in a timely fashion and a hardware system reset (also known as an Automatic Server Recovery (ASR)) event will occur. - The hpwdt driver also has three (3) module parameters. They are the following: + The hpwdt driver also has four (4) module parameters. They are the following: soft_margin - allows the user to set the watchdog timer value allow_kdump - allows the user to save off a kernel dump image after an NMI nowayout - basic watchdog parameter that does not allow the timer to be restarted or an impending ASR to be escaped. + priority - determines whether or not the hpwdt driver is first on the + die_notify list to handle NMIs or last. The default value + for this module parameter is 0 or LAST. If the user wants to + enable NMI sourcing then reload the hpwdt driver with + priority=1 (and boot with nmi_watchdog=0). NOTE: More information about watchdog drivers in general, including the ioctl interface to /dev/watchdog can be found in Documentation/watchdog/watchdog-api.txt and Documentation/IPMI.txt. - The NMI sourcing capability is disabled when the driver discovers that the - nmi_watchdog is turned on (nmi_watchdog = 1). This is due to the inability to + The priority parameter was introduced due to other kernel software that relied + on handling NMIs (like oprofile). Keeping hpwdt's priority at 0 (or LAST) + enables the users of NMIs for non critical events to be work as expected. + + The NMI sourcing capability is disabled by default due to the inability to distinguish between "NMI Watchdog Ticks" and "HW generated NMI events" in the Linux kernel. What this means is that the hpwdt nmi handler code is called each time the NMI signal fires off. This could amount to several thousands of NMIs in a matter of seconds. If a user sees the Linux kernel's "dazed and confused" message in the logs or if the system gets into a hung state, then - the user should reboot with nmi_watchdog=0. + the hpwdt driver can be reloaded with the "priority" module parameter set + (priority=1). 1. If the kernel has not been booted with nmi_watchdog turned off then edit /boot/grub/menu.lst and place the nmi_watchdog=0 at the end of the currently booting kernel line. 2. reboot the sever + 3. Once the system comes up perform a rmmod hpwdt + 4. insmod /lib/modules/`uname -r`/kernel/drivers/char/watchdog/hpwdt.ko priority=1 Now, the hpwdt can successfully receive and source the NMI and provide a log message that details the reason for the NMI (as determined by the HP BIOS). -- cgit v1.1 From 7e325d3a6b117c7288bfc0755410e9d9d2b71326 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 19 Jun 2009 20:22:37 +0200 Subject: update Documentation/filesystems/Locking The rules for locking in many superblock operations has changed significantly, so update the documentation for it. Also correct some older updates and ommissions. Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 43 ++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 21 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 229d7b7..18b9d0c 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -109,27 +109,28 @@ prototypes: locking rules: All may block. - BKL s_lock s_umount -alloc_inode: no no no -destroy_inode: no -dirty_inode: no (must not sleep) -write_inode: no -drop_inode: no !!!inode_lock!!! -delete_inode: no -put_super: yes yes no -write_super: no yes read -sync_fs: no no read -freeze_fs: ? -unfreeze_fs: ? -statfs: no no no -remount_fs: yes yes maybe (see below) -clear_inode: no -umount_begin: yes no no -show_options: no (vfsmount->sem) -quota_read: no no no (see below) -quota_write: no no no (see below) - -->remount_fs() will have the s_umount lock if it's already mounted. + None have BKL + s_umount +alloc_inode: +destroy_inode: +dirty_inode: (must not sleep) +write_inode: +drop_inode: !!!inode_lock!!! +delete_inode: +put_super: write +write_super: read +sync_fs: read +freeze_fs: read +unfreeze_fs: read +statfs: no +remount_fs: maybe (see below) +clear_inode: +umount_begin: no +show_options: no (namespace_sem) +quota_read: no (see below) +quota_write: no (see below) + +->remount_fs() will have the s_umount exclusive lock if it's already mounted. When called from get_sb_single, it does NOT have the s_umount lock. ->quota_read() and ->quota_write() functions are both guaranteed to be the only ones operating on the quota file by the quota code (via -- cgit v1.1 From 236e946b53ffd5e2f5d7e6abebbe72a9f0826d15 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Wed, 24 Jun 2009 16:23:03 -0700 Subject: Revert "PCI: use ACPI _CRS data by default" This reverts commit 9e9f46c44e487af0a82eb61b624553e2f7118f5b. Quoting from the commit message: "At this point, it seems to solve more problems than it causes, so let's try using it by default. It's an easy revert if it ends up causing trouble." And guess what? The _CRS code causes trouble. Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 040fee6..d08759a 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1855,7 +1855,7 @@ and is between 256 and 4096 characters. It is defined in the file IRQ routing is enabled. noacpi [X86] Do not use ACPI for IRQ routing or for PCI scanning. - nocrs [X86] Don't use _CRS for PCI resource + use_crs [X86] Use _CRS for PCI resource allocation. routeirq Do IRQ routing for all PCI devices. This is normally done in pci_enable_device(), -- cgit v1.1