diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/feature-removal-schedule.txt | 7 | ||||
-rw-r--r-- | Documentation/md.txt | 72 | ||||
-rw-r--r-- | Documentation/spinlocks.txt | 184 | ||||
-rw-r--r-- | Documentation/sysctl/kernel.txt | 31 |
4 files changed, 183 insertions, 111 deletions
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 2a4d779..eb2c138 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -291,13 +291,6 @@ Who: Michael Buesch <mb@bu3sch.de> --------------------------- -What: usedac i386 kernel parameter -When: 2.6.27 -Why: replaced by allowdac and no dac combination -Who: Glauber Costa <gcosta@redhat.com> - ---------------------------- - What: print_fn_descriptor_symbol() When: October 2009 Why: The %pF vsprintf format provides the same functionality in a diff --git a/Documentation/md.txt b/Documentation/md.txt index 4edd39e..188f476 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -233,9 +233,9 @@ All md devices contain: resync_start The point at which resync should start. If no resync is needed, - this will be a very large number. At array creation it will - default to 0, though starting the array as 'clean' will - set it much larger. + this will be a very large number (or 'none' since 2.6.30-rc1). At + array creation it will default to 0, though starting the array as + 'clean' will set it much larger. new_dev This file can be written but not read. The value written should @@ -296,6 +296,51 @@ All md devices contain: active-idle like active, but no writes have been seen for a while (safe_mode_delay). + bitmap/location + This indicates where the write-intent bitmap for the array is + stored. + It can be one of "none", "file" or "[+-]N". + "file" may later be extended to "file:/file/name" + "[+-]N" means that many sectors from the start of the metadata. + This is replicated on all devices. For arrays with externally + managed metadata, the offset is from the beginning of the + device. + bitmap/chunksize + The size, in bytes, of the chunk which will be represented by a + single bit. For RAID456, it is a portion of an individual + device. For RAID10, it is a portion of the array. For RAID1, it + is both (they come to the same thing). + bitmap/time_base + The time, in seconds, between looking for bits in the bitmap to + be cleared. In the current implementation, a bit will be cleared + between 2 and 3 times "time_base" after all the covered blocks + are known to be in-sync. + bitmap/backlog + When write-mostly devices are active in a RAID1, write requests + to those devices proceed in the background - the filesystem (or + other user of the device) does not have to wait for them. + 'backlog' sets a limit on the number of concurrent background + writes. If there are more than this, new writes will by + synchronous. + bitmap/metadata + This can be either 'internal' or 'external'. + 'internal' is the default and means the metadata for the bitmap + is stored in the first 256 bytes of the allocated space and is + managed by the md module. + 'external' means that bitmap metadata is managed externally to + the kernel (i.e. by some userspace program) + bitmap/can_clear + This is either 'true' or 'false'. If 'true', then bits in the + bitmap will be cleared when the corresponding blocks are thought + to be in-sync. If 'false', bits will never be cleared. + This is automatically set to 'false' if a write happens on a + degraded array, or if the array becomes degraded during a write. + When metadata is managed externally, it should be set to true + once the array becomes non-degraded, and this fact has been + recorded in the metadata. + + + As component devices are added to an md array, they appear in the 'md' directory as new directories named @@ -334,8 +379,9 @@ Each directory contains: Writing "writemostly" sets the writemostly flag. Writing "-writemostly" clears the writemostly flag. Writing "blocked" sets the "blocked" flag. - Writing "-blocked" clear the "blocked" flag and allows writes + Writing "-blocked" clears the "blocked" flag and allows writes to complete. + Writing "in_sync" sets the in_sync flag. This file responds to select/poll. Any change to 'faulty' or 'blocked' causes an event. @@ -372,6 +418,24 @@ Each directory contains: array. If a value less than the current component_size is written, it will be rejected. + recovery_start + + When the device is not 'in_sync', this records the number of + sectors from the start of the device which are known to be + correct. This is normally zero, but during a recovery + operation is will steadily increase, and if the recovery is + interrupted, restoring this value can cause recovery to + avoid repeating the earlier blocks. With v1.x metadata, this + value is saved and restored automatically. + + This can be set whenever the device is not an active member of + the array, either before the array is activated, or before + the 'slot' is set. + + Setting this to 'none' is equivalent to setting 'in_sync'. + Setting to any other value also clears the 'in_sync' flag. + + An active md device will also contain and entry for each active device in the array. These are named diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt index 619699d..178c831 100644 --- a/Documentation/spinlocks.txt +++ b/Documentation/spinlocks.txt @@ -1,73 +1,8 @@ -SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED defeat lockdep state tracking and -are hence deprecated. +Lesson 1: Spin locks -Please use DEFINE_SPINLOCK()/DEFINE_RWLOCK() or -__SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static -initialization. - -Most of the time, you can simply turn: - - static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; - -into: - - static DEFINE_SPINLOCK(xxx_lock); - -Static structure member variables go from: - - struct foo bar { - .lock = SPIN_LOCK_UNLOCKED; - }; - -to: - - struct foo bar { - .lock = __SPIN_LOCK_UNLOCKED(bar.lock); - }; - -Declaration of static rw_locks undergo a similar transformation. - -Dynamic initialization, when necessary, may be performed as -demonstrated below. - - spinlock_t xxx_lock; - rwlock_t xxx_rw_lock; - - static int __init xxx_init(void) - { - spin_lock_init(&xxx_lock); - rwlock_init(&xxx_rw_lock); - ... - } - - module_init(xxx_init); - -The following discussion is still valid, however, with the dynamic -initialization of spinlocks or with DEFINE_SPINLOCK, etc., used -instead of SPIN_LOCK_UNLOCKED. - ------------------------ - -On Fri, 2 Jan 1998, Doug Ledford wrote: -> -> I'm working on making the aic7xxx driver more SMP friendly (as well as -> importing the latest FreeBSD sequencer code to have 7895 support) and wanted -> to get some info from you. The goal here is to make the various routines -> SMP safe as well as UP safe during interrupts and other manipulating -> routines. So far, I've added a spin_lock variable to things like my queue -> structs. Now, from what I recall, there are some spin lock functions I can -> use to lock these spin locks from other use as opposed to a (nasty) -> save_flags(); cli(); stuff; restore_flags(); construct. Where do I find -> these routines and go about making use of them? Do they only lock on a -> per-processor basis or can they also lock say an interrupt routine from -> mucking with a queue if the queue routine was manipulating it when the -> interrupt occurred, or should I still use a cli(); based construct on that -> one? - -See <asm/spinlock.h>. The basic version is: - - spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; +The most basic primitive for locking is spinlock. +static DEFINE_SPINLOCK(xxx_lock); unsigned long flags; @@ -75,13 +10,11 @@ See <asm/spinlock.h>. The basic version is: ... critical section here .. spin_unlock_irqrestore(&xxx_lock, flags); -and the above is always safe. It will disable interrupts _locally_, but the +The above is always safe. It will disable interrupts _locally_, but the spinlock itself will guarantee the global lock, so it will guarantee that there is only one thread-of-control within the region(s) protected by that -lock. - -Note that it works well even under UP - the above sequence under UP -essentially is just the same as doing a +lock. This works well even under UP. The above sequence under UP +essentially is just the same as doing unsigned long flags; @@ -91,15 +24,13 @@ essentially is just the same as doing a so the code does _not_ need to worry about UP vs SMP issues: the spinlocks work correctly under both (and spinlocks are actually more efficient on -architectures that allow doing the "save_flags + cli" in one go because I -don't export that interface normally). +architectures that allow doing the "save_flags + cli" in one operation). + + NOTE! Implications of spin_locks for memory are further described in: -NOTE NOTE NOTE! The reason the spinlock is so much faster than a global -interrupt lock under SMP is exactly because it disables interrupts only on -the local CPU. The spin-lock is safe only when you _also_ use the lock -itself to do locking across CPU's, which implies that EVERYTHING that -touches a shared variable has to agree about the spinlock they want to -use. + Documentation/memory-barriers.txt + (5) LOCK operations. + (6) UNLOCK operations. The above is usually pretty simple (you usually need and want only one spinlock for most things - using more than one spinlock can make things a @@ -120,20 +51,24 @@ and another sequence that does then they are NOT mutually exclusive, and the critical regions can happen at the same time on two different CPU's. That's fine per se, but the critical regions had better be critical for different things (ie they -can't stomp on each other). +can't stomp on each other). The above is a problem mainly if you end up mixing code - for example the routines in ll_rw_block() tend to use cli/sti to protect the atomicity of their actions, and if a driver uses spinlocks instead then you should -think about issues like the above.. +think about issues like the above. This is really the only really hard part about spinlocks: once you start using spinlocks they tend to expand to areas you might not have noticed before, because you have to make sure the spinlocks correctly protect the shared data structures _everywhere_ they are used. The spinlocks are most -easily added to places that are completely independent of other code (ie -internal driver data structures that nobody else ever touches, for -example). +easily added to places that are completely independent of other code (for +example, internal driver data structures that nobody else ever touches). + + NOTE! The spin-lock is safe only when you _also_ use the lock itself + to do locking across CPU's, which implies that EVERYTHING that + touches a shared variable has to agree about the spinlock they want + to use. ---- @@ -141,13 +76,17 @@ Lesson 2: reader-writer spinlocks. If your data accesses have a very natural pattern where you usually tend to mostly read from the shared variables, the reader-writer locks -(rw_lock) versions of the spinlocks are often nicer. They allow multiple +(rw_lock) versions of the spinlocks are sometimes useful. They allow multiple readers to be in the same critical region at once, but if somebody wants -to change the variables it has to get an exclusive write lock. The -routines look the same as above: +to change the variables it has to get an exclusive write lock. - rwlock_t xxx_lock = RW_LOCK_UNLOCKED; + NOTE! reader-writer locks require more atomic memory operations than + simple spinlocks. Unless the reader critical section is long, you + are better off just using spinlocks. +The routines look the same as above: + + rwlock_t xxx_lock = RW_LOCK_UNLOCKED; unsigned long flags; @@ -159,18 +98,21 @@ routines look the same as above: .. read and write exclusive access to the info ... write_unlock_irqrestore(&xxx_lock, flags); -The above kind of lock is useful for complex data structures like linked -lists etc, especially when you know that most of the work is to just -traverse the list searching for entries without changing the list itself, -for example. Then you can use the read lock for that kind of list -traversal, which allows many concurrent readers. Anything that _changes_ -the list will have to get the write lock. +The above kind of lock may be useful for complex data structures like +linked lists, especially searching for entries without changing the list +itself. The read lock allows many concurrent readers. Anything that +_changes_ the list will have to get the write lock. + + NOTE! RCU is better for list traversal, but requires careful + attention to design detail (see Documentation/RCU/listRCU.txt). -Note: you cannot "upgrade" a read-lock to a write-lock, so if you at _any_ +Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_ time need to do any changes (even if you don't do it every time), you have -to get the write-lock at the very beginning. I could fairly easily add a -primitive to create a "upgradeable" read-lock, but it hasn't been an issue -yet. Tell me if you'd want one. +to get the write-lock at the very beginning. + + NOTE! We are working hard to remove reader-writer spinlocks in most + cases, so please don't add a new one without consensus. (Instead, see + Documentation/RCU/rcu.txt for complete information.) ---- @@ -233,4 +175,46 @@ indeed), while write-locks need to protect themselves against interrupts. Linus +---- + +Reference information: + +For dynamic initialization, use spin_lock_init() or rwlock_init() as +appropriate: + + spinlock_t xxx_lock; + rwlock_t xxx_rw_lock; + + static int __init xxx_init(void) + { + spin_lock_init(&xxx_lock); + rwlock_init(&xxx_rw_lock); + ... + } + + module_init(xxx_init); + +For static initialization, use DEFINE_SPINLOCK() / DEFINE_RWLOCK() or +__SPIN_LOCK_UNLOCKED() / __RW_LOCK_UNLOCKED() as appropriate. + +SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated. These interfere +with lockdep state tracking. + +Most of the time, you can simply turn: + static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED; +into: + static DEFINE_SPINLOCK(xxx_lock); + +Static structure member variables go from: + + struct foo bar { + .lock = SPIN_LOCK_UNLOCKED; + }; + +to: + struct foo bar { + .lock = __SPIN_LOCK_UNLOCKED(bar.lock); + }; + +Declaration of static rw_locks undergo a similar transformation. diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 8f7a0e7..3894eaa 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -19,6 +19,8 @@ Currently, these files might (depending on your configuration) show up in /proc/sys/kernel: - acpi_video_flags - acct +- bootloader_type [ X86 only ] +- bootloader_version [ X86 only ] - callhome [ S390 only ] - auto_msgmni - core_pattern @@ -93,6 +95,35 @@ valid for 30 seconds. ============================================================== +bootloader_type: + +x86 bootloader identification + +This gives the bootloader type number as indicated by the bootloader, +shifted left by 4, and OR'd with the low four bits of the bootloader +version. The reason for this encoding is that this used to match the +type_of_loader field in the kernel header; the encoding is kept for +backwards compatibility. That is, if the full bootloader type number +is 0x15 and the full version number is 0x234, this file will contain +the value 340 = 0x154. + +See the type_of_loader and ext_loader_type fields in +Documentation/x86/boot.txt for additional information. + +============================================================== + +bootloader_version: + +x86 bootloader version + +The complete bootloader version number. In the example above, this +file will contain the value 564 = 0x234. + +See the type_of_loader and ext_loader_ver fields in +Documentation/x86/boot.txt for additional information. + +============================================================== + callhome: Controls the kernel's callhome behavior in case of a kernel panic. |