aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* IB/srp: Factor out common request reset codeIshai Rabinovitz2006-06-171-22/+18
| | | | | | | | | | | | Misc cleanups in ib_srp: 1) I think that it is more efficient to move the req entries from req_list to free_list in srp_reconnect_target (rather than rebuild the free_list). (In any case this code is shorter). 2) This allows us to reuse code in srp_reset_device and srp_reconnect_target and call a new function srp_reset_req. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Support SRP rev. 10 targetsRamachandra K2006-06-172-3/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There has been a change in the format of port identifiers between revision 10 of the SRP specification and the current revision 16A. Revision 10 specifies port identifier format as lower 8 bytes : GUID upper 8 bytes : Extension Whereas revision 16A specifies it as lower 8 bytes : Extension upper 8 bytes : GUID There are older targets (e.g. SilverStorm Virtual Fibre Channel Bridge) which conform to revision 10 of the SRP specification. The I/O class of revision 10 is 0xFF00 and the I/O class of revision 16A is 0x0100. For supporting older targets, this patch: 1) Adds a new optional target creation parameter "io_class". Default value of io_class is 0x0100 (i.e. revision 16A) 2) Uses the correct port identifier format for targets with IO class of 0xFF00 (i.e. conforming to revision 10) Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* [SCSI] srp.h: Add I/O Class valuesRamachandra K2006-06-171-0/+5
| | | | | | | | | | Add enum values for I/O Class values from rev. 10 and rev. 16a SRP drafts. The values are used to detect targets that implement obsolete revisions of SRP, so that the initiator can use the old format for port identifier when connecting to them. Signed-off-by: Ramachandra K <rkuchimanchi@silverstorm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.Or Gerlitz2006-06-171-3/+27
| | | | | | | | | | When creating a FMR pool, query the IB device and use the returned max_map_map_per_fmr attribute as for the max number of FMR remaps. If the device does not suport querying this attribute, use the original IB_FMR_MAX_REMAPS (32) default. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fill in max_map_per_fmr device attributeOr Gerlitz2006-06-171-0/+10
| | | | | | | | | Report the true max_map_per_fmr value from mthca_query_device(), taking into account the change in FMR remapping introduced by the Sinai performance optimization. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Add client reregister event generationRoland Dreier2006-06-171-1/+1
| | | | | | | Generate a client reregister event instead of a LID change event when client reregister bit is set. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Add client reregister event generationLeonid Arsh2006-06-171-3/+11
| | | | | | | | | | Change the mthca snoop of MADs that set PortInfo to check if the SM has set the client reregister bit, and if it has, generate a client reregister event. If the bit is not set, just generate a LID change event as usual. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Move struct port_info from ipath to <rdma/ib_smi.h>Leonid Arsh2006-06-172-38/+38
| | | | | | | | | | | Move ipath's struct port_info into <rdma/ib_smi.h>, so that it can be used by mthca to implement client reregister support. Remove the __attribute__((packed)) because all the members of the struct are naturally aligned anyway. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Handle client reregister eventsLeonid Arsh2006-06-171-1/+2
| | | | | | | | Handle client reregister events by treating them just like LID or SM changes -- flush all cached paths and rejoin multicast groups. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Add client reregister event typeLeonid Arsh2006-06-171-1/+2
| | | | | | | | Add IB_EVENT_CLIENT_REREGISTER to enum so low-level drivers can generate "client reregister" events. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Fix kernel unaligned access on ia64Jack Morgenstein2006-06-173-34/+39
| | | | | | | | | | | | | | | Fix misaligned access faults on ia64: never cast a misaligned neighbour->ha + 4 pointer to union ib_gid type; pass a void * pointer instead. The memcpy was being optimized to use full word accesses because the compiler thought that union ib_gid is always aligned. The cast in IPOIB_GID_ARG is safe, since it is fixed to access each byte separately. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Avoid using stale last_send counter when reaping AHsRoland Dreier2006-06-171-18/+9
| | | | | | | | | | | | The comparisons of priv->tx_tail to ah->last_send in ipoib_free_ah() and ipoib_post_receive() are slightly unsafe, because priv->tx_lock is not held and hence a stale value of ah->last_send might be used, which would lead to freeing an AH before the driver was really done with it. The simple way to fix this is to the optimization of early free from ipoib_free_ah() and unconditionally queue AHs for reaping, and then take priv->tx_lock in __ipoib_reap_ah(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: Check GID/LID when matching requestsJack Morgenstein2006-06-171-31/+63
| | | | | | | | | | | | | | Check GID/LID for requester side when searching for request which matches received response. This is in order to guarantee uniqueness if the same TID is used when requesting via multiple source LIDs (when LMC is not zero). Use ports' cached LMC to perform the check. Further, do not perform LID check for direct-routed packets, since the permissive LID makes a proper check impossible. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Add caching of ports' LMCJack Morgenstein2006-06-173-1/+43
| | | | | | | | | Add an LMC cache to struct ib_device, and add a function ib_get_cached_lmc() to query the cache. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/cm: remove unneeded flush_workqueueMichael S. Tsirkin2006-06-171-1/+0
| | | | | | | destroy_workqueue() already does flush_workqueue(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
* IB/ucm: convert semaphore to mutexSean Hefty2006-06-171-16/+16
| | | | | | | Convert semaphore in ib_ucm_file to a real mutex. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Get rid of "Target has req_lim 0" messagesRoland Dreier2006-06-172-6/+17
| | | | | | | | | | It's perfectly valid for a connection to an SRP target to have a request limit of 0, so get rid of the message about it, which can spam kernel logs even with printk_ratelimit(). Keep a count of such events in a "zero_req_lim" SCSI host attribute instead, so someone who cares can look at the statistics. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Handle DREQ events from CMIshai Rabinovitz2006-06-171-5/+9
| | | | | | | | | | | Handle IB_CM_DREQ_ERROR and IB_CM_DREQ_RECEIVED events from the CM, instead of just printing "Unhandled CM event". In the case of DREQ_ERROR, just ignore the event -- a TIMEWAIT_EXIT will be generated also. For DREQ_RECEIVED, send a DREP in response to shut the connection down cleanly. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Mention RFC numbers in documentationRoland Dreier2006-06-171-4/+8
| | | | | | | Now that the IETF has released RFCs covering IPoIB, give the numbers in the documentation for IPoIB. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Allow sg_tablesize to be adjustedVu Pham2006-06-172-9/+17
| | | | | | | | | Make the sg_tablesize used by SRP adjustable at module load time via a module parameter. Calculate the corresponding IU length required to support this. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Allow cmd_per_lun to be set per target portVu Pham2006-06-171-7/+17
| | | | | | | | | Allow userspace to throttle traffic on a given connection to a target port by adding "max_cmd_per_lun=xyz" to lower the cmd_per_lun value set for that scsi_host. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Clean up loop in srp_remove_one()Ishai Rabinovitz2006-06-171-5/+3
| | | | | | | | | | Interrupts will always be enabled in srp_remove_one(), so spin_lock_irq() can be used instead of spin_lock_irqsave(). Also, the loop takes target->scsi_host->host_lock, so target->state can just be set to SRP_TARGET_REMOVED witout testing the old value. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Make needlessly global ib_mad_cache staticRoland Dreier2006-06-172-4/+1
| | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Change target_mutex to a spinlockMatthew Wilcox2006-06-172-8/+8
| | | | | | | | | The SRP driver never sleeps while holding target_mutex, and it's just used to protect some simple list operations, so hold times will be short. So just convert it to a spinlock, which is smaller and faster. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Get rid of unneeded use of list_for_each_entry_safe()Matthew Wilcox2006-06-171-2/+1
| | | | | | | | list_for_each_entry_safe() is used in one place where the list isn't modified. So just change it to list_for_each_entry(). Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Use SCAN_WILD_CARD from SCSI headersMatthew Wilcox2006-06-171-2/+1
| | | | | | | | SCAN_WILD_CARD is indeed available from <scsi/scsi.h>, which is already included. So get rid of private hack. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Convert FW commands to use wait_for_completion_timeout()Roland Dreier2006-06-171-19/+4
| | | | | | | | | The kernel has had wait_for_completion_timeout() for a long time now. mthca should use it to handle FW commands timing out, instead of implementing the same thing in a much more complicated way by using wait_for_completion() along with a timer that does complete(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Use FMRs to map gather/scatter listsRoland Dreier2006-06-172-87/+228
| | | | | | | | | | | | | | | | Create an SRP FMR pool on HCAs that support FMRs, and use FMRs to map gather/scatter lists that have more than one entry into a single memory region that appears virtually contiguous to the SRP target (which is the RDMA initiator). This patch bails out on FMR mapping for SCSI commands where the gather/scatter list cannot be mapped into a single FMR because there are sub-page-sized entries in middle of the list. An unaligned start or end of the list is OK. Based on a patch by Vu Pham <vuhuong@mellanox.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Remove dead codeMichael S. Tsirkin2006-06-171-4/+0
| | | | | | | Kill some dead code in mthca_eq.c Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: IP address based RDMA connection managerSean Hefty2006-06-174-2/+2234
| | | | | | | | | | | | Kernel connection management agent over InfiniBand that connects based on IP addresses. The agent defines a generic RDMA connection abstraction to support clients wanting to connect over different RDMA devices. The agent also handles RDMA device hotplug events on behalf of clients. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: address translation to map IP toIB addresses (GIDs)Sean Hefty2006-06-174-1/+491
| | | | | | | | Add an address translation service that maps IP addresses to InfiniBand GID addresses using IPoIB. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* [NET]: Export ip_dev_find()Sean Hefty2006-06-171-0/+1
| | | | | | | Export ip_dev_find() to allow locating a net_device given an IP address. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/cm: Match connection requests based on private dataSean Hefty2006-06-173-19/+114
| | | | | | | | | | | Extend matching connection requests to listens in the InfiniBand CM to include private data checks. This allows applications to listen on the same service identifier, with private data directing the request to the appropriate application. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: common handling for marshalling parameters to/from userspaceSean Hefty2006-06-177-218/+328
| | | | | | | | Provide common handling for marshalling data between userspace clients and kernel InfiniBand drivers. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: memfree completion with error FW bug workaroundMichael S. Tsirkin2006-06-171-1/+10
| | | | | | | | | | | Memfree firmware is in rare cases reporting WQE index == base - 1 in receive completion with error, instead of (rq size - 1); base is 0 in mthca. Here is a patch to avoid kernel crash and report a correct WR id in this case. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: restore missing PCI registers after resetMichael S. Tsirkin2006-06-171-0/+59
| | | | | | | | | | | | | | mthca does not restore the following PCI-X/PCI Express registers after reset: PCI-X device: PCI-X command register PCI-X bridge: upstream and downstream split transaction registers PCI Express : PCI Express device control and link control registers This causes instability and/or bad performance on systems where one of these registers is set to a non-default value by BIOS. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Linux v2.6.17Linus Torvalds2006-06-171-1/+1
| | | | | Being named "Crazed Snow-Weasel" instills a lot of confidence in this release, so I'm sure this will be one of the better ones.
* [PATCH] powerpc: enable CPU_FTR_CI_LARGE_PAGE for cellArnd Bergmann2006-06-171-1/+1
| | | | | | | | Reflect the fact that the Cell Broadband Engine supports 64k pages by adding the bit to the CPU features. Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] powerpc: Fix 64k pages on non-partitioned machinesArnd Bergmann2006-06-171-2/+2
| | | | | | | | | | | | | The page size encoding passed to tlbie is incorrect for new-style large pages. This fixes it. This doesn't affect anything on older machines because mmu_psize_defs[psize].penc (the page size encoding) is 0 for 4k and 16M pages (the two are distinguished by a separate "is a large page" bit). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] arm_timer: remove a racy and obsolete PF_EXITING checkOleg Nesterov2006-06-171-3/+0
| | | | | | | | | | | | | | | | | | arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state) in run_posix_cpu_timers(). However, for some reason it does so only for CPUCLOCK_PERTHREAD case (which is imho wrong). Also, this check is not reliable, PF_EXITING could be set on another cpu without any locks/barriers just after the check, so it can't prevent from attaching the timer to the exiting task. The previous patch makes this check unneeded. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()Oleg Nesterov2006-06-172-26/+18
| | | | | | | | | | | | | | | | | | | | | | | do_exit() clears ->it_##clock##_expires, but nothing prevents another cpu to attach the timer to exiting process after that. arm_timer() tries to protect against this race, but the check is racy. After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and before do_exit() calls 'schedule() local timer interrupt can find tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu does sys_wait4) interrupted task has ->signal == NULL. At this moment exiting task has no pending cpu timers, they were cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just return from irq. John Stultz recently confirmed this bug, see http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] check_process_timers: fix possible lockupOleg Nesterov2006-06-171-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | If the local timer interrupt happens just after do_exit() sets PF_EXITING (and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call check_process_timers() with tasklist_lock + ->siglock held and check_process_timers: t = tsk; do { .... do { t = next_thread(t); } while (unlikely(t->flags & PF_EXITING)); } while (t != tsk); the outer loop will never stop. Actually, the window is bigger. Another process can attach the timer after ->it_xxx_expires was cleared (see the next commit) and the 'if (PF_EXITING)' check in arm_timer() is racy (see the one after that). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sky2: netconsole suspend/resume interactionStephen Hemminger2006-06-171-1/+6
| | | | | | | | | | | | | A couple of fixes that should prevent crashes when using netconsole and suspend/resume. First, netconsole poll routine shouldn't run unless the device is up; second, the NAPI poll should be disabled during suspend. This is only an issue on sky2, because it has to have one NAPI poll routine for both ports on dual port boards. Normal drivers use netif_rx_schedule_prep and that checks for netif_running. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Fix missing ret assignment in __bio_map_user() error pathJens Axboe2006-06-171-2/+3
| | | | | | | | | | | | If get_user_pages() returns less pages than what we asked for, we jump to out_unmap which will return ERR_PTR(ret). But ret can contain a positive number just smaller than local_nr_pages, so be sure to set it to -EFAULT always. Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix cdrom openJens Axboe2006-06-171-2/+4
| | | | | | | | | | | | | | | | Some time ago the cdrom open routine was changed so that we call the driver's open routine before checking to see if it is read only. However, if we discovered that a read write open was not possible and the open flags required a writable open, we just returned -EROFS without calling the driver's release routine. This seems to work for most cdrom drivers, but breaks the Powerpc iSeries virtual cdrom rather badly. This just inserts the release call in the error path to balance the call to "->open()" done by "open_for_data()". Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] cfq-iosched: fix crash in do_div()Jens Axboe2006-06-141-8/+3
| | | | | | | | | | | | | We don't clear the seek stat values in cfq_alloc_io_context(), and if ->seek_mean is unlucky enough to be set to -36 by chance, the first invocation of cfq_update_io_seektime() will oops with a divide by zero in do_div(). Just memset the entire cic instead of filling invididual values independently. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Return error in case flock_lock_file failureKirill Korotaev2006-06-141-0/+2
| | | | | | | | | If flock_lock_file() failed to allocate flock with locks_alloc_lock() then "error = 0" is returned. Need to return some non-zero. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sky2: stop/start hardware idle timer on suspend/resumeStephen Hemminger2006-06-131-4/+13
| | | | | | | | | The resume bug was caused not by an early interrupt but because the idle timeout was not being stopped on suspend. Also disable hardware IRQ's on suspend. Will need to revisit this with hotplug? Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sky2: save/restore base hardware irq during suspend/resumeStephen Hemminger2006-06-131-0/+3
| | | | | | | | The hardware should be fully shut off during suspend, and the base irq mask restored during resume. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sky2: fix hotplug detect during pollStephen Hemminger2006-06-131-2/+2
| | | | | | | | If the poll routine detects no hardware available, it needs to dequeue it self from the network poll list. Linus didn't understand NAPI. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>