aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mlx4
Commit message (Collapse)AuthorAgeFilesLines
* net: fix NULL dereferences in check_peer_redir()Eric Dumazet2012-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d3aaeb38c40e5a6c08dd31a1b64da65c4352be36, along with dependent backports of commits: 69cce1d1404968f78b177a0314f5822d5afdbbfb 9de79c127cccecb11ae6a21ab1499e87aa222880 218fa90f072e4aeff9003d57e390857f4f35513e 580da35a31f91a594f3090b7a2c39b85cb051a12 f7e57044eeb1841847c24aa06766c8290c202583 e049f28883126c689cf95859480d9ee4ab23b7fa ] Gergely Kalman reported crashes in check_peer_redir(). It appears commit f39925dbde778 (ipv4: Cache learned redirect information in inetpeer.) added a race, leading to possible NULL ptr dereference. Since we can now change dst neighbour, we should make sure a reader can safely use a neighbour. Add RCU protection to dst neighbour, and make sure check_peer_redir() can be called safely by different cpus in parallel. As neighbours are already freed after one RCU grace period, this patch should not add typical RCU penalty (cache cold effects) Many thanks to Gergely for providing a pretty report pointing to the bug. Reported-by: Gergely Kalman <synapse@hippy.csoma.elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* IB/mlx4: pass SMP vendor-specific attribute MADs to firmwareJack Morgenstein2012-02-131-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a6f7feae6d19e84253918d88b04153af09d3a243 upstream. In the current code, vendor-specific MADs (e.g with the FDR-10 attribute) are silently dropped by the driver, resulting in timeouts at the sending side and inability to query/configure the relevant feature. However, the ConnectX firmware is able to handle such MADs. For unsupported attributes, the firmware returns a GET_RESPONSE MAD containing an error status. For example, for a FDR-10 node with LID 11: # ibstat mlx4_0 1 CA: 'mlx4_0' Port 1: State: Active Physical state: LinkUp Rate: 40 (FDR10) Base lid: 11 LMC: 0 SM lid: 24 Capability mask: 0x02514868 Port GUID: 0x0002c903002e65d1 Link layer: InfiniBand Extended Port Query (EPI) vendor mad timeouts before the patch: # smpquery MEPI 11 -d ibwarn: [4196] smp_query_via: attr 0xff90 mod 0x0 route Lid 11 ibwarn: [4196] _do_madrpc: retry 1 (timeout 1000 ms) ibwarn: [4196] _do_madrpc: retry 2 (timeout 1000 ms) ibwarn: [4196] _do_madrpc: timeout after 3 retries, 3000 ms ibwarn: [4196] mad_rpc: _do_madrpc failed; dport (Lid 11) smpquery: iberror: [pid 4196] main: failed: operation EPI: ext port info query failed EPI query works OK with the patch: # smpquery MEPI 11 -d ibwarn: [6548] smp_query_via: attr 0xff90 mod 0x0 route Lid 11 ibwarn: [6548] mad_rpc: data offs 64 sz 64 mad data 0000 0000 0000 0001 0000 0001 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 # Ext Port info: Lid 11 port 0 StateChangeEnable:...............0x00 LinkSpeedSupported:..............0x01 LinkSpeedEnabled:................0x01 LinkSpeedActive:.................0x01 Signed-off-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Ira Weiny <weiny2@llnl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mlx4: generalization of multicast steering.Yevgeny Petrilin2011-03-231-5/+5
| | | | | | | | | The same packet steering mechanism would be used both for IB and Ethernet, Both multicasts and unicasts. This commit prepares the general infrastructure for this. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_en: Reporting HW revision in ethtool -iYevgeny Petrilin2011-03-231-1/+0
| | | | | | | | HW revision is derived from device ID and rev id. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* IB/mlx4: Handle protocol field in multicast tableAleksey Senin2011-01-121-4/+5
| | | | | | | | | | | | | The newest device firmware stores IB vs. Ethernet protocol in two bits in members_count field of multicast group table (0: Infiniband, 1: Ethernet). When changing the QP members count for a multicast group, it important not to reset this information. When calling multicast attach first time, the protocol type should be specified. In this patch we always set it IB, but in the future we will handle Ethernet too. When looking for a QP, the protocol type shoud be checked too. Signed-off-by: Aleksey Senin <alekseys@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* mlx4_{core, ib, en}: Fix driver when sizeof (phys_addr_t) > sizeof (long)Roland Dreier2011-01-121-1/+2
| | | | | | | | | Some systems have PCI addresses that don't fit in unsigned long (eg some 32-bit PowerPC 440 systems have 36-bit bus addresses). Fix up mlx4 drivers by using phys_addr_t where appropriate, so we don't truncate any PCI resource addresses before ioremapping them. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-01-112-1/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits) IB/qib: Fix refcount leak in lkey/rkey validation IB/qib: Improve SERDES tunning on QMH boards IB/qib: Unnecessary delayed completions on RC connection IB/qib: Issue pre-emptive NAKs on eager buffer overflow IB/qib: RDMA lkey/rkey validation is inefficient for large MRs IB/qib: Change QPN increment IB/qib: Add fix missing from earlier patch IB/qib: Change receive queue/QPN selection IB/qib: Fix interrupt mitigation IB/qib: Avoid duplicate writes to the rcv head register IB/qib: Add a few new SERDES tunings IB/qib: Reset packet list after freeing IB/qib: New SERDES init routine and improvements to SI quality IB/qib: Clear WAIT_SEND flags when setting QP to error state IB/qib: Fix context allocation with multiple HCAs IB/qib: Fix multi-Florida HCA host panic on reboot IB/qib: Handle transitions from ACTIVE_DEFERRED to ACTIVE better IB/qib: UD send with immediate receive completion has wrong size IB/qib: Set port physical state even if other fields are invalid IB/qib: Generate completion callback on errors ...
| * IB/mlx4: Handle -ENOMEM in forward_trap()Dan Carpenter2011-01-101-0/+2
| | | | | | | | | | | | | | ib_create_send_mad() can return ERR_PTR(-ENOMEM) here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Don't call dma_free_coherent() with irqs disabledVladimir Sokolovsky2011-01-101-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx4_ib_free_cq_buf() should not be called under spin_lock_irq() since it calls dma_free_coherent(), which needs irqs enabled. Fix this by deferring the free to outside the locked region. This was found due to the WARN_ON(irqs_disabled()); in swiotlb_free_coherent(). Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | Merge branch 'master' of ↵David S. Miller2010-12-262-7/+7
|\ \ | |/ | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/fib_frontend.c
| * IB/mlx4: Fix IBoE link stateEli Cohen2010-12-011-1/+1
| | | | | | | | | | | | | | | | Use netif_running() and netif_carrier_ok() to report link state, exactly as is done to report Ethernet link state in sysfs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Fix IBoE reported link rateEli Cohen2010-12-011-1/+1
| | | | | | | | | | | | | | | | | | The link rate is the product of the link speed in the link width. For Etherent ports the rate is 10G, so we use 1 for the width and 4 for speed to get the correct rate. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Fix memory ordering of VLAN insertion control bitsEli Cohen2010-12-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | We must fully update the control segment before marking it as valid, so that hardware doesn't start executing it before we're ready. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Move VLAN control bit setting to before wmb(). - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | infiniband: remove dev_base_lock useEric Dumazet2010-11-241-3/+3
|/ | | | | | | | | | | dev_base_lock is the legacy way to lock the device list, and is planned to disappear. (writers hold RTNL, readers hold RCU lock) Convert rdma_translate_ip() and update_ipv6_gids() to RCU locking. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-10-266-123/+854
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits) IB/qib: clean up properly if pci_set_consistent_dma_mask() fails IB/qib: Allow driver to load if PCIe AER fails IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set IB/qib: Fix extra log level in qib_early_err() RDMA/cxgb4: Remove unnecessary KERN_<level> use RDMA/cxgb3: Remove unnecessary KERN_<level> use IB/core: Add link layer type information to sysfs IB/mlx4: Add VLAN support for IBoE IB/core: Add VLAN support for IBoE IB/mlx4: Add support for IBoE mlx4_en: Change multicast promiscuous mode to support IBoE mlx4_core: Update data structures and constants for IBoE mlx4_core: Allow protocol drivers to find corresponding interfaces IB/uverbs: Return link layer type to userspace for query port operation IB/srp: Sync buffer before posting send IB/srp: Use list_first_entry() IB/srp: Reduce number of BUSY conditions IB/srp: Eliminate two forward declarations IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144 IB: Replace EXTRA_CFLAGS with ccflags-y ...
| *-. Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', ↵Roland Dreier2010-10-266-123/+854
| |\ \ | | | | | | | | | | | | 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next
| | | * IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144Jack Morgenstein2010-10-231-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Node Description cannot be changed via MADs (it is read-only). Until now, it was changed in the driver via sysfs, and the new Node Description was simply inserted by the driver into MAD responses (replacing the description returned by FW). System startup scripts use the sysfs interface to change the node description at driver startup to show the hostname, etc. However, this has a race condition: the SM could discover the original FW node description rather than the system-specific description if it queried the port before the startup scripts finish running. For mlx4, we fix this with a new FW command (SET_NODE) that allows passing the new node description to FW. When this command is invoked, FW sends a trap 144 to the SM. When it gets this trap, the SM can query the node to obtain the new node description -- thus eliminating the effects of the race. This patch simply calls SET_NODE command when a new node description is entered via sysfs (thus causing trap 144 to be issued by the FW). We ignore all failures of the SET_NODE command (including those caused by using a device FW that predates the SET_NODE command), since in that case things work just as before. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | | * IB/mlx4: Limit size of fast registration WRsEli Cohen2010-10-112-2/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | Fix the limit on the size of max fast registration WRs that can be posted to match hardware capabilities. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/mlx4: Add VLAN support for IBoEEli Cohen2010-10-253-27/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows IBoE traffic to be encapsulated in 802.1Q tagged VLAN frames. The VLAN tag is encoded in the GID and derived from it by a simple computation. The netdev notifier callback is modified to catch VLAN device addition/removal and the port's GID table is updated to reflect the change, so that for each netdevice there is an entry in the GID table. When the port's GID table is exhausted, GID entries will not be added. Only children of the main interfaces can add to the GID table; if a VLAN interface is added on another VLAN interface (e.g. "vconfig add eth2.6 8"), then that interfaces will not add an entry to the GID table. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/core: Add VLAN support for IBoEEli Cohen2010-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/mlx4: Add support for IBoEEli Cohen2010-10-255-108/+687
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for IBoE to mlx4_ib. The bulk of the code is handling the new address vector fields; mlx4 needs the MAC address of a remote node to include it in a WQE (for datagrams) or in the QP context (for connected QPs). Address resolution is done by assuming all unicast GIDs are either link-local IPv6 addresses. Multicast group attach/detach needs to update the NIC's multicast filters; but since attaching a QP to a multicast group can be done before the QP is bound to a port, for IBoE we need to keep track of all multicast groups that a QP is attached too before it transitions from INIT to RTR (since it does not have a port in the INIT state). Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Many things cleaned up and otherwise monkeyed with; hope I didn't introduce too many bugs. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/pack: IBoE UD packet packing supportEli Cohen2010-10-141-1/+1
| |/ | | | | | | | | | | | | | | | | | | Add support for packing IBoE packet headers. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Clean up and fix ib_ud_header_init() a bit. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | infiniband: fix mlx4 kconfig dependency warningRandy Dunlap2010-10-161-0/+1
|/ | | | | | | | | Fix kconfig dependency warning to satisfy dependencies: warning: (MLX4_EN && NETDEVICES && NETDEV_10000 && PCI && INET || MLX4_INFINIBAND && INFINIBAND) selects MLX4_CORE which has unmet direct dependencies (NETDEVICES && NETDEV_10000 && PCI) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* IB/core: Allow device-specific per-port sysfs filesRalph Campbell2010-05-211-1/+1
| | | | | | | | | | | | | | | | | Add a new parameter to ib_register_device() so that low-level device drivers can pass in a pointer to a callback function that will be called for each port that is registered in sysfs. This allows low-level device drivers to create files in /sys/class/infiniband/<hca>/ports/<N>/ without having to poke through the internals of the RDMA sysfs handling. There is no need for an unregister function since the kobject reference will go to zero when ib_unregister_device() is called. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add support for masked atomic operationsVladimir Sokolovsky2010-04-213-11/+48
| | | | | | | | Add support for masked atomic operations (masked compare and swap, masked fetch and add). Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-04-091-1/+1
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Check correct variable for allocation failure RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp() RDMA/cm: Set num_paths when manually assigning path records IB/cm: Fix device_create() return value check
| * IB/mlx4: Check correct variable for allocation failureDan Carpenter2010-04-071-1/+1
| | | | | | | | | | | | | | | | The intent here is to check the "mfrpl->mapped_page_list" allocation. We checked "mfrpl->ibfrpl.page_list" earlier. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-307-0/+9
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* Merge branch 'misc' into for-nextRoland Dreier2010-03-011-1/+1
|\ | | | | | | | | Conflicts: drivers/infiniband/core/uverbs_main.c
| * IB/core: Fix and clean up ib_ud_header_init()Eli Cohen2010-02-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | ib_ud_header_init() first clears header and then fills up the various fields. Later on, it tests header->immediate_present, which it has already cleared, so the condition is always false. Fix this by adding an immediate_present parameter and setting header->immediate_present as is done with grh_present. Also remove unused calculation of header_len. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/mlx4: Simplify retrieval of ib_deviceEli Cohen2010-02-121-1/+1
|/ | | | | | | | struct ib_qp already holds a pointer to the ib device. No need to dive to the hw device object to retrieve it. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix queue overflow check in post_recvOr Gerlitz2010-01-061-1/+1
| | | | | | | | | In mlx4_ib_post_recv(), we should check the queue for overflow using recv_cq instead of send_cq (current code looks like a copy-and-paste mistake). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Initialize SRQ scatter entries when creating an SRQJack Morgenstein2010-01-061-0/+6
| | | | | | | | As for memfree mthca hardware, ConnectX also requires SRQ WQE scatter entries to be initialized with the invalid L_Key at SRQ creation time. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-162-14/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits) RDMA/cxgb3: Fix error paths in post_send and post_recv RDMA/nes: Fix stale ARP issue RDMA/nes: FIN during MPA startup causes timeout RDMA/nes: Free kmap() resources RDMA/nes: Check for zero STag RDMA/nes: Fix Xansation test crash on cm_node ref_count RDMA/nes: Abnormal listener exit causes loopback node crash RDMA/nes: Fix crash in nes_accept() RDMA/nes: Resource not freed for REJECTed connections RDMA/nes: MPA request/response error checking RDMA/nes: Fix query of ORD values RDMA/nes: Fix MAX_CM_BUFFER define RDMA/nes: Pass correct size to ioremap_nocache() RDMA/nes: Update copyright and branding string RDMA/nes: Add max_cqe check to nes_create_cq() RDMA/nes: Clean up struct nes_qp RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset RDMA/nes: Correct fast memory registration implementation IB/ehca: Fix error paths in post_send and post_recv ...
| * IB/mlx4: Remove limitation on LSO header sizeEli Cohen2009-11-122-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code has a limitation: an LSO header is not allowed to cross a 64 byte boundary. This patch removes this limitation by setting the WQE RR for large headers thus allowing LSO headers of any size. The extra buffer reserved for MLX4_IB_QP_LSO QPs has been doubled, from 64 to 128 bytes, assuming this is reasonable upper limit for header length. Also, this patch will cause IB_DEVICE_UD_TSO to be set only for HCA FW versions that set MLX4_DEV_CAP_FLAG_BLH; e.g. FW version 2.6.000 and higher. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Remove unneeded codeEli Cohen2009-11-121-1/+0
| | | | | | | | | | | | | | There is no such flag DE - the field is reserved and should be zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | tree-wide: fix assorted typos all over the placeAndré Goddard Rosa2009-12-041-1/+1
|/ | | | | | | | | | That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
*-. Merge branches 'cxgb3', 'ehca', 'ipath', 'ipoib', 'misc', 'mlx4', 'mthca' ↵Roland Dreier2009-09-103-9/+16
|\ \ | | | | | | | | | and 'nes' into for-linus
| | * IB/mlx4: Don't allow userspace open while recovering from catastrophic errorJack Morgenstein2009-09-052-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace apps are supposed to release all ib device resources if they receive a fatal async event (IBV_EVENT_DEVICE_FATAL). However, the app has no way of knowing when the device has come back up, except to repeatedly attempt ibv_open_device() until it succeeds. However, currently there is no protection against the open succeeding while the device is in being removed following the fatal event. In this case, the open will succeed, but as a result the device waits in the middle of its removal until the new app releases its resources -- and the new app will not do so, since the open succeeded at a point following the fatal event generation. This patch adds an "active" flag to the device. The active flag is set to false (in the fatal event flow) before the "fatal" event is generated, so any subsequent ibv_dev_open() call to the device will fail until the device comes back up, thus preventing the above deadlock. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/mlx4: Annotate CQ lockingRoland Dreier2009-09-051-4/+8
| |/ |/| | | | | | | | | | | | | | | mlx4_ib_lock_cqs()/mlx4_ib_unlock_cqs() are helper functions that lock/unlock both CQs attached to a QP in the proper order to avoid AB-BA deadlocks. Annotate this so sparse can understand what's going on (and warn us if we misuse these functions). Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB: Use printk_once() for driver versionsMarcin Slusarz2009-09-051-5/+1
|/ | | | | | | Replace open-coded reimplementations with printk_once(). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Add strong ordering to local inval and fast reg work requestsJack Morgenstein2009-06-051-0/+4
| | | | | | | | | | | | | | | | | The ConnectX Programmer's Reference Manual states that the "SO" bit must be set when posting Fast Register and Local Invalidate send work requests. When this bit is set, the work request will be executed only after all previous work requests on the send queue have been executed. (If the bit is not set, Fast Register and Local Invalidate WQEs may begin execution too early, which violates the defined semantics for these operations) This fixes the issue with NFS/RDMA reported in <http://lists.openfabrics.org/pipermail/general/2009-April/059253.html> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Don't overwrite fast registration page list when posting work requestJack Morgenstein2009-05-073-3/+10
| | | | | | | | | | | | | | | | | | The low-level mlx4 driver modified the page-list addresses for fast register work requests post send to big-endian, and set a "present" bit. This caused problems later when the consumer attempted to unmap the pages using the page-list (using the list addresses which were assumed to be still in CPU-endian order). Fix the mlx4 driver to allocate two buffers and use a private buffer for the hardware-format bus addresses. This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>, an NFS/RDMA server crash. The cause of the crash was found by Vu Pham of Mellanox. The fix is along the lines suggested by Steve Wise in comment #21 in bug 1571. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Use pgprot_writecombine() for BlueFlame pagesRoland Dreier2009-03-301-2/+1
| | | | | | | | | The PAT work on x86 has finally made pgprot_writecombine() a usable API for modular drivers. As the comment indicates, this is exactly what we want to use in mlx4_ib to map BlueFlame pages up to userspace, since using WC for these pages improves small message latency significantly. Signed-off-by: Roland Dreier <rolandd@cisco.com>
*-. Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', ↵Roland Dreier2009-03-243-20/+34
|\ \ | | | | | | | | | 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next
| | * IB/mlx4: Unregister IB device prior to CLOSE PORT commandYevgeny Petrilin2009-03-181-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the ConnectX programmer's reference manual, all operations should be stopped, all QPs should be torn down and all WQEs flushed before the CLOSE_PORT command is invoked. In some cases reversing the order of operations (as implemented now) could cause a loss of completions. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/mlx4: Fix dispatch of IB_EVENT_LID_CHANGE eventMoni Shoua2009-01-281-7/+20
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When snooping a PortInfo MAD, its client_reregister bit is checked. If the bit is ON then a CLIENT_REREGISTER event is dispatched, otherwise a LID_CHANGE event is dispatched. This way of decision ignores the cases where the MAD changes the LID along with an instruction to reregister (so a necessary LID_CHANGE event won't be dispatched) or the MAD is neither of these (and an unnecessary LID_CHANGE event will be dispatched). This causes problems at least with IPoIB, which will do a "light" flush on reregister, rather than the "heavy" flush required due to a LID change. Fix this by dispatching a CLIENT_REREGISTER event if the client_reregister bit is set, but also compare the LID in the MAD to the current LID. If and only if they are not identical then a LID_CHANGE event is dispatched. Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB: Remove __constant_{endian} usesHarvey Harrison2009-01-171-11/+11
|/ | | | | | | | | | | The base versions handle constant folding just fine, use them directly. The replacements are OK in the include/ files as they are not exported to userspace so we don't need the __ prefixed versions. This patch does not affect code generation at all. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix memory ordering problem when posting LSO sendsRoland Dreier2009-01-161-9/+19
| | | | | | | | | | | | | | | | | | The current work request posting code writes the LSO segment before writing any data segments. This leaves a window where the LSO segment overwrites the stamping in one cacheline that the HCA prefetches before the rest of the cacheline is filled with the correct data segments. When the HCA processes this work request, a local protection error may result. Fix this by saving the LSO header size field off and writing it only after all data segments are written. This fix is a cleaned-up version of a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1383>. Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-01-131-4/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/iser: Add dependency on INFINIBAND_ADDR_TRANS IPoIB: Do not join broadcast group if interface is brought down RDMA/nes: Fix for NIPQUAD removal IPoIB: Fix loss of connectivity after bonding failover on both sides IB/mlx4: Don't register IB device for adapters with no IB ports mlx4_core: Fix warning from min() IB/ehca: spin_lock_irqsave() takes an unsigned long