aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mthca
Commit message (Collapse)AuthorAgeFilesLines
* IB/mthca: Make all device methods truly reentrantRoland Dreier2006-06-175-22/+48
| | | | | | | | | | | | | | | | Documentation/infiniband/core_locking.txt says: All of the methods in struct ib_device exported by a low-level driver must be fully reentrant. The low-level driver is required to perform all synchronization necessary to maintain consistency, even if multiple function calls using the same object are run simultaneously. However, mthca's modify_qp, modify_srq and resize_cq methods are currently not reentrant. Add a mutex to the QP, SRQ and CQ structures so that these calls can be properly serialized. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix memory leak on modify_qp error pathsRoland Dreier2006-06-171-6/+6
| | | | | | | | Some error paths after the mthca_alloc_mailbox() call in mthca_modify_qp() just do a "return -EINVAL" without freeing the mailbox. Convert these returns to "goto out" to avoid leaking the mailbox storage. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fill in max_map_per_fmr device attributeOr Gerlitz2006-06-171-0/+10
| | | | | | | | | Report the true max_map_per_fmr value from mthca_query_device(), taking into account the change in FMR remapping introduced by the Sinai performance optimization. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Add client reregister event generationLeonid Arsh2006-06-171-3/+11
| | | | | | | | | | Change the mthca snoop of MADs that set PortInfo to check if the SM has set the client reregister bit, and if it has, generate a client reregister event. If the bit is not set, just generate a LID change event as usual. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Convert FW commands to use wait_for_completion_timeout()Roland Dreier2006-06-171-19/+4
| | | | | | | | | The kernel has had wait_for_completion_timeout() for a long time now. mthca should use it to handle FW commands timing out, instead of implementing the same thing in a much more complicated way by using wait_for_completion() along with a timer that does complete(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Remove dead codeMichael S. Tsirkin2006-06-171-4/+0
| | | | | | | Kill some dead code in mthca_eq.c Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: memfree completion with error FW bug workaroundMichael S. Tsirkin2006-06-171-1/+10
| | | | | | | | | | | Memfree firmware is in rare cases reporting WQE index == base - 1 in receive completion with error, instead of (rq size - 1); base is 0 in mthca. Here is a patch to avoid kernel crash and report a correct WR id in this case. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: restore missing PCI registers after resetMichael S. Tsirkin2006-06-171-0/+59
| | | | | | | | | | | | | | mthca does not restore the following PCI-X/PCI Express registers after reset: PCI-X device: PCI-X command register PCI-X bridge: upstream and downstream split transaction registers PCI Express : PCI Express device control and link control registers This causes instability and/or bad performance on systems where one of these registers is set to a non-default value by BIOS. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix posting lists of 256 receive requests to SRQ for TavorMichael S. Tsirkin2006-05-241-20/+21
| | | | | | | | | | | | | If we post a list of length exactly a multiple of 256, nreq in doorbell gets set to 256 which is wrong: it should be encoded by 0. This is because we only zero it out on the next WR, which may not be there. The solution is to ring the doorbell after posting a WQE, not before posting the next one. This is the same bug that we just fixed for QPs with non-shared RQ. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix posting lists of 256 receive requests for TavorMichael S. Tsirkin2006-05-181-17/+18
| | | | | | | | | | | If we post a list of length 256 exactly, nreq in doorbell gets set to 256 which is wrong: it should be encoded by 0. This is because we only zero it out on the next WR, which may not be there. The solution is to ring the doorbell after posting a WQE, not before posting the next one. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Make fw_cmd_doorbell default to 0Roland Dreier2006-05-171-1/+1
| | | | | | | | | | Setting fw_cmd_doorbell allows FW command to be queued using posted writes instead of requiring polling on a "go" bit, so it should be a performance boost. However, the option causes problems with at least some device/firmware combinations, so set the default to 0 until we understand what's going on better. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: FMR ioremap fixMichael S. Tsirkin2006-05-101-4/+11
| | | | | | | | | | | Addresses for ioremap must be calculated off of pci_resource_start; we can't directly use the bus address as seen by the HCA. Fix the code that remaps device memory for FMR access. Based on patch by Klaus Smolin. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix race in reference countingRoland Dreier2006-05-095-45/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix races in in destroying various objects. If a destroy routine waits for an object to become free by doing wait_event(&obj->wait, !atomic_read(&obj->refcount)); /* now clean up and destroy the object */ and another place drops a reference to the object by doing if (atomic_dec_and_test(&obj->refcount)) wake_up(&obj->wait); then this is susceptible to a race where the wait_event() and final freeing of the object occur between the atomic_dec_and_test() and the wake_up(). And this is a use-after-free, since wake_up() will be called on part of the already-freed object. Fix this in mthca by replacing the atomic_t refcounts with plain old integers protected by a spinlock. This makes it possible to do the decrement of the reference count and the wake_up() so that it appears as a single atomic operation to the code waiting on the wait queue. While touching this code, also simplify mthca_cq_clean(): the CQ being cleaned cannot go away, because it still has a QP attached to it. So there's no reason to be paranoid and look up the CQ by number; it's perfectly safe to use the pointer that the callers already have. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix offset in query_gid methodRoland Dreier2006-05-011-1/+1
| | | | | | | | | | GuidInfo records have 8 byte GUIDs in them, so an index should be multiplied by 8 to get an offset. mthca_query_gid() was incorrectly multiplying by 16. Noticed by Leonid Keller <leonid@mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: make a function staticAdrian Bunk2006-04-191-1/+1
| | | | | | | This patch makes the needlessly global mthca_update_rate() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devicesJack Morgenstein2006-04-124-2/+30
| | | | | | | | | | | | | | | | | | | | | | The driver allocates SRQ WQEs size with a power of 2 size both for Tavor and for memfree. For Tavor, however, the hardware only requires the WQE size to be a multiple of 16, not a power of 2, and the max number of scatter-gather allowed is reported accordingly by the firmware (and this is the value currently returned by ib_query_device() and ibv_query_device()). If the max number of scatter/gather entries reported by the FW is used when creating an SRQ, the creation will fail for Tavor, since the required WQE size will be increased to the next power of 2, which turns out to be larger than the device permitted max WQE size (which is not a power of 2). This patch reduces the reported SRQ max wqe size so that it can be used successfully in creating an SRQ on Tavor HCAs. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Disable tuning PCI read burst sizeMichael S. Tsirkin2006-04-101-0/+7
| | | | | | | | | | | | | | | | | | The PCI spec recommends against drivers playing with a device's PCI read burst size, and says that systems software should configure it. And we actually have users that report that changing it from the default set by BIOS hurts performance and/or stability for them. On the other hand, the Mellanox Programmer's Reference Manual recommends turning it up all the way to the maximum value. Some tests conducted here in the lab do not show performance improvement from this tuning, but this might be just me. As a work-around, make this tuning an option, off by default (safe value), with an eye towards removing it completely one day if no one complains. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: simplify static rate encodingJack Morgenstein2006-04-108-17/+195
| | | | | | | | | | | | | | | | | Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=yRoland Dreier2006-04-024-11/+29
| | | | | | | | | | | | Change the mthca debugging trace output code so that it can enabled and disabled at runtime with the debug_level module parameter in sysfs. Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled unless CONFIG_EMBEDDED is selected. We want users (and especially distros) to have this turned on unless they really need to save space, because by the time we want debugging output, it's usually too late to rebuild a kernel. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix section mismatch problemsRoland Dreier2006-03-299-12/+12
| | | | | | | | | Quite a few cleanup functions in mthca were marked as __devexit. However, they could also be called from error paths during initialization, so they cannot be marked that way. Just delete all of the incorrect annotations. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix check of size in SRQ creationJack Morgenstein2006-03-291-1/+1
| | | | | | | | | | | | | | | The previous patch for Tavor broke MemFree logic. The driver should perform limit check only for Tavor. For MemFree, the check is incorrect, since ds (WQE stride) is always a power-of-2 (although the max_desc_size may not be). In Tavor, however, WQE stride and desc_size are the same, and are not necessarily power-of-2. The check was really for the WQE stride (and it Tavor, we use max_desc_size for the stride). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix modify QP error pathRoland Dreier2006-03-241-10/+11
| | | | | | | | | If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would still do some things it shouldn't, such as store away attributes for special QPs. Fix this, and simplify the code, by simply jumping to the exit path if mthca_MODIFY_QP() fails. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix indentationRoland Dreier2006-03-241-2/+2
| | | | | | Fix some whitespace damage (indenting with spaces) that snuck in. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix uninitialized variable in mthca_alloc_qp()Jack Morgenstein2006-03-241-4/+5
| | | | | | | | mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport before calling mthca_set_qp_size(), since the value is used there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check SRQ limit in modify SRQ operationJack Morgenstein2006-03-241-0/+2
| | | | | | | | | When setting the shared receive queue (SRQ) watermark in a modify SRQ operation, make sure that the supplied value is not larger than the full size of the SRQ. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check that SRQ WQE size does not exceed device's max valueJack Morgenstein2006-03-241-0/+4
| | | | | | | | | | Guarantee the calculated work queue entry size does not exceed the max allowable WQE size when creating an SRQ. This is a problem with Arbel in Tavor-compatibility mode because the current WQE size computation method rounds up to next power of 2. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check that sgid_index and path_mtu are valid in modify_qpDotan Barak2006-03-241-4/+23
| | | | | | | | Add a check that the modify QP parameters sgid_index and path_mtu are valid, since they might come from userspace. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Query SRQ srq_limit fixesEli Cohen2006-03-201-4/+8
| | | | | | | | | | Fix endianness handling of srq_limit: it is big-endian in the context structure, so we need to swab it before returning it. Also add support for srq_limit query for Tavor (non-MemFree) HCAs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Correct reported SRQ size in MemFree case.Dotan Barak2006-03-201-2/+2
| | | | | | | | | MemFree devices need to reserve one shared receive queue (SRQ) work request for internal use, so the capacity returned from the create_srq and query_srq methods should be srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Coverity fix to mthca_init_eq_table()Roland Dreier2006-03-201-1/+1
| | | | | | | | Fix bug found by coverity: the loop body never executed, because it was doing for (i = 0; i < MTHCA_EQ_CMD; ++i), but MTHCA_EQ_CMD is 0. The correct loop bound is MTHCA_NUM_EQ, to loop over all EQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Update firmware versionsRoland Dreier2006-03-201-3/+3
| | | | | | Update known firmware versions in driver's table to the latest releases. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Optimize large messages on Sinai HCAsEli Cohen2006-03-205-17/+51
| | | | | | | | | | | | Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Use an enum for HCA page sizeIshai Rabinovitz2006-03-204-30/+37
| | | | | | | | | | | Use a named enum for the HCA's internal page size, rather than having magic values of 4096 and shifts by 12 all over the code. Also, fix one minor bug in EQ handling: only one HCA page is mapped to the HCA during initialization, but a full kernel page is unmapped during cleanup. This might cause problems when PAGE_SIZE != 4096. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check alternate P_Key index when setting alternate pathDotan Barak2006-03-201-2/+8
| | | | | | | | | Check that the alternate P_Key index is in range when setting the alternate path for a QP. Also make a cosmetic touch up to the debug message printed when the main P_Key index is out of range. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Add support for send work request fence flagDotan Barak2006-03-201-2/+6
| | | | | | | | Add support for IB_SEND_FENCE flag in post_send methods. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Implement query_ah methodJack Morgenstein2006-03-203-0/+33
| | | | | | | | | Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Write FW commands through doorbell pageEli Cohen2006-03-202-23/+136
| | | | | | | | | | | | | | | | | | | This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Return actual capacity from create_srqDotan Barak2006-03-201-0/+3
| | | | | | | | | Have mthca's create_srq method return the actual capacity of the SRQ that gets created. Also update comments in <rdma/ib_verbs.h> to clarify that this is what is expected from ib_create_srq(). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Bump driver version and release dateRoland Dreier2006-03-201-2/+2
| | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Support for query QP and SRQEli Cohen2006-03-206-1/+184
| | | | | | | Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Convert to use ib_modify_qp_is_ok()Roland Dreier2006-03-203-290/+76
| | | | | | | Use ib_modify_qp_is_ok() in mthca, and delete the big table of attributes for queue pair state transitions. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Generate SQ drained events when requestedRoland Dreier2006-03-202-2/+15
| | | | | | | | Add low-level driver support to ib_mthca so that consumers can request a "send queue drained" event be generated when a transiton to the SQD state completes. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Enable FMR pool user to set page sizeOr Gerlitz2006-03-201-5/+5
| | | | | | | | | | This patch allows the consumer to set the page size of "pages" mapped by the pool FMRs, which is a feature already existing in the base verbs API. On the cosmetic side it changes ib_fmr_attr.page_size field to be named page_shift. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Add modify_device method to set node descriptionRoland Dreier2006-03-202-1/+48
| | | | | | | | | Add a modify_device method to mthca, which implements setting the node description. This makes the writable "node_desc" sysfs attribute work for Mellanox HCAs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Whitespace cleanupsRoland Dreier2006-03-209-26/+26
| | | | | | | Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Add device-specific support for resizing CQsRoland Dreier2006-03-207-52/+308
| | | | | | | Add low-level driver support for resizing CQs (both kernel and userspace) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Get rid of might_sleep() annotationsRoland Dreier2006-03-203-13/+0
| | | | | | | | | | | | | | | | | | | | | The might_sleep() annotations in mthca are silly -- they all occur shortly before calls that will end up in core functions like kmalloc() that will print the same warning in an unsafe context anyway. In fact, beyond cluttering the source, we're actually bloating text with CONFIG_DEBUG_SPINLOCK_SLEEP and/or CONFIG_PREEMPT_VOLUNTARY set. With both options set, getting rid of the might_sleep()s saves a lot: add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-171 (-171) function old new delta mthca_pd_alloc 132 109 -23 mthca_init_cq 969 946 -23 mthca_mr_alloc 592 568 -24 mthca_pd_free 67 42 -25 mthca_free_mr 219 194 -25 mthca_free_cq 570 545 -25 mthca_fmr_alloc 742 716 -26 Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Make functions that never fail return voidRoland Dreier2006-03-203-22/+15
| | | | | | | | | | | | | | | The function mthca_free_err_wqe() can never fail, so get rid of its return value. That means handle_error_cqe() doesn't have to check what mthca_free_err_wqe() returns, which means it can't fail either and doesn't have to return anything either. All this results in simpler source code and a slight object code improvement: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10) function old new delta mthca_free_err_wqe 83 81 -2 mthca_poll_cq 1758 1750 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: bump driver version and release dateRoland Dreier2006-02-131-2/+2
| | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Don't print debugging info until we have all valuesRoland Dreier2006-02-101-19/+19
| | | | | | | | | | | When debugging is enabled, the mthca_QUERY_DEV_LIM() firmware command function prints out some of the device limits that it queries. However the debugging prints happen before all of the fields are extracted from the firmware response, so some of the values that get printed are uninitialized junk. Move the prints to the end of the function to fix this. Signed-off-by: Roland Dreier <rolandd@cisco.com>