| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3269ee0bd6686baf86630300d528500ac5b516d7 upstream.
At best the current code only seems to free the leaf pagetables and
the root. If you're unlucky enough to have a large gap (like any
QEMU guest with more than 3G of memory), only the first chunk of leaf
pagetables are freed (plus the root). This is a massive memory leak.
This patch re-writes the pagetable freeing function to use a
recursive algorithm and manages to not only free all the pagetables,
but does it without any apparent performance loss versus the current
broken version.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit ea2447f700cab264019b52e2b417d689e052dcfd upstream.
This patch is to prevent non-USB devices that have RMRRs associated with them from
being placed into the SI Domain during init. This fixes the issue where the RMRR info
for devices being placed in and out of the SI Domain gets lost.
Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Reviewed-by: Donald Dutile <ddutile@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6491d4d02893d9787ba67279595990217177b351 upstream.
The dma_pte_free_pagetable() function will only free a page table page
if it is asked to free the *entire* 2MiB range that it covers. So if a
page table page was used for one or more small mappings, it's likely to
end up still present in the page tables... but with no valid PTEs.
This was fine when we'd only be repopulating it with 4KiB PTEs anyway
but the same virtual address range can end up being reused for a
*large-page* mapping. And in that case were were trying to insert the
large page into the second-level page table, and getting a complaint
from the sanity check in __domain_mapping() because there was already a
corresponding entry. This was *relatively* harmless; it led to a memory
leak of the old page table page, but no other ill-effects.
Fix it by calling dma_pte_clear_range (hopefully redundant) and
dma_pte_free_pagetable() before setting up the new large page.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Ravi Murty <Ravi.Murty@intel.com>
Tested-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3e7abe2556b583e87dabda3e0e6178a67b20d06f upstream.
When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below. It looks like a legitimate lock
ordering problem:
- domain_context_mapping_one() takes iommu->lock and calls
iommu_support_dev_iotlb(), which takes device_domain_lock (inside
iommu->lock).
- domain_remove_one_dev_info() starts by taking device_domain_lock
then takes iommu->lock inside it (near the end of the function).
So this is the classic AB-BA deadlock. It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.
BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:
iommu_support_dev_iotlb()
domain_remove_dev_info()
lock device_domain_lock
find info
unlock device_domain_lock
lock device_domain_lock
find same info
unlock device_domain_lock
free_devinfo_mem(info)
do stuff with info after it's free
However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.
Anyway here's the full lockdep output that prompted all of this:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.39.1+ #1
-------------------------------------------------------
bash/13954 is trying to acquire lock:
(&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
but task is already holding lock:
(device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (device_domain_lock){-.-...}:
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
[<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
[<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
[<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
[<ffffffff81cab204>] pci_iommu_init+0x16/0x41
[<ffffffff81002165>] do_one_initcall+0x45/0x190
[<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
[<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10
-> #0 (&(&iommu->lock)->rlock){......}:
[<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
[<ffffffff812f8b42>] device_notifier+0x72/0x90
[<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
[<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
[<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
[<ffffffff81373ccf>] device_release_driver+0x2f/0x50
[<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
[<ffffffff813724ac>] drv_attr_store+0x2c/0x30
[<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
[<ffffffff8117569e>] vfs_write+0xce/0x190
[<ffffffff811759e4>] sys_write+0x54/0xa0
[<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
6 locks held by bash/13954:
#0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
#1: (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
#2: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
#3: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
#4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
#5: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230
stack backtrace:
Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
Call Trace:
[<ffffffff810993a7>] print_circular_bug+0xf7/0x100
[<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
[<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
[<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
[<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
[<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
[<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
[<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
[<ffffffff812f8b42>] device_notifier+0x72/0x90
[<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
[<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
[<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
[<ffffffff81373ccf>] device_release_driver+0x2f/0x50
[<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
[<ffffffff813724ac>] drv_attr_store+0x2c/0x30
[<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
[<ffffffff8117569e>] vfs_write+0xce/0x190
[<ffffffff811759e4>] sys_write+0x54/0xa0
[<ffffffff81579a82>] system_call_fastpath+0x16/0x1b
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 4399c8bf2b9093696fa8160d79712e7346989c46 upstream.
If target_level == 0, current code breaks out of the while-loop if
SUPERPAGE bit is set. We should also break out if PTE is not present.
If we don't do this, KVM calls to iommu_iova_to_phys() will cause
pfn_to_dma_pte() to create mapping for 4KiB pages.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 8140a95d228efbcd64d84150e794761a32463947 upstream.
set dmar->iommu_superpage field to the smallest common denominator
of super page sizes supported by all active VT-d engines. Initialize
this field in intel_iommu_domain_init() API so intel_iommu_map() API
will be able to use iommu_superpage field to determine the appropriate
super page size to use.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 292827cb164ad00cc7689a21283b1261c0b6daed upstream.
iommu_unmap() API expects IOMMU drivers to return the actual page order
of the address being unmapped. Previous code was just returning page
order passed in from the caller. This patch fixes this problem.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
If CONFIG_PM is not set, init_iommu_pm_ops() introduced by commit
134fac3f457f3dd753ecdb25e6da3e5f6629f696 (PCI / Intel IOMMU: Use
syscore_ops instead of sysdev class and sysdev) is not defined
appropriately. Fix this issue.
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://git.infradead.org/iommu-2.6:
intel-iommu: Fix off-by-one in RMRR setup
intel-iommu: Add domain check in domain_remove_one_dev_info
intel-iommu: Remove Host Bridge devices from identity mapping
intel-iommu: Use coherent DMA mask when requested
intel-iommu: Dont cache iova above 32bit
intel-iommu: Speed up processing of the identity_mapping function
intel-iommu: Check for identity mapping candidate using system dma mask
intel-iommu: Only unlink device domains from iommu
intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
intel-iommu: Flush unmaps at domain_exit
intel-iommu: Remove obsolete comment from detect_intel_iommu
intel-iommu: fix VT-d PMR disable for TXT on S3 resume
|
| |
| |
| |
| |
| |
| |
| |
| | |
We were mapping an extra byte (and hence usually an extra page):
iommu_prepare_identity_map() expects to be given an 'end' argument which
is the last byte to be mapped; not the first byte *not* to be mapped.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The comment in domain_remove_one_dev_info() states "No need to compare
PCI domain; it has to be the same". But for the si_domain that isn't
going to be true, as it consists of all the PCI devices that are
identity mapped thus multiple PCI domains can be in si_domain. The
code needs to validate the PCI domain too.
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
that do not use the IOMMU causes a kernel panic. Fix that by not
inserting those devices into the si_domain.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The __intel_map_single function is not honoring the passed in DMA mask.
This results in not using the coherent DMA mask when called from
intel_alloc_coherent().
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When there are a large count of PCI devices, and the pass through
option for iommu is set, much time is spent in the identity_mapping
function hunting though the iommu domains to check if a specific
device is "identity mapped".
Speed up the function by checking the cached info to see if
it's mapped to the static identity domain.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The identity mapping code appears to make the assumption that if the
devices dma_mask is greater than 32bits the device can use identity
mapping. But that is not true: take the case where we have a 40bit
device in a 44bit architecture. The device can potentially receive a
physical address that it will truncate and cause incorrect addresses
to be used.
Instead check to see if the device's dma_mask is large enough
to address the system's dma_mask.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit a97590e5 added unlinking domains from iommus to reciprocate the
iommu from domains unlinking that was already done. We actually want
to only do this for device domains and never for the static
identity map domain or VM domains. The SI domain is special and
never freed, while VM domain->id lives in their own special address
space, separate from iommu->domain_ids.
In the current code, a VM can get domain->id zero, then mark that
domain unused when unbound from pci-stub. This leads to DMAR
write faults when the device is re-bound to the host driver.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
- size >= 2MiB, and
- virtual address aligned to 2MiB, and
- physical address aligned to 2MiB, and
- on hardware that supports superpages.
(and likewise for larger superpages).
We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.
Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.
Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.
==
The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.
I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.
The original sequence of commits leading to identical code was:
Youquan Song (3):
intel-iommu: super page support
intel-iommu: Fix superpage alignment calculation error
intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
David Woodhouse (4):
intel-iommu: Precalculate superpage support for dmar_domain
intel-iommu: Fix hardware_largepage_caps()
intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We typically batch unmaps to be lazily flushed out at
regular intervals. When we destroy a domain, we need
to force a flush of these lazy unmaps to be sure none
reference the domain we're about to free.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which
was merged as commit 51a63e67da6056c13b5b597dcc9e1b3bd7ceaa55.
This patch adds support for S3, as pointed out by Chris Wright.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| | | |
| \ | |
|\ \ \
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | |
| | | |
'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40
Conflicts:
arch/x86/include/asm/amd_iommu_types.h
arch/x86/kernel/amd_iommu.c
arch/x86/kernel/amd_iommu_init.c
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch moves the relevant declarations from the local
header file in drivers/pci to a more accessible locations so
that it can be used by the AMD IOMMU driver too.
The file is named pci-ats.h because support for the PCI PRI
capability will also be added there in a later patch-set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* git://git.infradead.org/iommu-2.6:
intel_iommu: disable all VT-d PMRs when TXT launched
intel-iommu: Fix get_domain_for_dev() error path
intel-iommu: Unlink domain from iommu
intel-iommu: Fix use after release during device attach
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled,
on each VT-d engine, after DMA remapping is enabled on the engines.
This is because the behavior of having both enabled is not deterministic
and because, if TXT has been used to launch the kernel, the PMRs may be
programmed to cover memory regions that will be used for DMA.
Under some circumstances (certain quirks detected, lack of multiple
devices, etc.), the current code does not set up DMA remapping on some
VT-d engines. In such cases it also skips disabling the PMRs. This
causes failures when the kernel is launched with TXT (most often this
occurs on the graphics engine and results in colored vertical bars on
the display).
This patch detects when the kernel has been launched with TXT and then
disables the PMRs on all VT-d engines. In some cases where the reason
that remapping is not being enabled is due to possible ACPI DMAR table
errors, the VT-d engine addresses may not be correct and thus not able
to be safely programmed even to disable PMRs. Because part of the TXT
launch process is the verification of these addresses, it will always be
safe to disable PMRs if the TXT launch has succeeded and hence only
doing this in such cases.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If we run out of domain_ids and fail iommu_attach_domain(), we
fall into domain_exit() without having setup enough of the
domain structure for this to do anything useful. In fact, it
typically runs off into the weeds walking the bogus domain->devices
list. Just free the domain.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When we remove a device, we unlink the iommu from the domain, but
we never do the reverse unlinking of the domain from the iommu.
This means that we never clear iommu->domain_ids, eventually leading
to resource exhaustion if we repeatedly bind and unbind a device
to a driver. Also free empty domains to avoid a resource leak.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Obtain the new pgd pointer before releasing the page containing this
value.
Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |/
|/|
| |
| |
| |
| | |
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
| |
| |
| |
| |
| |
| | |
Scripted with coccinelle.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
The Intel IOMMU subsystem uses a sysdev class and a sysdev for
executing iommu_suspend() after interrupts have been turned off
on the boot CPU (during system suspend) and for executing
iommu_resume() before turning on interrupts on the boot CPU
(during system resume). However, since both of these functions
ignore their arguments, the entire mechanism may be replaced with a
struct syscore_ops object which is simpler.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
|
|\
| |
| |
| |
| |
| | |
* git://git.infradead.org/iommu-2.6:
intel-iommu: Use symbolic values instead of magic numbers in Lenovo w/a
intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit 9eecabcb9a924f1e11ba670365fd4babe423045c ("intel-iommu: Abort
IOMMU setup for igfx if BIOS gave no shadow GTT space") uses a bunch of
magic numbers. Provide #defines for those to make it look slightly saner.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| | |
Yet another BIOS bug; Lenovo this time (X201). Red Hat bug #593516.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw':
drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available
drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here
Move the offending function (and its siblings) to top-of-file, remove the
forward declaration.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441
Reported-by: Martin Mokrejs <mmokrejs@ribosome.natur.cuni.cz>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| | |
* git://git.infradead.org/iommu-2.6:
intel-iommu: Fix 32-bit build warning with __cmpxchg()
intr-remap: allow disabling source id checking
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
drivers/pci/intel-iommu.c: In function 'dma_pte_addr':
drivers/pci/intel-iommu.c:239: warning: passing argument 1 of '__cmpxchg64' from incompatible pointer type
It seems that __cmpxchg64() now cares about the type of its pointer argument,
so give it a (uint64_t *) instead of a pointer to a structure which contains
only that.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (30 commits)
DMAENGINE: at_hdmac: locking fixlet
DMAENGINE: pch_dma: kill another usage of __raw_{read|write}l
dma: dmatest: fix potential sign bug
ioat2: catch and recover from broken vtd configurations v6
DMAENGINE: add runtime slave control to COH 901 318 v3
DMAENGINE: add runtime slave config to DMA40 v3
DMAENGINE: generic slave channel control v3
dmaengine: Driver for Topcliff PCH DMA controller
intel_mid: Add Mrst & Mfld DMA Drivers
drivers/dma: Eliminate a NULL pointer dereference
dma/timb_dma: compile warning on 32 bit
DMAENGINE: ste_dma40: support older silicon
DMAENGINE: ste_dma40: support disabling physical channels
DMAENGINE: ste_dma40: no disabled phy channels on ux500
DMAENGINE: ste_dma40: fix suspend bug
DMAENGINE: ste_dma40: add DB8500 memcpy channels
DMAENGINE: ste_dma40: no flow control on memcpy
DMAENGINE: ste_dma40: arch updates for LCLA and LCPA
DMAENGINE: ste_dma40: allocate LCLA dynamically
DMAENGINE: ste_dma40: no premature stop
...
Fix up trivial conflicts in arch/arm/mach-ux500/devices-db8500.c
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
On some platforms (MacPro3,1) the BIOS assigns the ioatdma device to the
incorrect iommu causing faults when the driver initializes. Add a quirk
to catch this misconfiguration and try falling back to untranslated
operation (which works in the MacPro3,1 case).
Assuming there are other platforms with misconfigured iommus teach the
ioatdma driver to treat initialization failures as non-fatal (just fail
the driver load and emit a warning instead of triggering a BUG_ON).
This can be classified as a boot regression since 2.6.32 on affected
platforms since the ioatdma module did not autoload prior to that
kernel.
Cc: <stable@kernel.org>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Reported-by: Chris Li <lkml@chrisli.org>
Tested-by: Chris Li <lkml@chrisli.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| | |
This patch allows IOMMU users to determine whether the
hardware and software support safe, isolated interrupt
remapping. Not all Intel IOMMUs have the hardware, and the
software for AMD is not there yet.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Certain revisions of this chipset appear to be broken. There is a shadow
GTT which mirrors the real GTT but contains pre-translated physical
addresses, for performance reasons. When a GTT update happens, the
translations are done once and the resulting physical addresses written
back to the shadow GTT.
Except sometimes, the physical address is actually written back to the
_real_ GTT, not the shadow GTT. Thus we start to see faults when that
physical address is fed through translation again.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
stanse found the following double lock.
In get_domain_for_dev:
spin_lock_irqsave(&device_domain_lock, flags);
domain_exit(domain);
domain_remove_dev_info(domain);
spin_lock_irqsave(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
This happens when the domain is created by another CPU at the same time
as this function is creating one, and the other CPU wins the race to
attach it to the device in question, so we have to destroy our own
newly-created one.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit a99c47a2 "intel-iommu: errors with smaller iommu widths" replace the
dmar_domain->pgd with the first entry of page table when iommu's supported
width is smaller than dmar_domain's. But it use physical address directly
for new dmar_domain->pgd...
This result in KVM oops with VT-d on some machines.
Reported-by: Allen Kay <allen.m.kay@intel.com>
Cc: Tom Lyon <pugs@cisco.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://git.infradead.org/iommu-2.6:
intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
intel-iommu: Combine the BIOS DMAR table warning messages
panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
panic: Allow warnings to set different taint flags
intel-iommu: intel_iommu_map_range failed at very end of address space
intel-iommu: errors with smaller iommu widths
intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
intel-iommu: use physfn to search drhd for VF
intel-iommu: Print out iommu seq_id
intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
intel-iommu: Avoid global flushes with caching mode.
intel-iommu: Use correct domain ID when caching mode is enabled
intel-iommu mistakenly uses offset_pfn when caching mode is enabled
intel-iommu: use for_each_set_bit()
intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
intel_iommu_map_range() doesn't allow allocation at the very end of the
address space; that code has been simplified and corrected.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When using iommu_domain_alloc with the Intel iommu, the domain address
width is always initialized to 48 bits (agaw 2). This domain->agaw value
is then used by pfn_to_dma_pte to (always) build a 4 level page table.
However, not all systems support iommu width of 48 or 4 level page tables.
In particular, the Core i5-660 and i5-670 support an address width of 36
bits (not 39!), an agaw of only 1, and only 3 level page tables.
This version of the patch simply lops off extra levels of the page tables
if the agaw value of the iommu is less than what is currently allocated
for the domain (in intel_iommu_attach_device). If there were already
allocated addresses above what the new iommu can handle, EFAULT is
returned.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
more info on system with more than one IOMMU
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
While it may be efficient on real hardware, emulation of global
invalidations is very expensive as all shadow entries must be examined.
This patch changes the behaviour when caching mode is enabled (which is
the case when IOMMU emulation takes place). In this case, page specific
invalidation is used instead.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In caching-mode mappings of pages (changes from non-present to present)
require invalidation.
Currently, this IOTLB flush is performed with domain ID of zero.
This is not according to the VT-d spec and causes big problems for
emulating software.
This patch uses the correct domain ID in IOTLB flushes.
Device IOTLB invalidation is performed only on present to non-present
changes. This decision is now based on explicit parameter instead of
zero domain-ID.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
intel_map_sg used offset_pfn which was set to zero when invalidating the IOTLB.
intel_map_sg now uses size variable for this matter.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Replace open-coded loop with for_each_set_bit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch changes the iommu-api functions for mapping and
unmapping page ranges to use the new page-size based
interface. This allows to remove the range based functions
later.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|