Commit Graph

70 Commits

Author SHA1 Message Date
David Gibson
82516263ce pseries: Don't expose PCIe extended config space on older machine types
bb9986452 "spapr_pci: Advertise access to PCIe extended config space"
allowed guests to access the extended config space of PCI Express devices
via the PAPR interfaces, even though the paravirtualized bus mostly acts
like plain PCI.

However, that patch enabled access unconditionally, including for existing
machine types, which is an unwise change in behaviour.  This patch limits
the change to pseries-2.9 (and later) machine types.

Suggested-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-14 11:54:17 +11:00
Cédric Le Goater
f7759e4331 ppc/xics: use the QOM interface to get irqs
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-01 11:23:39 +11:00
Paul Burton
62be393423 hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller
Add support for emulating the Xilinx AXI Root Port Bridge for PCI
Express as described by Xilinx' PG055 document. This is a PCIe
controller that can be used with certain series of Xilinx FPGAs, and is
used on the MIPS Boston board which will make use of this code.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
[yongbok.kim@imgtec.com:
  removed returning on !level,
  updated IRQ connection with GPIO logic,
  moved xilinx_pcie_init() to boston.c
  replaced stw_le_p() with pci_set_word()
  and other cosmetic changes]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-02-21 23:49:29 +00:00
Stefan Weil
5bb8590d37 include: Fix typos found by codespell
Add also a missing parenthesis in a comment.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-01-24 23:26:52 +03:00
David Gibson
5c4537bded spapr: Fix 2.7<->2.8 migration of PCI host bridge
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
from qemu-2.7 to the current version.  It split the device's MMIO
window into two pieces for 32-bit and 64-bit MMIO.

The patch included backwards compatibility code to convert the old
property into the new format.  However, the property value was also
transferred in the migration stream and compared with a (probably
unwise) VMSTATE_EQUAL.  So, the "raw" value from 2.7 is compared to
the new style converted value from (pre-)2.8 giving a mismatch and
migration failure.

Along with the actual field that caused the breakage, there are
several other ill-advised VMSTATE_EQUAL()s.  To fix forwards
migration, we read the values in the stream into scratch variables and
ignore them, instead of comparing for equality.  To fix backwards
migration, we populate those scratch variables in pre_save() with
adjusted values to match the old behaviour.

To permit the eventual possibility of removing this cruft from the
stream, we only include these compatibility fields if a new
'pre-2.8-migration' property is set.  We clear it on the pseries-2.8
machine type, which obviously can't be migrated backwards, but set it
on earlier machine type versions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-11-23 12:00:48 +11:00
David Gibson
357d1e3bc7 spapr: Improved placement of PCI host bridges in guest memory map
Currently, the MMIO space for accessing PCI on pseries guests begins at
1 TiB in guest address space.  Each PCI host bridge (PHB) has a 64 GiB
chunk of address space in which it places its outbound PIO and 32-bit and
64-bit MMIO windows.

This scheme as several problems:
  - It limits guest RAM to 1 TiB (though we have a limited fix for this
    now)
  - It limits the total MMIO window to 64 GiB.  This is not always enough
    for some of the large nVidia GPGPU cards
  - Putting all the windows into a single 64 GiB area means that naturally
    aligning things within there will waste more address space.
In addition there was a miscalculation in some of the defaults, which meant
that the MMIO windows for each PHB actually slightly overran the 64 GiB
region for that PHB.  We got away without nasty consequences because
the overrun fit within an unused area at the beginning of the next PHB's
region, but it's not pretty.

This patch implements a new scheme which addresses those problems, and is
also closer to what bare metal hardware and pHyp guests generally use.

Because some guest versions (including most current distro kernels) can't
access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
64 TiB.  This is broken into 1 TiB chunks.  The first 1 TiB contains the
PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs.  Each
subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
one PHB each.

This reduces the number of allowed PHBs (without full manual configuration
of all the windows) from 256 to 31, but this should still be plenty in
practice.

We also change some of the default window sizes for manually configured
PHBs to saner values.

Finally we adjust some tests and libqos so that it correctly uses the new
default locations.  Ideally it would parse the device tree given to the
guest, but that's a more complex problem for another time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:04:15 +11:00
David Gibson
daa2369903 spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
  - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
  - A 64-bit window which maps onto a large region somewhere high in PCI
    address space (traditionally this used an identity mapping from guest
    physical address to PCI address, but that's not always the case)

The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however.  At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.

This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window.  With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.

This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured.  The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).

So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified.  This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.

For now, this only adds the possibility of 64-bit windows.  The default
configuration still uses the legacy mode.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
David Gibson
6737d9ad79 spapr_pci: Delegate placement of PCI host bridges to machine type
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
and PAPR guests) to have numerous independent PHBs, each controlling a
separate PCI domain.

There are two ways of configuring the spapr-pci-host-bridge device: first
it can be done fully manually, specifying the locations and sizes of all
the IO windows.  This gives the most control, but is very awkward with 6
mandatory parameters.  Alternatively just an "index" can be specified
which essentially selects from an array of predefined PHB locations.
The PHB at index 0 is automatically created as the default PHB.

The current set of default locations causes some problems for guests with
large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
GPGPU cards via VFIO).  Obviously, for migration we can only change the
locations on a new machine type, however.

This is awkward, because the placement is currently decided within the
spapr-pci-host-bridge code, so it breaks abstraction to look inside the
machine type version.

So, this patch delegates the "default mode" PHB placement from the
spapr-pci-host-bridge device back to the machine type via a public method
in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
can do.

For now, this just changes where the calculation is done.  It doesn't
change the actual location of the host bridges, or any other behaviour.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
Alexey Kardashevskiy
4814401fa0 spapr_pci: Add numa node id
This adds a numa id property to a PHB to allow linking passed PCI device
to CPU/memory. It is up to the management stack to do CPU/memory pinning
to the node with the actual PCI device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Renamed property from "node" to "numa_node" to match the similar
 one in the pxb device]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:07 +10:00
Ladi Prosek
d4b84d564e Remove unused function declarations
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Peter Xu
cfc13df462 acpi: add DMAR scope definition for root IOAPIC
To enable interrupt remapping for intel IOMMU device, each IOAPIC device
in the system reported via ACPI MADT must be explicitly enumerated under
one specific remapping hardware unit. This patch adds the root-complex
IOAPIC into the default DMAR device.

Please refer to VT-d spec 8.3.1.1 for more information.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-20 19:30:27 +03:00
Markus Armbruster
121d07125b Clean up header guards that don't match their file name
Header guard symbols should match their file name to make guard
collisions less likely.  Offenders found with
scripts/clean-header-guards.pl -vn.

Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Markus Armbruster
20668fdebd spapr_pci: Include spapr.h instead of playing games with #error
include/hw/pci-host/spapr.h needs hw/ppc/spapr.h.  It checks whether
its header guard is defined, and errors out if it isn't.

Playing games with some other header's guard symbol is not a good
idea.  Just include the frackin' header already.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Peter Maydell
791b7d2340 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes

iommus can not be added with -device.
cleanups and fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 05 Jul 2016 11:18:32 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (30 commits)
  vmw_pvscsi: remove unnecessary internal msi state flag
  e1000e: remove unnecessary internal msi state flag
  vmxnet3: remove unnecessary internal msi state flag
  mptsas: remove unnecessary internal msi state flag
  megasas: remove unnecessary megasas_use_msi()
  pci: Convert msi_init() to Error and fix callers to check it
  pci bridge dev: change msi property type
  megasas: change msi/msix property type
  mptsas: change msi property type
  intel-hda: change msi property type
  usb xhci: change msi/msix property type
  change pvscsi_init_msi() type to void
  tests: add APIC.cphp and DSDT.cphp blobs
  tests: acpi: add CPU hotplug testcase
  log: Permit -dfilter 0..0xffffffffffffffff
  range: Replace internal representation of Range
  range: Eliminate direct Range member access
  log: Clean up misuse of Range for -dfilter
  pci_register_bar: cleanup
  Revert "virtio-net: unbreak self announcement and guest offloads after migration"
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05 16:48:24 +01:00
Alexey Kardashevskiy
ae4de14cd3 spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)
This adds support for Dynamic DMA Windows (DDW) option defined by
the SPAPR specification which allows to have additional DMA window(s)

The "ddw" property is enabled by default on a PHB but for compatibility
the pseries-2.6 machine and older disable it.
This also creates a single DMA window for the older machines to
maintain backward migration.

This implements DDW for PHB with emulated and VFIO devices. The host
kernel support is required. The advertised IOMMU page sizes are 4K and
64K; 16M pages are supported but not advertised by default, in order to
enable them, the user has to specify "pgsz" property for PHB and
enable huge pages for RAM.

The existing linux guests try creating one additional huge DMA window
with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
the guest switches to dma_direct_ops and never calls TCE hypercalls
(H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
and not waste time on map/unmap later. This adds a "dma64_win_addr"
property which is a bus address for the 64bit window and by default
set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware
uses and this allows having emulated and VFIO devices on the same bus.

This adds 4 RTAS handlers:
* ibm,query-pe-dma-window
* ibm,create-pe-dma-window
* ibm,remove-pe-dma-window
* ibm,reset-pe-dma-window
These are registered from type_init() callback.

These RTAS handlers are implemented in a separate file to avoid polluting
spapr_iommu.c with PCI.

This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs
and updates all references to dma_liobn. However this does not add
64bit LIOBN to the migration stream as in fact even 32bit LIOBN is
rather pointless there (as it is a PHB property and the management
software can/should pass LIOBNs via CLI) but we keep it for the backward
migration support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Markus Armbruster
01c9742d9d pc: Eliminate PcPciInfo
PcPciInfo has two (ill-named) members: Range w32 is the PCI hole, and
w64 is the PCI64 hole.

Three users:

* I440FXState and MCHPCIState have a member PcPciInfo pci_info, but
  only pci_info.w32 is actually used.  This is confusing.  Replace by
  Range pci_hole.

* acpi_build() uses auto PcPciInfo pci_info to forward both PCI holes
  from acpi_get_pci_info() to build_dsdt().  Replace by two variables
  Range pci_hole, pci_hole64.  Rename acpi_get_pci_info() to
  acpi_get_pci_holes().

PcPciInfo is now unused; drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-07-04 14:52:10 +03:00
Marcel Apfelbaum
10d01f73e3 machine: remove iommu property
Since iommu devices can be created with '-device' there is
no need to keep iommu as machine and mch property.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:58 +03:00
Benjamin Herrenschmidt
27f2458245 ppc/xics: Replace "icp" with "xics" in most places
The "ICP" is a different object than the "XICS". For historical reasons,
we have a number of places where we name a variable "icp" while it contains
a XICSState pointer. There *is* an ICPState structure too so this makes
the code really confusing.

This is a mechanical replacement of all those instances to use the name
"xics" instead. There should be no functional change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[spapr_cpu_init has been moved to spapr_cpu_core.c, change there]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Efimov Vasily
401f2f3ef1 Q35: implement property interfece to several parameters
During creation of Q35 instance several parameters are set using direct access.
It violates Qemu device model. Correctly, the parameters should be handled as
object properties.

The patch adds four link type properties for fields:
mch.ram_memory
mch.pci_address_space
mch.system_memory
mch.address_space_io
And, it adds two size type properties for fields:
mch.below_4g_mem_size
mch.above_4g_mem_size

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Alexey Kardashevskiy
b3162f22cb spapr_pci: Add and export DMA resetting helper
This will be later used by the "ibm,reset-pe-dma-window" RTAS handler
which resets the DMA configuration to the defaults.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
David Gibson
a36304fdca spapr_pci: Remove finish_realize hook
Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
72700d7e73 spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
    1. automatically sizes its DMA window to match the host IOMMU
    2. checks if the attached VFIO container is backed by the
       VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
c1fa017c7e spapr_pci: Allow EEH on spapr-pci-host-bridge
Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
fbb4e98341 spapr_pci: Eliminate class callbacks
The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:10 +11:00
Eduardo Habkost
34be1e7c92 q35: Remove MCHPCIState.guest_info field
The field is not used for anything.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:13 +02:00
David Gibson
f93caaac36 spapr_pci: Allow PCI host bridge DMA window to be configured
At present the PCI host bridge (PHB) for the pseries machine type has a
fixed DMA window from 0..1GB (in PCI address space) which is mapped to real
memory via the PAPR paravirtualized IOMMU.

For better support of VFIO devices, we're going to want to allow for
different configurations of the DMA window.

Eventually we'll want to allow the guest itself to reconfigure the window
via the PAPR dynamic DMA window interface, but as a preliminary this patch
allows the user to reconfigure the window with new properties on the PHB
device.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
David Gibson
28e0204254 spapr: Merge sPAPREnvironment into sPAPRMachineState
The code for -machine pseries maintains a global sPAPREnvironment structure
which keeps track of general state information about the guest platform.
This predates the existence of the MachineState structure, but performs
basically the same function.

Now that we have the generic MachineState, fold sPAPREnvironment into
sPAPRMachineState, the pseries specific subclass of MachineState.

This is mostly a matter of search and replace, although a few places which
relied on the global spapr variable are changed to find the structure via
qdev_get_machine().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07 17:44:50 +02:00
Gerd Hoffmann
bafc90bdc5 q35: implement TSEG
TSEG provides larger amounts of SMRAM than the 128 KB available with
legacy SMRAM and high SMRAM.

Route access to tseg into nowhere when enabled, for both cpus and
busmaster dma, and add tseg window to smram region, so cpus can access
it in smm mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 19:45:13 +02:00
Gerd Hoffmann
68c77acfb1 q35: implement SMRAM.D_LCK
Once the SMRAM.D_LCK bit has been set by the guest several bits in SMRAM
and ESMRAMC become readonly until the next machine reset.  Implement
this by updating the wmask accordingly when the guest sets the lock bit.
As the lock it itself is locked down too we don't need to worry about
the guest clearing the lock bit.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:40 +02:00
Gerd Hoffmann
b66a67d751 q35: add config space wmask for SMRAM and ESMRAMC
Not all bits in SMRAM and ESMRAMC can be changed by the guest.
Add wmask defines accordingly and set them in mch_reset().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Gerd Hoffmann
7744752402 q35: fix ESMRAMC default
The cache bits in ESMRAMC are hardcoded to 1 (=disabled) according to
the q35 mch specs.  Add and use a define with this default.

While being at it also update the SMRAM default to use the name (no code
change, just makes things a bit more readable).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Paolo Bonzini
64130fa4a1 q35: implement high SMRAM
When H_SMRAME is 1, low memory at 0xa0000 is left alone by
SMM, and instead the chipset maps the 0xa0000-0xbffff window at
0xfeda0000-0xfedbffff.  This affects both the "non-SMM" view controlled
by D_OPEN and the SMM view controlled by G_SMRAME, so add two new
MemoryRegions and toggle the enabled/disabled state of all four
in mch_update_smram.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Paolo Bonzini
3de70c0899 hw/i386: remove smram_update
It's easier to inline it now that most of its work is done by the CPU
(rather than the chipset) through /machine/smram and the memory API.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Paolo Bonzini
f809c60512 target-i386: use memory API to implement SMRAM
Remove cpu_smm_register and cpu_smm_update.  Instead, each CPU
address space gets an extra region which is an alias of
/machine/smram.  This extra region is enabled or disabled
as the CPU enters/exits SMM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Paolo Bonzini
fe6567d5fd hw/i386: add a separate region that tracks the SMRAME bit
This region is exported at /machine/smram.  It is "empty" if
SMRAME=0 and points to SMRAM if SMRAME=1.  The CPU will
enable/disable it as it enters or exits SMRAM.

While touching nearby code, the existing memory region setup was
slightly inconsistent.  The smram_region is *disabled* in order to open
SMRAM (because the smram_region shows the low VRAM instead of the RAM
at 0xa0000).  Because SMRAM is closed at startup, the smram_region must
be enabled when creating the i440fx or q35 devices.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-05 17:36:39 +02:00
Michael Roth
7619c7b00c spapr_pci: add dynamic-reconfiguration option for spapr-pci-host-bridge
This option enables/disables PCI hotplug for a particular PHB.

Also add machine compatibility code to disable it by default for machine
types prior to pseries-2.4.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: move commas for compat fields]
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03 23:56:54 +02:00
Alexey Kardashevskiy
46c5874e9c spapr_pci: Make find_phb()/find_dev() public
This makes find_phb()/find_dev() public and changed its names
to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
or VFIO PCI error injection or VFIO EEH handling - in all these
cases there are RTAS calls which are addressed to BUID+config_addr
in IEEE1275 format.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03 23:56:51 +02:00
Alexey Kardashevskiy
3e1a01cb55 spapr_pci: Define default DMA window size as a macro
This gets rid of a magic constant describing the default DMA window size
for an emulated PHB.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03 23:56:50 +02:00
Paolo Bonzini
f2fbb40ea3 range: remove useless inclusions
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-04-30 16:05:48 +03:00
Gavin Shan
ee954280da sPAPR: Implement EEH RTAS calls
The emulation for EEH RTAS requests from guest isn't covered
by QEMU yet and the patch implements them.

The patch defines constants used by EEH RTAS calls and adds
callbacks sPAPRPHBClass::{eeh_set_option, eeh_get_state, eeh_reset,
eeh_configure}, which are going to be used as follows:

  * RTAS calls are received in spapr_pci.c, sanity check is done
    there.
  * RTAS handlers handle what they can. If there is something it
    cannot handle and the corresponding sPAPRPHBClass callback is
    defined, it is called.
  * Those callbacks are only implemented for VFIO now. They do ioctl()
    to the IOMMU container fd to complete the calls. Error codes from
    that ioctl() are transferred back to the guest.

[aik: defined RTAS tokens for EEH RTAS calls]
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 15:00:08 +01:00
Alexey Kardashevskiy
b194df478a spapr-pci: Enable huge BARs
At the moment sPAPR only supports 512MB window for MMIO BARs. However
modern devices might want bigger 64bit BARs.

This extends MMIO window from 512MB to 62GB (aligned to
SPAPR_PCI_WINDOW_SPACING) and advertises it in 2 records in
the PHB "ranges" property. 32bit gets the space from
SPAPR_PCI_MEM_WIN_BUS_OFFSET till the end of 4GB, 64bit gets the rest
of the space. If no space is left, 64bit range is not advertised.

The MMIO space size is set to old value of 0x20000000 by default
for pseries machines older than 2.3.

The approach changes the device tree which is a guest visible change, however
it won't break migration as:
1. we do not support migration to older QEMU versions
2. migration to newer QEMU will migrate the device tree as well and since
the new layout only extends the old one and does not change address mappigns,
no breakage is expected here too.

SLOF change is required to utilize this extension.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 14:59:54 +01:00
David Gibson
3e4ac96871 pseries: Limit PCI host bridge "index" value
pseries guests can have large numbers of PCI host bridges.  To avoid the
user having to specify a number of different configuration values for every
one, the device supports an "index" property which is a shorthand setting
the various window and configuration addresses from a predefined sensible
set.

There are some problems with the details at present:
  * The "index" propery is signed, but negative values will create PCI
windows below where we expect, potentially colliding with other devices
  * No limit is imposed on the "index" property and large values can
translate to extremely large window addresses.  With PCI passthrough in
particular this can mean we exceed various mapping and physical address
limits causing the guest host bridge to not work in strange ways.

This patch addresses this, by making "index" unsigned, and imposing a
limit.  Currently the limit allows indices from 0..255 which is probably
enough host bridges for the time being.  It's fairly easy to extend if
we discover we need more.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 14:59:54 +01:00
Alexander Graf
4d8fde1126 pci: Add generic PCIe host bridge
With simple exposure of MMFG, ioport window, mmio window and an IRQ line we
can successfully create a workable PCIe host bridge that can be mapped anywhere
and only needs to get described to the OS using whatever means it likes.

This patch implements such a "generic" host bridge. It handles 4 legacy IRQ
lines. MSIs need to be handled external to the host bridge.

This device is particularly useful for the "pci-host-ecam-generic" driver in
Linux.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-13 05:46:07 +00:00
Peter Maydell
2d6838e86c Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
Patch queue for ppc - 2014-09-08

Alexander Graf (11):
      PPC: KVM: Fix g3beige and mac99 when HV is loaded
      PPC: mac99: Move NVRAM to page boundary when necessary
      KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
      PPC: KVM: Use vm check_extension for pv hcall
      PPC: mac99: Fix core99 timer frequency
      PPC: mac_nvram: Remove unused functions
      PPC: mac_nvram: Allow 2 and 4 byte accesses
      PPC: mac_nvram: Split NVRAM into OF and OSX parts
      PPC: Mac: Move tbfreq into local variable
      PPC: Cuda: Use cuda timer to expose tbfreq to guest
      PPC: Fix default config ordering and add eTSEC for ppc64

Alexey Kardashevskiy (7):
      spapr: Move DT memory node rendering to a helper
      spapr: Use DT memory node rendering helper for other nodes
      spapr: Refactor spapr_populate_memory() to allow memoryless nodes
      spapr: Split memory nodes to power-of-two blocks
      spapr: Add a helper for node0_size calculation
      spapr: Fix ibm, associativity for memory nodes
      spapr_pci: Fix config space corruption

Anton Blanchard (2):
      spapr-vlan: Don't touch last entry in buffer list
      hypervisor property clashes with hypervisor node

Benjamin Herrenschmidt (2):
      loader: Add load_image_size() to replace load_image()
      spapr: Locate RTAS and device-tree based on real RMA

Bharat Bhushan (4):
      ppc: debug stub: Get trap instruction opcode from KVM
      ppc: synchronize excp_vectors for injecting exception
      ppc: Add software breakpoint support
      ppc: Add hw breakpoint watchpoint support

Gonglei (1):
      spapr: fix possible memory leak

Greg Kurz (1):
      spapr_pci: map the MSI window in each PHB

Nikunj A Dadhania (3):
      ppc: spapr-rtas - implement os-term rtas call
      spapr: add uuid/host details to device tree
      ppc/spapr: Fix MAX_CPUS to 255

Peter Maydell (1):
      hw/ppc/spapr_hcall.c: Fix typo in function names

Tom Musta (20):
      linux-user: Fix Stack Pointer Bug in PPC setup_rt_frame
      linux-user: Split PPC Trampoline Encoding from Register Save
      linux-user: Enable Signal Handlers on PPC64
      linux-user: Properly Dereference PPC64 ELFv1 Signal Handler Pointer
      linux-user: Implement do_setcontext for PPC64
      linux-user: Handle PPC64 ELFv2 Function Pointers
      target-ppc: Bug Fix: rlwinm
      target-ppc: Bug Fix: rlwnm
      target-ppc: Bug Fix: rlwimi
      target-ppc: Bug Fix: mullwo
      target-ppc: Bug Fix: mullw
      target-ppc: Bug Fix: mulldo OV Detection
      target-ppc: Bug Fix: srawi
      target-ppc: Bug Fix: srad
      target-ppc: Special Case of rlwimi Should Use Deposit
      target-ppc: Optimize rlwinm MB=0 ME=31
      target-ppc: Optimize rlwnm MB=0 ME=31
      target-ppc: Clean Up mullw
      target-ppc: Clean up mullwo
      target-ppc: Implement mulldo with TCG

# gpg: Signature made Mon 08 Sep 2014 11:51:15 BST using RSA key ID 03FEDC60
# gpg: Can't check signature: public key not found

* remotes/agraf/tags/signed-ppc-for-upstream: (52 commits)
  hypervisor property clashes with hypervisor node
  PPC: Fix default config ordering and add eTSEC for ppc64
  spapr_pci: map the MSI window in each PHB
  target-ppc: Implement mulldo with TCG
  target-ppc: Clean up mullwo
  target-ppc: Clean Up mullw
  target-ppc: Optimize rlwnm MB=0 ME=31
  target-ppc: Optimize rlwinm MB=0 ME=31
  target-ppc: Special Case of rlwimi Should Use Deposit
  spapr-vlan: Don't touch last entry in buffer list
  spapr_pci: Fix config space corruption
  PPC: Cuda: Use cuda timer to expose tbfreq to guest
  PPC: Mac: Move tbfreq into local variable
  PPC: mac_nvram: Split NVRAM into OF and OSX parts
  PPC: mac_nvram: Allow 2 and 4 byte accesses
  PPC: mac_nvram: Remove unused functions
  PPC: mac99: Fix core99 timer frequency
  PPC: KVM: Use vm check_extension for pv hcall
  KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd
  target-ppc: Bug Fix: srad
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-09-08 12:02:07 +01:00
Greg Kurz
8c46f7ec85 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Le Tan
a52a7fdfa7 intel-iommu: add Intel IOMMU emulation to q35 and add a machine option "iommu" as a switch
Add Intel IOMMU emulation to q35 chipset and expose it to the guest.
1. Add a machine option. Users can use "-machine iommu=on|off" in the command
line to enable/disable Intel IOMMU. The default is off.
2. Accroding to the machine option, q35 will initialize the Intel IOMMU and
use pci_setup_iommu() to setup q35_host_dma_iommu() as the IOMMU function for
the pci bus.
3. q35_host_dma_iommu() will return different address space according to the
bus_num and devfn of the device.

Signed-off-by: Le Tan <tamlokveer@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-08-28 23:10:22 +02:00
Gonglei
ef9f7b587d pci-host: update obsolete reference about piix_pci.c
piix_pci.c has been renamed into piix.c at commit
c0907c9e64

update the obsolete reference.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:06 +04:00
Alexey Kardashevskiy
9a321e9234 spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.

This makes use of new XICS ability to reuse interrupts.

This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.

This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.

This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.

This fixed traces to be more informative.

This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy
9fc34ada7e spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).

Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.

Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB

where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy
523e7b8ab8 spapr_iommu: Get rid of window_size in sPAPRTCETable
This removes window_size as it is basically a copy of nb_table
shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are
going to support windows as big as the entire RAM and this number
will be bigger that 32 capacity, we will have to do something
about @window_size anyway and removal seems to be the right way to go.

This removes dma_window_start/dma_window_size from sPAPRPHBState as
they are no longer used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00