Re: [Qemu-ppc] [RFC] QEMU/KVM PowerPC: virtio and guest endianness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04.10.2013, at 13:53, Paul Mackerras wrote:

> On Thu, Oct 03, 2013 at 04:29:52PM +0200, Greg Kurz wrote:
>> Hi,
>> 
>> There have been some work on the topic lately but no agreement has
>> been reached yet. I want to consolidate the facts in a single thread of
>> mail and re-start the discussion. Please find below a recap of what we
>> have as of today:
>> 
>> From a virtio POV, guest endianness is reflected by the endianness of
>> the interrupt vectors (ILE bit in the LPCR register). The guest kernel
>> relies on the H_SET_MODE_RESOURCE_LE hcall to set this bit, early in the
>> boot process.
>> 
>> Rusty sent a patchset on qemu-devel@ to provide the necessary bits to
>> perform byteswap in the QEMU:
>> 
>> http://patchwork.ozlabs.org/patch/266451/
>> http://patchwork.ozlabs.org/patch/266452/
>> http://patchwork.ozlabs.org/patch/266450/
>> (plus other enablement patches for virtio drivers, not essential for
>> the discussion).
>> 
>> In non-KVM mode, QEMU implements the H_SET_MODE_RESOURCE_LE and updates
>> its internal value for LPCR when the guest requests it. Rusty's patchset
>> works out-of-the-box in this mode: I could successfully setup and use a
>> 9p share over virtio transport (broader virtio testing still to be done
>> though).
>> 
>> When using KVM, the story is different : QEMU is not on this
>> endianness change flow anymore, providing KVM has the following
>> patch from Anton:
>> 
>> http://patchwork.ozlabs.org/patch/277079/
>> 
>> There are *at least* two approaches to bring back endianness knowledge
>> to QEMU: polling (1) and propagation (2).
>> 
>> (1) QEMU must retrieve LPCR from the kernel using the following API:
>> 
>> http://patchwork.ozlabs.org/patch/273029/
>> 
>> (2) KVM can resume execution to the host and thus propagating
>> H_SET_MODE_RESOURCE_LE to QEMU. Laurent came up with a patch on
>> linuxppc-dev@ to do this:
>> 
>> http://patchwork.ozlabs.org/patch/278590/
>> 
>> I would say (1) is a standard and sane way of addressing the issue:
>> since the LPCR register value is held by KVM, it makes sense to
>> introduce an API to get/set it. Then, it is up to QEMU to use this API.
>> 
>> We can dumbly do the polling in all the places where byteswapping
>> matters: it is clearly sub-optimized, especially since the LPCR_ILE bit
>> doesn't change so often. Rusty suggested we can retrieve it at virtio
>> device reset time and cache it, since an endianness change after the
>> devices have started to be used is non-sensical.
>> 
>> I have searched for an appropriate place to add the polling and I must
>> admit I did not find any... I am no QEMU expert but I suspect we would
>> need some kind of arch specific hook to be called from the virtio code
>> to do this... :-\ I hope I am wrong, please correct me if so.
>> 
>> On the other hand, (2) looks a bit hacky: KVM usually returns to the
>> host when it cannot fully handle the h_call. Propagating may look like
>> a useless path to follow from a KVM POV. From a QEMU POV, things are
>> different: propagation will trig the fallback code in QEMU, already
>> working in non-KVM mode. Nothing more to be done.
> 
> I don't mind particularly whether H_SET_MODE for the endianness
> setting gets handled in the kernel or in QEMU, but I don't think it
> should be handled in both.  If you want QEMU to know about the
> endianness setting immediately, make the kernel version do nothing and
> get QEMU to handle it -- which if KVM is enabled will mean iterating
> over all vcpus and getting them all to send the new LPCR setting to
> the kernel via the SET_ONE_REG ioctl.
> 
> However, I want the setting of breakpoint registers (CIABR and DAWR/X)
> via H_SET_MODE to happen in the kernel, preferably in real mode, since
> that can happen on context switch and thus needs to be quick.

I don't want to see a single hypercall be split across the QEMU/KVM barrier. So if there's a reasonable incentive to handle H_SET_MODE in KVM, we should handle all of it in KVM.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux