Re: [RFC 7/11] virtio_pci: new, capability-aware driver.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Thu, 12 Jan 2012 08:00:10 +0200, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> On Thu, Jan 12, 2012 at 12:31:09PM +1030, Rusty Russell wrote:
> > If we use a 32-bit counter, we also get this though, right?
> > 
> > If counter has changed, it's a config interrupt...
> 
> But we need an exit to read the counter. We can put the counter
> in memory but this looks suspiciously like a simplified VQ -
> so why not use a VQ then?

Because now a driver first gets the data from config space.  But from
then on, they have to get it from the vq, and ignore the config space.
That's a bit weird.

> > > If we do require config VQ anyway, why not use it to notify
> > > guest of config changes? Guest could pre-post an in buffer
> > > and host uses that.
> > 
> > We could, but it's an additional burden on each device.  vqs are cheap,
> > but not free.  And the config area is so damn convenient...
> 
> Not if you start playing with counters, checking it twice,
> reinvent all kind of barriers ...

None of that appears inside the driver, though.  And let's be honest,
it's not *that* bad (very approx code):

static u32 vp_get_gen(struct virtio_pci_device *vp_dev)
{
        u32 gen;
        do {
                gen = ioread32(vp_dev->ioaddr + VIRTIO_PCI_CONFIG_GEN);
        } while (unlikely((gen & 1) == 1));

        virtio_rmb();
        return gen;
}

static bool vp_check_gen(struct virtio_pci_device *vp_dev, u32 gen)
{
        virtio_rmb();
        return ioread32(vp_dev->ioaddr + VIRTIO_PCI_CONFIG_GEN) == gen;
}

static void vp_get32(struct virtio_device *vdev, unsigned offset, u32 *buf)
{
	struct virtio_pci_device *vp_dev = to_vp_device(vdev);
        u32 gen;

        do {
                gen = vp_get_gen(vdev);
                *buf = ioread32(vp_dev->ioaddr + VIRTIO_PCI_CONFIG(vp_dev) + offset);
        } while (unlikely(!vp_check_gen(vp_dev, gen)));
}
...

> > It was suggested by others, but I think TCP Acks are the classic one.
> > 12 + 14 + 20 + 40 = 86 bytes with virtio_net_hdr_mrg_rxbuf at the front.
> 
> That's only the simplest IPv4, right?
> Anyway, this spans multiple descriptors so this complicates allocation
> significantly.

Yes, I think any general-but-useful inline will need to span multiple
descriptors.  That's part of the fun!

Let's get totally crazy and implement our ring in stripes, like:

00 04 08 12 01 05 09 13 02 06 10 14 03 07 11 15

That way consecutive (32-byte) descriptors don't share a cacheline!
(Not serious... quiet.)

> > Yes, I'm thinking #define VIRTIO_F_VIRTIO2 (-1).  For PCI, this gets
> > mapped into a "are we using the new config layout?".  For others, it
> > gets mapped into a transport-specific feature.
> > 
> > (I'm sure you get it, but for the others) This is because I want to be
> > draw a clear line between all the legacy stuff at the same time, not
> > have to support part of it later because someone might not flip the
> > feature bit.
> 
> So my point is, config stuff and ring are completely separate,
> they are different layers.

Absolutely, and we should analyze them separately as well as together.

*But* for maintenance is far easier if we only have to test new
config+new ring and old config+old ring.  They do interact, because
remember, the allocation of the ring changes with new config, too...

Cheers,
Rusty.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Photo]     [Yosemite]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Find Someone Nice]     [Video 4 Linux]     [Linux Resources]
Add to Google Powered by Linux