Re: rfc seamless migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Alon Levy píše v Ne 10. 06. 2012 v 17:44 +0300:
> On Sun, Jun 10, 2012 at 04:32:54PM +0200, Hans de Goede wrote:
> > Hi,
> > 
> > On 06/10/2012 11:05 AM, Yonit Halperin wrote:
> > >Hi,
> > >
> > >As the qemu team rejected integrating spice connection migration in qemu migration process, we remain with a solution that will involve libvirt, and passing data from the src to the target via the client. Before I continue with the implementation I'd like to hear your comments on the details:
> > >
> > >Here is a reminder about the problems we face:
> > >(1) Loss of data: we would like the client to continue the connection from the same point the vm was stopped. For example, we want any usb/smartcard devices to stay attached, and we don't want to lose any data that was sent from the client to the vm, or partial data that was read from a device, but hasn't reached its destination before migration.
> > >
> > >(2) The qemu process in the src side can be closed by libvirt as soon as the migration state changes to "completed". Thus, we can't reliably pass any data between the src server and the client after migration has completed.
> > >
> > >These problems can be addressed by the following:
> > >Add a qmp event for spice migration completion. libvirt will need to wait not only for qemu migration completion, but also for this qmp event, before it closes the src qemu.
> > >Spice is required to know whether libvirt supports this, or not, in order to decide which migration approach to take (semi or seamless). For this aim, we will add a new parameter to the spice configuration in the qemu command line (e.g., seamless-migration=on), and if it is set by libvirt we can assume libvirt will wait for spice migration.
> > >After qemu migration is completed, the src server will pass migration data to the target via the client/s. When the clients disconnect from the src and switch completely to the target, we send the new qmp event.
> > >
> > >
> > >migration data transfer
> > >=======================
> > >Our historical MSG_MIGRATE pathway, provides support for sending all pending outgoing data from the client to the server, and vice-versa, before we fill the migration data.
> > >Each channel defines its own migration data.
> > >(1) MSG_MIGRATE is the last message that is sent from the src server channel to the client, before MIGRATE_DATA.
> > >(2) If the messages flags have MIGRATE_NEED_FLUSH, the client write all its outgoing data, and then sends FLUSH to the server. (3) Then the client
> > >channel waits for MIGRATE_DATA message, and does nothing besides that. (4) When it receives the message, it switches to the target completely and passes it the migration data.
> > >
> > >(1) server channel--->MSG_MIGRATE...in-flight messages--->client
> > >(2) client channel-->MSGC_FLUSH_MARK...in-flight messages-->server
> > >(3) server channel-->MSG_MIGRATE_DATA-->client
> > >(4) client channel-->MSGC_MIGRATE_DATA-->target server
> > >
> > >Obligatory migration data:
> > >-------------------------
> > >(1) agent/spicevmc/smartcard write buffer. i.e., data that reached the server after savevm, and thus was not written to the device.
> > >Currently, spicevmc and smartcard do not have write buffer, but since buffers can reach the server after savevm, they should have one. I'm not sure if even today they should attempt to write to the guest if it is stopped. The agent code also can write to the guest even if it is stopped; I think it is a bug.
> > >(2) agent/smartcard partial data that had been read from the device and wasn't sent to the client since its reading hasn't completed.
> > >Currently we don't have such data for spicevmc, because we push to the client any amount of data we read. In the future we might want to control the rate and the size of data we send/receive, and then we will have outgoing buffer.
> > 
> > I'm still not a big fan of the concept of server data going through the client, this means the server
> > will need to seriously sanity check what it receives to avoid potentially new attacks on it.
> > 
> > I'm wondering why not do the following:
> > 
> > 1) spicevmc device gets a savevm call, tell spice-server to send a message to the client telling it
> > to stop sending more data to *this* server.
> > 2) client sends an ack in response to the stop sending data server
> > 3) server waits for ack.
> > 4) savevm continues only after ack, which means all data which was in flight has been received.
> 
> Have you seen the qemu-devel thread Yonit referred to in the beginning?
> Let me quote:
> "
> Spice is *not* getting a hook in migration where it gets to add
> arbitrary amounts of downtime to the migration traffic.  That's a
> terrible idea.
> "
> 
> It didn't continue any better.
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg00559.html
> 

Rather naïve question: have anybody tried to push two phase migration to
qemu, first just data needed for VM sync and second other stuff not
needed to actually run VM, including spice-server state?

Prospect of any data going client -> src -> client -> dst doesn't sound
like wise use of high-latency WAN links.

BTW when I read this thread, it seems to me that semi-seamless migration
would play better with WAN conditions - if it is feasible to make
usbredir work with it...

David

> > 
> > No more reason for obligatory data 1.
> > And as you already point out 2, is not an issue atm.
> > 
> > So no more obligatory reason to have server *state* pass through the client, which I still believe
> > is just asking for security vulnerabilities.
> > 
> > 
> > >Optional migration data:
> > >--------------
> > >- primary surface lossy region(*), or its extents
> > >If we don't send it to the client, and jpeg is enabled, we will need to resend the primary surface after migration, or set the lossy region to the whole surface, and then each non opaque rendering operation that involves the surface, will require resending parts of it losslessly.
> > 
> > So this needs to be send to client, but not back to the server?
> > 
> > >- list of off-screen surfaces ids that have been sent to the client, and their lossy region.
> > >By keeping this data we will avoid on-demand resending surfaces  that already exist on the client side.
> > 
> > The client already knows which off-screen surfaces ids it has been received, so it can just
> > send these to the new server without having to receive them from the old one first.
> > 
> > >- bitmaps cache - list of bitmaps ids + some internal cache information for each bitmap.
> > 
> > idem.
> > 
> > >- active video streams: ids, destination box, etc.
> > 
> > idem.
> > 
> > >- session bandwidth (low/high): we don't want to perform the main channel net test after the migration is completed, because it can take time (we can't do it during the migration because the main loop is not available). So we assume the bandwidth classification will stay the same. When we will have a dynamic monitoring of bandwidth, we can drop this.
> > 
> > This I can live with being send through the client, but then not as opaque data, but have
> > a special command for it. This could be useful in non migration cases too. If the client
> > somehow already knows the channel characteristics.
> > 
> > 
> > >
> > >Though the above data is optional, part of it is important for avoiding a slow start of the connection to target (e.g., sending the primary lossy region, in order to avoid resending parts of it).
> > >
> > >In addition, if we wish to keep the client channels state the same, and not require them (1) to send initialization data to the server, and (2) to reset part of their state, we should also migrate other server state details, like:
> > >- the serial of the last message sent from the display channel
> > >- main channel agent data tokens state
> > >- size of the images cache (this is usually set by the client upon new connection).
> > >Including such information in the migration data will allow us to keep the migration logic in the server. The alternative will be that the client will reset part of its state after migration, either by self initiative, or by specific messages sent from the server (it may require new set of messages).
> > >
> > >(*) lossy-region=the region on the surface that contains bitmaps that were compressed using jpeg
> > >
> > >Transparency of migration data:
> > >------------------------------
> > >I think that the migration data shouldn't be part of spice protocol, and that it should be opaque to the client, for the following reasons:
> > 
> > As said before, I think that migration data should not be send through the spice protocol *at all* !
> > 
> > >(a) The client is only a mediator, and it has nothing to do with the data content.
> > >(b) If the migration data of each channel is part of spice protocol, every minor change to the migration data of one channel, will require a new message and capability, and will make the support in migration backward compatibility more cumbersome, as it will involve the client as well.  Moreover, If the client supports only migration data of ver x, and the src and target both support ver x+1, we will suffer from data loss.
> > >(c) As for security issues, I don't think that it should raise a problem since the client is trusted by both the src and the target.
> > 
> > The client is trusted to access the *vm*, not the *host*, and this allows attacks on spice-server,
> > which is running on the *host*.
> > 
> > 
> > Regards,
> > 
> > Hans
> _______________________________________________
> Spice-devel mailing list
> Spice-devel@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/spice-devel

-- 

David Jaša, RHCE

SPICE QE based in Brno
GPG Key:     22C33E24 
Fingerprint: 513A 060B D1B4 2A72 7F0D 0278 B125 CD00 22C3 3E24



_______________________________________________
Spice-devel mailing list
Spice-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/spice-devel



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]     [Monitors]