Re: pinning, tsc and apic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Ryan Harper wrote:
> I've been digging into some of the instability we see when running
> larger numbers of guests at the same time.  The test I'm currently using
> involves launching 64 1vcpu guests on an 8-way AMD box.

Note this is a Barcelona system and therefore should have a 
fixed-frequency TSC.

>   With the latest
> kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
> 64 of these 1 second apart,

BTW, what if you don't pace-out the startups?  Do we still have issues 
with that?

>  and only a handful (1 to 3)  end up not
> making it up.  In dmesg on the host, I get a couple messages:
> [321365.362534] vcpu not ready for apic_round_robin
> and 
> [321503.023788] Unsupported delivery mode 7
> Now, the interesting bit for me was when I used numactl to pin the guest
> to a processor, all of the guests come up with no issues at all.  As I
> looked into it, it means that we're not running any of the vcpu
> migration code which on svm is comprised of tsc_offset recalibration and
> apic migration, and on vmx, a little more per-vcpu work

Another data point is that -no-kvm-irqchip doesn't make the situation 

> I've convinced myself that svm.c's tsc offset calculation works and
> handles the migration from cpu to cpu quite well.  I added the following
> snippet to trigger if we ever encountered the case where we migrated to
> a tsc that was behind:
>     rdtscll(tsc_this);
>     delta = vcpu->arch.host_tsc - tsc_this;
>     old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
>     new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
>     if (new_time < old_time) {
>         printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
>                vcpu->cpu, cpu, old_time - new_time);
>     }
>     svm->vmcb->control.tsc_offset += delta;

Time will never go backwards, but what can happen is that the TSC 
frequency will slow down.  This is because upon VCPU migration, we don't 
account for the time between vcpu_put on the old processor and vcpu_load 
on the new processor.  This time then disappears.

A possible way to fix this (that's only valid on a processor with a 
fixed-frequency TSC), is to take a high-res timestamp on vcpu_put, and 
then on vcpu_load, take the delta timestamp since the old TSC was saved, 
and use the TSC frequency on the new pcpu to calculate the number of 
elapsed cycles.

Assuming a fixed frequency TSC, and a calibrated TSC across all 
processors, you could get the same affects by using the VT tsc delta 
logic.  Basically, it always uses the new CPU's TSC unless that would 
cause the guest to move backwards in time.  As long as you have a 
stable, calibrated TSC, this would work out.

Can you try your old patch that did this and see if it fixes the problem?

> Noting that vcpu->arch.host_tsc is the tsc of the previous cpu the vcpu
> was running on (see svm_put_vcpu()).  This allows me to check if we are
> in fact increasing the guest's view of the tsc.  I've not be able to
> trigger this at all when the vcpus are migrating.
> As for the apic, the migrate code seems to be rather simple, but I've
> not yet dived in to see if we've got anything racy in there:
> lapic.c:
> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
> {
>     struct kvm_lapic *apic = vcpu->arch.apic;
>     struct hrtimer *timer;
>     if (!apic)
>         return;
>     timer = &apic->;
>     if (hrtimer_cancel(timer))
>         hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS);
> }

There's a big FIXME in the __apic_timer_fn() to make sure the timer runs 
on the current "pCPU".  As written, it's possible for the timer to 
happen on a different pcpu as the current vcpu's but it wasn't obvious 
to me that it would cause problems.  Eddie, et al: Care to elaborate on 
what the TODO was trying to address?


Anthony Liguori

> Ryan Harper
> Software Engineer; Linux Technology Center
> IBM Corp., Austin, Tx
> (512) 838-9253   T/L: 678-9253
> ryanh@xxxxxxxxxx
> -------------------------------------------------------------------------
> This email is sponsored by: Microsoft 
> Defy all challenges. Microsoft(R) Visual Studio 2008. 
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@xxxxxxxxxxxxxxxxxxxxx

This email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008.
kvm-devel mailing list

[Site Home]     [Netdev]     [Ethernet Bridging]     [Linux Virtualization]     [LVS Devel]     [Linux Wireless]     [Kernel Newbies]     [Memory]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Rubini]     [100% Free Internet Dating]     [Photo]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Video 4 Linux]     [Linux Resources]

Powered by Linux