Re: [PATCH V2 1/6] KVM: ARM: Remove pgtable standard functions from stage-2 page tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 02/26/2019 09:45 AM, Anshuman Khandual wrote:
> 
> 
> On 02/25/2019 09:19 PM, Mark Rutland wrote:
>> On Mon, Feb 25, 2019 at 07:50:27PM +0530, Anshuman Khandual wrote:
>>>
>>>
>>> On 02/25/2019 04:30 PM, Mark Rutland wrote:
>>>> Hi Anshuman,
>>>>
>>>> On Mon, Feb 25, 2019 at 10:33:54AM +0530, Anshuman Khandual wrote:
>>>>> ARM64 standard pgtable functions are going to use pgtable_page_[ctor|dtor]
>>>>> constructs. Certain stage-2 page table allocations are multi order which
>>>>> cannot be allocated through a generic pgtable function as it does not exist
>>>>> right now. This prevents all pgtable allocations including multi order ones
>>>>> in stage-2 from moving into new ARM64 (pgtable_page_[ctor|dtor]) pgtable
>>>>> functions. Hence remove all generic pgtable allocation function dependency
>>>>> from stage-2 page tables till there is one multi-order allocation function
>>>>> available.
>>>>
>>>> I'm a bit confused by this. Which allocations are multi-order?
>>>>
>>>
>>> Stage-2 PGD.
>>>
>>> kvm_alloc_stage2_pgd -> alloc_pages_exact
>>> kvm_free_stage2_pgd -> free_pages_exact
>>>  
>>>> Why does that prevent other allcoations from using the regular routines?
>>>
>>> At present both stage-2 PGD (kvm_alloc_stage2_pgd -> alloc_pages_exact), PUD|PMD
>>> (mmu_memory_cache_alloc) allocates directly from buddy allocator but then while
>>> freeing back stage-2 PGD directly calls buddy allocator via free_pages_exact but
>>> PUD|PMD get freed with stage2_[pud|pmd]_free which calls pud|pmd_free instead
>>> of calling free_pages() directly.
>>
>> If we allocate/free the stage-2 PGD with {alloc,free}_pages_exact(),
>> then the PGD level is balanced today.
>>
>> I don't see what that has to do with the other levels of table.
> 
> The idea is either all page table levels pages go through ctor/dtor constructs
> or none of them do. Not only this makes sense logically but also prevents getting
> into bad page state errors when freeing without calling dtor. For other non-KVM
> use cases like kernel linear or vmmap it helps while tearing down the page table
> for memory hot-remove cases.
> 
>>
>>> All of these worked fine because pud|pmd_free called free_pages() directly with
>>> out going through pgtable_page_dtor(). But now we are changing pud|pmd_free to
>>> use pgtable_page_dtor() both for user and host kernel page tables. This will
>>> break stage2 page table (bad page state errors) because the new free path which
>>> would call pgtable_page_dtor() where as alloc patch never called pgtable_page_ctor().
>>
>> I'm lost as to how that relates to the alloc/free of the PGD. AFAICT,
>> that's unrelated to the problem at hand.
> 
> As mentioned above either all page table use ctor/dtor based allocation/free or
> none of them do.
> 
>>
>> What subtlety am I missing?
> 
> pmd|pud_free() will call dtor() after the series hence they cannot be used by
> stage2_pmd|pud_free because stage2_pud|pmd get allocated from pre-allocated
> mmu_memory_cache_alloc which builds with __get_free_page(PGALLOC_GFP). Is not
> that right ? 
> 
>>
>>> To fix this situation either we move all stage-2 page table to use pte_alloc_one()
>>> and pte_free() which goes through pgtable_page_ctor/dtor cycle or just keep the
>>> virtualization page tables out of it (stage2/hyp) and remove the only dependency
>>> which breaks because of these changes. This series went with the second option. 
>>
>> It sounds to me like this is just a mismatch between the alloc/free
>> paths for the PUD and PMD levels of table.
>>
>> IIUC, we allocate those with {pmd,pud}_alloc_one(), and free them with
>> stage2_{pud,pmd}_free(), which call {pud,pmd}_free().
> 
> AFAICS they get allocated from mmu_memory_cache_alloc instead. pud|pmd_alloc_one()
> get used only for the _hyp_ mappings not for the stage2 table.
> 
> kvm_handle_guest_abort
> 	user_mem_abort
> 		stage2_set_pud_huge -> stage2_get_pud -> mmu_memory_cache_alloc
> 		stage2_set_pmd_huge -> stage2_get_pmd -> mmu_memory_cache_alloc
> 		stage2_set_pte -> mmu_memory_cache_alloc
> 
> Am I missing something here.

This needs to be added into this patch here. Got skipped because my guest VM test setup
always created PMD level mappings. No impact on arm platform as it's pte_free_kernel()
always called free_page which remains unchanged.

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 30251e288629..61ea76d786b9 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -191,7 +191,7 @@ static void clear_stage2_pmd_entry(struct kvm *kvm, pmd_t *pmd, phys_addr_t addr
        VM_BUG_ON(pmd_thp_or_huge(*pmd));
        pmd_clear(pmd);
        kvm_tlb_flush_vmid_ipa(kvm, addr);
-       pte_free_kernel(NULL, pte_table);
+       __free_page(virt_to_page(pte_table));
        put_page(virt_to_page(pmd));
 }


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [CentOS ARM]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]

  Powered by Linux