Re: [PATCH 2/5] Btrfs: fix trans block rsv regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/14/2012 08:07 PM, David Sterba wrote:
> On Fri, Sep 14, 2012 at 07:25:45PM +0800, Liu Bo wrote:
>> On 09/14/2012 07:15 PM, David Sterba wrote:
>>> On Fri, Sep 14, 2012 at 04:58:04PM +0800, Liu Bo wrote:
>>>> --- a/fs/btrfs/transaction.c
>>>> +++ b/fs/btrfs/transaction.c
>>>> @@ -306,9 +306,17 @@ static struct btrfs_trans_handle *start_transaction(struct btrfs_root *root,
>>>>  		WARN_ON(type != TRANS_JOIN && type != TRANS_JOIN_NOLOCK &&
>>>>  			type != TRANS_JOIN_ONLY);
>>>>  		h = current->journal_info;
>>>> -		h->use_count++;
>>>> -		h->orig_rsv = h->block_rsv;
>>>> +		if (h->block_rsv) {
>>>> +			struct btrfs_trans_rsv_item *item;
>>>> +			item = kmalloc(sizeof(*item), GFP_NOFS);
>>>
>>> I'd rather avoid the kmalloc here and add a list hook into
>>> btrfs_block_rsv itself (used only for this purpose).
>>>
>>> It also does not increase the failure surface and we don't have to
>>> handle error conditions from deep callchains.
>>>
>>
>> Actually I placed a list hook at first, but I found the same block_rsv could be inserted into
>> the list chain twice, which will cause list_head's terrible warnings.
> 
> Is it expected and correct to add the rsv twice? I know the transaction
> joins and commits are sometimes wildly nested so it does not mean that's
> necessarily a bug. Then it is not possible to embed the list hook, we
> need to keep the full track of the blk_rsv stack.
> 
> I'm counting two other similar structs used inside btrfs
> (list_head + 8B value), we could make a shared separate slab for them.
> 
> struct delayed_iput {
>         struct list_head           list;                 /*     0    16 */
>         struct inode *             inode;                /*    16     8 */
> 
>         /* size: 24, cachelines: 1, members: 2 */
>         /* last cacheline: 24 bytes */
> };
> 
> truct seq_list {
>         struct list_head           list;                 /*     0    16 */
>         u64                        seq;                  /*    16     8 */
> 
>         /* size: 24, cachelines: 1, members: 2 */
>         /* last cacheline: 24 bytes */
> };
> 
> seq_list is never allocated, only embedded inside tree_mod_log structures.
> 
> delayed_iput is allocated on every add_delayed_iput and freed quite
> frequently (once per-commit at least), so this is a short-lived object
> as well as the proposed btrfs_trans_rsv_item. IMO this justifies using
> the slab. Does this sound good to you?
> 

Good, a quick thinking says it will work ;)

btw, do you have any test suit to measure this kind of cacheline efficiency so that
we can get some numbers to see if it really helps?

- liubo
> 
> david
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux