Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rsync finished. FWIW in the end it reported an average speed of about
 900K/sec. Without autodefrag there have been no messages about hung
kworkers even though rsync seemingly keeps getting hung for several
minutes throughout the whole execution.

Thanks
John



On Tue, Sep 2, 2014 at 10:48 PM, john terragon <jterragon@xxxxxxxxx> wrote:
> OK, so I'm using 3.17-rc3, same test on a flash usb drive, no
> autodefrag. The situation is even stranger. The rsync is clearly
> stuck, it's trying to write the same file for much more than 120 secs.
> However dmesg is clean, no "INFO: task kworker/u16:11:1763 blocked for
> more than 120 seconds" or anything.
> df is responsive but shows no increase in used space.
> Consider that with autodefrag this bug is completely "reliable", the
> hung-task info starts to show up almost immediately.
>
> Oh wait (I'm live...) now rsync is unstuck, files are being written
> and df shows an increase in used space. BUT, still no hung-task
> message in the kernel log, even though rsync was actually stuck for
> several minutes.
>
> So, to summarize, same conditions except no autodefrag. Result:
> process stuck for way more than 120 secs but this time no complaints
> in the kernel log.
>
> Thanks
> John
>
>
>
> On Tue, Sep 2, 2014 at 10:23 PM, john terragon <jterragon@xxxxxxxxx> wrote:
>> I don't know what to tell you about the ENOSPC code being heavily
>> involved. At this point I'm using this simple test to see if things
>> improve:
>>
>> -freshly created btrfs on dmcrypt,
>> -rsync some stuff (since the fs is empty I could just use cp but I
>> keep the test the same as it was when I had the problem for the first
>> time)
>> -note: the rsynced stuff is about the size of the volume but with
>> compression I always end up with 1/2 to 3/4 free space
>>
>> I'm not sure how do I even get close to involving the ENOSPC code but
>> probably I'm not fully aware of the inner workings of btrfs.
>>
>>> Can you try flipping off autodefrag?
>>
>> As soon as the damn unkillable rsync decides to obey the kill -9...
>>
>> Thanks
>>
>> John
>>
>> On Tue, Sep 2, 2014 at 10:10 PM, Chris Mason <clm@xxxxxx> wrote:
>>>> On 09/02/2014 03:56 PM, john terragon wrote:
>>>> Nice...now I get the hung task even with 3.14.17.... And I tried with
>>>> 4K for node and leaf size...same result. And to top it all off, today
>>>> I've been bitten by the bug also on my main root fs (which is on two
>>>> fast ssd), although with 3.16.1.
>>>>
>>>> Is it at least safe for the data? I mean, as long as the hung process
>>>> terminates and no other error shows up, can I at least be sure that
>>>> the data written is correct?
>>>
>>> Your traces are a little different.  The ENOSPC code is throttling
>>> things to make sure you have enough room for the writes you're doing.
>>> The code we have in 3.17-rc3 (or my for-linus branch) are the best
>>> choices right now.  You can pull that down to 3.16 if you want all the
>>> fixes on a more stable kernel.
>>>
>>> Nailing down the ENOSPC code is going to be a little different, I think
>>> autodefrag probably isn't interacting well with being short on space and
>>> encryption.  This is leading to much more IO than we'd normally do, and
>>> dm-crypt makes it fairly intensive.
>>>
>>> Can you try flipping off autodefrag?
>>>
>>> -chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux