Re: High CPU usage of RT threads...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 19.03.2018 09:13, Shyam Prasad N wrote:
> Hi Nikolay,
> 
> Thanks for your reply on this.
> 
> Checked the stack trace for many of the stuck threads. Looks like all
> of them are stuck in this loop...
> [<ffffffff810031f2>] exit_to_usermode_loop+0x72/0xd0
> [<ffffffff81003c16>] prepare_exit_to_usermode+0x26/0x30
> [<ffffffff818390e5>] retint_user+0x8/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff

Well, this doesn't imply btrfs at all.

How about the _full_ output of :

echo w > /proq/sysrq-trigger

Perhaps there is a lot of load in workqueues?

> 
> Seems like it is stuck in a tight loop in exit_to_usermode_loop.
> FWIW, we started seeing this issue with nodatacow btrfs mount option.
> Previously we were running with default option of datacow.
> However, this also coincides with fairly heavy unlink load that we've
> been putting the system under.
> 
> Please let me know if there is anything else you can think of, based
> on the above data.
> 
> Regards,
> Shyam
> 
> 
> On Thu, Mar 15, 2018 at 12:59 PM, Nikolay Borisov <nborisov@xxxxxxxx> wrote:
>>
>>
>> On 15.03.2018 09:23, Shyam Prasad N wrote:
>>> Hi,
>>>
>>> Our servers run some daemons that are scheduled to run many real time
>>> threads. These threads serve the client nodes by performing I/O on top
>>> of some set of disks, configured as DRBD pairs with disks on other
>>> peer servers for high availability of data. Btrfs is the filesystem
>>> that is configured on top of DRBD.
>>>
>>> While testing high availability with fairly high load, we have noticed
>>> the following behaviour a couple of times: When the server which was
>>> killed comes back up and gets ready and DRBD disks start syncing the
>>> data between the disks, a performance hit is generally expected at the
>>> peer node which has taken over the service now. However, the real time
>>> threads (mentioned above) on the active node are hogging the CPUs. As
>>> a part of debugging the issue, we tried to force a core dump on these
>>> threads by using a SIGABRT. However, these threads were not responding
>>> to any signals. Only after using real-time throttling (to reduce real
>>> time CPU usage to 50%), and waiting around for a few minutes, we were
>>> able to force a core dump. However, the corefile generated didn't have
>>> much useful info (I think it was a partial/corrupted core dump).
>>>
>>> Based on the above behaviour, (signals not being picked up), it looks
>>> to me like all these threads were likely stuck inside some system
>>> call. And since majority of the system calls by these threads are VFS
>>> calls on btrfs, I feel that these threads may have been stuck in some
>>> I/O. Specifically, based on the CPU usage, in some spinlock (I'm open
>>> to suggestions of other possibilities). And this is the reason I'm
>>> posting on this mailing list.
>>
>> When you have a bunch of those threads get a dump of the stacks of all
>> sleeping tasks by "echo w > /proc/sysrq-trigger" .
>>
>>>
>>> Is there a known bug which might have caused this? Kernel version
>>> we're using is 4.4.0.
>>
>> This is rather old kernel, you should at least be using latest 4.4.y
>> stable kernel. BTRFS is a moving target and there are a lot of
>> improvements made every release. So I'd suggest to try 4.14 at least on
>> one offending machine.
>>
>>> If we go for a kernel upgrade, what are the chances of not seeing this
>>> behaviour again?
>>>
>>> Or is my analysis of the problem entirely wrong? My feeling is that
>>> this maybe some issue with using Btrfs when it doesn't get a response
>>> from DRBD quickly enough.
>>
>> Feelings don't count for anything. Next time this happens extract
>> stacktrace from the offending threads i.e. smapling /proc/[pid of
>> hogging thread]/stack. Furthermore, if we assume that btrfs is indeed
>> not getting responses fast enough this means most clients should really
>> be stuck in io sleep and not doing any processing.
>>
>>
>>> Because we have been using ext4 on top of DRBD for a long time, and
>>> have never seen such issues during HA tests there.
>>>
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux