[Bug 43305] New: Deleting a folder with 500 000 small files causes deadlock in start_this_handle.irsa.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=43305

           Summary: Deleting a folder with 500 000 small files causes
                    deadlock in start_this_handle.irsa.7
           Product: File System
           Version: 2.5
    Kernel Version: 3.2.0-24-generic-pae
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@xxxxxxxxxxxxxxxxxxxx
        ReportedBy: aigarius@xxxxxxxxx
        Regression: No


As part of unrelated software experiment I created a folder with 500 000 small
files (10-15 bytes of content in each) and when I tried to then delete this
folder I ran into trouble - when I ran rm against files in that folder (either
the whole folder or individually or in batches) one of the rm commands would
invariably hang in Uninterruptible sleep in function start_this_handle.irsa.7
and this would block any other writes to that filesystem causing messages like
this in the kernel logs:

[  480.617348] INFO: task bounce:4006 blocked for more than 120 seconds.
[  480.617350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  480.617353] bounce          D ef39dec0     0  4006   1685 0x00000000
[  480.617357]  ef39df10 00000086 00000000 ef39dec0 c10f6526 f7570ca0 c1930e00
c1930e00
[  480.617365]  cd507544 00000052 f78c7e00 df428000 f7570ca0 ef39df2c c10f65c5
00000001
[  480.617372]  0000000e 00000001 d5af5a68 ef39dee0 ef39dee4 00000001 ef39deec
c1036578
[  480.617379] Call Trace:
[  480.617382]  [<c10f6526>] ? wait_on_page_bit+0x86/0x90
[  480.617386]  [<c10f65c5>] ? filemap_fdatawait_range+0x95/0x160
[  480.617391]  [<c1036578>] ? default_spin_lock_flags+0x8/0x10
[  480.617395]  [<c15a819d>] ? _raw_spin_lock_irqsave+0x2d/0x40
[  480.617399]  [<c15a65a5>] schedule+0x35/0x50
[  480.617403]  [<c12175e5>] jbd2_log_wait_commit+0x95/0x100
[  480.617408]  [<c1079e90>] ? add_wait_queue+0x50/0x50
[  480.617412]  [<c11c81f5>] ext4_sync_file+0x1f5/0x2b0
[  480.617416]  [<c11c8000>] ? ext4_flush_completed_IO+0xa0/0xa0
[  480.617421]  [<c116cf83>] vfs_fsync+0x33/0x50
[  480.617425]  [<c116d2d6>] sys_fsync+0x26/0x50
[  480.617429]  [<c15af35f>] sysenter_do_call+0x12/0x28

If a signal of SIGKILL or SIGTERM is sent to the specific rm process that is in
that function, then after up to 30 seconds the process would die and the
filesystem would continue function and all hung operations would complete. The
rm process does delete some files before the deadlock occurs, but after the
wchan of the rm process changes to start_this_handle.irsa.7 no more files are
deleted.

Rebooting the machine and forcing a fsck does not change the outcome.

This is on a PCIX type SSD drive.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux