Re: btrfs dev del hangs on 4.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 09, 2016 at 06:27:33PM +0000, Hugo Mills wrote:
> On Tue, Aug 09, 2016 at 02:26:14PM -0400, Chris Mason wrote:
> > On 08/09/2016 02:23 PM, Hugo Mills wrote:
> > >   Hi, Chris,
> > >
> > >On Tue, Aug 09, 2016 at 02:02:20PM -0400, Chris Mason wrote:
> > >>On 08/09/2016 01:27 PM, Hugo Mills wrote:
> > >>>  Over the weekend, I started doing some maintenance on my server: I
> > >>>upgraded to 4.7.0, and I started deleting a device from my array,
> > >>>preparatory to putting in a larger one. About halfway through the
> > >>>operation, several kernel threads hung up for a while (resulting in
> > >>>"blocked for 120s" messages), and then the delete process seems to
> > >>>have stopped entirely, although several kernel threads are at maximum
> > >>>usage.
> > >>>
> > >>>  After a few hours, I rebooted the machine, and left it for a day or
> > >>>so. I tried the delete again this afternoon, and it's done the same
> > >>>thing again. The full log is included below. I have a kworker and a
> > >>>btrfs-transaction pegged at close to 100% of a core each, and a
> > >>>btrfs-cleaner (and the btrfs dev del process) in D state.
> > >>>
> > >>>  The FS was not under load at the time of the failure, and it passes
> > >>>scrub. I haven't tried a btrfs check yet.
> > >>
> > >>Thanks Hugo, can you nail down which line of code belongs to:
> > >>
> > >>btrfs_async_run_delayed_refs+0xc6
> > >
> > >   I'm having a spot of trouble with this. The btrfs on this kernel is
> > >built-in, and I've lost the contents of the build directory (it's done
> > >by an overnight build script, and it's already built a 4.8-rc1 for one
> > >of my other machines).
> > >
> > >(gdb) file /boot/vmlinuz-4.7.0-dirty
> > >BFD: /boot/vmlinuz-4.7.0-dirty: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .bss
> > >Reading symbols from /boot/vmlinuz-4.7.0-dirty...(no debugging symbols found)...done.
> > >(gdb) list *btrfs_async_run_delayed_refs+0xc6
> > >No symbol table is loaded.  Use the "file" command.
> > >
> > >   There must be a way of getting this info from here, but I'm not
> > >sure I know what it is. Build a new kernel from 4.7 with this
> > >machine's config and run gdb on the btrfs.o file? Not a problem to do,
> > >but it might take a little while.
> > 
> > As long as you use the same gcc and config file, it'll almost always
> > generate the same offsets/code.  You can recompile with debug
> > symbols on and it'll be accurate.
> 
>    OK. Back later.

(gdb) file fs/btrfs/btrfs.o
Reading symbols from fs/btrfs/btrfs.o...done.
(gdb) list *btrfs_async_run_delayed_refs+0xc6
0x13dae is in btrfs_async_run_delayed_refs (fs/btrfs/extent-tree.c:2915).
2910	
2911		btrfs_queue_work(root->fs_info->extent_workers, &async->work);
2912		
2913			if (wait) {
2914			   	  wait_for_completion(&async->wait);
2915						ret = async->error;
2916						      kfree(async);
2917								return ret;
2918								       }
2919									return 0;

   Hugo.

-- 
Hugo Mills             | You shouldn't anthropomorphise computers. They
hugo@... carfax.org.uk | really don't like that.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux