On 6/5/2011 2:50 ÏÎ, Chris Mason wrote:
Excerpts from Konstantinos Skarlatos's message of 2011-05-05 17:04:00 -0400:
On 5/5/2011 11:32 ÎÎ, Chris Mason wrote:
Excerpts from Konstantinos Skarlatos's message of 2011-05-05 16:27:54 -0400:
I think i made some progress. When i tried to remove the directory that
i suspect contains the problematic file, i got this on the console
rm -rf serverloft/
Ok, our one bad block is in the extent allocation tree. This is going
to be the very hardest thing to fix.
Until I finish off the code to rebuild parts of the extent allocation
tree, I think your best bet is to copy the files off.
The big question is, what happened to make this error? Can you describe
your setup in more detail?
I created this btrfs filesystem on an arch linux system (amd64, quad
core) with kernel 2.3.38.1. it is on top of a md raid 5.
[root@linuxserver ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sde1[3] sdc1[1] sda1[0] sdf1[4]
5860535808 blocks super 1.2 level 5, 512k chunk, algorithm 2
[4/4] [UUUU]
the raid was grown from 3 devices to 4, and then btrfs was grown to max
size. mount options were clear_cache,compress-force.
I was investigating a performance issue that i had, because over the
network i could only write to the filesystem at about 32mb/sec.
when writing btrfs-delalloc- cpu usage was at 100%.
While investigating i disabled compression, enabled space_cache and
tried zlib compression, and various combinations, while copying large
files back and forth using samba.
BTW I tried to change some mount options using mount -o remount but
although the new options were printed on dmesg i think that they were
not enabled.
I got the first error when i was copying some files and at the same time
created a directory over samba. After a while i upgraded to 2.6.38.5 but
nothing seems to have changed.
I really dont think there is a hardware error here, but to be safe I am
now running a check on the raid
This error basically means we didn't write the block. It could be
because the write went to the wrong spot, or the hardware stack messed
it up, or because of a btrfs bug. But, 2.6.38 is relatively recent. It
doesn't look like memory corruption because the transids are fairly
close.
When you grew the raid device, did you grow a partition as well? We've
had trouble in the past with block dev flushing code kicking in as
devices are resized.
no, I did not grow any partitions, I just added one disk to the Raid 5
md0 device, and then grew the btrfs filesystem to max size(no partitions
on md0).
I can remember that as a test (to see if shrink works) i shrank the fs
by 1 gb and then grew it again to max size.
Samba isn't doing anything exotic, and 2.6.38 has my recent fixes for
rare metadata corruption bugs in btrfs.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html