Re: Data Offset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Swapped out the new drive for the old. The new drive is still labelled
as /dev/sdf, hopefully.. I decided to check times before proceeding,
and to make sure the drives were in the right order. I corrected them
to go by `mdadm --examine`'s output as best I could.

Here's the output of `mdadm --examine /dev/sdf` and the result of
executing the given `./mdadm` command (with re-ordered drives), The
binary compiled from git sources crashed with a segmentation fault
while attempting to print out a failure writing the superblock. I've
tried the drives (with proper sizes) in other combinations, according
to both what you posted and what mdadm --examine says the "Device
Role" is. I haven't found a working combination; is it possible my
drives got swapped around on reboot? There's a re-run of mdadm
--examine at the end of my post.

root@leyline:~/mdadm# mdadm --examine /dev/sdf
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9759ad94:75e30b6b:8a726b4d:177a6eda
           Name : leyline:1  (local to host leyline)
  Creation Time : Mon Sep 12 13:19:00 2011
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
     Array Size : 7813048320 (7451.10 GiB 8000.56 GB)
  Used Dev Size : 3906524160 (1862.78 GiB 2000.14 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2edc16c6:cf45ad32:04b026a4:956ce78b

    Update Time : Fri Jun  1 03:11:54 2012
       Checksum : b3e49e59 - correct
         Events : 2127454

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA. ('A' == active, '.' == missing)
root@leyline:~/mdadm# ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5
--assume-clean -c 512 /dev/sdc3:2048s /dev/sdf:2048s /dev/sdb3:2048s
/dev/sdd3:2048s /dev/sde3:1024s
mdadm: /dev/sdc3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:10:46 2012
mdadm: /dev/sdf appears to contain an ext2fs file system
    size=242788K  mtime=Fri Oct  7 16:55:40 2011
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid5 devices=5 ctime=Mon Sep 12 13:19:00 2011
mdadm: /dev/sdb3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:10:46 2012
mdadm: /dev/sdd3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:10:46 2012
Continue creating array? yes
Segmentation fault

Since I couldn't find any fault with running it again (but I am not a
smart man, or I would not be in this position), I decided to run
valgrind over it:
==3206== Memcheck, a memory error detector
==3206== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==3206== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for
copyright info
==3206== Command: ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean
-c 512 /dev/sdc3:2048s /dev/sdf:2048s /dev/sdb3:2048s /dev/sdd3:2048s
/dev/sde3:1024s
==3206==
==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==3206==    This could cause spurious value errors to appear.
==3206==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on
writing a proper wrapper.
==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==3206==    This could cause spurious value errors to appear.
==3206==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on
writing a proper wrapper.
==3206== Warning: noted but unhandled ioctl 0x1261 with no size/direction hints
==3206==    This could cause spurious value errors to appear.
==3206==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on
writing a proper wrapper.
mdadm: /dev/sdc3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:14:20 2012
mdadm: /dev/sdf appears to contain an ext2fs file system
    size=242788K  mtime=Fri Oct  7 16:55:40 2011
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:14:20 2012
mdadm: /dev/sdb3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:14:20 2012
mdadm: /dev/sdd3 appears to be part of a raid array:
    level=raid5 devices=5 ctime=Tue Jun  5 00:14:20 2012
Continue creating array? ==3206== Invalid read of size 8
==3206==    at 0x43C9B7: write_init_super1 (super1.c:1327)
==3206==    by 0x41F1B9: Create (Create.c:951)
==3206==    by 0x407231: main (mdadm.c:1464)
==3206==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==3206==
==3206==
==3206== Process terminating with default action of signal 11 (SIGSEGV)
==3206==  Access not within mapped region at address 0x8
==3206==    at 0x43C9B7: write_init_super1 (super1.c:1327)
==3206==    by 0x41F1B9: Create (Create.c:951)
==3206==    by 0x407231: main (mdadm.c:1464)
==3206==  If you believe this happened as a result of a stack
==3206==  overflow in your program's main thread (unlikely but
==3206==  possible), you can try to increase the size of the
==3206==  main thread stack using the --main-stacksize= flag.
==3206==  The main thread stack size used in this run was 8388608.
==3206==
==3206== HEAP SUMMARY:
==3206==     in use at exit: 37,033 bytes in 350 blocks
==3206==   total heap usage: 673 allocs, 323 frees, 4,735,171 bytes allocated
==3206==
==3206== LEAK SUMMARY:
==3206==    definitely lost: 832 bytes in 8 blocks
==3206==    indirectly lost: 18,464 bytes in 4 blocks
==3206==      possibly lost: 0 bytes in 0 blocks
==3206==    still reachable: 17,737 bytes in 338 blocks
==3206==         suppressed: 0 bytes in 0 blocks
==3206== Rerun with --leak-check=full to see details of leaked memory
==3206==
==3206== For counts of detected and suppressed errors, rerun with: -v
==3206== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)

mdadm --examine of all my drives (again):
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b
           Name : leyline:1  (local to host leyline)
  Creation Time : Tue Jun  5 00:14:35 2012
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB)
     Array Size : 7813046272 (7451.10 GiB 8000.56 GB)
  Used Dev Size : 3906523136 (1862.78 GiB 2000.14 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : feb94069:b2afeb6e:ae6b2af2:f9e3cee4

    Update Time : Tue Jun  5 00:14:35 2012
       Checksum : 1909d79e - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAA ('A' == active, '.' == missing)
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b
           Name : leyline:1  (local to host leyline)
  Creation Time : Tue Jun  5 00:14:35 2012
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 3906525098 (1862.78 GiB 2000.14 GB)
     Array Size : 7813046272 (7451.10 GiB 8000.56 GB)
  Used Dev Size : 3906523136 (1862.78 GiB 2000.14 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : ed6116ac:4f91c2dd:4ada53df:0e14fc2a

    Update Time : Tue Jun  5 00:14:35 2012
       Checksum : fe0cffd8 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAA ('A' == active, '.' == missing)
/dev/sdd3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 9ed0b17f:e9a7a813:a9139679:4f8f999b
           Name : leyline:1  (local to host leyline)
  Creation Time : Tue Jun  5 00:14:35 2012
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 3906523136 (1862.78 GiB 2000.14 GB)
     Array Size : 7813046272 (7451.10 GiB 8000.56 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : d839ca02:1d14cde3:65b54275:8caa0275

    Update Time : Tue Jun  5 00:14:35 2012
       Checksum : 3ac0c483 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sde3.



On Mon, Jun 4, 2012 at 5:57 PM, NeilBrown <neilb@xxxxxxx> wrote:
> On Mon, 04 Jun 2012 20:26:05 +0200 Pierre Beck <mail@xxxxxxxxxxxxxx> wrote:
>
>> I'll try and clear up some confusion (I was in IRC with freeone3000).
>>
>> /dev/sdf is an empty drive, a replacement for a failed drive. The Array
>> attempted to assemble, but failed and reported one drive as spare. This
>> is the moment we saved the --examines.
>>
>> In expectation of a lost write due to drive write-cache, we executed
>> --assemble --force, which kicked another drive.
>>
>> @James: remove /dev/sdf for now and replace /dev/sde3, which indeed has
>> a very outdated update time, with the non-present drive. Post an
>> --examine of that drive. It should report update time Jun 1st.
>>
>> We tried to re-create the array with --assume-clean. But mdadm chose a
>> different data offset for the drives. A re-create with proper data
>> offset will be necessary.
>
> OK, try:
>
>   git clone -b data_offset git://neil.brown.name/mdadm
>   cd mdadm
>   make
>
>   ./mdadm -C /dev/md1 -e 1.2 -l 5 -n 5 --assume-clean -c 512 \
>      /dev/sdc3:2048s /dev/sdb3:2048s ???  /dev/sdd3:1024s ???
>
> The number after ':' after a device name is a data offset.  's' means sectors.
> With out 's' it means Kilobytes.
> I don't know what should be at slot 2 or 4 so I put '???'.  You should fill it
> in.  You should also double check the command and double check the names of
> your devices.
> Don't install this mdadm, and don't use it for anything other than
> re-creating this array.
>
> Good luck.
>
> NeilBrown
>
>>
>> Greetings,
>>
>> Pierre Beck
>>
>>
>> Am 04.06.2012 05:35, schrieb NeilBrown:
>> > On Fri, 1 Jun 2012 19:48:41 -0500 freeone3000<freeone3000@xxxxxxxxx>  wrote:
>> >
>> >> Sorry.
>> >>
>> >> /dev/sde fell out of the array, so I replaced the physical drive with
>> >> what is now /dev/sdf. udev may have relabelled the drive - smartctl
>> >> states that the drive that is now /dev/sde works fine.
>> >> /dev/sdf is a new drive. /dev/sdf has a single, whole-disk partition
>> >> with type marked as raid. It is physically larger than the others.
>> >>
>> >> /dev/sdf1 doesn't have a mdadm superblock. /dev/sdf seems to, so I
>> >> gave output of that device instead of /dev/sdf1, despite the
>> >> partition. Whole-drive RAID is fine, if it gets it working.
>> >>
>> >> What I'm attempting to do is rebuild the RAID from the data from the
>> >> other four drives, and bring the RAID back up without losing any of
>> >> the data. /dev/sdb3, /dev/sdc3, /dev/sdd3, and what is now /dev/sde3
>> >> should be used to rebuild the array, with /dev/sdf as a new drive. If
>> >> I can get the array back up with all my data and all five drives in
>> >> use, I'll be very happy.
>> > You appear to have 3 devices that are happy:
>> >    sdc3 is device 0   data-offset 2048
>> >    sdb3 is device 1   data-offset 2048
>> >    sdd3 is device 3   data-offset 1024
>> >
>> > nothing claims to be device 2 or 4.
>> >
>> > sde3 looks like it was last in the array on 23rd May, a little over
>> > a week before your report.  Could that have been when "sde fell out of the
>> > array" ??
>> > Is it possible that you replaced the wrong device?
>> > Or is it possible the the array was degraded when sde "fell out" resulting
>> > in data loss?
>> >
>> > I need more precise history to understand what happened, as I cannot suggest
>> > a fixed until I have that understanding.
>> >
>> > When did the array fail?
>> > How certain are you that you replaced the correct device?
>> > Can you examine the drive that you removed and see what it says?
>> > Are you certain that the array wasn't already degraded?
>> >
>> > NeilBrown
>> >
>



-- 
James Moore
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux