Re: uknown issues - different sha256 hash - files corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The results of the cmp are here:
https://drive.google.com/file/d/0B6RZ_9vVuTEcQ3RUSV9hdXU3eUE/view?usp=sharing

On Tue, Jan 26, 2016 at 2:00 AM, John Smith <lenovomi@xxxxxxxxx> wrote:
> Also I copied out from LVM  to btrfs almost 2TB of small file and all
> hashes are same.
>
> Here is full output of cmp -l file_on_lvm file_on_btrfs
>
> https://drive.google.com/open?id=0B6RZ_9vVuTEcQ3RUSV9hdXU3eUE
>
>
>
>
> On Tue, Jan 26, 2016 at 1:15 AM, John Smith <lenovomi@xxxxxxxxx> wrote:
>> Hello,
>>
>> i copied data from to btrfs (same file) and the hashes are same. Also
>> btrfs to lvm was okay.
>>
>> Still waiting for the results of lvm to lvm.
>>
>> At the moment i dont have any box around that i can use to replace cuboxi.
>>
>> Could it be really HW problem? As you can see data between btrfs to
>> btrfs was  copied w/out any issues. Problem is looks like only between
>> lvm to btrfs. Im so confused.
>>
>> On Mon, Jan 25, 2016 at 11:02 PM, Henk Slager <eye1tm@xxxxxxxxx> wrote:
>>> On Mon, Jan 25, 2016 at 5:53 PM, John Smith <lenovomi@xxxxxxxxx> wrote:
>>>> Hello all,
>>>>
>>>> there is no network involvement in that copy process. Esata enclosure
>>>> is attached directly to cuboxi and copy is between 2 sata drives in
>>>> lvm (using ext4) and two btrfs drives in raid1.
>>>
>>> I thought it was raid0, anyhow it seems not relevant, it just a btrfs filesystem
>>>
>>>> When i copy data from lvm to btrfs, hash of the file on btrfs is
>>>> different compare to the one on lvm. When I copy exactly same file
>>>> multiple times all the time it got different hash on btrfs.
>>>>
>>>> I did a test, i copied file from btrfs to lvm and both hashes are same.
>>>>
>>>>
>>>> When can intervent / mess up data when i do copy between lvm to btrfs?
>>>
>>> My first thought was that it could due to that you are using SATA port
>>> multiplexing. But then you would likely experience some corruption of
>>> the btrfs fs, although it might take some time (days/weeks) before you
>>> discover. You could check dmesg to see what is there for ataX.Y.
>>> A mitigation would be to do the same copy tests over USB, although
>>> port-multiplexing might still be effective, I don't know that.
>>>
>>> It could also be that the cubox with its current firmware+software
>>> fails under certain loads (btrfs writing is quite different from
>>> ext4), it might be that it is just your cubox hardware, maybe power
>>> issues or whatever. Or some btrfs/other piece of code overwrites the
>>> rsync/btrfs write buffers before crc every now and then. Or just
>>> memory errors as already suggested.
>>>
>>>>
>>>> All drives are brand new, badblocks was executed on each drive, also
>>>> smart doesnt shows up any issues.
>>>
>>> The drives are not the issue I think. I would temporary replace the
>>> cubox with a typical x86_64 system with kernel v4.4 from kernel.org,
>>> connect the icy box via eSATA port (or USB if you don't have) and
>>> execute the copy tests lvm -> btrfs. From there you can see if it is
>>> btrfs on the cubox and maybe then just connect a single SATA disk to
>>> the cubox and repeat tests and maybe try a bit older kernel (like
>>> 3.18) as well.
>>>
>>> And what if you do If this several times on cubox and PC?
>>> # dd if=/dev/zero of=<fileonbtrfs> bs=1M count=130000
>>>
>>>> thank you
>>>>
>>>> On Mon, Jan 25, 2016 at 10:03 AM, Patrik Lundquist
>>>> <patrik.lundquist@xxxxxxxxx> wrote:
>>>>> On 25 January 2016 at 01:12, John Smith <lenovomi@xxxxxxxxx> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> what else/ or where it was corrupted?
>>>>>
>>>>> It got corrupted after leaving the source disk and before btrfs
>>>>> calculated the data checksum during file write.
>>>>>
>>>>>
>>>>>> I didnt check data checksum
>>>>>> errors - is it possible with some btrfs tools? But yes I can read
>>>>>> whole file after stored on btrfs.
>>>>>
>>>>> You wouldn't have been able to read the whole corrupted file from
>>>>> btrfs if the corruption took place after the file was written (due to
>>>>> wrong checksum).
>>>>>
>>>>>
>>>>>> By checksum errors, do you mean sha256 hash?
>>>>>
>>>>> No, I mean the built-in data checksum in btrfs that guarantees file integrity.
>>>>>
>>>>>
>>>>>> I plan to run memcheck but on the irc i was suggested that it is not
>>>>>> RAM issue, based on the output from the cmp.
>>>>>
>>>>> Perhaps not, but you have to rule that out. Leave the memtest overnight.
>>>>>
>>>>>
>>>>>> The drives are brand new
>>>>>> and badblocks + smart test was executed with no errors.
>>>>>
>>>>> That's great.
>>>>>
>>>>>
>>>>>> On Mon, Jan 25, 2016 at 1:06 AM, Patrik Lundquist
>>>>>> <patrik.lundquist@xxxxxxxxx> wrote:
>>>>>> > On 24 January 2016 at 23:00, John Smith <lenovomi@xxxxxxxxx> wrote:
>>>>>> >>
>>>>>> >> Dear,
>>>>>> >>
>>>>>> >> I have cubox-i4, running debian with 4.4 kernel. The icy box
>>>>>> >> IB-3664SU3 enclosure is attached into cubox using esata port,
>>>>>> >> enclosure uses JM393 and JM539 chipsets.
>>>>>> >>
>>>>>> >> I use btrfs volume in raid0 created from the two drives, and lvm ext4
>>>>>> >> volume that contains two drives also. When I copy (using rsync) big
>>>>>> >> file (the one i copied is 130GB) from ext4 to btrfs the sha256 hash is
>>>>>> >> differs.
>>>>>> >>
>>>>>> >> I did 2 tests, copy the source file from ext4 to btrfs, count sha256
>>>>>> >> hash, each time the destination file on btrfs has different hash
>>>>>> >> compared to the source file located on ext4 and even hashes from both
>>>>>> >> runs of target files on btrfs differs.
>>>>>> >>
>>>>>> >> I run cmp -l <(hexdump source_file_ext4) <(hexdump target_file_btrfs).
>>>>>> >> The snapshot of the result is here http://paste.debian.net/367678/,
>>>>>> >> the is so many bytes with differences. The size of the source and
>>>>>> >> target file is exactly the same.
>>>>>> >>
>>>>>> >>
>>>>>> >> I also copied around 600GB of data set that contains small files,
>>>>>> >> music, videos, etc... and i did sha256 on all the files ext4 vs btrfs
>>>>>> >> - all was fine.
>>>>>> >>
>>>>>> >> Any idea what can cause that issue or how can i debug it in more detail?
>>>>>> >
>>>>>> > The data must have become corrupted before it was written to the btrfs
>>>>>> > volume, since you can read it back without data checksum errors.
>>>>>> >
>>>>>> > Try copying the big file a couple of times but from btrfs to ext4 to
>>>>>> > see if you get data checksum errors.
>>>>>> >
>>>>>> > Run memcheck and long SMART tests on the disks.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux