Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 2, 2012 at 7:46 AM, Arne Jansen <sensille@xxxxxxx> wrote:
> On 02.08.2012 13:57, Liu Bo wrote:
>> On 08/02/2012 07:40 PM, Arne Jansen wrote:
>>> On 02.08.2012 13:34, Liu Bo wrote:
>>>> On 08/02/2012 07:18 PM, Arne Jansen wrote:
>>>>> On 02.08.2012 12:36, Liu Bo wrote:
>>>>>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>>>>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>>>>>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>>>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>>>>>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>>>>>>>>> With commit acce952b0, btrfs was changed to flag the filesystem with
>>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
>>>>>>>>>>>> error happened like a write I/O errors of all mirrors.
>>>>>>>>>>>> In such situations, on unmount, the superblock is written in
>>>>>>>>>>>> btrfs_error_commit_super(). This is done with the intention to be able
>>>>>>>>>>>> to evaluate the error flag on the next mount. A warning is printed
>>>>>>>>>>>> in this case during the next mount and the log tree is ignored.
>>>>>>>>>>>>
>>>>>>>>>>>> The issue is that it is possible that the superblock points to a root
>>>>>>>>>>>> that was not written (due to write I/O errors).
>>>>>>>>>>>> The result is that the filesystem cannot be mounted. btrfsck also does
>>>>>>>>>>>> not start and all the other btrfs-progs tools fail to start as well.
>>>>>>>>>>>> However, mount -o recovery is working well and does the right things
>>>>>>>>>>>> to recover the filesystem (i.e., don't use the log root, clear the
>>>>>>>>>>>> free space cache and use the next mountable root that is stored in the
>>>>>>>>>>>> root backup array).
>>>>>>>>>>>>
>>>>>>>>>>>> This patch removes the writing of the superblock when
>>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
>>>>>>>>>>>> flag in the mount function.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, I have to admit that this can be a serious problem.
>>>>>>>>>>>
>>>>>>>>>>> But we'll need to send the error flag stored in the super block into
>>>>>>>>>>> disk in the future so that the next mount can find it unstable and do
>>>>>>>>>>> fsck by itself maybe.
>>>>>>>>>>
>>>>>>>>>> Hum, that's possible. However, I neither see
>>>>>>>>>>
>>>>>>>>>> a) a safe way to get that flag to disk
>>>>>>>>>>
>>>>>>>>>> nor
>>>>>>>>>>
>>>>>>>>>> b) a situation where this flag would help. When we abort a transaction, we just
>>>>>>>>>> roll everything back to the last commit, i.e. a consistent state. So if we stop
>>>>>>>>>> writing a potentially corrupt super block, we should be fine anyway. Or am I
>>>>>>>>>> missing something?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm just wondering if we can roll everything back well, why do we need fsck?
>>>>>>>>
>>>>>>>> If the disks support barriers, we roll everything back very well. The
>>>>>>>> most recent superblock on the disks always defines a consistent
>>>>>>>> filesystem state. There are only two remaining filesystem consistency
>>>>>>>> issues left that can cause inconsistent states, one is the one that the
>>>>>>>> patch in this email addresses, and the second one is that the error
>>>>>>>> result from barrier_all_devices() is ignored (which I want to change next).
>>>>>>>
>>>>>>> Hi Liu Bo,
>>>>>>>
>>>>>>> Do you have any remaining objections to that patch?
>>>>>>>
>>>>>>
>>>>>> Hi Stefan,
>>>>>>
>>>>>> Still I have another question:
>>>>>>
>>>>>> Our metadata can be flushed into disk if we reach the limit, 32k, so we
>>>>>> can end up with updated metadata and the latest superblock if we do not
>>>>>> write the current super block.
>>>>>
>>>>> The old metadata stays valid until the new superblock is written,
>>>>> so no problem here, or maybe I don't understand your question :)
>>>>>
>>>>
>>>> Yeah, Arne, you're right :)
>>>>
>>>> But for undetected and unexpected errors as Arne had mentioned,  I want
>>>> to keep the error flag which is able to inform users that this FS is
>>>> recommended (but not must) to do fsck at least.
>>>
>>> How about storing the flag in a different location than the superblock?
>>> If the fs is in an unknown state, every write potentially makes it only
>>> worse.
>>>
>>
>> IMO it does not make sense if we don't write the flag into disk, and on
>> ext4's side, it just tries to write the super block.
>>
>> Anyway, for now, our error flag has only been stored in memory, so what
>> about just keep it until we find a graceful way?
>
> Yeah, we need this patch to restore consistency. We can define a fixed
> area on disk (e.g. behind the superblock) where we can write the flag
> to without risking the superblock.

Is there a reason btrfs_error_commit_super couldn't do the as treelog:
update only the first superblock via max_mirrors=1?  I'd expect that
fsck, -o recovery and so forth should all handle this correctly
already, and we even have documentation that discusses it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux