Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 11, 2016 at 03:20:36PM -0700, Chris Murphy wrote:
> On Mon, Jan 11, 2016 at 3:10 PM, Hugo Mills <hugo@xxxxxxxxxxxxx> wrote:
> > On Mon, Jan 11, 2016 at 02:31:41PM -0700, Chris Murphy wrote:
> >> On Mon, Jan 11, 2016 at 2:03 AM, Hugo Mills <hugo@xxxxxxxxxxxxx> wrote:
> >> > On Sun, Jan 10, 2016 at 05:13:28PM -0700, Chris Murphy wrote:
> >> >> On Sat, Jan 9, 2016 at 2:04 PM, Hugo Mills <hugo@xxxxxxxxxxxxx> wrote:
> >> >> > On Sat, Jan 09, 2016 at 09:59:29PM +0100, cheater00 . wrote:
> >> >> >> OK. How do we track down that bug and get it fixed?
> >> >> >
> >> >> >    I have no idea. I'm not a btrfs dev, I'm afraid.
> >> >> >
> >> >> >    It's been around for a number of years. None of the devs has, I
> >> >> > think, had the time to look at it. When Josef was still (publicly)
> >> >> > active, he had it second on his list of bugs to look at for many
> >> >> > months -- but it always got trumped by some new bug that could cause
> >> >> > data loss.
> >> >>
> >> >>
> >> >> Interesting. I did not know of this bug. It's pretty rare.
> >> >
> >> >    Not really. It shows up maybe on average once a week on IRC. It
> >> > gets reported much less on the mailing list.
> >>
> >> Is there a pattern? Does it only happen at a 2TiB threshold?
> >
> >    No, and no.
> >
> >    There is, as far as I can tell from some years of seeing reports of
> > this bug, no correlation with RAID level, hardware, OS, kernel
> > version, FS size, usage of the FS at failure, or allocation level of
> > either data or metadata at failure.
> >
> >    I haven't tried correlating with the phase of the moon or the
> > losses on Lloyds Register yet.
> 
> Huh. So it's goofy cakes.
> 
> This is specifically where btrfs_free_extent produces errno -28 no
> space left, and then the fs goes read-only?

   The symptoms I'm using for a diagnosis of this bug are that the FS
runs out of (usually data) space when there's still unallocated space
remaining that it could use for another block group.

   Forced RO isn't usually a symptom, although the FS can get into a
state where you can't modify it (as distinct from being explicitly
read-only).

   Block-group level operations, like balance, device delete, device
add sometimes seem to have some kind of (usually small) effect on the
point at which the error occurs. If you hit the problem and run a
balance, you might end up making things worse by a couple of
gigabytes, or making things better by the same amount, or having no
effect at all.

   Hugo.

-- 
Hugo Mills             | "What are we going to do tonight?"
hugo@... carfax.org.uk | "The same thing we do every night, Pinky. Try to
http://carfax.org.uk/  | take over the world!"
PGP: E2AB1DE4          |

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux