> On 28 Jun 2019, at 1:58 PM, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > On 2019/6/28 上午10:47, Anand Jain wrote: >> On 27/6/19 10:58 PM, David Sterba wrote: >>> On Tue, Jun 25, 2019 at 04:24:57PM +0800, Qu Wenruo wrote: >>>> Ping? >>>> >>>> This patch should fix the problem of compressed extent even when >>>> nodatasum is set. >>>> >>>> It has been one year but we still didn't get a conclusion on where >>>> force_compress should behave. >>> >>> Note that pings to patches sent year ago will get lost, I noticed only >>> because you resent it and I remembered that we had some discussions, >>> without conclusions. >>> >>>> But at least to me, NODATASUM is a strong exclusion for compress, no >>>> matter whatever option we use, we should NEVER compress data without >>>> datasum/datacow. >>> >>> That's correct, >> >> But I wonder what's the reason that datasum/datacow is prerequisite for >> the compression ? > > It's easy to understand the hard requirement for data COW. > > If you overwrite compressed extent, a powerloss before transaction > commit could easily screw up the data. This scenario is even true for non compressed data, right? > Furthermore, what will happen if you're overwriting a 16K data extent > while its original compressed size is only 4K, while the new data after > compression is 8K? Sorry I can’t think of any limitation, any idea? > For checksum, I'd say it's not as a hard requirement as data cow, but > still a very preferred one. > > Since compressed data corruption could cause more problem, e.g. one bit > corruption can cause the whole extent to be corrupted, it's highly > recommended to have checksum to protect them. Isn’t it true that compression boundary/granularity is an extent, so the corrupted extent remains corrupted the same way after the decompress, and corruption won’t perpetuate to the other extents in the file. But can a non-impending corruption due to external factors be the reason for datasum perrequisite? it can’t be IMO. Thanks, Anand > Thanks, > Qu >> >> Thanks, Anand >> >>> but the way you fix it is IMO not right. This was also >>> noticed by Nikolay, that there are 2 locations that call >>> inode_need_compress but with different semantics. >>> >>> One is the decision if compression applies at all, and the second one >>> when that's certain it's compression, to do it or not based on the >>> status decision of eg. heuristics. >>> >> >
