On 07.07.2019 14:15 Qu Wenruo wrote: > > On 2019/7/6 上午4:28, Martin Raiber wrote: >> More research on this. Seems a generic error reporting mechanism for >> this is in the works https://lkml.org/lkml/2018/6/1/640 . > sync() system call is defined as void sync(void); thus it has no error > reporting. > > syncfs() could report error. > >> Wrt. to btrfs one should always use BTRFS_IOC_SYNC because only this one >> seems to wait for delalloc work to finish: >> https://patchwork.kernel.org/patch/2927491/ (five year old patch where >> Filipe Manana added this to BTRFS_IOC_SYNC and with v2->v3 not to >> syncfs() ). >> >> I was smart enough to check if the filesystem is still writable after a >> syncfs() (so the missing error return doesn't matter that much) but I >> guess the missing wait for delalloc can cause the application to think >> data is on disk even though it isn't. > Isn't syncfs() enough to return error for your use case? > > Another solution is fsync(). It's ensured to return error if data > writeback or metadata update path has something wrong. > IIRC there are quite some fstests test cases using this way to detect fs > misbehavior. > > Testing if the fs can be written after sync() is not enough in fact. > If you're doing buffer write, it only covers the buffered write part, > which normally just includes space preallocation and copy data to page > cache, doesn't include the data write back nor metadata update. > > So I'd recommend to stick to fsync() if you want to make sure your data > reach disk. This does not only apply to btrfs, but all Linux filesystems. > > Thanks, > Qu This is for UrBackup (Open Source backup software). What it does is, create btrfs snapshot of last backup of a client, sync the current client fs to the btrfs snapshot, then call syncfs(btrfs snapshot), then check if the snapshot is still writable, then set the backup to complete in its internal database. Calling fsync() on every file would kill performance (especially on btrfs). The problem I had was that there was a (complete in database) backup that had files with wrong checksums (UrBackup does its own checksums, the btrfs ones were okay), and missing files. On the day the corrupted backup completed the btrfs went read-only a few hours after the backup completed and the syncfs() was called with: [253018.670661] BTRFS: error (device md1) in btrfs_run_delayed_refs:2950: errno=-5 IO failure So my guess is using BTRFS_IOC_SYNC instead of syncfs() fixes the problem in my case, while it would probably be nice if syncfs() returns an error if the fs fails (it doesn't -- I tested it) and waits for everything to be written to disk (as expected, and the man-page somewhat confirms). > >> On 05.07.2019 16:22 Martin Raiber wrote: >>> Hi, >>> >>> I realize this isn't a btrfs specific problem but syncfs() returns no >>> error even on complete fs failure. The problem is (I think) that the >>> return value of sb->s_op->sync_fs is being ignored in fs/sync.c. I kind >>> of assumed it would return an error if it fails to write the file system >>> changes to disk. >>> >>> For btrfs there is a work-around of using BTRFS_IOC_SYNC (which I am >>> going to use now) but that is obviously less user friendly than syncfs(). >>> >>> Regards, >>> Martin Raiber >>
