On Sat, Jan 24, 2015 at 01:06:01PM -0500, Zygo Blaxell wrote: > I am seeing a lot of spurious I/O errors that look like they come from > the cache-facing side of btrfs. While running a heavy load with some > extent-sharing (e.g. building 20 Linux kernels at once from source trees > copied with 'cp -a --reflink=always'), some files will return spurious > EIO on read. It happens often enough to prevent a Linux kernel build > about 1/3 of the time. [...] > Observed from 3.17..3.18.3. All filesystems affected use skinny-metadata. > No filesystems that are not using skinny-metadata seem to have this > problem. I ran a test overnight using 3.18.3 on a freshly formatted filesystem with no skinny-metadata. The test consisted of creating reflink copies of a Linux kernel source tree and running kernel builds in each copy simultaneously, like this: # assume you have a ready-to-build kernel tree in 'linux' for x in $(seq 1 5); do cp -a --reflink linux linux-$x done # build all the kernels at once for x in $(seq 1 5); do (cd linux-$x && make -j10 2>&1 | tee make.log) & done wait # then tail all the make.logs and see how many failed due to # I/O errors Spurious I/O errors occured with as few as two concurrent kernel builds. The test machine has 16GB of RAM and the filesystem is also 16GB, RAID1 on two spinning disks.
Attachment:
signature.asc
Description: Digital signature
