Hi all, We experience strange Input/output errors on our mail server (dovecot pop/imap) that is using btrfs for its mailspool. The server uses software RAID10. The RAID is split into LVMs. The mailspool logical volume uses btrfs. For several days now we see Input/output errors on different files. We could pinpoint the first occurrence of the errors to a day when one of the RAID disks failed. More precisely it all started while the RAID was rebuilding. All affected files are files that are read/written frequently, like dovecot index files, maildirsize and the likes. Most files return to a useful state after some time (sometimes days, sometimes minutes), which is why we didn't notice the errors right away. We do snapshot and send/receive backup of the mailspool via cron. A file that is unreadable in the mailspool is totally OK on the backup volume, even the latest 'copy' that must have been taken from an otherwise unreadable file. It is also possible to restore an unreadable file from backup without Input/output error. The file is fine and useful then. All disks taking part in the RAID show now SMART errors btw and a btrfs scrub on the mailspool did not indicate any errors (I somehow expected that). All this runs on a virtual machine that uses kernel 4.1.3 (Debian build) and btrfs-progs v4.0. So finally I would ask what we can do to solve this problem? I also appreciate comments to the situation and of course hints to what is going on. This is over my head. Thanks and cheers, -- J.Hofmüller Nisiti - Abie Nathan, 1927-2008
Attachment:
signature.asc
Description: OpenPGP digital signature
