dmcrypt on top of raid5, or raid5 on top of dmcrypt?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a btrfs filesystem with many many files which got slow likely due to
btrfs optimization issues, but someone pointed out that I should also look
at write amplification problems.

This is my current array:
gargamel:~# mdadm --detail /dev/md8
/dev/md8:
        Version : 1.2
  Creation Time : Thu Mar 25 20:15:00 2010
     Raid Level : raid5
     Array Size : 7814045696 (7452.05 GiB 8001.58 GB)
  Used Dev Size : 1953511424 (1863.01 GiB 2000.40 GB)
    Persistence : Superblock is persistent
  Intent Bitmap : Internal
         Layout : left-symmetric
     Chunk Size : 512K   < I guess this is too big

http://superuser.com/questions/305716/bad-performance-with-linux-software-raid5-and-luks-encryption
says:
"LUKS has a botleneck, that is it just spawns one thread per block device.

Are you placing the encryption on top of the RAID 5? Then from the point of
view of your OS you just have one device, then it is using just one thread
for all those disks, meaning disks are working in a serial way rather than
parallel."
but it was disputed in a reply.
Does someone know if this is still valid/correct in 3.14?

Since I'm going to recreate the filesystem considering the troubles I've had
with it, I might as well do it better this time :)
(but doing the copy back will take days, so I'd rather get it right the first time)

How would you recommend I create the array when I rebuild it?

This filesystem contains may backup with many files, most small, and ideally
identical stuff is hardlinked together (many files, many hardlinks)
gargamel:~# btrfs fi df /mnt/btrfs_pool2
Data, single: total=3.28TiB, used=2.29TiB
System, DUP: total=8.00MiB, used=384.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=74.50GiB, used=70.11GiB  <<< muchos metadata
Metadata, single: total=8.00MiB, used=0.00


#1 move the intent bitmap to another device. I have /boot on swraid1 with
   ext4, so I'll likely use this (man page says ext3 only, but I hope ext4
   is good too, right?)
#2 change chunk size to something smaller? 128K better?
#3 anything else?

Then, I used this for dmcrypt:
cryptsetup luksFormat --align-payload=8192 -s 256 -c aes-xts-plain64  

The align-payload was good for my SSD, but probably not for a hard drive.
http://wiki.drewhess.com/wiki/Creating_an_encrypted_filesystem_on_a_partition
says
"To calculate this value, multiply your RAID chunk size in bytes by the
number of data disks in the array (N/2 for RAID 1, N-1 for RAID 5 and N-2
for RAID 6), and divide by 512 bytes per sector."

So 512K * 4 / 512 = 4K
In other words, I can do align-payload=4096 for a small reduction of write
amplification, or =1024 if I change my raid chunk size to 128K

Correct? 
Do you recommend that I indeed rebuild that raid5 with a chunk size of 128K?

Other bits I found that can maybe help others:
http://superuser.com/questions/305716/bad-performance-with-linux-software-raid5-and-luks-encryption

This seems to help work around the write amplification a bit:
for i in /sys/block/md*/md/stripe_cache_size; do echo 16384 > $i; done

This looks like an easy thing, done.

If you have other suggestions/comments, please share :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux