On 4/03/2012 8:42 AM, Stan Hoeppner wrote:
On 3/3/2012 1:36 PM, Steven Haigh wrote:Hi all, I just wanted to run this past a few folk here as I want to make sure I'm doing it the Right Way(tm). I've decided to experiment with using a 128Kb chunk size on my RAID6 instead of a 64kb chunk.Why? Does your target application(s) perform better with a larger chunk, and therefore larger total stripe size? If you're strictly after larger dd copy numbers then you're wasting everyone's time, including yours, as such has almost zero bearing on real world performance, as most workloads are far more random than sequential.
Purely experimental for fun and education. I actually thought that a reshape would go at somewhat near the resync speeds I get of ~60-90Mb/sec. I guess this shows I'm wrong ;)
And apparently you're not using XFS. This reshape will screw up your alignment, and you'll need to change your fstab mount to reflect the new RAID geometry. But my guess is you're not using. If you were you'd probably be experienced enough to know that doubling your chunk size isn't going to make much difference, if any, in real world system usage.
I do use XFS - but this machines role is a Xen Dom0 - so md2 holds the filesystems for the guest VMsin LVs. One of those guest filesystems is an LV of the VG on md2 formatted as XFS. It will be interesting to see how this affects things :)
I set a few 'optimisations' that I believe should help: ## Tweak the RAIDs blockdev --setra 8192 /dev/sd[abcdefg]Read-ahead is per file descriptor, and occurs at the filesystem level. The read-ahead value used is that of the device immediately underlying the filessytem. So don't bother setting these above.
Interesting - I didn't think that was the case for whole disk arrays - but there you go... Learnt something else :)
blockdev --setra 8192 /dev/md0 blockdev --setra 8192 /dev/md1 blockdev --setra 16384 /dev/md2This is fine. You could theoretically set this to 1GB or more if you always read entire files, with no ill effects, as read-ahead doesn't go past EOF. However if you do any mmap reads (many apps do) of portions of large files, this will hammer performance, obviously, as you're reading entire large files speculatively when not needed. Play with this at your own risk.
The workloads of the array (having LVM on top) for the VMs would probably make it quite random. This is part of the reason I am playing here - pure experimentation. I am very curious to see if it works better or worse after the reshape. I honestly don't know :)
echo 16384> /sys/block/md2/md/stripe_cache_size for i in sda sdb sdc sdd sde sdf; do echo "Setting options for $i" echo 256> /sys/block/$i/queue/nr_requests echo 4096> /sys/block/$i/queue/read_ahead_kbEliminate this line ^^^^
Any insight into why? I would have thought that this would help - however I'm not quite sure as to the values - as this is much less than one chunk... That also being said, wouldn't it be a good idea to have *some* readahead?
echo 1> /sys/block/$i/device/queue_depth echo deadline> /sys/block/$i/queue/scheduler done Just wondering if anyone knows of any possible way to speed up the reshape a little, or if (like I suspect) it will take ~2 days to complete the reshape.Considering how expensive such operations are in both time and wear on the disk drives, it's better to read everything available to you on the subject and ask questions *before* performing expensive experiments on your array. If you currently have an performance problem you're trying to solve, the cause lay somewhere other than your chunk size.
As I said above, there really is no 'problem' I'm trying to solve. The whole reason is experimentation and education - really to see a 'what if' case. The last reshape I did on this array was a RAID5->RAID6 grow which went very well - however I have never experimented with chunk size on a mdadm raid.
-- Steven Haigh Email: netwiz@xxxxxxxxx Web: http://www.crc.id.au Phone: (03) 9001 6090 - 0412 935 897 Fax: (03) 8338 0299