- Subject: Re: [Lsf-pc] [LSF/MM TOPIC] a few storage topics
- From: Jeff Moyer <jmoyer@xxxxxxxxxx>
- Date: Tue, 24 Jan 2012 13:05:50 -0500
- Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, "linux-scsi@xxxxxxxxxxxxxxx" <linux-scsi@xxxxxxxxxxxxxxx>, Mike Snitzer <snitzer@xxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, "dm-devel@xxxxxxxxxx" <dm-devel@xxxxxxxxxx>, Boaz Harrosh <bharrosh@xxxxxxxxxxx>, "linux-fsdevel@xxxxxxxxxxxxxxx" <linux-fsdevel@xxxxxxxxxxxxxxx>, "lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx" <lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx>, Chris Mason <chris.mason@xxxxxxxxxx>
- In-reply-to: <186EA560-1720-4975-AC2F-8C72C4A777A9@dilger.ca> (Andreas Dilger's message of "Tue, 24 Jan 2012 10:08:47 -0700")
- References: <20120117213648.GA9457@quack.suse.cz> <20120118225808.GA3074@tux1.beaverton.ibm.com> <20120118232200.GA22019@quack.suse.cz> <4F1758D4.9010401@panasas.com> <20120119094637.GA23442@quack.suse.cz> <4F1BFF5F.6000502@panasas.com> <20120123161857.GC28526@quack.suse.cz> <20120123175353.GD30782@redhat.com> <x49r4yq9suf.fsf@segfault.boston.devel.redhat.com> <20120124151504.GQ4387@shiny> <20120124165631.GA8941@infradead.org> <186EA560-1720-4975-AC2F-8C72C4A777A9@dilger.ca>
- Reply-to: device-mapper development <dm-devel@xxxxxxxxxx>
- User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1 (gnu/linux)
Andreas Dilger <adilger@xxxxxxxxx> writes:
> On 2012-01-24, at 9:56, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>> On Tue, Jan 24, 2012 at 10:15:04AM -0500, Chris Mason wrote:
>>> https://lkml.org/lkml/2011/12/13/326
>>>
>>> This patch is another example, although for a slight different reason.
>>> I really have no idea yet what the right answer is in a generic sense,
>>> but you don't need a 512K request to see higher latencies from merging.
>>
>> That assumes the 512k requests is created by merging. We have enough
>> workloads that create large I/O from the get go, and not splitting them
>> and eventually merging them again would be a big win. E.g. I'm
>> currently looking at a distributed block device which uses internal 4MB
>> chunks, and increasing the maximum request size to that dramatically
>> increases the read performance.
>
> (sorry about last email, hit send by accident)
>
> I don't think we can have a "one size fits all" policy here. In most
> RAID devices the IO size needs to be at least 1MB, and with newer
> devices 4MB gives better performance.
Right, and there's more to it than just I/O size. There's access
pattern, and more importantly, workload and related requirements
(latency vs throughput).
> One of the reasons that Lustre used to hack so much around the VFS and
> VM APIs is exactly to avoid the splitting of read/write requests into
> pages and then depend on the elevator to reconstruct a good-sized IO
> out of it.
>
> Things have gotten better with newer kernels, but there is still a
> ways to go w.r.t. allowing large IO requests to pass unhindered
> through to disk (or at least as far as enduring that the IO is aligned
> to the underlying disk geometry).
I've been wondering if it's gotten better, so decided to run a few quick
tests.
kernel version 3.2.0, storage: hp eva fc array, i/o scheduler cfq,
max_sectors_kb: 1024, test program: dd
ext3:
- buffered writes and buffered O_SYNC writes, all 1MB block size show 4k
I/Os passed down to the I/O scheduler
- buffered 1MB reads are a little better, typically in the 128k-256k
range when they hit the I/O scheduler.
ext4:
- buffered writes: 512K I/Os show up at the elevator
- buffered O_SYNC writes: data is again 512KB, journal writes are 4K
- buffered 1MB reads get down to the scheduler in 128KB chunks
xfs:
- buffered writes: 1MB I/Os show up at the elevator
- buffered O_SYNC writes: 1MB I/Os
- buffered 1MB reads: 128KB chunks show up at the I/O scheduler
So, ext4 is doing better than ext3, but still not perfect. xfs is
kicking ass for writes, but reads are still split up.
Cheers,
Jeff
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel
[DM Crypt]
[Fedora Desktop]
[ATA RAID]
[Fedora Marketing]
[Fedora Packaging]
[Fedora SELinux]
[Yosemite Discussion]
[Yosemite Photos]
[KDE Users]
[Fedora Tools]
[Fedora Docs]