Re: [RFC][PATCH] Make io_submit non-blocking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 24, 2012 at 05:11:05PM +0530, Ankit Jain wrote:
> 
> Currently, io_submit tries to execute the io requests on the
> same thread, which could block because of various reaons (eg.
> allocation of disk blocks). So, essentially, io_submit ends
> up being a blocking call.
> 
> With this patch, io_submit prepares all the kiocbs and then
> adds (kicks) them to ctx->run_list (kicked) in one go and then
> schedules the workqueue. The actual operations are not executed
> on io_submit's process context, so it can return very quickly.
> 
> This run_list is processed either on a workqueue or in response to
> an io_getevents call. This utilizes the existing retry infrastructure.
> 
> It uses override_creds/revert_creds to use the submitting process'
> credentials when processing the iocb request from the workqueue. This
> is required for proper support of quota and reserved block access.
> 
> Currently, we use block plugging in io_submit, since most of the IO
> was being done there itself. This patch moves it to aio_kick_handler
> and aio_run_all_iocbs, where the IO gets submitted.
> 
> All the tests were run with ext4.
> 
> I tested the patch with fio
>  (fio rand-rw-disk.fio --max-jobs=2 --latency-log
>  --bandwidth-log)
> 
> **Unpatched**
> read : io=102120KB, bw=618740 B/s, iops=151 , runt=169006msec
> slat (usec): min=275 , max=87560 , avg=6571.88, stdev=2799.57

Hmmm, I had to check the numbers twice - that's only 600KB/s.

Perhaps you need to test on something more than a single piece of
spinning rust. Optimising AIO for SSD rates (say 100k 4k write IOPS)
is probably more relevant to the majority of AIO users....

> write: io=102680KB, bw=622133 B/s, iops=151 , runt=169006msec
> slat (usec): min=2 , max=196 , avg=24.66, stdev=20.35
> 
> **Patched**
> read : io=102864KB, bw=504885 B/s, iops=123 , runt=208627msec
> slat (usec): min=0 , max=120 , avg= 1.65, stdev= 3.46 
> 
> write: io=101936KB, bw=500330 B/s, iops=122 , runt=208627msec
> slat (usec): min=0 , max=131 , avg= 1.85, stdev= 3.27 

So you made ext4 20% slower at random 4k writes with worst case
latencies only improving by about 30%. That, I think, is a
non-starter....

Also, you added a memory allocation in the io submit code. Worse
case latency will still be effectively undefined - what happens to
latencies if you generate memory pressure while the test is running?

FWIW, if you are going to change generic code, you need to present
results for other filesystems as well (xfs, btrfs are typical), as
they may not have the same problems as ext4 or react the same way to
your change. The result might simply be "it is 20% slower"....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux