On 30 September 2013 22:47, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote: > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote: >> On 30 September 2013 22:11, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote: >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote: >> >> On 29 September 2013 15:12, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote: >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote: >> >> >> Thank you very much for the reply. That clarifies a lot of things. >> >> >> >> >> >> I was trying a small test case that opens a file, writes a block of >> >> >> data, calls fsync and then closes the file. If I understand correctly, >> >> >> fsync would return only after all in-memory buffers have been >> >> >> committed to disk. I have added few print statements in the >> >> >> __extent_writepage function, and I notice that the function gets >> >> >> called a bit later after fsync returns. It seems that I am not >> >> >> guaranteed to see the data going to disk by the time fsync returns. >> >> >> >> >> >> Am I doing something wrong, or am I looking at the wrong place for >> >> >> disk write? This happens both with tree logging enabled as well as >> >> >> with notreelog. >> >> >> >> >> > >> >> > So 3.1 was a long time ago and to be sure it had issues I don't think it was >> >> > _that_ broken. You are probably better off instrumenting a recent kernel, 3.11 >> >> > or just build btrfs-next from git. But if I were to make a guess I'd say that >> >> > __extent_writepage was how both data and metadata was written out at the time (I >> >> > don't think I changed it until 3.2 or something later) so what you are likely >> >> > seeing is the normal transaction commit after the fsync. In the case of >> >> > notreelog we are likely starting another transaction and you are seeing that >> >> > commit (at the time the transaction kthread would start a transaction even if >> >> > none had been started yet.) Thanks, >> >> > >> >> > Josef >> >> >> >> Is there any special handling for very small file write, less than 4K? As >> >> I understand there is an optimization to inline the first extent in a file if >> >> it is smaller than 4K, does it affect the writeback on fsync as well? I did >> >> set the max_inline mount option to 0, but even then it seems there is >> >> some difference in fsync behaviour for writing first extent of less than 4K >> >> size and writing 4K or more. >> >> >> > >> > Yeah if the file is an inline extent then it will be copied into the log >> > directly and the log will be written out, no going through the data write path >> > at all. Max inline == 0 should make it so we don't inline, so if it isn't >> > honoring that then that may be a bug. Thanks, >> > >> > Josef >> >> I tried it on 3.12-rc2 release, and it seems there is a bug then. >> Please find attached logs to confirm. >> Also, probably on the older release. >> > > Oooh ok I understand, you have your printk's in the wrong place ;). > do_writepages doesn't necessarily mean you are writing something. If you want > to see if stuff got written to the disk I'd put a printk at run_delalloc_range > and have it spit out the range it is writing out since thats what we think is > actually dirty. Thanks, > > Josef No, but I also placed dump_stack() in the beginning of __extent_writepage. run_delalloc_range is being called only from __extent_writepage, if it were to be called, the dump_stack() at the top of __extent_writepage would have printed as well, no? Thanks -- Aastha Mehta MPI-SWS, Germany E-mail: aasthakm@xxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html