Re: Blocket for more than 120 seconds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hans-Kristian Bakke posted on Mon, 16 Dec 2013 01:06:36 +0100 as
excerpted:

> torrents are really only one thing my storage server get hammered with.
> It also does a lot more IO intensive stuff. I actually run enterprise
> storage drives in a Supermicro-server for a reason, even if it is my
> home setup, consumer stuff just don't cut it with my storage abuse :)
> It runs KVM virtualisation (not on btrfs though) with several VMs,
> including windows machines, do lots of manipulation of large files,
> offsite backups at 100 mbit/s for days on end, reencoding large amounts
> of audio files, runs lots of web sites, constantly streams blu-rays to
> at least one computer, and chews through enormous amounts of internet
> bandwith constantly. Last week it consumed ~10TB of internet bandwith
> alone. I was at about 140 mbit/s average throughput on a  100/100 link
> over a full 7 day week, peaking at 177 mbit/s average over 24 hours, and
> that is not counting the local gigabit traffic for all the video
> remuxing and stuff.
> In other words, all 19 storage drives in that server is driven really
> hard, and it is no wonder that this triggers some subtleties that normal
> users just don't hit.

Wow!  Indeed!

> But since torrenting are clearly the worst offender when it comes to
> fragmentation I can comment on that.
> Using btrfs with partitioning stops me from using the btrfs multidisk
> handling that I ideally need, so that is really not an option.

??  I'm not running near what you're running, but I *AM* running multiple 
independent multi-device btrfs filesystems (raid1 mode) on a single pair 
of partitioned 256 MB (238 MiB) SSDs, just as pre-btrfs and pre-SSD, I 
ran multiple 4-way md/raid1 volumes on individual partitions on
4-physical-spindle spinning rust.

Like md/raid, btrfs' multi-device support takes generic block devices.  
It doesn't care whether they're physical devices, partitions on physical 
devices, LVM2 volumes on physical devices, md/raid volumes on physical 
devices, partitions on md-raid on lvm2 on physical devices... you get the 
idea.  As long as you can mkfs.btrfs it, you can run multiple-device 
btrfs on it.

In fact, I have that pair of SSDs GPT partitioned up, with 11 independent 
btrfs, 9 of which are btrfs raid1 mode across similar partitions (one 
/var/log, plus working and primary backup for each of root, /home, gentoo 
distro packages tree with sources and binpkgs as well, and a 32-bit chroot 
that's an install image for my netbook) on each device, with the other 
two being /boot and its backup on the other device, my only two non-raid1-
mode btrfs.

So yes, you can definitely run btrs multi-device on partition block-
devices instead of directly on the physical device block devices, as I 
know quite well since my setup depends on that! =:^)

> I also
> think that if I were to use partitions (no multidisk), no COW and hence
> no checksumming, I might as well use ext4 which is more optimized for
> that usage scenario. Ideally I could use just a subvol with nodatacow
> and quota for this purpose, but per subvolume nodatacow is not available
> yet as far as I have understood (correct me if I'm wrong).

Well, if your base assumption, that you couldn't use btrfs multi-device 
on partitions, only on physical devices, was correct... But it's not.

Which means you /can/ partition if you like, and then use whatever 
filesystem on those partitions you want, combining multi-device btrfs on 
some of them, with ext4 on md/raid if you want multi-device support for 
it, since unlike btrfs, ext4 doesn't support multi-device natively.

You could even throw lvm2 in there, if you like, giving you additional 
sizing and deployment flexibility.  Before btrfs here, I actually used 
reiserfs on lvm2 on mdraid on physical devices, and it worked, but that 
was complex enough I wasn't confident of my ability to manage it in a 
disaster recovery scenario, and lvm2 requires userspace and thus an initr* 
to handle root on lvm2, while root on mdraid can be handled directly from 
the kernel commandline so no initr* required, so I kept the mdraid and 
dropped lvm2.

[snipped further discussion along that invalid assumption line]

> I have, until btrfs, normally just made one large array of all storage
> drives matching in performance characteristics, thinking that all the
> data can benefit from the extra IO-performance of the array. This has
> been a good compromise for a limited budget home setup where ideal
> storage teering with SSD hybrid SANs and such is not an option. But as I
> am now experiencing with btrfs, COW kind of changes the rules in a
> profound noticable all-the-time way. With COWs inherent
> random-write-to-large-file fragmentation penalty I think there is no
> other way than to separate the different workloads into separate storage
> pools going to different hardware. In my case it would probably mean
> having one storage pool for general storage, one for VMs and one for
> torrenting, as all of those react in their own way to COW and will get
> heavily affected by the other workloads in the worst case if run from
> the same drives with COW.

Luckily, the partitioning thing does work.  Additionally, as mentioned 
you can set NOCOW on directories and have new files in them inherit 
that.  So you have quite a bit more flexibility than you might have 
though.  Tho of course it's your system and you may well prefer 
administering whole physical devices to dealing with permissions, just as 
I decided lvm2 wasn't appropriate to me, altho many people use it for 
everything.

> Your system of a "cache" is actually already implemented logically in my
> setup, in the form of a post-processing script that rtorrent runs on
> completion. It moves completed files in dedicated per-tracker seeding
> folders, and then makes a copy (using cp --reflink=auto on btrfs) of the
> file, processes it if needed (tag clean up, reencoding, decompresssing
> or what not), and then moves it to another "finished" folder. This makes
> it easy to know what the new stuff is, and I can manipulate, rename and
> clean up all the data without messing up the seeds.
> 
> I think that the "finished" folder could still be located on the RAID10
> btrfs volume with COW, as I can use an internal move into the organized
> archive when I am actually sitting at the computer instead of a drive to
> drive copy via the network.

That makes sense.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux