[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ogfs-dev]multi-node performance numbers

Here are some performance numbers using the latest CVS (no-pool) code
and external journals running on a 3 node with fibre channel
configuration.  In all tests, node 0 was running memexpd, and 3 nodes
had the filesystem mounted.  Each node had its own subdirectory in the
filesystem.  The pool driver was used for consistent names and for

There were 4 disks used.  One contained the filesystem (pool0), one
contained the lock tables and cluster information (pool0cidev), and
the other two contained the external journals (journal[0-5]).

/dev/sda        < none >
/dev/sda1       journal0    0     0      0  ogfs_journal      514048
/dev/sda2       journal2    0     0      0  ogfs_journal      514064
/dev/sda3       journal4    0     0      0  ogfs_journal      514064
/dev/sda4       < none >
/dev/sdb        < none >
/dev/sdb1       journal1    0     0      0  ogfs_journal      514048
/dev/sdb2       journal3    0     0      0  ogfs_journal      514064
/dev/sdb3       journal5    0     0      0  ogfs_journal      514064
/dev/sdb4       < none >
/dev/sdc        < none >
/dev/sdc1          pool0    0     0      0  ogfs_data       16064976
/dev/sdc2       < none >
/dev/sdc3       < none >
/dev/sdd        < none >
/dev/sdd1     pool0cidev    0     0      0  ogfs_data         160624
/dev/sdd2       < none >
/dev/sdd3       < none >

The benchmark is simple and has three parts as follows:

	untar:	unpack a tar.gz of the linux kernel.  This part does
		lots of inode creates and file writes.  It is measured
		as the time to unpack a linux kernel tar.gz plus the
		time for a sync to complete.

	tar:	tar the new tree from the target filesystem to /dev/null.
		This part does lots of reads, some cached.  It is the
		time to tar the entire tree to /dev/null plus the time
		for a sync to complete.

	rm:	remove of all the created files.  This part does lots of
		metadata updates.  It is the time to perform 'rm -rf' of
		the tree plus the time for a sync to complete.

All measurement results are real time in seconds.  All delta results
are a simple division where 1.0x would be equal, > 1x took longer to

Test 1:	compare times with memexpd on local node and remote node
	(Note: memexpd on node0)

		node0	node1	delta
		-----	-----	-----
	untar	84.2	88.3	1.05x
	tar	 5.6	 7.5	1.34x
	rm	52.4	49.7	0.95x

	Conclusion: The effect of remote lock server is noticeable
	on the reads (tar) because most of the data was cached.
	The writes dominate (untar) so the local vs. remote lock
	server has only a small impact.  Curious that removes with
	remote lock server are slightly faster.

Test 2:	run benchmark on two machines simultaneously and compare with
	one-machine results.  Each node was given a separate directory
	under the filesystem root, as /ogfs/node0, /ogfs/node1, etc.
	(Note: one disk for filesystem, one disk each for journals.)
	(Note: memexpd on node0)

		node0	delta		node1	delta
		-----	-----		------	-----
	untar	170.6	2.03x		111.1	1.26x
	tar	  6.1	1.09x		10.3	1.37x
	rm	 56.5	1.08x		53.9	1.08x

	Conclusion: There is metadata contention, but no journal
	contention for this test.  The untar on node0 may have
	been affected by memexpd traffic.  There is some scaling
	across machines, twice the work in less than twice the

Test 3:	run two copies of the benchmark on the same machine and
	compare with two-machine results.
	(Note: separate subdirectory for each copy /ogfs/{copy1,copy2})

		copy1	delta		copy2	delta
		-----	-----		-----	-----
	untar	263.3	 1.54x		250.1	 1.47x
	tar	123.4	20.23x		121.1	19.86x
	rm	282.2	 5.00x		281.5	 4.98x

	Conclusion: The results here are different than expected.
	I thought that Test 2 slowdown would be attributed to the
	extra seeking on the filesystem disk or to the metadata
	contention.  There is metadata contention as in Test 2, but
	there is also journal contention which may account for the
	horrible performance.  The good news is that Test 2 shows a
	benefit for using multiple nodes as compared to Test 3!

Test 4:	run benchmark on three machines simultaneously and compare
	with one-machine results.

	This test did not complete on five attempts, due to lack of
	stability with memexpd.

Test 5:	compare one-machine ogfs (node0 Test 1) results with ext3
	(Note: ext3 was also configured with an external journal.)
	(Note: the faster column is different than the previous
	 deltas where > 1x means faster.)

		ext3	faster
		-----	------
	untar	33.3	 2.5x
	tar	 1.2	 4.6x
	rm	 1.1	48.5x

	Conclusion: OpenGFS is considerably slower than ext3 for single
	node operation.  This is not news, but interesting.

-- Joe DiMartino <joe@osdl.org>

This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
Opengfs-devel mailing list

[Kernel]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Clusters]     [Linux RAID]     [Yosemite Hiking]     [Linux Resources]

Powered by Linux