dbench(1): Corega RAID10 via eSATA

(本来はNFS等を対象としてるらしい)ディスクI/Oベンチマークツールの dbench(1) を動かしてみた.

結論としては「(普通に使う分には)Ext4優秀すぎワロタ」という,何ともつまらない...

ハードの詳細は id:flalin:20100717:1279360949 で.

$ uname -a
Linux zbox 2.6.38-2-amd64 #1 SMP Sun May 8 13:51:57 UTC 2011 x86_64 GNU/Linux
$ sed -e '/^$/,$d' < /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 28
model name      : Intel(R) Atom(TM) CPU D510   @ 1.66GHz
stepping        : 10
cpu MHz         : 1662.565
cache size      : 512 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl tm2 ssse3 cx16 xtpr pdcm movbe lahf_lm dts
bogomips        : 3325.13
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
$ free -m
             total       used       free     shared    buffers     cached
Mem:          3967       3376        590          0        637       1680
-/+ buffers/cache:       1058       2909
Swap:            0          0          0

Btrfs

$ LANG=C aptitude show btrfs-tools|grep Version
Version: 0.19+20100601-3
$ cd /mnt/btrfs
$ dbench -D . 4
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 120 secs
3 of 4 processes prepared for launch   0 sec
4 of 4 processes prepared for launch   0 sec
releasing clients
   4      3291   142.68 MB/sec  warmup   1 sec  latency 149.822 ms
   4      7950   120.71 MB/sec  warmup   2 sec  latency 26.372 ms
(略
   4   1388060    39.73 MB/sec  execute 598 sec  latency 475.540 ms
   4   1389106    39.70 MB/sec  execute 599 sec  latency 423.655 ms
   4  cleanup 600 sec
   0  cleanup 600 sec

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX     759475     0.086    26.728
 Close         557949     0.007    10.547
 Rename         32144     0.765   638.607
 Unlink        153315     0.293   327.566
 Deltree           24    19.819    49.056
 Mkdir             12     0.014     0.032
 Qpathinfo     688254     0.042    16.386
 Qfileinfo     120778     0.005     4.781
 Qfsinfo       126187     0.027    12.179
 Sfileinfo      61836     0.061     9.810
 Find          266126     0.160    18.310
 WriteX        379503     0.152  2862.354
 ReadX        1190095     0.015    22.645
 LockX           2472     0.011     2.559
 UnlockX         2472     0.007     0.272
 Flush          53235    38.303  1108.550

Throughput 39.7034 MB/sec  4 clients  4 procs  max_latency=2862.366 ms

最初の辺りのメチャクチャな速度は,キャッシュ? エクステントと関係あるのか? それとも単に,dbench が模倣する「実アプリのアクセスパターン」でたまたま都合よいのが使われたとか? (見比べると分かるが "Count" 欄が毎回異なる)

Nilfs2

$ LANG=C aptitude show nilfs-tools|grep Version
Version: 2.0.23-1
$ cd /mnt/nilfs
$ dbench -D . 4
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 120 secs
3 of 4 processes prepared for launch   0 sec
4 of 4 processes prepared for launch   0 sec
releasing clients
   4       182    29.22 MB/sec  warmup   1 sec  latency 517.189 ms
   4       411    29.19 MB/sec  warmup   2 sec  latency 404.415 ms
(略
   4   1494449    44.95 MB/sec  execute 598 sec  latency 43.430 ms
   4   1496041    44.92 MB/sec  execute 599 sec  latency 155.153 ms
   4  cleanup 600 sec
   0  cleanup 600 sec

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX     857431     0.449   298.429
 Close         629970     0.007    12.059
 Rename         36311     1.256   312.670
 Unlink        173093     1.614   312.721
 Deltree           24   150.773   464.184
 Mkdir             12     0.013     0.019
 Qpathinfo     777237     0.023    12.085
 Qfileinfo     136326     0.005     2.268
 Qfsinfo       142503     0.013     4.175
 Sfileinfo      69873     0.880   287.039
 Find          300546     0.061    12.115
 WriteX        428040     1.413   321.132
 ReadX        1344349     0.016    16.566
 LockX           2794     0.011     1.127
 UnlockX         2794     0.009     1.754
 Flush          60114    14.604   320.315

Throughput 44.9204 MB/sec  4 clients  4 procs  max_latency=321.147 ms

あれ,Btrfsに1割も勝ってる."AvgLat" 欄を比べると "Qpathinfo", "Qfsinfo", "Find" 以外は大したことないのに.

Ext4 + dir_index,extent,filetype,uninit_bg

# dumpe2fs -h /dev/sdb2
dumpe2fs 1.41.12 (17-May-2010)
Filesystem volume name:   Corega_Ext4
Last mounted on:          /usr/local
Filesystem UUID:          3c730b0a-a7d1-4680-a258-ebe9d129047d
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              32776192
Block count:              131074334
Reserved block count:     1310743
Free blocks:              128934491
Free inodes:              32774652
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      992
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Sat May 28 00:11:06 2011
Last mount time:          Sat May 28 12:17:03 2011
Last write time:          Sat May 28 13:05:56 2011
Mount count:              4
Maximum mount count:      27
Last checked:             Sat May 28 00:11:06 2011
Check interval:           15552000 (6 months)
Next check after:         Thu Nov 24 00:11:06 2011
Lifetime writes:          8 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      808ebdf7-59ca-4d42-8b57-e4af025aff6c
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x0001122e
Journal start:            27414

$ cd /usr/local/bench
$ dbench -D . 4
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 120 secs
3 of 4 processes prepared for launch   0 sec
4 of 4 processes prepared for launch   0 sec
releasing clients
   4      7644   237.07 MB/sec  warmup   1 sec  latency 31.833 ms
   4     15036   196.46 MB/sec  warmup   2 sec  latency 48.780 ms
(略
   4   2920057    85.57 MB/sec  execute 598 sec  latency 386.633 ms
   4   2923755    85.56 MB/sec  execute 599 sec  latency 191.606 ms
   4  cleanup 600 sec
   0  cleanup 600 sec

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    1635590     0.175   298.657
 Close        1201445     0.008    10.411
 Rename         69259     0.647   447.047
 Unlink        330299     0.455   579.998
 Deltree           40    24.523   148.663
 Mkdir             20     0.013     0.017
 Qpathinfo    1482507     0.029    16.158
 Qfileinfo     259747     0.005    10.889
 Qfsinfo       271824     0.015    11.699
 Sfileinfo     133263     0.741   497.927
 Find          573166     0.097    17.267
 WriteX        815223     0.292   579.552
 ReadX        2563537     0.016    18.196
 LockX           5324     0.011     0.787
 UnlockX         5324     0.008     0.326
 Flush         114647    11.134  1653.168

Throughput 85.5629 MB/sec  4 clients  4 procs  max_latency=1653.190 ms

もうこれでいいや... という性能だな.なんだこれ.無数にあるオプションは,どれがデフォルトでオンなんだかよく分からん.無印で作って dumpe2fs(8) すりゃいいのか? バージョンの進化で変わったりするしなぁ.

ZFS-FUSE

# zfs list
NAME          USED  AVAIL  REFER  MOUNTPOINT
nkPool        141K  98.4G    24K  /mnt/zfs
nkPool/zfs0    21K  98.4G    21K  /mnt/zfs/zfs0
nkPool/zfs1    21K  98.4G    21K  /mnt/zfs/zfs1
$ LANG=C aptitude show zfs-fuse|grep Version
Version: 0.6.9-1+b1
$ cd /mnt/zfs/zfs0
$ dbench -D . 4
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 120 secs
3 of 4 processes prepared for launch   0 sec
4 of 4 processes prepared for launch   0 sec
releasing clients
   4       435    61.09 MB/sec  warmup   1 sec  latency 574.269 ms
   4       664    42.31 MB/sec  warmup   2 sec  latency 850.737 ms

   4    732311    22.70 MB/sec  execute 598 sec  latency 340.347 ms
   4    732911    22.69 MB/sec  execute 599 sec  latency 293.503 ms
   4  cleanup 600 sec
   0  cleanup 600 sec

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX     434247     0.677   992.711
 Close         318964     0.024     6.158
 Rename         18399     1.303    10.969
 Unlink         87708     0.413    15.392
 Deltree            8    54.977   115.550
 Mkdir              4     0.030     0.038
 Qpathinfo     393680     0.382    17.070
 Qfileinfo      68903     0.091     5.801
 Qfsinfo        72191     0.150    12.025
 Sfileinfo      35399     0.393     8.233
 Find          152189     1.653    30.637
 WriteX        216014     0.472  1002.589
 ReadX         681049     0.023    10.055
 LockX           1416     0.015     0.588
 UnlockX         1416     0.009     1.435
 Flush          30447    46.959  1304.704

Throughput 22.6866 MB/sec  4 clients  4 procs  max_latency=1304.724 ms

お疲れさまでした.まぁスナップショット関連が超優秀なので,CPU-boundなCIサーバとかで活用できたらなぁと思う.

おまけ:Ext3 on Samsung SSD

# dumpe2fs -h /dev/sda6
dumpe2fs 1.41.12 (17-May-2010)
Filesystem volume name:   zboxHome
Last mounted on:          <not available>
Filesystem UUID:          a3b5e1c8-df04-46cb-90ca-3890ccf22575
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              3334144
Block count:              13335552
Reserved block count:     666777
Free blocks:              11583306
Free inodes:              3265927
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      1020
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Filesystem created:       Mon Jul 19 17:38:53 2010
Last mount time:          Sat May 28 12:17:03 2011
Last write time:          Sat May 28 12:17:03 2011
Mount count:              27
Maximum mount count:      35
Last checked:             Sun Jan 16 12:53:14 2011
Check interval:           15552000 (6 months)
Next check after:         Fri Jul 15 12:53:14 2011
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       548942
Default directory hash:   half_md4
Directory Hash Seed:      18cef424-b9d3-424e-b754-38f301969f49
Journal backup:           inode blocks
Journal features:         journal_incompat_revoke
Journal size:             128M
Journal length:           32768
Journal sequence:         0x000bb7eb
Journal start:            12983

$ cd ~
$ dbench -D . 4
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load '/usr/share/dbench/client.txt' and minimum warmup 120 secs
3 of 4 processes prepared for launch   0 sec
4 of 4 processes prepared for launch   0 sec
releasing clients
   4       664    84.63 MB/sec  warmup   1 sec  latency 707.158 ms
   4      2345    55.78 MB/sec  warmup   2 sec  latency 1333.045 ms
(略
   4   2388598    70.74 MB/sec  execute 598 sec  latency 70.759 ms
   4   2391996    70.74 MB/sec  execute 599 sec  latency 110.458 ms
   4  cleanup 600 sec
   0  cleanup 600 sec

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    1349973     0.057   254.163
 Close         991662     0.007    11.557
 Rename         57160     0.137    55.222
 Unlink        272605     0.361   254.454
 Deltree           40    15.689    89.557
 Mkdir             20     0.010     0.014
 Qpathinfo    1223459     0.027    14.627
 Qfileinfo     214589     0.004     5.709
 Qfsinfo       224372     0.013    16.075
 Sfileinfo     109952     0.069    82.219
 Find          473072     0.089    17.703
 WriteX        674105     0.334   206.080
 ReadX        2116045     0.015    16.855
 LockX           4396     0.010     1.992
 UnlockX         4396     0.008     2.586
 Flush          94605    18.488   759.176

Throughput 70.7404 MB/sec  4 clients  4 procs  max_latency=759.203 ms

SSD上でI/Oベンチとかマジ心臓に悪いわ... SLCなら安心していいとは言われつつも.

雑記

しかしレイテンシの最大値ってのは統計量としてどうなのかね... 5, 50(中央値), 95 パーセンタイルが欲しいよ.パーセンタイルは算出の空間効率が悪いから,みんな面倒くさがって使わないんだよね...
Nilfs2以外の全てで見られたのが,序盤のレートが終盤で数割落ちる現象.これは何なの? HDDの特性?
後はちょっと前に(「(2.6カーネルで dir_index 入る前の)Ext3使えねー」って言って)XFS, JFSとかReiserFSが流行った気がするが,もうオワコン... とまで言わないが,あまり大きなアドバンテージが見えないので,まぁ別にいいや.
Bonnie++ も動かしてみようかな.
上記は我が家固有の結果なんで... 他のが知りたければベンチマーク・サイトをどうぞ.例えば: