Speed up RAID5/RAID6 write speed

I'm currently upgrading from Ubuntu 9.04 to Ubuntu 12.04, a long process as every upgrade does only one step. This is the price you pay for skipping all updates of the last years. It's even slower since I moved my root filesystem to a RAID6.

The currently running update step to Ubuntu 11.10 Oneiric is even slower than the previous and strace on the processes showed about 4kB/second write speed (didn't measure, but it's writing 4kB-blocks and looks like a second per block).

But there's a little trick to speed up the RAID write speed - at the cost of memory: The magic file /sys/block/md0/md/stripe_cache_size contains the size of the stripe cache, usually 256 (which is the number of memory pages per disk used for caching). I increased this value to the maximum of 32768 - and my RAID write speed went up a lot.

But this speedup costs memory. My 4-disk RAID6 on a 64 bit Ubuntu with 32768 pages per disk eats up 4 (disks) * 4kB (page size) * 32768 = 512 MB memory. I have plenty of it (4 GB for a server which used to have 1 GB for years without problems) and no problem to reserve it for RAID. Setting the stripe cache size too high would simply eat up all your servers memory and kill your server.

1. Get the RAID device

$ cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid1] [linear] [multipath] [raid0] [raid10]
md125 : active raid1 sde1[1] sdd1[2] sda1[0]
1959808 blocks [3/3] [UUU]

md126 : active raid5 sda3[0] sdc3[2] sdb3[1]
3982384768 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md127 : active raid6 sde2[3] sda2[0] sdc2[2] sdb2[1]
308656640 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]

unused devices: <none>

The RAID device name is md followed by a number and the raid level is shown just two words next to it. My md126 is a RAID5 while md127 is a RAID6. The stripe_cache_size file only exists for RAID5 and RAID6, not for the RAID1 known as md125 on my server.

2. Look at the current value

$ cat /sys/block/md126/md/stripe_cache_size

My RAID5 currently has a stripe cache size of 256 pages, which is the default.

3. Increase the value

$ echo 1024 | sudo tee /sys/block/md126/md/stripe_cache_size

That's all. You may repeat step 2 to see if the new value is being used by the kernel.

Notice: This is not a permanent setting. It has to be set after each reboot.

4. Make it permanent

If your system is still running and you still got enough memory left (and only if both conditions apply!), add a startup file to set the stripe cache size automatically on each reboot.

I created a new file in /etc/init.d/rc2.d (which is a quick-and-dirty patch, there are probably many better ways to do this):

echo "Setting RAID5/6 stripe cache size..."
echo 8192 >/sys/block/md126/md/stripe_cache_size
echo 8192 >/sys/block/md127/md/stripe_cache_size

The script will be run as root and doesn't need the sudo tee workaround. It will increase the stripe cache size for both of my RAIDs but may still fail if the system decides to reassign the RAID device numbers.

/pbr /


Noch keine Kommentare. Schreib was dazu

Schreib was dazu

Die folgenden HTML-Tags sind erlaubt:<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>