3

I have a HPE ProLiant DL360 Gen9 server, specs are:

  • CPU: Intel Xeon 2 CPUs E5-2687W v3 @ 3.10GHz, 25MB L3 cache, 10 cores ea
  • RAM: 8x 32GB PC4-17000 DDR4 2133MHz CAS-15 1.2V SDRAM DIMM (256 GB total)

(full server specs here)

The server is running CentOS 7.2 with kernel 3.10.0-327.36.3.el7.x86_64.

I mounted a tmpfs ramdisk on the server using the following entry in /etc/fstab:

tmpfs  /ramdisk  tmpfs  noauto,user  0 0

To test writing to this ramdisk, I then run the following command:

time sh -c "dd if=/dev/zero of=/ramdisk/120GB_testfile bs=4k count=30000000 && sync"

It reports that it wrote 122,880,000,000 bytes in 58.857s, which is a write speed of 1991 MiB/sec.

Considering that the write speed of this memory is 17GB/sec (according to this description of memory data rates), I am surprised by the considerably lower rate when writing to my tmpfs ramdisk. Can anyone explain the disparity, and suggest another way to write to a file in memory that is faster?

Thanks.

UPDATE

I disabled vm.swappiness, but that yielded no benefit (1712 MiB/sec).

I tried increasing the block size as well (bs=256k count=468750), but again, not much of an effect (2087 MiB/sec).

atreyu
  • 362

1 Answers1

4

There's more going on than just putting data in RAM when you're using an in-memory filesystem. You still have to handle the data structures associated with the file, including tracking where in memory all the allocations for it are. Writing this information takes time too (in particular, for the testing you're doing, your file size is being updated on every write, which immediately doubles the number of places data is changing in memory).

Also, allocating memory is extremely slow. In fact, it's about one of the slowest things you can do on most systems that doesn't involve I/O, with the only significantly slower thing being creating a new thread or process. Tools like ramspeed pre-allocate all the memory they will use right when they start up, so they can test the actual memory performance. In comparison, tmpfs has no idea how big of a file you are going to be creating, so it has to allocate everything on-demand, and does so in chunks no bigger than the dd block size (I think it caps out at 64k, but I'm not sure). Because of this, you have overhead in every block for allocating memory to store that block in.