Make Your Steem Server Last Longer With Memory Compression

On 13 January 2018, the RPC node at steemd.privex.io went down because the Steem daemon (steemd) exhausted the 256GB of RAM on that node. The Graphene in-memory database just kept getting bigger.

To get their RPC node back online, @someguy123 wanted to upgrade to a server with 512GB of RAM, but their provider told them that it would take 10 business days, and that's not including the time it would take to get the node set up!

My team (@dutch and I) suggested a fix to @someguy123: RAM compression with zram.

The fix worked, and despite running out of 256GB of RAM, the @privex RPC node came back up after less than 34 hours of downtime.

Unfortunately, the only way to prevent this ever-inflating memory usage from becoming unmaintainable would be to overhaul how Graphene accesses what it needs from the blockchain. There are already mitigations in place, like the LOW_MEMORY_NODE compile-time option or disabling unneeded plugins, but as the blockchain grows, so will memory usage.

I've got some good news, though:

  • zram will delay the inevitable. Our testing so far has shown that the life of a Steem witness node with a fixed amount of RAM can be extended by months (or even longer; it's hard to gauge for sure) with zram.
  • It's easy to set up zram.
  • If you're running Ubuntu 14.04 or newer, it's even easier to set up zram.
  • If you're on Debian 7 or newer, you can also use the Ubuntu instructions with some extra steps.

We are now recommending the use of zram as a new best practice for all new and existing Steem nodes.


Witness @gridcoin.science Uses zram

Our witness @gridcoin.science, intentionally configured with just 16GiB of RAM, is currently making use of zram:

https://i.imgur.com/DgzzAPs.png

Above, you can see that the steemd memory-mapped file is 21GiB large, but zram has compressed some of it:

https://i.imgur.com/OR8iNrD.png

Thanks to zram, we're able to run a witness below the commonly accepted minimum RAM requirement.

When either the CPU struggles to keep up with zram swapping or when zram swap space runs low, we plan to fail over to the backup witness briefly, increase the RAM of the primary witness, catch up the blockchain, and resume operations from the primary witness.

Currently, there's barely any CPU load, so we expect that zram will last us a while.


Steem Daemon with zram on Ubuntu or Debian

Ubuntu makes it dead simple to set up zram.

Debian 7 only: You need to enable the backports repository in /etc/apt/sources.list:

deb http://ftp.debian.org/debian wheezy-backports main

All Debian releases: Manually download and install the zram-config package version 0.5 from Ubuntu:

sudo apt update
wget 'http://archive.ubuntu.com/ubuntu/pool/universe/z/zram-config/zram-config_0.5_all.deb'
sudo dpkg -i zram-config_0.5_all.deb
sudo apt install -f
rm -v zram-config_0.5_all.deb

Then go directly to step #3.

  1. If you haven't already, enable the "universe" repository:

    sudo add-apt-repository "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc) universe"
    
  2. Install the zram-config package:

    sudo apt update
    sudo apt install zram-config
    
  3. By default, zram-config sets up zram swap half the size of your RAM, but our testing revealed that the steemd in-memory database has a zram lzo compression ratio of greater than 2×, which means you can comfortably double the default zram swap size.

    Set the calculated zram swap capacity to be equal to that of RAM:

    Ubuntu 14.04 only:

    sudo sed -i 's|/ 2 /|/ 1 /|g' /etc/init/zram-config.conf
    

    zram-config 0.5 (all other releases as of September 2018):

    sudo sed -i 's|/ 2 /|/ 1 /|g' /usr/bin/init-zram-swapping
    
  4. Start up zram-config:

    Ubuntu 16.04 and newer or Debian 8 and newer:

    sudo systemctl restart zram-config
    

    All releases:

    sudo service zram-config restart
    

    You should now see zram swap:

    $ swapon --show
    NAME       TYPE      SIZE USED PRIO
    /dev/zram0 partition   2G   0B    5
    /dev/zram1 partition   2G   0B    5
    /dev/zram2 partition   2G   0B    5
    /dev/zram3 partition   2G   0B    5
    /dev/zram4 partition   2G   0B    5
    /dev/zram5 partition   2G   0B    5
    /dev/zram6 partition   2G   0B    5
    /dev/zram7 partition   2G   0B    5
    
    $ zramctl
    NAME       ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
    /dev/zram0 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram1 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram2 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram3 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram4 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram5 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram6 lzo             2G   4K   81B   12K       1 [SWAP]
    /dev/zram7 lzo             2G   4K   81B   12K       1 [SWAP]
    
  5. Optional, but highly recommended:

    If you do not already have regular disk swap (either a swap file or a swap partition), create one and set it to enable on boot:

    This sets up a 4GiB swap file (bs=1M count=4096):

    sudo dd if=/dev/zero of=/swapfile bs=1M count=4096
    sudo mkswap /swapfile
    echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
    

    Extra swap will keep steemd running longer, even if you run out of zram swap. You are at elevated risk of missing blocks when using disk swap because disk swap is much slower than zram swap.

  6. If you are not already storing the steemd memory-mapped file in a tmpfs (ramdisk) mount:

    In witness_node_data_dir/config.ini, set shared-file-dir to a tmpfs mount (/dev/shm by default):

    shared-file-dir = /dev/shm
    
  7. In the same config.ini, set shared-file-size to something sane. In January 2018, the default for witnesses is 54G (54GiB). Anything below 22G (22GiB) in January 2018 will fail for witnesses because the steemd in-memory database is about to reach that size.

    We suggest that you use double your RAM size plus however much disk swap you have minus 1GiB for other things that may be running in RAM. If you have 16GiB of RAM and 4GiB of disk swap, set shared-file-size = 35G (16GiB × 2 + 4GiB - 1GiB = 35GiB).

    Regardless of how big you set the file size, steemd will only use as much space as it needs.

  8. Remount the /dev/shm tmpfs so that it can hold the entire shared_memory.bin.

    If your shared-file-size = 35G, consider setting the tmpfs file size to 36352M ((35GiB + 0.5GiB buffer) * 1024 = 36352MiB):

    mount -o remount,size=36352M /dev/shm
    
  9. If you have the files shared_memory.bin and shared_memory.meta already, copy them over to /dev/shm so that you don't have to replay the blockchain.

  10. Start the Steem daemon:

    • steemd if you copied the files in the previous step
    • steemd --replay-blockchain if you need to replay the blockchain

Steem Daemon with zram on Other Linux Distros

These instructions should be pretty portable across Linux distros as long as you install the util-linux package because it contains /sbin/zramctl.

  • Debian/Ubuntu: sudo apt install util-linux
  • Fedora: sudo dnf install util-linux
  • RHEL/CentOS: sudo yum install util-linux
  • Arch Linux: sudo pacman -S util-linux
  • Gentoo: sudo emerge util-linux
  1. If zram doesn't show up in lsmod | grep zram, run sudo modprobe zram.

    If you get a message that starts with modprobe: FATAL: Module zram not found, then you'll need to boot up with a kernel that has zram (standard with Linux 3.14 and newer).

  2. Run zramctl -f to confirm that /dev/zram0 is the first zram device available.

    If it's not /dev/zram0, that means you already started up zram somewhere else. This guide recommends that zram be used exclusively for steemd's memory-mapped file and assumes that /dev/zram0 is the device you choose to use.

  3. Determine how much space to allocate to the zram device. Just use however much RAM you have:

    $ totalmem=$(LC_ALL=C free -b | grep -e "^Mem:" | sed -e 's/^Mem: *//' -e 's/  *.*//')
    $ echo "$totalmem"
    16742518784
    
  4. Create a zram device with the size you determined in the previous step:

    $ sudo zramctl -f -s "$totalmem"
    /dev/zram0
    
  5. Format the new zram device as swap:

    $ sudo mkswap /dev/zram0
    Setting up swapspace version 1, size = 15.6 GiB (16742514688 bytes)
    no label, UUID=632f5bc3-d5cf-4983-a5ba-bcbcfe9dd238
    
  6. Mount the new swap device:

    $ sudo swapon /dev/zram0
    
  7. Go to step #5 of the Ubuntu/Debian instructions above.


Conclusion

I want to contribute to alleviating the operational growing pains of the Steemit platform. Growing RAM usage, which increases costs of running Steem nodes, continues to be a nagging problem. Collectively, that's a lot of RAM. I hope this zram tutorial helps to squeeze out more value from the hardware available while being transparent to the software.

Perhaps if using zram becomes standard operating practice, we can have more reliable witnesses to support the long-term endurance of Steem (and by extension, Graphene).

Reducing RAM usage isn't all, though. The witness @gridcoin.science is at the forefront of all the improvements I have worked on for witness operations. For an overview of what @dutch and I have already done differently with @gridcoin.science, see our announcement post.

To support this witness, visit https://steemit.com/~witnesses and add gridcoin.science to the box at the bottom of the page, click vote, and authorize using your Active Key.

We want to continue innovating and sharing our findings. Please let me or @dutch know if this tutorial was helpful and what other topics you'd like us to explore.

H2
H3
H4
Upload from PC
Video gallery
3 columns
2 columns
1 column
27 Comments