CLUSTERING MAGENTO REDIS CACHING

SENTINEL,TWEMPROXY, TWEMPROXY_AGENT AND KEEPALIVED

Redis clustering using sentinel is easy to set up. Adding twemproxy allows for a highly scalable redis cluster and you get auto fail over and a ton of other benefits with this configuration. This arrangement can also remove your redis single point of failure.

Several lightweight applications will be used together to cluster Redis and to allow for high availability, stability and scalability. Getting all the parts working together can be tricky so take a look at the implementation diagram and then we will go over the details.

Redis Scaling Network Diagram

 

Incoming redis cache requests go through a keepalived virtual IP to the twemproxy server. The twemproxy server maintains the list of redis shard servers, and knows which ones have been delegated to master or slave. Redis requests are passed to the proper server by twemproxy, and returned to magento.

Sentinel maintains the clusters of Redis servers across shards. The twemproxy_agent plugs into the status updates and provides twemproxy the information. The diagram has 3 redis shards which would each live on individual servers, these can be web servers. Additional redis shard servers can be added easily.

A LIST OF DAEMONS:

keepalived:

  • Virtual IP failover endpoint
  • Passes traffic to TWemproxy
  • Load Balancer can handle this instead of keepalived
  • Provides load balance to sentinel servers

twemproxy:

  • Shards Redis across clusters
  • Automatically tracks master/slave per shard/cluster
  • Highly configurable to allow for many shards/clusters
  • Delegates Redis connections to proper redis shard
  • Code is called "nutcracker" as well as twemproxy

twemproxy_agent:

  • TWemproxy Agent ties to Redis Sentinel
  • Provides HA capabilities of twemproxy as Redis fails
  • Runs on node.js VERY lightweight

sentinel:

  • Discovery service for Redis servers
  • Part of the Redis Core
  • Built in monitoring, notification & autofailover
  • Auto-delegates Master/Slaves and provides info to clients
  • Automatically avoids problems with master going down
  • Avoids issues of bringing a master back online

Magento with Sentinel:

  • Magento connects to Sentinel and discovers proper IP's for slave/masters
  • Credis is used which supports Sentinel.
  • Cluster Instances should be configured to wrap each Magento Cache
  • Sessions persistence using Redis AOF method, check fsync policies on system (1 second is best)
  • Pretty much transparent to Magento

 

How to set up and configure?

The example configuration uses four servers, one server is the proxy master and the other three are redis cluster servers. I will describe the configuration on each box. Initially you begin with the Redis Clusters. After confirming your clusters work properly you wrap the redis clusters in Sentinel. Finally you move on to TWemproxy and its agent.

Having a good naming convention across the servers is very important. Several parts of this rely on automated name based clustering/configuration. The redis-sentinel master_name will be the key to this and will be set according to the 3 magento caches: cluster_fpc,cluster_obj and cluster_sess

These are reused by twemproxy and its agent and will all need to match properly.

Another concern is redis ports, with 3 or more redis instances running per box and carefully considering how the ports are used is important. In the examples I use the following ports:

Redis Cluster Servers

  • 12000 - cluster_obj
  • 12001 - cluster_fpc
  • 12002 - cluster_sess
  • 26379 - redis-sentinel

Proxy Server

  • 6379 - twemproxy_obj
  • 6380 - twemproxy_fpc
  • 6381 - twemproxy_sess
  • 22222 - twemproxy stats

The commands and installation in this guide are based on ubuntu 12.08 and may not be complete. You will need to do some experimenting and some script writting by the time this is done, if your going to use it in production.

 

Redis Shard Server

Each server will have several copies of redis-server running, these will be sharded across servers. Additionally redis-sentinel can be set up on each server allowing for high availability concerns.

Basic Overview

  1. Install Redis 2.8 from source
  2. Configure clusters of redis-servers on each shard
  3. Configure Sentinel to maintain clusters of redis servers
  4. Confirm this is working before moving on to the twemproxy set up
    • Sentinel failover must be working correctly
    • Redis cluster is working with get/set between shards

Naming Convention for redis master_name along with ports

hostnames based on shard name below
redis-shard1 redis-shard2 redis-shard3
cluster_obj (master) on 12000 cluster_obj (slave) on 12000 cluster_obj (slave) on 12000
cluster_fpc (slave) on 12001 cluster_fpc (master) on 12001 cluster_fpc (slave) on 12001
cluster_sess (slave) on 12001 cluster_sess (master) on 12001 cluster_sess (slave) on 12001

columns have Shard Name with (master_name / initial type / port ) for each of the redis-servers

 

Add hostname entries for each of the servers, so they can be refered to such as redis-shard1. This is required to make the configuration work properly. In a production environment this would be automated.

When setting up redis you can start with just one set of the cluster(such as fpc), simplify and set it up to work properly against all three shards and then move on. Sentinel initial configuration and testing should be done on redis-shard1. Once you have confirmed everything is working, add the other two sentinel servers for redundancy. Sentinel clusters easily, overlapping across shard servers, use keepalived for load balancing against this.

 

Building redis 2.8 from source

Follow the directions on http://redis.io/download. You will want to confirm the default install has worked with default settings. Disable the provided service and turn off the server before proceeding with the configuration. The install directions on redis also do not explain that you can do a "make install" and it should install it properly the way you would expect.

Configuring redis-server clusters on the shards

Redis config: redis-shard1(obj master) with notes for set up of other redis servers on this box. Slaveof name is based on our naming context above which I added to the hosts file. You would comment out the relevent one for each redis.conf file.


## Default configuration options 
daemonize yes 
pidfile "/var/run/redis.pid" 
logfile "/var/log/redis.log" # can change for each redis cache type 
loglevel notice 
databases 1 
save 900 1 
save 300 10 
save 60 10000 
rdbcompression yes 
dbfilename "dump.rdb" 
appendonly no 

# OBJ Master (for redis.obj.conf) 
port 12000 

# FPC Slave (for redis.fpc.conf) 
#port 12001
#slaveof redis-shard2:12001 

# SESS Slave (for redis.sess.conf) 
#port 12002 
#slaveof redis-shard3:12002 
# Persistence should be set up properly for sessions 

These will be saved as /etc/redis/redis.obj.conf, /etc/redis/redis.fpc.conf, and /etc/redis/redis.sess.conf, with the proper section (un)commented.

On each shard you will need to set up the files properly. Reflecting the hostnames/ips for each redis cluster. Pay careful attention and make sure each shard server gets configured properly with each of its three redis servers, and with the proper ports and if they are a slave/master (this can be tedious).

During testing you can use a shell command to start redis using your configaration: redis-server /etc/redis/redis.obj.conf

This is useful while testing the initial configurations, once everything is set up properly you can add the init.d scripts properly on each server.

Again make sure all of the redis clusters work properly across the shards with your manual configuration. Once that is confirmed, you move on to adding sentinel which will monitor and maintain delegation for the redis clusters.

Adding Sentinel

The next step is easy if you have done everything correctly in setting up the naming on the redis clusters. You will already have redis-sentinel installed on the redis shard servers, it comes with 2.8. You will have to place a configuration file in /etc/redis-sentinel.conf


sentinel monitor cluster_obj redis-shard1 12000 1 
sentinel known-slave cluster_obj redis-shard2 12000 
sentinel known-slave cluster_obj redis-shard2 12000 
sentinel down-after-milliseconds cluster_obj 200 
sentinel failover-timeout cluster_obj 1000 
sentinel monitor cluster_fpc redis-shard2 12000 1 
sentinel known-slave cluster_fpc redis-shard1 12000 
sentinel known-slave cluster_fpc redis-shard3 12000 
sentinel down-after-milliseconds cluster_fpc 200 
sentinel failover-timeout cluster_fpc 1000 
sentinel monitor cluster_sess redis-shard3 12000 1 
sentinel known-slave cluster_sess redis-shard1 12000 
sentinel known-slave cluster_sess redis-shard2 12000 
sentinel down-after-milliseconds cluster_sess 200 
sentinel failover-timeout cluster_sess 1000 

Sentinel is aware of all three of the clusters across all of the shards. Test this much the same way you test redis, running the command: redis-sentinel /etc/redis-sentinel.conf 
Will start up the sentinel, you should see if connect to each of the master and slaves. It may take some tinkering to get this right, take a look at the related links in the links and thanks section if you need help debugging.

 

 

Proxy Redis Using twemproxy

So now you have set up your redis cluster with sentinel and your ready to wrap it in twemproxy. Twemproxy will keep track of which server is master/slave and sends incoming redis write/read requests to the proper server. Twemproxy will become out of date if sentinel changes a master, and this is where twemproxy_agent steps in and updates twemproxy configurations and restarts the proxy server.

  1. Build nutcracker-0.4.0(TWemproxy) from source.
  2. Configure and Start
  3. Confirm it can connect to Redis Nodes

Build twemproxy from source

Source for twemproxy: https://github.com/twitter/twemproxy 
Follow the build directions, you should not need to set debug flags. Once installed you will need to create a nutcracker.yml in /etc/nutcracker/

twem1: 
  listen: "redis-proxy:6379" 
  hash: fnv1a_64 
  hash_tag: "{}" 
  distribution: ketama 
  redis: true 
  preconnect: true 
  servers: 
    - "redis-shard1:12000:1 redis-obj" 
    - "redis-shard2:12000:1 redis-obj" 
    - "redis-shard3:12000:1 redis-obj" 

twem2: 
  listen: "redis-proxy:6381" 
  hash: fnv1a_64 
  hash_tag: "{}" 
  distribution: ketama 
  redis: true 
  preconnect: true 
  servers:
    - "redis-shard1:12001:1 redis-fpc" 
    - "redis-shard2:12001:1 redis-fpc" 
    - "redis-shard3:12001:1 redis-fpc" 

twem3: 
  listen: "redis-proxy:6382" 
  hash: fnv1a_64 
  hash_tag: "{}" 
  distribution: ketama 
  redis: true 
  preconnect: true 
  servers: 
    - "redis-shard1:12002:1 redis-sess" 
    - "redis-shard2:12002:1 redis-sess" 
    - "redis-shard3:12002:1 redis-sess"

The listen address will be your outside Redis port, for that cluster. In our example I used redis-proxy as the hostname. 
Note: for twemproxy_agent to work properly you have to use the twem# syntax(ie. twem1).

To start nutcracker you use the command: nutcracker -d -c /etc/nutcracker/nutcracker.yml -o /var/log/nutcracker.log and you can omit the -d if you do not want it to daemonize.

To test against this, you can use: redis-cli -h redis-proxy -p 6379 
This should connect and allow you to use set/get and many other commands.

Be aware that twemproxy does not support all commands, a good example is INFO will not work. You can find a full list of the supported commands here:https://github.com/twitter/twemproxy/blob/master/notes/redis.md

Installing twemproxy_agent

Install nodejs and add required node modules using the following:


sudo apt-get install python-software-properties python g++ make 
sudo add-apt-repository ppa:chris-lea/node.js 
sudo apt-get update 
sudo apt-get install nodejs 
npm install cli js-yaml redis underscore async 

Source code for twemproxy_agent: https://github.com/Stono/redis-twemproxy-agent

twemproxy_agent's configuration is simple, run the command with the proper variables. This falls back on your twemproxy configuration and connects to sentinel to keep track of changes as they happen.

Start this up using the following command from the lib/ in the package:

node cli.js -h redis-cluster1 -p 26379 -f /etc/nutcracker/nutcracker.yml -c /etc/init.d/nutcracker.sh -l /var/log/twemproxy.log

info about the startup command:


-h : this will be your sentinel master (can be put through keepalived or loadbalancer) 
-p : Port of your sentinel server 
-f : Point this at your nutcracker/twemproxy conf 
-c : Shell script to killall nutcracker and restart it 
-l : Log file where it outputs all of its info 

Take a look at https://github.com/Stono/redis-twemproxy-agent for more info about how it works. You will need to fill in some blanks here, the shell script to restart nutcracker is a bit raw, and provided below.

killall nutcracker sleep 1 `nutcracker -d -c /etc/nutcracker/nutcracker.yml -o /var/log/nutcracker.log` 

Either way when you get nutcracker and the agent working together you should be able to drop any master redis, and it will automatically switch over. With these settings, it happens in about 100ms after it detects the master is gone. When you add the dropped redis back it will automatically add it back as a slave.

 

 

Failover keepalived or Hardware Loadbalancer

You can add another layer on top of the twemproxy, in our example we refered to this as keepalived, which can virtualize the IP and allow for fall back and some other concerns. It is out of scope of this document to explain the set up of N nodes of twemproxy. This graphic helps describe the architecture of adding that layer: 
 

Network Diagram for Redis Sentinel

from: http://www.jambr.co.uk/Article/redis-twemproxy-agent

Using keepalived to load balance the sentinel servers would also decrease single point of failure on this set up to nearly 0. If you add a second twemproxy server you have a complete system.

 

Conclusion

There is a lot of ground to cover on this setup and many things have been glossed over in this write up. I suggest following a simple set up for this and building on the complexity. Once you have everything set up this works like a dream.

Currently this post is probably just a draft, and here is a small list of the things that need to be added:

  • keepalived examples and settings
  • Adding and removing shards
  • debugging and more details
  • Add magento config section

 

Thanks / Credits

Thijs Feryn: https://joind.in/talk/view/11867 (Special Thanks!) 
Keepalived: http://www.keepalived.org/ 
TWemproxy: https://github.com/twitter/twemproxy 
TWemproxy Agent: https://github.com/Stono/redis-twemproxy-agent 
Redis Persistence: http://redis.io/topics/persistence 
Sentinel: http://redis.io/topics/sentinel 
Magento Credis: https://github.com/colinmollenhour/credis 
Clustering Redis Blog Post: https://blog.recurly.com/2014/05/clustering-redis-maximize-uptime-scale 
Robofirm: http://www.robofirm.com/