Internet-Accessible ElastiCache Server Behind Twemproxy, Using NLB and ASG

Lucas Melogno
.
October 26, 2023
Internet-Accessible ElastiCache Server Behind Twemproxy, Using NLB and ASG

Summary

In this post, we get into setting up an internet-accessible ElastiCache server using AWS services. Learn how to bypass the “VPC-isolated” limitation of the ElastiCache service while utilizing it for its robust capabilities. This tutorial requires a basic understanding of AWS and is ideal for those aiming to leverage Redis as more than just a cache. Redis is not only a high-performance in-memory cache but also a data structure store supporting various types: lists, sets, hashes, streams, and more. It offers features like persistence, pub/sub, and clustering, making it versatile for a variety of applications beyond caching.

Overview

Redis is an open source, in-memory Key-Value database, meant for low-latency read/write activity. It is typically used as cache, but it’s not its only possible application. It can also be used as a NoSQL database.

If you came across this post, you probably might want to implement a Redis server. And you considered convenient to avoid taking care of the routine maintenance, security measures and updates that imply having such an asset. This is where cloud providers become convenient, so you probably thought AWS’s managed service ElastiCache could be a good choice.

Goal

By design, ElastiCache clusters can only connect to resources inside the VPC it’s running on. In this solution, the Redis database will be accesible from anywhere in the internet.

This of course has some implications that must be taken into consideration before implementing this solution into your workload.

The goal of this blog is to illustrate how such an architecture would be, as well as to provide CloudFormation templates to test it for yourself. These templates can be found in this GitHub repository.

Considerations

Solution

To sort out this access limitation, an internet exposed proxy instance can be implemented. This comes with its own challenges and limitations, but the idea is to use a Redis proxy. There are multiple Redis proxy options available for free on the internet: RedisLabs/redis-cluster-proxy, Twitter/twemproxy, Nginx among others.

Twitter’s (or now 𝕏’s) twemproxy will be used, as suggested in this Trivago tech post, for being able to manage persistent connections with the Redis database. This is specially useful in contexts in which write activity is as high as read activity, and helps reducing latency by reducing the amount of connection opening and closing.

These proxy instances will not be directly exposed to the internet, but rather behind an AWS Network Load Balancer (NLB).

The load balancer plays a main security role, as it terminates TLS connections and allows to have twemproxy instances in a private subnet. Offloading encrypted packages using NLB avoids decrypting in the Redis instances, which degrades performance. AWS states that their NLB scales “infinitely”. Of course this is priced accordingly. Communication between twemproxy and ElastiCache is done through TCP. Twemproxy instances are launched by an autoscaling group (ASG), providing both fault tolerance and high availability by scaling horizontally. ASG allows to specify how many instances are going to be up for any given time, and how much it can scale in or out.

For ElastiCache settings, this architecture allows for both Cluster Mode enabled and disabled (the comparison between them can be found here), as twemproxy configuration takes any amount of Redis Nodes and has multiple consistent sharding options. In the scope of this article, the ElastiCache cluster will be running on Cluster Mode disabled.

Cloudformation Stacks

The resources will be split in three different Cloudformation (CFN) stacks, separating VPC, ElastiCache and the NLB/ASG resources.

💡You should always use infrastructure as Code (laC) for your cloud computing projects

Prerequisites

In order to follow this tutorial you need to have the following resources:

It’s important to note that having an ACM Certificate implies that you own a domain name. You will need to specify it during the creation of the NLB/ASG stack. The SSL certificate is needed for TLS termination, otherwise you would need to use plain TCP communication through the internet which is dangerous and not recommended.

Tutorial

Setting up the network

We first need to deploy the required network sources to our AWS account. Upload the VPC stack template in CloudFormation and specify the private and public subnets’ IP ranges, and the VPC IP range.

This template will create a NAT Gateway, necessary to provide internet access to instances without public IPs. This resource is charged by hourly usage, so remember to shut it down after testing.

Creating the ElastiCache cluster

To start the ElastiCache cluster we will be using a CFN stack that is already configured for Redis 7.0. You can change this stack to modify some of the cluster configurations. For now, we will specify a total of 2 nodes with a small instance size and use the private subnets we just created.

The cluster provisioning is rather slow-ish, so be patient while AWS sets your database up. Once it’s done, we need to gather info from the resources created. We need to obtain the following info:

Once you noted this parameters you are ready to deploy the NLB/ASG stack. This values can be exported as outputs in a stack containing this resources, this was not done for the sake of simplicity.

Launching the Redis proxy

Finally, we need to deploy the twemproxy stack, using the parameters gathered in the previous step. This template will launch a Network Load Balancer, whose domain name will be the one specified as subdomain name, which terminates TLS connections to the instances in the autoscaling group. This instances will be exact replicas of a certain LaunchTemplate, a sort of blueprint that defines an EC2 setup and startup. There will be as much instances as stated in the DesiredInstances parameter.

Once the stack is launched, we will be able to access the Redis database, using the server name we defined. The following is an example of a Redis client initialized in Ruby to connect through TLS.


Performance

We built a simple Ruby script to test connection performance, using an EC2 located in a different VPC in the same region. This simulates the case of the connection going through the internet. It achieved an average latency of ~0.26ms

Pricing

The following list of considerations per service must be taken into account to calculate the operating costs of this solution:

During the making of this solution, we had a daily spend of roughly ~4USD, where ~1.5USD was the ElastiCache. This accounts for about 100USD per month in fixed costs. Of course, depending on you needs and capacity, the spend will scale accordingly. A more detailed spending amount can be calculated using AWS Pricing Calculator.

Conclusion

In this exploration, we’ve demonstrated how to bridge the inherent limitations of AWS’ ElastiCache service to make a Redis server accessible over the internet. While ElastiCache’s VPC-bound nature provides a robust layer of isolation, certain applications and scenarios necessitate broader accessibility. By utilizing AWS services judiciously, in conjunction with twemproxy, we’ve successfully crafted a solution that maintains the security and performance characteristics vital to Redis operations.

To those aiming to leverage the capabilities of Redis beyond its traditional use-cases, this tutorial serves as a starting point, illustrating the potential and flexibility of combining cloud-native services with open-source solutions. As always, the evolution of technology warrants continuous learning and adaptation to maximize the benefits of such integrations.

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!

Download your e-book today!

Download your report today!

Summary

In this post, we get into setting up an internet-accessible ElastiCache server using AWS services. Learn how to bypass the “VPC-isolated” limitation of the ElastiCache service while utilizing it for its robust capabilities. This tutorial requires a basic understanding of AWS and is ideal for those aiming to leverage Redis as more than just a cache. Redis is not only a high-performance in-memory cache but also a data structure store supporting various types: lists, sets, hashes, streams, and more. It offers features like persistence, pub/sub, and clustering, making it versatile for a variety of applications beyond caching.

Overview

Redis is an open source, in-memory Key-Value database, meant for low-latency read/write activity. It is typically used as cache, but it’s not its only possible application. It can also be used as a NoSQL database.

If you came across this post, you probably might want to implement a Redis server. And you considered convenient to avoid taking care of the routine maintenance, security measures and updates that imply having such an asset. This is where cloud providers become convenient, so you probably thought AWS’s managed service ElastiCache could be a good choice.

Goal

By design, ElastiCache clusters can only connect to resources inside the VPC it’s running on. In this solution, the Redis database will be accesible from anywhere in the internet.

This of course has some implications that must be taken into consideration before implementing this solution into your workload.

The goal of this blog is to illustrate how such an architecture would be, as well as to provide CloudFormation templates to test it for yourself. These templates can be found in this GitHub repository.

Considerations

Solution

To sort out this access limitation, an internet exposed proxy instance can be implemented. This comes with its own challenges and limitations, but the idea is to use a Redis proxy. There are multiple Redis proxy options available for free on the internet: RedisLabs/redis-cluster-proxy, Twitter/twemproxy, Nginx among others.

Twitter’s (or now 𝕏’s) twemproxy will be used, as suggested in this Trivago tech post, for being able to manage persistent connections with the Redis database. This is specially useful in contexts in which write activity is as high as read activity, and helps reducing latency by reducing the amount of connection opening and closing.

These proxy instances will not be directly exposed to the internet, but rather behind an AWS Network Load Balancer (NLB).

The load balancer plays a main security role, as it terminates TLS connections and allows to have twemproxy instances in a private subnet. Offloading encrypted packages using NLB avoids decrypting in the Redis instances, which degrades performance. AWS states that their NLB scales “infinitely”. Of course this is priced accordingly. Communication between twemproxy and ElastiCache is done through TCP. Twemproxy instances are launched by an autoscaling group (ASG), providing both fault tolerance and high availability by scaling horizontally. ASG allows to specify how many instances are going to be up for any given time, and how much it can scale in or out.

For ElastiCache settings, this architecture allows for both Cluster Mode enabled and disabled (the comparison between them can be found here), as twemproxy configuration takes any amount of Redis Nodes and has multiple consistent sharding options. In the scope of this article, the ElastiCache cluster will be running on Cluster Mode disabled.

Cloudformation Stacks

The resources will be split in three different Cloudformation (CFN) stacks, separating VPC, ElastiCache and the NLB/ASG resources.

💡You should always use infrastructure as Code (laC) for your cloud computing projects

Prerequisites

In order to follow this tutorial you need to have the following resources:

It’s important to note that having an ACM Certificate implies that you own a domain name. You will need to specify it during the creation of the NLB/ASG stack. The SSL certificate is needed for TLS termination, otherwise you would need to use plain TCP communication through the internet which is dangerous and not recommended.

Tutorial

Setting up the network

We first need to deploy the required network sources to our AWS account. Upload the VPC stack template in CloudFormation and specify the private and public subnets’ IP ranges, and the VPC IP range.

This template will create a NAT Gateway, necessary to provide internet access to instances without public IPs. This resource is charged by hourly usage, so remember to shut it down after testing.

Creating the ElastiCache cluster

To start the ElastiCache cluster we will be using a CFN stack that is already configured for Redis 7.0. You can change this stack to modify some of the cluster configurations. For now, we will specify a total of 2 nodes with a small instance size and use the private subnets we just created.

The cluster provisioning is rather slow-ish, so be patient while AWS sets your database up. Once it’s done, we need to gather info from the resources created. We need to obtain the following info:

Once you noted this parameters you are ready to deploy the NLB/ASG stack. This values can be exported as outputs in a stack containing this resources, this was not done for the sake of simplicity.

Launching the Redis proxy

Finally, we need to deploy the twemproxy stack, using the parameters gathered in the previous step. This template will launch a Network Load Balancer, whose domain name will be the one specified as subdomain name, which terminates TLS connections to the instances in the autoscaling group. This instances will be exact replicas of a certain LaunchTemplate, a sort of blueprint that defines an EC2 setup and startup. There will be as much instances as stated in the DesiredInstances parameter.

Once the stack is launched, we will be able to access the Redis database, using the server name we defined. The following is an example of a Redis client initialized in Ruby to connect through TLS.


Performance

We built a simple Ruby script to test connection performance, using an EC2 located in a different VPC in the same region. This simulates the case of the connection going through the internet. It achieved an average latency of ~0.26ms

Pricing

The following list of considerations per service must be taken into account to calculate the operating costs of this solution:

During the making of this solution, we had a daily spend of roughly ~4USD, where ~1.5USD was the ElastiCache. This accounts for about 100USD per month in fixed costs. Of course, depending on you needs and capacity, the spend will scale accordingly. A more detailed spending amount can be calculated using AWS Pricing Calculator.

Conclusion

In this exploration, we’ve demonstrated how to bridge the inherent limitations of AWS’ ElastiCache service to make a Redis server accessible over the internet. While ElastiCache’s VPC-bound nature provides a robust layer of isolation, certain applications and scenarios necessitate broader accessibility. By utilizing AWS services judiciously, in conjunction with twemproxy, we’ve successfully crafted a solution that maintains the security and performance characteristics vital to Redis operations.

To those aiming to leverage the capabilities of Redis beyond its traditional use-cases, this tutorial serves as a starting point, illustrating the potential and flexibility of combining cloud-native services with open-source solutions. As always, the evolution of technology warrants continuous learning and adaptation to maximize the benefits of such integrations.

Stay ahead of the curve on the latest trends and insights in big data, machine learning and artificial intelligence. Don't miss out and subscribe to our newsletter!