You are probably think, what is DAX? Why do we need DAX? Isn’t Amazon DynamoDB in itself fast enough? This blog clears everything about DAX, why and when it should be used, its advantages, and its limitations.
As we already know, DAX is an AWS feature that is an add-on for DynamoDB. Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for Amazon DynamoDB that delivers up to a 10 times performance improvement—from milliseconds to microseconds—even at millions of requests per second.
DAX is designed to run within an Amazon Virtual Private Cloud (Amazon VPC) environment. Amazon VPC defines a virtual network that closely resembles a traditional data center. With a VPC, you have control over its IP address range, subnets, routing tables, network gateways, and security settings. You can launch a DAX cluster in your virtual network and control access to the cluster by using Amazon VPC security groups.
Creating DAX Cluster
- To create a DAX cluster, open your AWS account console and search for DynamoDB and click on DAX Clusters. Now, click on Create Cluster.
2. Now add Cluster name, description, select node family and node type, specify the number of nodes and click on next
3. Select or create a subnet and security group. While creating a security group, make sure to add a rule with port 9111 for encrypted dax cluster or 8111 for unencrypted DAX cluster.
4. Now give permission to your Dax Cluster and provide encryption if you want.
5. Verify with advanced settings and review all steps before creating the cluster.
How does DAX Work?
1. To test DAX, we will create a DynamoDb table and then fetch data from it, after which we will again fetch the same data from the Dax cluster and then compare the time for the query executed. For this, first, create an EC2 instance and give IAM permission for DAX full access and DynamoDB full access. Make sure you provide the same vpc and security group as your cluster.
2. Now, Connect/SSH into your EC2 instance and run the following commands.
- pip install amazon-dax-client
- wget http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/samples/TryDax.zip
- unzip TryDax.zip
- python 01-create-table.py
- python 02-write-data.py
- python 03-getitem-test.py
- python 04-query-test.py
- python 05-scan-test.py
3. The commands above will install the DAX client, download sample code, create a DynamoDb table, write some test data into the table, and lastly, it will fetch data by get-item, query, and scan. The below screenshots show the time needed for query execution after running the query commands. These commands, for now, are fetching data directly from the database.
4. As you can see, we got some average execution time for the queries. Now we will run the same queries with the DAX client, for that, you can run the following command to get the dax cluster endpoint or go to the DAX cluster dashboard and copy the endpoint from there.
- aws dax describe-clusters –query “Clusters[*].ClusterDiscoveryEndpoint”
- python 03-getitem-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
- python 04-query-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
- python 05-scan-test.py dax://my-cluster.l6fzcv.dax-clusters.us-east-1.amazonaws.com
5. The following screenshots show the output after running the above commands.
6. After comparing the elapsed time of query from database versus from DAX, we can conclude that DAX is much faster and can enhance application performance from milli-seconds to microseconds. Now, if required, we can delete the table created by running the below command.
- python 06-delete-table.py
When should we use it?
- Consistent/Burst Traffic: When you have incoming traffic which has the same set of primary/secondary keys and also has a consistent pattern.
- Faster Response: When you need a faster response which is in microseconds and not in milliseconds.
- Eventual Consistency: When your application can deliver data that is not immediately updated. What this means is that changes in the DynamoDB table might take some time to get reflected in the Dax cluster. The reflection time is also very low, but this time may seem large in heavy traffic scenarios.
- Read Intensive: Dax is basically a cache, so as we all know, caches are used mainly for read intensive operations and for accessing data. It is not used for write intensive operations. It is better to use only for read intensive applications which have heavy traffic that could be split between multiple nodes of DAX.
- Save cost on DynamoDb RCU: Another benefit of using DAX is that it can also enable you to reduce your provisioned read capacity within DynamoDB. This is because of the fact that data is cached by DAX and so reduces the impact and amount of reading requests on your DB tables, instead, these will be served by DAX from the in-memory cache. As we know, reducing the provisioned requirements on your DynamoDB database will also reduce your overall costs.
- Hotkey Data Retrieval: When there are too many repeated items or query requests/reads, we can retrieve them with the mentioned TTL( default TTL is 5 mins) or increase the TTL also if required.
As we can see from the above examples, Dax is highly scalable with multiple nodes, provides extreme performance, and saves cost on RCUs during high capacity.
Written by – Mohammed Shahid Adoni