In Amazon Web Services (AWS), you generally spread your nodes over multiple availability zones for high availability. Unfortunately, not every node type is available in every availability zone, and in general, it is hard to know which zones one is available in in advance.
If you are provisioning a single EC2 instance or you are only provisioning EC2s in an Auto Scaling Group (ASG) in a single-zone, you will obviously notice if you chose an incompatible zone for your instance type as it just won’t work.
It can be more nefarious when you have an ASG with multiple zones though. For example, our large scale airflow service runs in kubernetes, and the main ASG goes over 3 zones. Today, we ran out of IPs in two zones and realized that the third was not even being utilized. When hunting down why, this message was in the “activity” tracker page for the ASG.
Launching a new EC2 instance. Status Reason: Your requested instance type (r5.2xlarge) is not supported in your requested Availability Zone (us-east-1e). Please retry your request by not specifying an Availability Zone or choosing us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f. Launching EC2 instance failed.
This is a very helpful message, but it’s unfortunate that we had to do the wrong thing in order to get the supported zones list.
Getting the Correct Zones in Advance
You can use this AWS CLI (V2) command to check the list of zones supported for an instance type in advance.
% aws ec2 describe-instance-type-offerings --location-type availability-zone --filters="Name=instance-type,Values=r5.2xlarge" --region us-east-1 --output table
| DescribeInstanceTypeOfferings |
|| InstanceTypeOfferings ||
|| InstanceType | Location | LocationType ||
|| r5.2xlarge | us-east-1f | availability-zone ||
|| r5.2xlarge | us-east-1c | availability-zone ||
|| r5.2xlarge | us-east-1b | availability-zone ||
|| r5.2xlarge | us-east-1d | availability-zone ||
|| r5.2xlarge | us-east-1a | availability-zone ||
You can find some information on this from AWS at this link.
We use AWS EKS (v1.16) kubernetes for our auto scaling Presto deployments, and we and front it with an nginx ingress leveraging a network load balancer.
We found that, once we started auto scaling, we started getting remote disconnect errors from clients fairly frequently. This was pretty hard to explain because we had actually gone to great lengths to make sure Presto itself was gracefully terminating in a way that would not damage live queries.
Where is the Issue?
The root cause of this issue is that:
- We use ingress.
- Ingress uses a cloud load balancer.
- The cloud load balancer talks to the nginx ingress controller as a NodePort service.
- This means the LB will route traffic through any random node in the cluster.
- So, we gracefully terminate presto, but the NodePort service on the node that is scaling down may still be used for routing traffic to another node (e.g. the coordinator in this case).
It turns out that there really is no good way to fix this in EKS at this point in time. We originally hit this bug: https://github.com/kubernetes/autoscaler/issues/1907, and when we tried the workaround of using externalTrafficPolicy = Local, we hit this other bug: https://github.com/kubernetes/cloud-provider-aws/issues/87.
Other solutions are being developed now and will allow you to exclude certain nodes from the LB config using labels/etc, but they are not ready yet.
What is a Workaround?
Unfortunately, we did not solve this purely using the NGINX ingress. We found that we had to schedule the ingress services on some non-auto-scaling core nodes, and then we added them to the load balancer specifically (actually, to a separate LB we created and manage with terraform). This way, ingress always comes into nodes that do not auto scale, and those nodes route to the other services in a reliable way using the CNI black magic. It’s not a feel-good solution, but it remains stable during auto scaling of the rest of the cluster, so it works until a real k8s/AWS solution is developed.
Due to increased query sizes on our presto clusters (causing aggregation failures), I’m in the middle of evaluating moving from 16 core 64GB RAM general purpose EC2 machines (m4.4xlarge) to 64 core 256GB RAM general purpose machines (a 4x increase in power/RAM).
Here is the list of m4 and m5 models for 16-core/64GB and 64-core/256GB specs. Below, we’ll see how they compare to each other and what the best option is.
||Instance Storage (GB)
||$0.80 per Hour
||$3.20 per Hour
||$0.768 per Hour
||$3.072 per Hour
EC2 uses the EC2 Compute Unit (ECU) term to describe CPU resources for each instance size where one ECU provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
There are a few good things to notice here:
- For m4.4xlarge to m4.16xlare, we are getting 4x the resources for exactly 4x the cost ($.80 x 4 = $3.20). The one exception is we get less than 4x the ECU units (so technically less than 4x the processing power). So, compute roughly scales linearly within a model it seems.
- Pretty much the exact same situation holds true for the m5 models; going from xlarge to 16xlarge is exactly a 4x increase in cost and resources except for ECUs which are a little less than 4x.
- The m5 models have more ECUs than their m4 counterparts and they also cost less, so they are a better deal both performance and cost wise.
So, we’ll go with m5.16xlarge instances which cost $3.072 an hour. This comes out to $2,211 a month.
This is just a quick note for anyone facing this issue.
A few of us lost about a day debugging what we thought was a terraform issue originally. While we were creating an auto scaling group (ASG), we were getting “Invalid details specified: You are not authorized to use launch template…”.
It turned out that the same error was presented in the AWS console when we tried to create the ASG there.
After some substantial debugging, it turned out that terraform was allowed to create a launch template with an AMI (Amazon Machine Image) that did not exist. We had used the AMI ID from our non-prod account in our prod account, but AMIs must exist in each account with unique IDs – so it wasn’t working.
It took us a while to get to this point in our debugging because, frankly, we were very astounded that the error message was so miss-leading. We spent a very long time trying to figure out everything that could trigger a permissions error on the template itself, not realizing that a missing resource used within the template would make the whole template present that error.
In AWS, you can generally extend the root (or other) volume of any of your EC2 instances without downtime. The steps slightly vary by OS, file system type, etc though.
On a rather default-configured AWS instance running the main marketplace Centos 7 image, I had to run the following commands.
- Find/modify volume in the AWS console “volumes” page under the EC2 service.
- Wait for it to get into the “Optimizing” state (visible in the volume listing).
- Run: sudo file -s /dev/xvd*
- If you’re in my situation, this will output a couple lines like this.
- /dev/xvda: x86 boot sector; partition 1: ID=0x83, active, starthead 32, startsector 2048, 134215647 sectors, code offset 0x63
- /dev/xvda1: SGI XFS filesystem data (blksz 4096, inosz 512, v2 dirs)
- The important part is the XFS; that is the file system type.
- Run: lsblk
- Again, in my situation the output looked like this:
- NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
- xvda 202:0 0 64G 0 disk
- └─xvda1 202:1 0 64G 0 part /
- This basically says that the data is in one partition under xvda. Note; mine said 32G to start. I increased it to 64G and am just going back through the process to document it.
- Run: sudo growpart /dev/xvda 1
- This grows partition #1 of /dev/xvda to take up remaining space.
- Run: sudo xfs_growfs -d /
- This tells the root volume to take up the available space in the partition.
- After this, you can just do a “df -h” to see the increased partition size.
Note, your volume may take hours to get out of the “optimizing” stage, but it still can be used immediately.
You can view the raw AWS instructions here in case any of this doesn’t line up for you when you go to modify your instance: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-modify-volume.html.