Using Athena From DBeaver with your IAM Role / Profile

I just spent about 30 minutes working out how to connect to DBeaver using my normal AWS credentials file / default credentials.

Thankfully I stumbled across this GitHub and it worked like a charm: https://github.com/dbeaver/dbeaver/issues/3918#issuecomment-511484596.

Here are the relevant notes (slightly modified for easier understanding):

  1. Do your normal AWS login process to refresh your credentials (in our case, we use okta + gimme_aws_creds for this).
  2. Go to driver properties on your DBeaver Athena connection and set:
    • AwsCredentialsProviderClass to com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider
    • AwsCredentialsProviderArguments equal to the name of the profile you want to use (see ~/.aws/config to see which profiles you have) – we use “default”.
  3. Test Connection and it should work.

Manually Load Docker Image – Avoid Rate Limiting

You can manually load a docker image to a server when you need to. This is useful to get around DockerHub rate limiting in an urgent situation.

# On your laptop.
docker pull busybox:1.4.1
docker save busybox:1.34.1 > busybox-1-34-1.tar
aws s3 cp busybox1-341.tar s3://your-s3-bucket/busybox-1-34-1.tar

# On remote node.
aws s3 cp s3://your-s3-bucket/busybox-1-34-1.tar /tmp/busybox-1-34-1.tar
docker load -i busybox-1-34-1.tar

You can use anything that both your local host and the target host have access to. I just used s3 as it was most convenient in my case. We have SSH disabled on our production nodes, or you could have just SSH’d it across too.

PrestoSQL / Presto UI – Get stats programmatically via API

If you’re having trouble getting the /ui/api/stats info programmatically, you can use this script. Its ill-advised as they may change those APIs at any time; but as some of the UI stats are better/more correct than prometheus stats, you may need them as we did.

% COOKIE_VALUE=$(curl --location --request POST 'https://some.company.com/ui/login' \
--data-urlencode 'username=john.humphreys' \
--data-urlencode 'password=<password>' --cookie-jar - --output /dev/null  --silent | awk '{print $7}' | tail -1l)

curl 'https://some.company.com/ui/api/stats' -H $''"Cookie: Presto-UI-Token=$COOKIE_VALUE"'' | jq --color-output

{
  "runningQueries": 8,
  "blockedQueries": 0,
  "queuedQueries": 0,
  "activeCoordinators": 1,
  "activeWorkers": 35,
  "runningDrivers": 3957,
  "totalAvailableProcessors": 2450,
  "reservedMemory": 2770000473,
  "totalInputRows": 1133212564136,
  "totalInputBytes": 10872687401451,
  "totalCpuTimeSecs": 777021
}

MAC Make Dock Wait/Delay Longer Before Appearing

If the dock at the bottom of your Mac is getting in your way when you try to do quick actions, like using a horizontal scroll-bar in a full-screen app, then you can use this CLI setting to bump up the delay to a few seconds.

I find 3 seconds is enough to get most things in that area of the screen done, but is also short enough that it using the dock on purpose isn’t too painful.

defaults write com.apple.Dock autohide-delay -float 3; killall Dock

I found this on stack overflow after digging around for a while -> https://superuser.com/a/406571.

Listing Supported Availability Zones (AZs) for Instance Types in AWS

Availability Zones

In Amazon Web Services (AWS), you generally spread your nodes over multiple availability zones for high availability. Unfortunately, not every node type is available in every availability zone, and in general, it is hard to know which zones one is available in in advance.

Error Types

If you are provisioning a single EC2 instance or you are only provisioning EC2s in an Auto Scaling Group (ASG) in a single-zone, you will obviously notice if you chose an incompatible zone for your instance type as it just won’t work.

It can be more nefarious when you have an ASG with multiple zones though. For example, our large scale airflow service runs in kubernetes, and the main ASG goes over 3 zones. Today, we ran out of IPs in two zones and realized that the third was not even being utilized. When hunting down why, this message was in the “activity” tracker page for the ASG.

Launching a new EC2 instance. Status Reason: Your requested instance type (r5.2xlarge) is not supported in your requested Availability Zone (us-east-1e). Please retry your request by not specifying an Availability Zone or choosing us-east-1a, us-east-1b, us-east-1c, us-east-1d, us-east-1f. Launching EC2 instance failed.

This is a very helpful message, but it’s unfortunate that we had to do the wrong thing in order to get the supported zones list.

Getting the Correct Zones in Advance

You can use this AWS CLI (V2) command to check the list of zones supported for an instance type in advance.

% aws ec2 describe-instance-type-offerings --location-type availability-zone --filters="Name=instance-type,Values=r5.2xlarge" --region us-east-1 --output table
-------------------------------------------------------
|            DescribeInstanceTypeOfferings            |
+-----------------------------------------------------+
||               InstanceTypeOfferings               ||
|+--------------+--------------+---------------------+|
|| InstanceType |  Location    |    LocationType     ||
|+--------------+--------------+---------------------+|
||  r5.2xlarge  |  us-east-1f  |  availability-zone  ||
||  r5.2xlarge  |  us-east-1c  |  availability-zone  ||
||  r5.2xlarge  |  us-east-1b  |  availability-zone  ||
||  r5.2xlarge  |  us-east-1d  |  availability-zone  ||
||  r5.2xlarge  |  us-east-1a  |  availability-zone  ||
|+--------------+--------------+---------------------+

Sources

You can find some information on this from AWS at this link.