Understanding and Checking/Analyzing Your DockerHub Rate Limit

We’ve been hitting docker rate limiting pretty hard lately in our EKS clusters. Here are some interesting things we learned:

  • The anonymous request rate limit for DockerHub is 100 requests per IP address per hour.
  • If you are in a private IP space and have internet gateways, you are probably being rate limited on the IPs of the gateways.
  • So, if you have 600 servers going through 6 gateways, you have 600 requests, not 60,000 (obviously this is a massive difference).
  • In kubernetes, you should specify an image tag (which is not mandatory) and pull-if-not-present in order to ensure you pull images less frequently.

If you need to observe your servers and how they are acting with the rate limit, you can refer here -> https://www.docker.com/blog/checking-your-current-docker-pull-rate-limits-and-status/.

For anonymous requests, basically just run:

TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq -r .token)

curl --head -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest 2>&1 | grep RateLimit

And you will get output like this, showing the rate limit (100) and how many you have left (100 for me as I haven’t pulled recently).

RateLimit-Limit: 100;w=21600
RateLimit-Remaining: 100;w=21600

Minikube ImagePullBackOff – Local Docker Image

Background Context

Earlier today I was beginning to port our Presto cluster into kubernetes.  So, the first thing I did was containerize Presto and try to run it in a kubernetes deployment in minikube (locally).

I’m fairly new to minikube.  So far, I’m running it with vm-driver=None so that it uses the local docker instance rather than a virtualbox VM/etc.

The Error

So, I got my docker image building well and tested it within docker itself.  That all worked great.  Then I wrote my kubernetes deployment and ran it using the image… but unfortunately, it came up with the pod saying Error: ImagePullBackOff.

I went down a rabbit hole for a while after this because many posts talk about how to enable your minikube to have access to your local docker repo.   But when you’re running vm-driver=None, you are literally running in your local docker – so it should already have access to everything.

The actual error is: “Failed to pull image “qaas-presto:latest”: rpc error: code = Unknown desc = Error response from daemon: pull access denied for qaas-presto, repository does not exist or may require ‘docker login’: denied: requested access to the resource is denied”.

So, the issue is that it’s trying to do a pull and it can’t find the image… but it shouldn’t need to pull because the image already exists locally as it was built locally.

Workaround

I found the workaround in this github entry: https://github.com/kubernetes/minikube/issues/2575. Basically, in your deployment/pod spec/whatever, you just set:

imagePullPolicy: Never

This makes it avoid trying to pull the image, so it never fails to find it.  It just assumes it is present, which it is, and it uses it and moves on.  You may not necessarily want to deploy your config to production with these settings, but you can always template them out with helm or something, so it’s a viable workaround.

 

 

Install PGAdmin Server With Docker

You can get PGAdmin 4 running in server mode with docker very easily.  Using this command will set up the server, set it to always restart in response to reboots or errors, and it will ensure that its data (users, config) is persisted between container runs.

docker pull dpage/pgadmin4

docker run -p 80:80 \
    --name pgadmin \
    --restart always \
    -v "/opt/pgadmin4/data:/var/lib/pgadmin" \
    -v "/opt/pgadmin4/config/servers.json:/servers.json" \
    -e "PGADMIN_DEFAULT_EMAIL=user@domain.com" \
    -e "PGADMIN_DEFAULT_PASSWORD=SuperSecret" \
    -d dpage/pgadmin4

You can run this command afterward to see the details and confirm the restart policy as well if you like:

docker inspect pgadmin

 

JupyterHub or JupyterLab – Back Up All User Docker Container Files

If, like me, you deployed the JupyerHub docker spawner without persistent volumes and then ended up with tons of users having tons of content to lose, this may help you.

This short bash script will list all containers and then copy out their contents and zip it up to save space.

#!/bin/bash

mkdir -p ~/notebook-backup
cd ~/notebook-backup

CONTAINERS=`docker container ls | grep -v PORTS | awk '{print $14}'`
for NAME in ${CONTAINERS}
do
  echo "Copying out files for ${NAME}; this may take a minute."
  docker cp ${NAME}:/home/notebook ./${NAME}

  echo "Zipping files for ${NAME}."
  tar -zcvf ${NAME}.tar.gz ${NAME}

  echo "Removing source files for ${NAME}."
  rm -rf ./${NAME}
done

 

Terraform on Docker – Run Using Current Directory as Volume

Quick Tip

You can use the following command to run a terraform apply using the current directory as the volume. This is great if you, say, do a git checkout of your repository and want to just run the terraform files from the checkout folder.

docker run -it -v $(pwd):/workpace -w /workpace hashicorp/terraform:light apply