Understanding and Checking/Analyzing Your DockerHub Rate Limit

We’ve been hitting docker rate limiting pretty hard lately in our EKS clusters. Here are some interesting things we learned:

  • The anonymous request rate limit for DockerHub is 100 requests per IP address per hour.
  • If you are in a private IP space and have internet gateways, you are probably being rate limited on the IPs of the gateways.
  • So, if you have 600 servers going through 6 gateways, you have 600 requests, not 60,000 (obviously this is a massive difference).
  • In kubernetes, you should specify an image tag (which is not mandatory) and pull-if-not-present in order to ensure you pull images less frequently.

If you need to observe your servers and how they are acting with the rate limit, you can refer here -> https://www.docker.com/blog/checking-your-current-docker-pull-rate-limits-and-status/.

For anonymous requests, basically just run:

TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq -r .token)

curl --head -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest 2>&1 | grep ratelimit

And you will get output like this, showing the rate limit (100) and how many you have left (100 for me as I haven’t pulled recently).

RateLimit-Limit: 100;w=21600
RateLimit-Remaining: 100;w=21600

Debugging Spring Boot Multi Stage Docker Build – JAR Extract Not Working

Containerized Spring Boot – Multi Stage Build

I was following this good tutorial on deploying spring-boot apps to kubernetes:

https://spring.io/guides/gs/spring-boot-kubernetes/

Their docker file looks like this:

FROM openjdk:8-jdk-alpine AS builder
WORKDIR target/dependency
ARG APPJAR=target/*.jar
COPY ${APPJAR} app.jar
RUN jar -xf ./app.jar

FROM openjdk:8-jre-alpine
VOLUME /tmp
ARG DEPENDENCY=target/dependency
COPY --from=builder ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY --from=builder ${DEPENDENCY}/META-INF /app/META-INF
COPY --from=builder ${DEPENDENCY}/BOOT-INF/classes /app
ENTRYPOINT ["java","-cp","app:app/lib/*","com.example.demo.DemoApplication"]

So, it is using a multi-sage docker build.  The reasoning for this seems to be described well in this other article here: https://spring.io/blog/2018/11/08/spring-boot-in-a-container.  Basically:

A Spring Boot fat jar naturally has “layers” because of the way that the jar itself is packaged. If we unpack it first it will already be divided into external and internal dependencies. To do this in one step in the docker build, we need to unpack the jar first. For example (sticking with Maven, but the Gradle version is pretty similar):

There are now 3 layers, with all the application resources in the later 2 layers. If the application dependencies don’t change, then the first layer (from BOOT-INF/lib) will not change, so the build will be faster, and so will the startup of the container at runtime as long as the base layers are already cached.

Build Failing

In my case, I was getting an error like this:

Step 9/12 : COPY --from=0 ${DEPENDENCY}/BOOT-INF/lib /app/lib
COPY failed: stat /var/lib/docker/overlay2/72b76f414c0b527f00df9b17f1e87afb4109fa6448a1466655aeaa5e9e358e27/merged/target/dependency/BOOT-INF/lib: no such file or directory

This was a little confusing.  At first I assumed, that the overlay file system path didn’t exist.  When I tried to go there (to the 72… folder), it in fact was not present.  This was a red herring though.

It is failing to copy BOOT-INF/ content.  So, on a hunch, I assumed the 72… folder path was ephemeral.  If that is true, then it did exist at some time, and the path did not have BOOT-INF content in it.

So, the next step was to debug the intermediate container for the multi-stage build.  In my docker file, the first like is:

FROM openjdk:8-jdk-alpine AS builder

So, the name of the intermediate container is “builder”.  We can stop the build at that container to interrogate it by doing this:

docker build . --target=builder
Sending build context to Docker daemon   17.2MB
Step 1/5 : FROM openjdk:8-jdk-alpine AS builder
 ---> a3562aa0b991
Step 2/5 : WORKDIR target/dependency
 ---> Using cache
 ---> 1dbeb7fad4c0
Step 3/5 : ARG APPJAR=target/*.jar
 ---> Using cache
 ---> 67ba4a4a1863
Step 4/5 : COPY ${APPJAR} app.jar
 ---> Using cache
 ---> d9da25b3bc23
Step 5/5 : RUN jar -xf ./app.jar
 ---> Using cache
 ---> b513f7190731
Successfully built b513f7190731

And then interactively running in the built image (b513f7190731):

docker run -it b513f7190731

In my case, I could clearly see that basicalyl no files existed in the /target/dependency folder.  There was only a JAR that was never extracted by the “jar -xf ./app.jar” command.

Debugging the JAR Extraction

I tried to manually run the jar extraction command myself, and it gave no error and did absolutely nothing.  After that, I went to make sure the JAR was valid.  When I looked at its size, it was fine.  When I looked at it with “vi”, I could see that the header was a bash script.

This is because my spring-boot project had the spring-boot-maven plugin in its POM with executable = true; this embeds a bash script in the front to allow it to run as a Linux init.d service.

So, my first move was to remove the plugin altogether.  This did not help as, after that, the JAR did extract but it had no BOOT-INF folder.  Spring boot was not building the fat-jar with dependencies as I removed the plugin.  So, I had to put the plugin back in like this:

<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
</plugin>

With executable = true removed.  This builds the fat JAR without the bash file embedded, which allows the “jar -xf ./app.jar” command to properly extract the contents including the BOOT-INF dependencies.

This was not a problem in the tutorial because they’re relying on generated projects from Initializr rather than existing projects you made yourself with possibly different use cases.

Summary

I hope this helped you learn about putting spring-boot in docker, and that it shed some light on how to debug a multi-stage docker build!  I knew a fair bit about docker and spring-boot going into this, but combining them revealed some process differences I had taken for granted over the years.  So, it was a good experience.

Minikube ImagePullBackOff – Local Docker Image

Background Context

Earlier today I was beginning to port our Presto cluster into kubernetes.  So, the first thing I did was containerize Presto and try to run it in a kubernetes deployment in minikube (locally).

I’m fairly new to minikube.  So far, I’m running it with vm-driver=None so that it uses the local docker instance rather than a virtualbox VM/etc.

The Error

So, I got my docker image building well and tested it within docker itself.  That all worked great.  Then I wrote my kubernetes deployment and ran it using the image… but unfortunately, it came up with the pod saying Error: ImagePullBackOff.

I went down a rabbit hole for a while after this because many posts talk about how to enable your minikube to have access to your local docker repo.   But when you’re running vm-driver=None, you are literally running in your local docker – so it should already have access to everything.

The actual error is: “Failed to pull image “qaas-presto:latest”: rpc error: code = Unknown desc = Error response from daemon: pull access denied for qaas-presto, repository does not exist or may require ‘docker login’: denied: requested access to the resource is denied”.

So, the issue is that it’s trying to do a pull and it can’t find the image… but it shouldn’t need to pull because the image already exists locally as it was built locally.

Workaround

I found the workaround in this github entry: https://github.com/kubernetes/minikube/issues/2575. Basically, in your deployment/pod spec/whatever, you just set:

imagePullPolicy: Never

This makes it avoid trying to pull the image, so it never fails to find it.  It just assumes it is present, which it is, and it uses it and moves on.  You may not necessarily want to deploy your config to production with these settings, but you can always template them out with helm or something, so it’s a viable workaround.

 

 

Install PGAdmin Server With Docker

You can get PGAdmin 4 running in server mode with docker very easily.  Using this command will set up the server, set it to always restart in response to reboots or errors, and it will ensure that its data (users, config) is persisted between container runs.

docker pull dpage/pgadmin4

docker run -p 80:80 \
    --name pgadmin \
    --restart always \
    -v "/opt/pgadmin4/data:/var/lib/pgadmin" \
    -v "/opt/pgadmin4/config/servers.json:/servers.json" \
    -e "PGADMIN_DEFAULT_EMAIL=user@domain.com" \
    -e "PGADMIN_DEFAULT_PASSWORD=SuperSecret" \
    -d dpage/pgadmin4

You can run this command afterward to see the details and confirm the restart policy as well if you like:

docker inspect pgadmin

 

JupyterHub or JupyterLab – Back Up All User Docker Container Files

If, like me, you deployed the JupyerHub docker spawner without persistent volumes and then ended up with tons of users having tons of content to lose, this may help you.

This short bash script will list all containers and then copy out their contents and zip it up to save space.

#!/bin/bash

mkdir -p ~/notebook-backup
cd ~/notebook-backup

CONTAINERS=`docker container ls | grep -v PORTS | awk '{print $14}'`
for NAME in ${CONTAINERS}
do
  echo "Copying out files for ${NAME}; this may take a minute."
  docker cp ${NAME}:/home/notebook ./${NAME}

  echo "Zipping files for ${NAME}."
  tar -zcvf ${NAME}.tar.gz ${NAME}

  echo "Removing source files for ${NAME}."
  rm -rf ./${NAME}
done