⚠ Kubernetes Ingress Best Practice ⚠

Generally when you set up an ingress controller in a k8s cluster, it will be exposed as NodePort across all of your nodes and the related load balancer will go over all nodes in your cluster (round robin or whatever it is). This happens with NGINX/ALB/etc.

We have seen this cause various problems at scale over the years. Two good examples are…

(1) An ALB ingress will register every node for every rule, and you’ll hit your AWS Service Quota for # of targets to a LB (1000 default) quickly as your # of rules times your # of nodes scale.

(2) If your cluster auto-scales a lot, there is a non-trivial chance your LB will direct through a node which may scale down on the route to your actual ingress pod. The network path is User -> LB -> Any Cluster Node -> Ingress Pod -> Service -> Target Pod. In the past we have seen this drop traffic and e.g. interrupt a user communicating with Trino. We saw this in various cases; but it seems like proper node draining could avoid it; so I can’t confirm if it is still an issue at this point.

In any case, it is a *very* good idea to use your ingress controller’s config to target a specific set of nodes that are finite and don’t scale a lot via labels. We tend to make a “main” node group to hold ingress controllers, coredns, kube dashboard, and similar things for this.

Migrate DockerHub Images to GitLab : Script

#!/bin/bash

CRED="<your-rw-deploy-token-for-the-gitlab-project>"

# Change these for the target image / group.
DOCKERHUB_IMAGE="openjdk:11-jre-slim"
GITLAB_REGISTRY="registry.gitlab.com/your/project/path"

# Calculate the GitLab image name.
GITLAB_IMAGE="$GITLAB_REGISTRY/$DOCKERHUB_IMAGE"

# Pull the image from DockerHub
docker pull $DOCKERHUB_IMAGE

# Tag the image with the GitLab Container Registry path.
docker tag $DOCKERHUB_IMAGE $GITLAB_IMAGE

# Push the image to the GitLab Container Registry.
docker login registry.gitlab.com -u unused -p $CRED 
docker push $GITLAB_IMAGE

You can generate deploy tokens (R/W) in your project settings. Group level tokens will let this operate across multiple projects in a group.

Sorting S3 Buckets by Size

It can be fairly hard to rank your s3 buckets by size, especially with intelligent tiering on. Here is a concise script to find all bucket sizes in your account using cloudwatch metrics, that will output the top 10 in sorted order.

import boto3
import pandas as pd
from datetime import datetime, timedelta
import logging

# Configure logging
logging.basicConfig(format='%(asctime)s %(levelname)s: %(message)s', level=logging.INFO)

# Connect to CloudWatch
cloudwatch = boto3.client('cloudwatch')

# Connect to S3
s3 = boto3.resource('s3')

# Define a function to get the BucketSizeBytes metric data for a given bucket and storage type
def get_metric_data(bucket, storage_type):
    response = cloudwatch.get_metric_statistics(
        Namespace='AWS/S3',
        MetricName='BucketSizeBytes',
        Dimensions=[
            {'Name': 'BucketName', 'Value': bucket},
            {'Name': 'StorageType', 'Value': storage_type}
        ],
        StartTime=datetime.utcnow() - timedelta(days=3),
        EndTime=datetime.utcnow(),
        Period=86400,
        Statistics=['Maximum']
    )
    datapoints = response['Datapoints']
    if datapoints:
        return max([datapoint['Maximum'] for datapoint in datapoints])
    else:
        return 0

# Log before pulling the list of bucket names
logging.info("Getting list of bucket names...")

# Get all buckets in the account
buckets = [bucket.name for bucket in s3.buckets.all()]

# Prepare the MetricDataQueries for all the metrics
metric_data_queries = []
for bucket in buckets:
    logging.info(f"Working on bucket: {bucket}...")
    metric_data_queries.append(get_metric_data(bucket, 'StandardStorage'))
    metric_data_queries.append(get_metric_data(bucket, 'IntelligentTieringIAStorage'))
    metric_data_queries.append(get_metric_data(bucket, 'IntelligentTieringFAStorage'))
    metric_data_queries.append(get_metric_data(bucket, 'IntelligentTieringAIAStorage'))

# Parse the MetricData and sum up the bucket sizes
bucket_sizes = {}
for i in range(0, len(metric_data_queries), 4):
    bucket = buckets[i // 4]
    total_size = sum(metric_data_queries[i:i+4])
    bucket_sizes[bucket] = total_size

# Convert the results to a Pandas dataframe and display without truncation
df = pd.DataFrame.from_dict(bucket_sizes, orient='index', columns=['Size (Bytes)'])
df['Size (TBs)'] = df['Size (Bytes)'] / (1024 ** 4)
df = df[['Size (TBs)']].sort_values(by='Size (TBs)', ascending=False).head(10)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.width', None)
pd.set_option('display.float_format', '{:.2f}'.format)
print(df)

Kubectl – View Pods Per Node in Kubernetes

You can use this command to view how many pods are on each node in Kubernetes using just kubect.

kubectl get pods -A -o=custom-columns=NODE:.spec.nodeName | sort | uniq -c | sort -n

In our case, we have a limit of 25 pods per node, so we have daemon sets fail to roll out if nodes already have 25 pods. So, this is helpful.

It can also be helpful when decoming nodes as you track the removal of pods from them.

Querying LDAP From Python the Easy Way

Historically, using LDAP in python could be fairly painful because you had to install python-ldap, which could be hard depending on your environment. E.g. getting that installed in a Jupyter notebook where I work proved impossible without changing the underlying docker image for the notebook.

Most search results will still lead you to python-dap, but now you can and should use python-ldap3 instead. This library is pure-python and does not have any awkward OS dependencies. So, it “just works” and is much lighter.

Here is an example of how to login with a service account and query a user via email.

import ldap3

# Put in params up top.
SERVICE_ACCOUNT="<user>"
SERVICE_ACCOUNT_PASSWORD="<password>"
LDAP_URI="ldaps://<your-ldap-dns>:636"

search_base = 'DC=foo,DC=bar'
search_filter = '(&(mail=john.doe@somecompany.com))'
attrs = ["*"]

server = ldap3.Server(LDAP_URI)
with ldap3.Connection(server, auto_bind=True, user=SERVICE_ACCOUNT, password=SERVICE_ACCOUNT_PASSWORD) as conn:
    conn.search(search_base, search_filter, attributes=attrs)
    print(len(conn.entries))

To run this, you just have to do a quick pip install as shown below. I recommend you use the latest version; but I locked it here just to remind you that locking a version is smart in most python projects. Version drift causes many production issues.

pip install ldap3==2.9.1