Generally when you set up an ingress controller in a k8s cluster, it will be exposed as NodePort across all of your nodes and the related load balancer will go over all nodes in your cluster (round robin or whatever it is). This happens with NGINX/ALB/etc.
We have seen this cause various problems at scale over the years. Two good examples are…
(1) An ALB ingress will register every node for every rule, and you’ll hit your AWS Service Quota for # of targets to a LB (1000 default) quickly as your # of rules times your # of nodes scale.
(2) If your cluster auto-scales a lot, there is a non-trivial chance your LB will direct through a node which may scale down on the route to your actual ingress pod. The network path is User -> LB -> Any Cluster Node -> Ingress Pod -> Service -> Target Pod. In the past we have seen this drop traffic and e.g. interrupt a user communicating with Trino. We saw this in various cases; but it seems like proper node draining could avoid it; so I can’t confirm if it is still an issue at this point.
In any case, it is a *very* good idea to use your ingress controller’s config to target a specific set of nodes that are finite and don’t scale a lot via labels. We tend to make a “main” node group to hold ingress controllers, coredns, kube dashboard, and similar things for this.
⚠ Kubernetes Ingress Best Practice ⚠
Reply