Terraform on Docker – Run Using Current Directory as Volume

Quick Tip

You can use the following command to run a terraform apply using the current directory as the volume. This is great if you, say, do a git checkout of your repository and want to just run the terraform files from the checkout folder.

docker run -it -v $(pwd):/workpace -w /workpace hashicorp/terraform:light apply

 

Shut Down All Docker Containers Based on Internal Analysis – JupyterHub Example

I manage a few decent sized Jupyter Hub environments based on the docker spawner.  Each frequently has more than 50 users, sometimes much more…. and recently, one of the servers ran out of memory.

I have some read-only notebooks inside the user containers… so I figured that if a user only had those read only notebooks, I could shut down their docker containers.  They weren’t doing any work that could be lost.

So, I wrote this script to:

  1. List all docker containers.
  2. Get their names.
  3. Exec a bash command in them.
  4. Shut them down based on the result.

I hope it helps you with a similar docker-related issue! 🙂

CONTAINERS=`docker container ls | awk '{print $14}'`
for NAME in ${CONTAINERS}
do
  #echo $name
  COUNT=`docker exec ${NAME} ls -a | grep .ipynb | grep -v checkpoints | wc -l`
  if [[ $COUNT = 1 ]];
  then
    echo "Stopping $NAME with COUNT = $COUNT."
    docker container stop $NAME;
  fi
done

AWS + Terraform + Auto Scale Group + User Data Bash Script on Startup to Customize Image

User Data  – On Startup

If you want to customize your VM image on its first start-up, you may want to use “user data”.  You can basically think of this as a script that will be run right after boot-up the very first time.  You can also make it run every reboot apparently (with extra config).

Why would you need this?  Well, in my case, I was spawning up a Presto cluster.  I generally do this in a special HA way… but even if you did it the simple way, you would have 1 coordinator and N workers, and the N workers would have to point at your 1 coordinator.

So, there are 2 interesting things here:

  1. The coordinator and workers are identical barring some slightly different configuration in one file.
  2. The workers need to know about the coordinator in order to use it.

So, for both of these cases, we’d like to run a script on start-up!.

The Terraform Code

When you want to create an auto-scale-group, you have to start by creating a launch template: https://www.terraform.io/docs/providers/aws/r/launch_template.html.

You can use that template to spawn up multiple auto-scale groups when its is done.  The launch template itself has the user data though.  So, you are best off trying to make your user data script generic enough that it can work for all your cases.  It can be a bash file and can use variables, so this isn’t too hard.

If you do need multiple separate user data scripts you’ll have to use separate launch templates, which is not the end of the world either.

The launch template in the link above is very complete, so all I’m going to show you is how to pass a bash script that takes parameters to the user data.

Basically replace:

user_data = "${base64encode(...)}"

In their example with something like this:

user_data = base64encode(templatefile("${path.module}/worker-script.sh", {coordinator_lb = "${aws_lb.coordinator.dns_name}", hive_thrift_csv = "${var.hive_thrift_csv}"}))

Assuming your worker-script has content like this:

#!/bin/bash
echo "Hello World" > /tmp/test-output.txt

and you have the hive_thrift_csv variable defined in your variables file like this:

variable "hive_thrift_csv" {
type = "string"
default = "thrift://ip-addr-1:9083,thrift://ip-addr-2:9083"
}

you should be good. Note, the first variable, definition coordinator_lb = “${aws_lb.coordinator.dns_name}” is a reference to the DNS name from a load balancer created in another part of my terraform config. I left it in as its a good example for a more complex separate variable.