Convert CA Certificate to JKS File (e.g. For Presto)

Many applications require JKS files to enable TLS (Transport Layer Security).  In case you are not sure what a JKS file is, you can read about what a JKS file is and see how to make a self-signed one right here.

Converting a CA Certificate to a JKS File

To convert the files a CA provides you into a JKS file you can do the following, which is lightly modified from this other article I followed.

cat /etc/ssl/certs/ca-bundle.crt IntermediateCA.crt > ca-certs.pem

openssl pkcs12 -export -in ssl_certificate.crt -inkey app.key -chain -CAfile ca-certs.pem -name "*.app.company.com" -out app.p12

keytool -importkeystore -deststorepass Password123! -destkeystore app.jks -srckeystore app.p12 -srcstoretype PKCS12

Note that the domain must match the one specified in the certificate.  Assuming these 3 commands work, you should have a proper JKS file when done.

Given the certificate is from a CA, clients should not need a copy of the JKS file to talk to servers that are using it.   For example, if my Presto server uses this JKS file, JDBC clients on other hosts can talk to it over SSL even though they do not have a copy of the file themselves.

NOTE: The JKS file will only work properly when used against the correct domain.  E.g. if you have a load balancer at https://load-balancer.app2.company.com pointing at your server running your JKS file which is for https://server1.app.company.com, it will not work.  You have to make a CNAME so your load balancer actually looks like it is also under app.company.com (what’s in the cert) and not app2.company.com.

Creating Java Key Stores (JKS), Setting Validity Periods, and Analyzing JKS for Expiry Dates

What is a JKS File?

Taken from: https://en.wikipedia.org/wiki/Java_KeyStore

A Java KeyStore (JKS) is a repository of security certificates – either authorization certificates or public key certificates – plus corresponding private keys, used for instance in SSL encryption.

Basically, applications like Presto use JKS files to enable them to do Transport Layer Security (TLS).

How Do You Create One?

As an example, Presto, a common big-data query tool, uses JKS for both secure internal and external communication.  At this link https://prestosql.io/docs/current/security/internal-communication.html they show you how to create one.

Here is an excerpt.  Note that you must make sure you replace *.example.com with the domain you will host the service using the JKS file on or it will not work.  Certificates are domain specific.

keytool -genkeypair -alias example.com -keyalg RSA \
    -keystore keystore.jks
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  *.example.com
What is the name of your organizational unit?
  [Unknown]:
What is the name of your organization?
  [Unknown]:
What is the name of your City or Locality?
  [Unknown]:
What is the name of your State or Province?
  [Unknown]:
What is the two-letter country code for this unit?
  [Unknown]:
Is CN=*.example.com, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?
  [no]:  yes

Enter key password for <presto>
        (RETURN if same as keystore password):

Pit Fall! – 90 Day Expiry Date – Change It?

Now… here’s a fun pitfall.  Certificates are made with an expiry date and have to be reissued periodically (for security reasons).  The default expiry date for JKS is 90 days.

https://docs.oracle.com/javase/tutorial/security/toolsign/step3.html

This certificate will be valid for 90 days, the default validity period if you don’t specify a –validity option.

This is fine on a big managed service with lots of attention.  But if you’re just TLS securing an internal app not many people see, you will probably forget to rotate it or neglect to set up appropriate automation.  Then, when it expires, things will break.

Now… for security reasons you generally shouldn’t set too high a value for certificate expiry time.  But for example purposes, here is how you would set it to 10 years.

keytool -genkeypair -alias example.com -keyalg RSA \
    -keystore keystore.jks -validity 3650

Determine Expiry Date

If you have a JKS file that you are using for your application and you are not sure when it expires, here’s a command that you can use:

keytool -list -v -keystore keystore.jk

This will output different things based on where your key store came from.  E.g. you will probably see more interesting output from a real SSL cert than you will from a self-signed one created like we did above.

In any case, you will clearly see a line in the large output from this command that says something like this:

Valid from: Fri Jul 12 01:27:21 UTC 2019 until: Thu Oct 10 01:27:21 UTC 2019

Note that in this case, it is just valid for 3 months.  Be careful when looking at this output because you may find multiple expiry dates in the output for different components of the JKS file.  You need to make sure you read the right one.  Though, chances are that the one on your domain will be the one that expires earliest anyway.

 

AWS Packer Centos 7 Example – Get AMI ID

I was very surprised to see how incredibly hard it is to determine an AMI ID in AWS for use with Packer.

I generally use Centos 7 marketplace images for my servers; e.g. CentOS 7 (x86_64) – with Updates HVM.  There is no place anywhere in the AWS UI or the linked Centos product page to actually find what the AMI ID is in a given region (and it does change per region).

I came across this stack-overflow post which was a life-saver though.  Basically, for us-east-1 as an example, you can run this command using the AWS CLI (yeah, you actually have to use the CLI – that’s how wrong this is).

aws ec2 describe-images \
      --owners aws-marketplace \
      --filters Name=product-code,Values=aw0evgkw8e5c1q413zgy5pjce \
      --query 'Images[*].[CreationDate,Name,ImageId]' \
      --filters "Name=name,Values=CentOS Linux 7*" \
      --region us-east-1 \
      --output table \
  | sort -r

And you get output like this:

|  2019-01-30T23:40:58.000Z|  CentOS Linux 7 x86_64 HVM EBS ENA 1901_01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-05713873c6794f575.4  |  ami-02eac2c0129f6376b  |
|  2018-06-13T15:53:24.000Z|  CentOS Linux 7 x86_64 HVM EBS ENA 1805_01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-77ec9308.4           |  ami-9887c6e7           |
|  2018-05-17T08:59:21.000Z|  CentOS Linux 7 x86_64 HVM EBS ENA 1804_2-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-55a2322a.4            |  ami-d5bf2caa           |
|  2018-04-04T00:06:30.000Z|  CentOS Linux 7 x86_64 HVM EBS ENA 1803_01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-8274d6ff.4           |  ami-b81dbfc5           |
|  2017-12-05T14:46:53.000Z|  CentOS Linux 7 x86_64 HVM EBS 1708_11.01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-95096eef.4            |  ami-02e98f78           |

The upper one will be the newest and probably the one you want (at least in my case).

I hope that saves you some precious googling time; it took me a while to find it since AWS’s less than admirable documentation on the subject shows up first.
 

Terraform on Docker – Run Using Current Directory as Volume

Quick Tip

You can use the following command to run a terraform apply using the current directory as the volume. This is great if you, say, do a git checkout of your repository and want to just run the terraform files from the checkout folder.

docker run -it -v $(pwd):/workpace -w /workpace hashicorp/terraform:light apply

 

Shut Down All Docker Containers Based on Internal Analysis – JupyterHub Example

I manage a few decent sized Jupyter Hub environments based on the docker spawner.  Each frequently has more than 50 users, sometimes much more…. and recently, one of the servers ran out of memory.

I have some read-only notebooks inside the user containers… so I figured that if a user only had those read only notebooks, I could shut down their docker containers.  They weren’t doing any work that could be lost.

So, I wrote this script to:

  1. List all docker containers.
  2. Get their names.
  3. Exec a bash command in them.
  4. Shut them down based on the result.

I hope it helps you with a similar docker-related issue! 🙂

CONTAINERS=`docker container ls | awk '{print $14}'`
for NAME in ${CONTAINERS}
do
  #echo $name
  COUNT=`docker exec ${NAME} ls -a | grep .ipynb | grep -v checkpoints | wc -l`
  if [[ $COUNT = 1 ]];
  then
    echo "Stopping $NAME with COUNT = $COUNT."
    docker container stop $NAME;
  fi
done

AWS + Terraform + Auto Scale Group + User Data Bash Script on Startup to Customize Image

User Data  – On Startup

If you want to customize your VM image on its first start-up, you may want to use “user data”.  You can basically think of this as a script that will be run right after boot-up the very first time.  You can also make it run every reboot apparently (with extra config).

Why would you need this?  Well, in my case, I was spawning up a Presto cluster.  I generally do this in a special HA way… but even if you did it the simple way, you would have 1 coordinator and N workers, and the N workers would have to point at your 1 coordinator.

So, there are 2 interesting things here:

  1. The coordinator and workers are identical barring some slightly different configuration in one file.
  2. The workers need to know about the coordinator in order to use it.

So, for both of these cases, we’d like to run a script on start-up!.

The Terraform Code

When you want to create an auto-scale-group, you have to start by creating a launch template: https://www.terraform.io/docs/providers/aws/r/launch_template.html.

You can use that template to spawn up multiple auto-scale groups when its is done.  The launch template itself has the user data though.  So, you are best off trying to make your user data script generic enough that it can work for all your cases.  It can be a bash file and can use variables, so this isn’t too hard.

If you do need multiple separate user data scripts you’ll have to use separate launch templates, which is not the end of the world either.

The launch template in the link above is very complete, so all I’m going to show you is how to pass a bash script that takes parameters to the user data.

Basically replace:

user_data = "${base64encode(...)}"

In their example with something like this:

user_data = base64encode(templatefile("${path.module}/worker-script.sh", {coordinator_lb = "${aws_lb.coordinator.dns_name}", hive_thrift_csv = "${var.hive_thrift_csv}"}))

Assuming your worker-script has content like this:

#!/bin/bash
echo "Hello World" > /tmp/test-output.txt

and you have the hive_thrift_csv variable defined in your variables file like this:

variable "hive_thrift_csv" {
type = "string"
default = "thrift://ip-addr-1:9083,thrift://ip-addr-2:9083"
}

you should be good. Note, the first variable, definition coordinator_lb = “${aws_lb.coordinator.dns_name}” is a reference to the DNS name from a load balancer created in another part of my terraform config. I left it in as its a good example for a more complex separate variable.