Presto – Internal TLS + Password Login; Removing Private Key from JKS File

Overview

For various reasons, you may have to secure a Presto cluster with TLS, both internally and externally.  This is pretty straight forward following Presto documentation, until you want to also combine that with an LDAP or custom password login mechanism.  Once you have internal TLS, external TLS, and LDAP, you have to play with some extra settings and manipulate your JKS files to get things done.

Internal TLS Settings

For secure internal communication, you should refer to the presto documentation right here: https://prestosql.io/docs/current/security/internal-communication.html.  It will walk you through various configuration settings that enable HTTPS, disable HTTP, and set key stores for TLS.

Part of the instructions have you generate a JKS file (Java Key Store) with a command like this:

keytool -genkeypair -alias example.com -keyalg RSA -keystore keystore.jks
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  *.example.com (Your site name should go here).

This will get your internal TLS working fine.

Adding External TLS

It would be quite pointless to secure the inside of a cluster if you didn’t secure the connections to the clients using it.  So, you’ve actually set all of the external TLS properties already when you were doing the internal security.  E.g. notice that the properties listed in the LDAP login plugin (which requires external SSL) here: https://prestosql.io/docs/current/security/ldap.html are already referenced in the doc we referred to for internal TLS here https://prestosql.io/docs/current/security/internal-communication.html.

Initially, I figured that I could configure a different JKS file for internal and external communication; but it turns out that this does not work; so don’t try it.   There is some information on that right hereYou need to use the same JKS file in all keystore configurations on the Presto servers.  So, don’t bother trying to tune the properties you already set while doing internal TLS; just keep them.

Given internal and external communication needs the same keystore, a naive second try may be to give clients the same JKS file that you use for internal TLS… but that’s a bad idea for two reasons:

  1. You’re giving away your private key and compromising security.
  2. If you go on to add password-login by LDAP or a custom password authenticator, the private key certificate will bypass it if the clients have it.

So, what you really need to do to allow clients to use TLS safely is use the same JKS file for all the server-side properties, but give clients a copy of that JKS file with the private key removed for use with JDBC/etc.

You can remove the private key from the JKS you made with the internal TLS instructions like this:

keytool -export -alias company.com -file sample.der -keystore keystore.jks
openssl x509 -inform der -in sample.der -out sample.crt
keytool -importcert -file sample.crt -keystore .keystore
The generated .keystore file can be used in JDBC or other connections by referring to it with the SSLTrustStorePath and SSLTrustStorePassword properties.  As it doesn’t have the private key, it will work for SSL, but it will not work as a login mechanism.  So, if you set up password login, clients will have to use it (which is what you want).  You can find JDBC documentation here: https://prestosql.io/docs/current/installation/jdbc.html.

Password Logins

You can do user-name and password login with LDAP out of the box using the documentation I linked earlier.  Alternatively, you can use the custom password plugin documentation I wrote a month ago here: https://coding-stream-of-consciousness.com/2019/06/18/presto-custom-password-authentication-plugin-internal/ to do your own.

In either case, while combining internal TLS and password login, you will have to modify this property:

http-server.authentication.type=PASSWORD
to say this:
http-server.authentication.type=CERTIFICATE,PASSWORD
You need this because you have to set the PASSWORD type to make password logins work… but that forces all traffic to require a password.  Internal nodes doing TLS will start asking each other for passwords and failing since they can’t do that.  So, you add CERTIFICATE to allow them to authenticate to each other using their JKS files.
This is why you had to strip the private key out of the file you gave to the clients.  If they had it and used it as a key store, they could have authenticated to the coordinator with the JKS file instead of a user name/password.  But just having the trust store with the public keys allows SSL to work while not allowing it to be used as the CERTIFICATE login mechanism.
I hope this helps you get it working! I spent longer on this than I would like to admit :).
Note: There is some good related conversation here: https://groups.google.com/forum/#!topic/presto-users/R_byjHcIS8A and here: https://groups.google.com/forum/#!topic/presto-users/TYdvs5kGYE8.  These are the google groups that helped me get this working.

 

 

Convert CA Certificate to JKS File (e.g. For Presto)

Many applications require JKS files to enable TLS (Transport Layer Security).  In case you are not sure what a JKS file is, you can read about what a JKS file is and see how to make a self-signed one right here.

Converting a CA Certificate to a JKS File

To convert the files a CA provides you into a JKS file you can do the following, which is lightly modified from this other article I followed.

cat /etc/ssl/certs/ca-bundle.crt IntermediateCA.crt > ca-certs.pem

openssl pkcs12 -export -in ssl_certificate.crt -inkey app.key -chain -CAfile ca-certs.pem -name "*.app.company.com" -out app.p12

keytool -importkeystore -deststorepass Password123! -destkeystore app.jks -srckeystore app.p12 -srcstoretype PKCS12

Note that the domain must match the one specified in the certificate.  Assuming these 3 commands work, you should have a proper JKS file when done.

Given the certificate is from a CA, clients should not need a copy of the JKS file to talk to servers that are using it.   For example, if my Presto server uses this JKS file, JDBC clients on other hosts can talk to it over SSL even though they do not have a copy of the file themselves.

NOTE: The JKS file will only work properly when used against the correct domain.  E.g. if you have a load balancer at https://load-balancer.app2.company.com pointing at your server running your JKS file which is for https://server1.app.company.com, it will not work.  You have to make a CNAME so your load balancer actually looks like it is also under app.company.com (what’s in the cert) and not app2.company.com.

Creating Java Key Stores (JKS), Setting Validity Periods, and Analyzing JKS for Expiry Dates

What is a JKS File?

Taken from: https://en.wikipedia.org/wiki/Java_KeyStore

A Java KeyStore (JKS) is a repository of security certificates – either authorization certificates or public key certificates – plus corresponding private keys, used for instance in SSL encryption.

Basically, applications like Presto use JKS files to enable them to do Transport Layer Security (TLS).

How Do You Create One?

As an example, Presto, a common big-data query tool, uses JKS for both secure internal and external communication.  At this link https://prestosql.io/docs/current/security/internal-communication.html they show you how to create one.

Here is an excerpt.  Note that you must make sure you replace *.example.com with the domain you will host the service using the JKS file on or it will not work.  Certificates are domain specific.

keytool -genkeypair -alias example.com -keyalg RSA \
    -keystore keystore.jks
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  *.example.com
What is the name of your organizational unit?
  [Unknown]:
What is the name of your organization?
  [Unknown]:
What is the name of your City or Locality?
  [Unknown]:
What is the name of your State or Province?
  [Unknown]:
What is the two-letter country code for this unit?
  [Unknown]:
Is CN=*.example.com, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?
  [no]:  yes

Enter key password for <presto>
        (RETURN if same as keystore password):

Pit Fall! – 90 Day Expiry Date – Change It?

Now… here’s a fun pitfall.  Certificates are made with an expiry date and have to be reissued periodically (for security reasons).  The default expiry date for JKS is 90 days.

https://docs.oracle.com/javase/tutorial/security/toolsign/step3.html

This certificate will be valid for 90 days, the default validity period if you don’t specify a –validity option.

This is fine on a big managed service with lots of attention.  But if you’re just TLS securing an internal app not many people see, you will probably forget to rotate it or neglect to set up appropriate automation.  Then, when it expires, things will break.

Now… for security reasons you generally shouldn’t set too high a value for certificate expiry time.  But for example purposes, here is how you would set it to 10 years.

keytool -genkeypair -alias example.com -keyalg RSA \
    -keystore keystore.jks -validity 3650

Determine Expiry Date

If you have a JKS file that you are using for your application and you are not sure when it expires, here’s a command that you can use:

keytool -list -v -keystore keystore.jk

This will output different things based on where your key store came from.  E.g. you will probably see more interesting output from a real SSL cert than you will from a self-signed one created like we did above.

In any case, you will clearly see a line in the large output from this command that says something like this:

Valid from: Fri Jul 12 01:27:21 UTC 2019 until: Thu Oct 10 01:27:21 UTC 2019

Note that in this case, it is just valid for 3 months.  Be careful when looking at this output because you may find multiple expiry dates in the output for different components of the JKS file.  You need to make sure you read the right one.  Though, chances are that the one on your domain will be the one that expires earliest anyway.

 

Hive + Presto + Ranger Version Hell

My Use Case

I was trying to test out Apache Ranger in order to give Presto column-level security over hive data.  Presto itself doesn’t seem to support Ranger yet, though some github entries suggest it will soon.  Ranger can integrate with hive though so that when presto queries hive, the security can work fine (apparently).

Conflicting Versions

I started off by deploying a version of Hive I’ve worked with before; 2.3.5, the latest 2.x version (I avoided 3.x).  After that, I deployed Presto .220, also the latest version.

This was all working great, so I moved on to Ranger.  This is when I found out that the Ranger docs specifically say that it only works with Hive version 1.2.0:

Apache Ranger version 0.5.x is compatible with only the component versions mentioned below

HIVE 1.2.0 https://hive.apache.org/downloads.html

That came from this link: https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5.0+Installation.

Alternative Options

I have a fairly stringent need for the security Ranger provides.  So, I was willing to use a 1.x version of hive, depending on what the feature loss was.  After all, quite a few big providers seem to use 1.x.

Unfortunately, the next thing I noticed was that Presto says: “The Hive connector supports Apache Hadoop 2.x and derivative distributions including Cloudera CDH 5 and Hortonworks Data Platform (HDP).”

That is coming from its latest documentation: https://prestodb.github.io/docs/current/connector/hive.html.

I’m not particularly excited to start digging through old versions of Presto as well.

Next Steps

I’m going to try to stick with Hive 2.x for now and a modern version of Presto.  So, my options are:

  1. Research Ranger more and see if it can actually work with Hive 2.x.  Various vendors seem to use Ranger and Hive/Presto together; so I’m curious to see how.  Maybe the documentation on Ranger is just out of date (I know, being hopeful).
  2. Look at Ranger alternatives like Apache Sentry and see if they support Hive 2.x.  Apparently Ranger is beating out Sentry in features, usage, and future support… so I’m not excited about using Sentry.  But if it works, I can always migrate back to Ranger once its support grows for either Hive or Presto.

Update

I starting digging in from JIRA and mailing lists and found that Ranger appears to have had work done on it as early as 2017 for supporting hive 2.3.2.  Here’s a link.  https://issues.apache.org/jira/browse/RANGER-1927.

So, I’m going to give installing ranger a shot on 2.3.5 and see if it works.  If not, I’ll try with 2.3.2 and/or seek community help.  Hopefully I’ll come back and update this afterward with some good news :).