Presto – Internal TLS + Password Login; Removing Private Key from JKS File

Overview

For various reasons, you may have to secure a Presto cluster with TLS, both internally and externally.  This is pretty straight forward following Presto documentation, until you want to also combine that with an LDAP or custom password login mechanism.  Once you have internal TLS, external TLS, and LDAP, you have to play with some extra settings and manipulate your JKS files to get things done.

Internal TLS Settings

For secure internal communication, you should refer to the presto documentation right here: https://prestosql.io/docs/current/security/internal-communication.html.  It will walk you through various configuration settings that enable HTTPS, disable HTTP, and set key stores for TLS.

Part of the instructions have you generate a JKS file (Java Key Store) with a command like this:

keytool -genkeypair -alias example.com -keyalg RSA -keystore keystore.jks
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  *.example.com (Your site name should go here).

This will get your internal TLS working fine.

Adding External TLS

It would be quite pointless to secure the inside of a cluster if you didn’t secure the connections to the clients using it.  So, you’ve actually set all of the external TLS properties already when you were doing the internal security.  E.g. notice that the properties listed in the LDAP login plugin (which requires external SSL) here: https://prestosql.io/docs/current/security/ldap.html are already referenced in the doc we referred to for internal TLS here https://prestosql.io/docs/current/security/internal-communication.html.

Initially, I figured that I could configure a different JKS file for internal and external communication; but it turns out that this does not work; so don’t try it.   There is some information on that right hereYou need to use the same JKS file in all keystore configurations on the Presto servers.  So, don’t bother trying to tune the properties you already set while doing internal TLS; just keep them.

Given internal and external communication needs the same keystore, a naive second try may be to give clients the same JKS file that you use for internal TLS… but that’s a bad idea for two reasons:

  1. You’re giving away your private key and compromising security.
  2. If you go on to add password-login by LDAP or a custom password authenticator, the private key certificate will bypass it if the clients have it.

So, what you really need to do to allow clients to use TLS safely is use the same JKS file for all the server-side properties, but give clients a copy of that JKS file with the private key removed for use with JDBC/etc.

You can remove the private key from the JKS you made with the internal TLS instructions like this:

keytool -export -alias company.com -file sample.der -keystore keystore.jks
openssl x509 -inform der -in sample.der -out sample.crt
keytool -importcert -file sample.crt -keystore .keystore
The generated .keystore file can be used in JDBC or other connections by referring to it with the SSLTrustStorePath and SSLTrustStorePassword properties.  As it doesn’t have the private key, it will work for SSL, but it will not work as a login mechanism.  So, if you set up password login, clients will have to use it (which is what you want).  You can find JDBC documentation here: https://prestosql.io/docs/current/installation/jdbc.html.

Password Logins

You can do user-name and password login with LDAP out of the box using the documentation I linked earlier.  Alternatively, you can use the custom password plugin documentation I wrote a month ago here: https://coding-stream-of-consciousness.com/2019/06/18/presto-custom-password-authentication-plugin-internal/ to do your own.

In either case, while combining internal TLS and password login, you will have to modify this property:

http-server.authentication.type=PASSWORD
to say this:
http-server.authentication.type=CERTIFICATE,PASSWORD
You need this because you have to set the PASSWORD type to make password logins work… but that forces all traffic to require a password.  Internal nodes doing TLS will start asking each other for passwords and failing since they can’t do that.  So, you add CERTIFICATE to allow them to authenticate to each other using their JKS files.
This is why you had to strip the private key out of the file you gave to the clients.  If they had it and used it as a key store, they could have authenticated to the coordinator with the JKS file instead of a user name/password.  But just having the trust store with the public keys allows SSL to work while not allowing it to be used as the CERTIFICATE login mechanism.
I hope this helps you get it working! I spent longer on this than I would like to admit :).
Note: There is some good related conversation here: https://groups.google.com/forum/#!topic/presto-users/R_byjHcIS8A and here: https://groups.google.com/forum/#!topic/presto-users/TYdvs5kGYE8.  These are the google groups that helped me get this working.

 

 

Building Apache Ranger

I was not particularly thrilled to see that I have to build ranger myself to get the various binaries needed for it.

Anyway, the first thing I did was download a “release”.  There is surprisingly little information on what a “release” is or how to use it.  But, given that all installation documentation seems to ask for artifacts named like ranger-%version-number%-admin.tar.gz” and I didn’t see any gz files, I assumed it was more like a bundled source code release that had to be built.

Note: referring to documentation here: https://ranger.apache.org/quick_start_guide.html and he release I used is here: http://mirrors.sonic.net/apache/ranger/1.2.0/apache-ranger-1.2.0.tar.gz.

Docker Build Script

My initial thought was to do the build using the convenient sounding “build_ranger_using_docker.sh” script which is in the root directory.  So, I installed docker quickly, did a docker login, and ran it.  It failed! (on Centos 7 for the record).

It tried to download a version of maven which doesn’t exist on the maven site currently.  If you switch to a slightly newer one the script breaks due to the maven release artifacts being a little different too.  So, I reverted to 3.3.9 which required changes to multiple lines.

After that, it went through to the end and failed on the last step with “gosu: not found”.  There had been some scary red text higher up about “no ultimately trusted keys found” related to installing gosu.

I tried various ways of fixing this and they all failed (on Centos 7.x)… but to be honest, I didn’t invest my own time in reading up on gosu or why the various proposed solutions were failing.

Build with Maven

Giving up and building with Maven failed on Centos and my Windows 10 box with similar python errors about half way through despite one being on Python 2 and one being on Python 3.  So, building straight from source wasn’t great either.

Success

I decided to go back to the docker build.  This time, I removed some of the maven validations, used a newer version of maven (which I’m confident doesn’t matter much).  But I also removed the gosu install and usage from the final build commands.

This finally worked.  Note that my copy is hacky and doesn’t bother using the “builder” account to do the build.  But it worked at least and built the artifacts.  So, I’m happy enough for my own purposes.  If it was a long-running web app or something, I’d go work out the bugs in the docker container/gosu/etc – but that’s not required for a build utility.

After this, you see a nice listing of tar.gz files in the ./target folder like so:

antrun ranger-1.2.0-kafka-plugin.zip ranger-1.2.0-sqoop-plugin.zip
archive-tmp ranger-1.2.0-kms.tar.gz ranger-1.2.0-src.tar.gz
maven-shared-archive-resources ranger-1.2.0-kms.zip ranger-1.2.0-src.zip
ranger-1.2.0-admin.tar.gz ranger-1.2.0-knox-plugin.tar.gz ranger-1.2.0-storm-plugin.tar.gz
ranger-1.2.0-admin.zip ranger-1.2.0-knox-plugin.zip ranger-1.2.0-storm-plugin.zip
ranger-1.2.0-atlas-plugin.tar.gz ranger-1.2.0-kylin-plugin.tar.gz ranger-1.2.0-tagsync.tar.gz
ranger-1.2.0-atlas-plugin.zip ranger-1.2.0-kylin-plugin.zip ranger-1.2.0-tagsync.zip
ranger-1.2.0-hbase-plugin.tar.gz ranger-1.2.0-migration-util.tar.gz ranger-1.2.0-usersync.tar.gz
ranger-1.2.0-hbase-plugin.zip ranger-1.2.0-migration-util.zip ranger-1.2.0-usersync.zip
ranger-1.2.0-hdfs-plugin.tar.gz ranger-1.2.0-ranger-tools.tar.gz ranger-1.2.0-yarn-plugin.tar.gz
ranger-1.2.0-hdfs-plugin.zip ranger-1.2.0-ranger-tools.zip ranger-1.2.0-yarn-plugin.zip
ranger-1.2.0-hive-plugin.tar.gz ranger-1.2.0-solr-plugin.tar.gz rat.txt
ranger-1.2.0-hive-plugin.zip ranger-1.2.0-solr-plugin.zip version
ranger-1.2.0-kafka-plugin.tar.gz ranger-1.2.0-sqoop-plugin.tar.gz

Here was my final docker file.  Note that you should read up on gosu/etc before using it and I take no responsibility for any security issues; you should use the official one – if you can :).

default_command="mvn -DskipTests=true clean compile package install assembly:assembly"
build_image=0
if [ "$1" = "-build_image" ]; then
build_image=1
shift
fi

params=$*
if [ $# -eq 0 ]; then
params=$default_command
fi

image_name="ranger_dev"
remote_home=
container_name="--name ranger_build"

if [ ! -d security-admin ]; then
echo "ERROR: Run the script from root folder of source. e.g. $HOME/git/ranger"
exit 1
fi

images=`docker images | cut -f 1 -d " "`
[[ $images =~ $image_name ]] && found_image=1 || build_image=1

if [ $build_image -eq 1 ]; then
echo "Creating image $image_name ..."
docker rmi -f $image_name

docker build -t $image_name - < /scripts/mvn.sh RUN echo 'set -x; if [ "\$1" = "mvn" ]; then usermod -u \$(stat -c "%u" pom.xml) bash -c '"'"'ln -sf /.m2 \$HOME'"'"'; exec "\$@"; fi; exec "\$@" ' >> /scripts/mvn.sh

RUN chmod -R 777 /scripts
RUN chmod -R 777 /tools

ENTRYPOINT ["/scripts/mvn.sh"]
Dockerfile

fi

src_folder=`pwd`

LOCAL_M2="$HOME/.m2"
mkdir -p $LOCAL_M2
set -x
docker run --rm -v "${src_folder}:/ranger" -w "/ranger" -v "${LOCAL_M2}:${remote_home}/.m2" $container_name $image_name $params

 

Hive 3 Standalone Metastore + Presto

Hive 3.0 Standalone Metastore – Why?

Hive version 3.0 allows you to download a standalone metastore.  This is cool because it does not require you to deploy hadoop and/or run the rest of Hive’s fairly large deployment.  This makes a lot of sense because many tools that use hive for schema management do not actually care about Hive’s query engine.

For example, Presto is a clustered query engine in its own right; it has no interest in using hadoop/map-reduce to execute a query on hive data; it just wants to view and manage hive’s metadata through its thrift metastore interface.  Similarly, Apache Spark loves to work with hive, but it actually goes directly to the underlying database for performance reasons and works against that.  So, it also does not need hive’s query engine.

Can/Should We Use It?

Unfortunately, Presto only currently supports Hive 2.X.  From it’s own documentation: “The Hive connector supports Apache Hadoop 2.x and derivative distributions including Cloudera CDH 5 and Hortonworks Data Platform (HDP).”

If you read online though, you will find that it does seem to work… but with limited features.  If you look at this git entry for example: https://groups.google.com/forum/#!topic/presto-users/iAeEecsnS9I, you will see:

“We have tested Presto 0.203e with Hive 3.0 Metastore, and it works fine. We tested it by running TPC-DS queries, and Presto completed all 99 queries.”

But lower down, you will see:

However, Presto is not able to read Hive managed (transactional tables) in Hive 3.x…

Yes, this is a known limitation.

Unfortunately, transactional ACID v2 tables are the default for Hive 3.x.  So, basically all managed tables will not work in Hive 3.x even though external tables will work.  So, it might be okay to use it if you only do external tables… but in our case we let people use Spark however they like and they likely create many managed tables.  So, this rules out using Hive 3.0 with the standalone metastore for us.

I’m going to see if Hive 2.0 can be run without the hive server and hadoop next.

Site Note – SchemaTool

I would just like to make a side-note that while I did manage to run the Hive Standalone Metastore without installing hadoop, I did have to install (but not run) hadoop in order to use the schematool provided with hive for creating the hive RDMBS schema.  This is due to library dependencies.

There is a “create on first run” config you can do instead of this as well but they don’t recommend using it in production; so just keep that in mind.

Useful Links

SCP/SSH With Different Private Key

If you need to use SSH or SCP with a different private key file, just specify it with -i.  For example, to copy logs from a remote server using a specific private key file and user, do the following:

scp -i C:\Users\[your-user]\.ssh\pk_file [user]@[ip-addr]:/path/logs/* .

This -i will work regardless of OS, but the example is SSHing to a Linux server from a Windows server assuming you store your private keys in your user .ssh directory.

Java – Regular/Scheduled Task, One Run at a Time

This will be a very short post, and I’m mostly writing it just so it sticks in my head better.

Common Use Cases

There are many times when you may find that you need to regularly run a task in Java.  Here are a few common examples:

  • You have a cache you need to refresh every X minutes to power a dashboard or something similar.
  • You need to prune old files from a file system once an hour.
  • You need to regularly update stats counters for monitoring.

Coding Options

There are a lot of ways to do this, but the recommended approach would be to use a scheduled executor.  Now… this part is easy to remember, but what is sometimes hard to remember is that you have two options when scheduling a task.  I often find myself picking the wrong one as it pops up in Intelli-sense and I forget there are 2 options.

  1. Run the task every X seconds/minutes/etc no matter what.
  2. Run the task every X seconds/minutes/etc *after* the previous task completed.

These two things can be very different.  If you have a task that only takes a couple of seconds, it probably doesn’t matter much.  But if you have a task that takes 2 minutes and you’re running it every 1 minute, then with option 1 you will always be running at least 2 copies of the task, whereas with option 2 you’ll just be running one copy at a time with a minute of buffer in between each task.

For both options, you can create the scheduled executor service the same way:

ScheduledExecutorService se = Executors.newSingleThreadScheduledExecutor();

But for option #1 (run every interval regardless of previous tasks), you would use this function:

se.scheduleAtFixedRate(this::refreshCache, 10, 120, TimeUnit.SECONDS);

And for option #2 (start counting after previous task completes), you would use this function.

se.scheduleWithFixedDelay(this::refreshCache, 10, 120, TimeUnit.SECONDS);

 

Extending a LVM Volume (e.g. /opt) in Cenots 7

What Does LVM Mean?

Taking a description from The Geek Diary:

The Logical Volume Manager (LVM) introduces an extra layer between the physical disks and the file system allowing file systems to:

  • Be resized and moved easily and online without requiring a system-wide outage.
  • Use discontinuous space on disk.
  • Have meaningful names to volumes, rather than the usual cryptic device names.
  • Span multiple physical disks.

Extending a LVM Volume (e.g. /opt):

Run “vgs” to display information on the available volume groups. This will tell you if you have “free” space that you can allocate to one of the existing logical volumes. In our case, we have 30 GB free.

$> vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  rootvg   1   7   0 wz--n- <63.00g <30.00g

Run “lvs” to display the logical volumes on your system and their sizes. Find the one you want to extend.

$> lvs
  LV     VG     Attr       LSize
  homelv rootvg -wi-ao----  1.00g
  optlv  rootvg -wi-ao----  2.00g
  rootlv rootvg -wi-ao----  8.00g
  swaplv rootvg -wi-ao----  2.00g
  tmplv  rootvg -wi-ao----  2.00g
  usrlv  rootvg -wi-ao---- 10.00g
  varlv  rootvg -wi-ao----  8.00g

Extend the logical volume using “lvextend”. In our case, I’m moving /opt from 2g to 5g.

$> lvextend -L 5g rootvg/optlv

Display the logical volumes again if you like. You won’t see a change yet, it will still say 2.00g.

Use df -hT to show what kind of file system you are using for the volume you resized. This can change the next command you have to do.

$> df -hT
Filesystem                Type      ...
/dev/mapper/rootvg-rootlv ext4      ...
devtmpfs                  devtmpfs  ...
tmpfs                     tmpfs     ...
tmpfs                     tmpfs     ...
tmpfs                     tmpfs     ...
/dev/mapper/rootvg-usrlv  ext4      ...
/dev/sda1                 ext4      ...
/dev/mapper/rootvg-optlv  ext4      ...
/dev/mapper/rootvg-tmplv  ext4      ...
/dev/mapper/rootvg-varlv  ext4      ...
/dev/mapper/rootvg-homelv ext4      ...
/dev/sdb1                 ext4      ...
tmpfs                     tmpfs     ...

If it is ext4, you can use the following command to tell the system to recognize the extended volume. If it is not, you will have to find the appropriate command for the given file system.

$> resize2fs /dev/mapper/rootvg-optlv

Now you should see the extended volume size in “lvs” or “df -h”; and you’re done!

$> df -h
Filesystem                 Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-rootlv  7.8G   76M  7.3G   2% /
devtmpfs                   3.9G     0  3.9G   0% /dev
tmpfs                      3.9G  4.0K  3.9G   1% /dev/shm
tmpfs                      3.9G  130M  3.8G   4% /run
tmpfs                      3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/mapper/rootvg-usrlv   9.8G  2.6G  6.7G  29% /usr
/dev/sda1                  976M  119M  790M  14% /boot
/dev/mapper/rootvg-optlv   4.9G  1.9G  2.9G  40% /opt
/dev/mapper/rootvg-tmplv   2.0G   11M  1.8G   1% /tmp
/dev/mapper/rootvg-varlv   7.8G  3.2G  4.2G  44% /var
/dev/mapper/rootvg-homelv  976M   49M  861M   6% /home
/dev/sdb1                   16G   45M   15G   1% /mnt/resource
tmpfs                      797M     0  797M   0% /run/user/1000

Azure CLI Get Scale Set Private IP Addresses

Getting Scale Set Private IPs is Hard

I have found that it is impressively difficult to get the private IP addresses of Azure scale set instances in almost every tool.

For example, if you go and create a scale set in Terraform, even Terraform will not provide you the addresses or a way to look them up to act upon them in future steps.  Similarly, you cannot easily list the addresses in Ansible.

You can make dynamic inventories in Ansible based on scripts though.  So, in order to make an ansible playbook target the nodes in a recently created scale set dynamically, I decided to use a dynamic inventory created by the Azure CLI.

Azure CLI Command

Here is an azure CLI command (version 2.0.58) which directly lists the IP addresses of scale set nodes.  I hope it helps you as it has helped me.  It took a while to build it out from the docs but its pretty simple now that it’s done.

az vmss nic list --resource-group YourRgName \
--vmss-name YourVmssName \
--query "[].ipConfigurations[].privateIpAddress"

The output will look similar to this, though I just changed the IP addresses to fake ones here an an example.

[
"123.123.123.123",
"123.123.123.124"
]