Linux Neatly Manage SSH Sessions Like MRemoteNG on Windows

SSH technically works the same anywhere, but on Windows without special tooling, using SSH can be cumbersome as the CLI is not very… elegant?  Generally, on Windows, people use tools like putty or MRemoteNG (also putty based) to make things neater.

If you’re on Linux, you can use SSH out of the box very cleanly by leveraging the config file to create named hosts.

For example, this verbose command:

ssh -i ~/.ssh/prod-key.pem centos@

Can be replaced by the following command – and the nice “pgadmin-prod” name even auto-completes.  So, if you have 100 servers for different things, you can still find them easily.

ssh pgadmin-prod

This works if you add the following block into your ~/.ssh/config file.

Host pgadmin-prod
User centos
IdentityFile ~/.ssh/prod-key.pem


  • Host = the user-friendly name I want to use for this connection.
  • User = the user I normally have to log in as.
  • HostName = the IP address of the target host.
  • IdentityFile = the path to the private key used for the connection.

You can have as many named hosts as you like and, again, their names will auto complete after the ssh command – so they’re very easy to use.

This gets even more effective if you combine it with a screen/session management app like tmux or screen.  So, I recommend you look into those too if you don’t use them already (tmux is a little more modern if you’re a new to both).

I hope this helps you be a little more efficient, and thanks for reading.

Spring Boot @Autowired Abstract/Base Class

Properly using (and not overusing) abstract classes is vital to having a cleanly organized and DRY application.  Developers tend to get confused when trying to inject services into abstract or base classes in Spring Boot though.

Setter injection actually works fine in base classes; but as noted here, you should  make sure to use the final keyword on the setter to prevent it from being overridden.

public abstract class AbstractSchemaService {
    private StringShorteningService shortener;

    public final void setStringShorteningService(StringShorteningService shortener) {
        this.shortener = shortener;

Presto – Internal TLS + Password Login; Removing Private Key from JKS File


For various reasons, you may have to secure a Presto cluster with TLS, both internally and externally.  This is pretty straight forward following Presto documentation, until you want to also combine that with an LDAP or custom password login mechanism.  Once you have internal TLS, external TLS, and LDAP, you have to play with some extra settings and manipulate your JKS files to get things done.

Internal TLS Settings

For secure internal communication, you should refer to the presto documentation right here:  It will walk you through various configuration settings that enable HTTPS, disable HTTP, and set key stores for TLS.

Part of the instructions have you generate a JKS file (Java Key Store) with a command like this:

keytool -genkeypair -alias -keyalg RSA -keystore keystore.jks
Enter keystore password:
Re-enter new password:
What is your first and last name?
  [Unknown]:  * (Your site name should go here).

This will get your internal TLS working fine.

Adding External TLS

It would be quite pointless to secure the inside of a cluster if you didn’t secure the connections to the clients using it.  So, you’ve actually set all of the external TLS properties already when you were doing the internal security.  E.g. notice that the properties listed in the LDAP login plugin (which requires external SSL) here: are already referenced in the doc we referred to for internal TLS here

Initially, I figured that I could configure a different JKS file for internal and external communication; but it turns out that this does not work; so don’t try it.   There is some information on that right hereYou need to use the same JKS file in all keystore configurations on the Presto servers.  So, don’t bother trying to tune the properties you already set while doing internal TLS; just keep them.

Given internal and external communication needs the same keystore, a naive second try may be to give clients the same JKS file that you use for internal TLS… but that’s a bad idea for two reasons:

  1. You’re giving away your private key and compromising security.
  2. If you go on to add password-login by LDAP or a custom password authenticator, the private key certificate will bypass it if the clients have it.

So, what you really need to do to allow clients to use TLS safely is use the same JKS file for all the server-side properties, but give clients a copy of that JKS file with the private key removed for use with JDBC/etc.

You can remove the private key from the JKS you made with the internal TLS instructions like this:

keytool -export -alias -file sample.der -keystore keystore.jks
openssl x509 -inform der -in sample.der -out sample.crt
keytool -importcert -file sample.crt -keystore .keystore
The generated .keystore file can be used in JDBC or other connections by referring to it with the SSLTrustStorePath and SSLTrustStorePassword properties.  As it doesn’t have the private key, it will work for SSL, but it will not work as a login mechanism.  So, if you set up password login, clients will have to use it (which is what you want).  You can find JDBC documentation here:

Password Logins

You can do user-name and password login with LDAP out of the box using the documentation I linked earlier.  Alternatively, you can use the custom password plugin documentation I wrote a month ago here: to do your own.

In either case, while combining internal TLS and password login, you will have to modify this property:

to say this:
You need this because you have to set the PASSWORD type to make password logins work… but that forces all traffic to require a password.  Internal nodes doing TLS will start asking each other for passwords and failing since they can’t do that.  So, you add CERTIFICATE to allow them to authenticate to each other using their JKS files.
This is why you had to strip the private key out of the file you gave to the clients.  If they had it and used it as a key store, they could have authenticated to the coordinator with the JKS file instead of a user name/password.  But just having the trust store with the public keys allows SSL to work while not allowing it to be used as the CERTIFICATE login mechanism.
I hope this helps you get it working! I spent longer on this than I would like to admit :).
Note: There is some good related conversation here:!topic/presto-users/R_byjHcIS8A and here:!topic/presto-users/TYdvs5kGYE8.  These are the google groups that helped me get this working.



Building Apache Ranger

I was not particularly thrilled to see that I have to build ranger myself to get the various binaries needed for it.

Anyway, the first thing I did was download a “release”.  There is surprisingly little information on what a “release” is or how to use it.  But, given that all installation documentation seems to ask for artifacts named like ranger-%version-number%-admin.tar.gz” and I didn’t see any gz files, I assumed it was more like a bundled source code release that had to be built.

Note: referring to documentation here: and he release I used is here:

Docker Build Script

My initial thought was to do the build using the convenient sounding “” script which is in the root directory.  So, I installed docker quickly, did a docker login, and ran it.  It failed! (on Centos 7 for the record).

It tried to download a version of maven which doesn’t exist on the maven site currently.  If you switch to a slightly newer one the script breaks due to the maven release artifacts being a little different too.  So, I reverted to 3.3.9 which required changes to multiple lines.

After that, it went through to the end and failed on the last step with “gosu: not found”.  There had been some scary red text higher up about “no ultimately trusted keys found” related to installing gosu.

I tried various ways of fixing this and they all failed (on Centos 7.x)… but to be honest, I didn’t invest my own time in reading up on gosu or why the various proposed solutions were failing.

Build with Maven

Giving up and building with Maven failed on Centos and my Windows 10 box with similar python errors about half way through despite one being on Python 2 and one being on Python 3.  So, building straight from source wasn’t great either.


I decided to go back to the docker build.  This time, I removed some of the maven validations, used a newer version of maven (which I’m confident doesn’t matter much).  But I also removed the gosu install and usage from the final build commands.

This finally worked.  Note that my copy is hacky and doesn’t bother using the “builder” account to do the build.  But it worked at least and built the artifacts.  So, I’m happy enough for my own purposes.  If it was a long-running web app or something, I’d go work out the bugs in the docker container/gosu/etc – but that’s not required for a build utility.

After this, you see a nice listing of tar.gz files in the ./target folder like so:

archive-tmp ranger-1.2.0-kms.tar.gz ranger-1.2.0-src.tar.gz
ranger-1.2.0-admin.tar.gz ranger-1.2.0-knox-plugin.tar.gz ranger-1.2.0-storm-plugin.tar.gz
ranger-1.2.0-atlas-plugin.tar.gz ranger-1.2.0-kylin-plugin.tar.gz ranger-1.2.0-tagsync.tar.gz
ranger-1.2.0-hbase-plugin.tar.gz ranger-1.2.0-migration-util.tar.gz ranger-1.2.0-usersync.tar.gz
ranger-1.2.0-hdfs-plugin.tar.gz ranger-1.2.0-ranger-tools.tar.gz ranger-1.2.0-yarn-plugin.tar.gz
ranger-1.2.0-hive-plugin.tar.gz ranger-1.2.0-solr-plugin.tar.gz rat.txt version
ranger-1.2.0-kafka-plugin.tar.gz ranger-1.2.0-sqoop-plugin.tar.gz

Here was my final docker file.  Note that you should read up on gosu/etc before using it and I take no responsibility for any security issues; you should use the official one – if you can :).

default_command="mvn -DskipTests=true clean compile package install assembly:assembly"
if [ "$1" = "-build_image" ]; then

if [ $# -eq 0 ]; then

container_name="--name ranger_build"

if [ ! -d security-admin ]; then
echo "ERROR: Run the script from root folder of source. e.g. $HOME/git/ranger"
exit 1

images=`docker images | cut -f 1 -d " "`
[[ $images =~ $image_name ]] && found_image=1 || build_image=1

if [ $build_image -eq 1 ]; then
echo "Creating image $image_name ..."
docker rmi -f $image_name

docker build -t $image_name - < /scripts/ RUN echo 'set -x; if [ "\$1" = "mvn" ]; then usermod -u \$(stat -c "%u" pom.xml) bash -c '"'"'ln -sf /.m2 \$HOME'"'"'; exec "\$@"; fi; exec "\$@" ' >> /scripts/

RUN chmod -R 777 /scripts
RUN chmod -R 777 /tools

ENTRYPOINT ["/scripts/"]



mkdir -p $LOCAL_M2
set -x
docker run --rm -v "${src_folder}:/ranger" -w "/ranger" -v "${LOCAL_M2}:${remote_home}/.m2" $container_name $image_name $params


Hive 3 Standalone Metastore + Presto

Hive 3.0 Standalone Metastore – Why?

Hive version 3.0 allows you to download a standalone metastore.  This is cool because it does not require you to deploy hadoop and/or run the rest of Hive’s fairly large deployment.  This makes a lot of sense because many tools that use hive for schema management do not actually care about Hive’s query engine.

For example, Presto is a clustered query engine in its own right; it has no interest in using hadoop/map-reduce to execute a query on hive data; it just wants to view and manage hive’s metadata through its thrift metastore interface.  Similarly, Apache Spark loves to work with hive, but it actually goes directly to the underlying database for performance reasons and works against that.  So, it also does not need hive’s query engine.

Can/Should We Use It?

Unfortunately, Presto only currently supports Hive 2.X.  From it’s own documentation: “The Hive connector supports Apache Hadoop 2.x and derivative distributions including Cloudera CDH 5 and Hortonworks Data Platform (HDP).”

If you read online though, you will find that it does seem to work… but with limited features.  If you look at this git entry for example:!topic/presto-users/iAeEecsnS9I, you will see:

“We have tested Presto 0.203e with Hive 3.0 Metastore, and it works fine. We tested it by running TPC-DS queries, and Presto completed all 99 queries.”

But lower down, you will see:

However, Presto is not able to read Hive managed (transactional tables) in Hive 3.x…

Yes, this is a known limitation.

Unfortunately, transactional ACID v2 tables are the default for Hive 3.x.  So, basically all managed tables will not work in Hive 3.x even though external tables will work.  So, it might be okay to use it if you only do external tables… but in our case we let people use Spark however they like and they likely create many managed tables.  So, this rules out using Hive 3.0 with the standalone metastore for us.

I’m going to see if Hive 2.0 can be run without the hive server and hadoop next.

Site Note – SchemaTool

I would just like to make a side-note that while I did manage to run the Hive Standalone Metastore without installing hadoop, I did have to install (but not run) hadoop in order to use the schematool provided with hive for creating the hive RDMBS schema.  This is due to library dependencies.

There is a “create on first run” config you can do instead of this as well but they don’t recommend using it in production; so just keep that in mind.

Useful Links

SCP/SSH With Different Private Key

If you need to use SSH or SCP with a different private key file, just specify it with -i.  For example, to copy logs from a remote server using a specific private key file and user, do the following:

scp -i C:\Users\[your-user]\.ssh\pk_file [user]@[ip-addr]:/path/logs/* .

This -i will work regardless of OS, but the example is SSHing to a Linux server from a Windows server assuming you store your private keys in your user .ssh directory.

Java – Regular/Scheduled Task, One Run at a Time

This will be a very short post, and I’m mostly writing it just so it sticks in my head better.

Common Use Cases

There are many times when you may find that you need to regularly run a task in Java.  Here are a few common examples:

  • You have a cache you need to refresh every X minutes to power a dashboard or something similar.
  • You need to prune old files from a file system once an hour.
  • You need to regularly update stats counters for monitoring.

Coding Options

There are a lot of ways to do this, but the recommended approach would be to use a scheduled executor.  Now… this part is easy to remember, but what is sometimes hard to remember is that you have two options when scheduling a task.  I often find myself picking the wrong one as it pops up in Intelli-sense and I forget there are 2 options.

  1. Run the task every X seconds/minutes/etc no matter what.
  2. Run the task every X seconds/minutes/etc *after* the previous task completed.

These two things can be very different.  If you have a task that only takes a couple of seconds, it probably doesn’t matter much.  But if you have a task that takes 2 minutes and you’re running it every 1 minute, then with option 1 you will always be running at least 2 copies of the task, whereas with option 2 you’ll just be running one copy at a time with a minute of buffer in between each task.

For both options, you can create the scheduled executor service the same way:

ScheduledExecutorService se = Executors.newSingleThreadScheduledExecutor();

But for option #1 (run every interval regardless of previous tasks), you would use this function:

se.scheduleAtFixedRate(this::refreshCache, 10, 120, TimeUnit.SECONDS);

And for option #2 (start counting after previous task completes), you would use this function.

se.scheduleWithFixedDelay(this::refreshCache, 10, 120, TimeUnit.SECONDS);