Querying LDAP From Python the Easy Way

Posted on November 2, 2022 by John Humphreys

Historically, using LDAP in python could be fairly painful because you had to install python-ldap, which could be hard depending on your environment. E.g. getting that installed in a Jupyter notebook where I work proved impossible without changing the underlying docker image for the notebook.

Most search results will still lead you to python-dap, but now you can and should use python-ldap3 instead. This library is pure-python and does not have any awkward OS dependencies. So, it “just works” and is much lighter.

Here is an example of how to login with a service account and query a user via email.

import ldap3

# Put in params up top.
SERVICE_ACCOUNT="<user>"
SERVICE_ACCOUNT_PASSWORD="<password>"
LDAP_URI="ldaps://<your-ldap-dns>:636"

search_base = 'DC=foo,DC=bar'
search_filter = '(&(mail=john.doe@somecompany.com))'
attrs = ["*"]

server = ldap3.Server(LDAP_URI)
with ldap3.Connection(server, auto_bind=True, user=SERVICE_ACCOUNT, password=SERVICE_ACCOUNT_PASSWORD) as conn:
    conn.search(search_base, search_filter, attributes=attrs)
    print(len(conn.entries))

To run this, you just have to do a quick pip install as shown below. I recommend you use the latest version; but I locked it here just to remind you that locking a version is smart in most python projects. Version drift causes many production issues.

pip install ldap3==2.9.1

Using Presto/Trino CLI w/ TLS & Passwords Enabled

Posted on September 28, 2022 by John Humphreys

Not much of a post here, just recording a good way to automatically run queries in the presto cli.

./presto-cli-350-executable.jar \
      --server https://your.cluster.dns:443 \
      --catalog hive \
      --schema default \
      --client-request-timeout "10s" \
      --user "john.humphreys" \
      --password \
      --execute "select * from some_db.some_table limit 10"

If you need the password as well, you can pipe it into the command with the yes command in Linux. I also suspect there is an environment variable you can pre-set for that, but I didn’t dig in to double check.

Python – Find IPs for DNS Name

Posted on January 13, 2022 by John Humphreys

We were recently trying to find all the IPs we needed to open in a firewall from an Apache proxy. So, we had to resolve a huge number of DNS records to IPs (and relevant ports) programmatically.

I found this very elegant way of getting all the IPs for a DNS name, I hope you find it useful!

import socket
net_info = socket.getaddrinfo("stackoverflow.com", None)
ip_list = set([x[0] for x in [x[4] for x in net_info]])
print(ip_list)

Output:

{'151.101.129.69', '151.101.1.69', '151.101.193.69', '151.101.65.69'}

Using Athena From DBeaver with your IAM Role / Profile

Posted on November 16, 2021 by John Humphreys

I just spent about 30 minutes working out how to connect to DBeaver using my normal AWS credentials file / default credentials.

Thankfully I stumbled across this GitHub and it worked like a charm: https://github.com/dbeaver/dbeaver/issues/3918#issuecomment-511484596.

Here are the relevant notes (slightly modified for easier understanding):

Do your normal AWS login process to refresh your credentials (in our case, we use okta + gimme_aws_creds for this).
Go to driver properties on your DBeaver Athena connection and set:
- AwsCredentialsProviderClass to com.simba.athena.amazonaws.auth.profile.ProfileCredentialsProvider
- AwsCredentialsProviderArguments equal to the name of the profile you want to use (see ~/.aws/config to see which profiles you have) – we use “default”.
Test Connection and it should work.

Manually Load Docker Image – Avoid Rate Limiting

Posted on November 3, 2021 by John Humphreys

You can manually load a docker image to a server when you need to. This is useful to get around DockerHub rate limiting in an urgent situation.

# On your laptop.
docker pull busybox:1.4.1
docker save busybox:1.34.1 > busybox-1-34-1.tar
aws s3 cp busybox1-341.tar s3://your-s3-bucket/busybox-1-34-1.tar

# On remote node.
aws s3 cp s3://your-s3-bucket/busybox-1-34-1.tar /tmp/busybox-1-34-1.tar
docker load -i busybox-1-34-1.tar

You can use anything that both your local host and the target host have access to. I just used s3 as it was most convenient in my case. We have SSH disabled on our production nodes, or you could have just SSH’d it across too.

Coding Stream of Consciousness

by John Humphreys – Random code from my life.