My VI Cheat Sheet

Posted on November 27, 2018 by John Humphreys

For years, I’ve been somewhat avoiding learning any advanced features of VIM. I have always predominantly relied on desktop editors for anything complex and just use VI to do basic text modification.

Anyway, I’m finally trying to change that. So, I’ll start forcing myself to do things in VIM and will record the keys here over time. I’m just starting with one command though; so it’ll be a while before this is useful! 🙂

My Cheat Sheet

Remember, generally you want to press “esc” before doing these.

Search Forward & Backwards
- Forward = /search-term
- Backward = ?search-term
Show or Hide Line Numbers
- : set number
- :set nonumber
Edit Multiple Lines (e.g. Block Comment Lines 10-20 With #)
- :10,20s/^/#/
Clear Highlight After Search
- There are some fancy ways, but just search for something that won’t exist and it will clear. For example:
  - /blahfwoeaf

Logging in Python 3 (Like Java Log4J/Logback)

Posted on November 26, 2018 by John Humphreys

What is Proper Logging?

Having a proper logger is essential for any production application. In the Java world, almost every framework automatically pulls in Logback or Log4J, and libraries tend to use SLF4J in order to be logger agnostic and to wire up to these loggers. So, I had to set out to see how to do similar logging in python.

While it can get fancier, I think the following things are essential when setting up a logger; so they were what I was looking for:

It should be externally configured from a file that your operations team can change.
It should write to a file automatically, not just console.
It should roll the file it writes to at a regular size (date or time rolling on top of that can be beneficial too; but the size restriction ensures you won’t fill up your disk with a ton of logs and break your applications).
It should keep a history of a few previous rolled files to aid debugging.
It should use a format that specifies both the time of the logs and the class that logged them.

On top of these, obviously we must be able to log at different levels and filter out which logs go to the file easily. This way, when we have issues, operations can jack up the logging level and figure out what is going wrong as needed.

How Do We Do it in Python 3?

It turns out that Python actually has a strong logging library built into its core distribution. The only extra library I had to add to use it was PyYAML, and even that could have been avoided (Python supports JSON out of the box and that could be used instead, but people seem to prefer YAML configuration in the community).

In the place where your app starts up, write the following code. Note that you have to install the PyYAML module yourself. Also, this expects the “logging.yaml” to be in the same directory as the startup code (change that if you like though). We’ll show the “logging.yaml” content lower.

import logging
import logging.config
import yaml

# Initialize the logger once as the application starts up.
with open("logging.yaml", 'rt') as f:
config = yaml.safe_load(f.read())
logging.config.dictConfig(config)

# Get an instance of the logger and use it to write a log!
# Note: Do this AFTER the config is loaded above or it won't use the config.
logger = logging.getLogger(__name__)
logger.info("Configured the logger!")

Then, when you want to use the logger in other modules, simply do this:

import logging
logger.info("Using the logger from another module.")

Of course, you just have to import logging once at the top of each module, not every time you write a log.

This code uses “logging.yaml” which contains the following settings. Note that:

It defines a formatter with the time, module name, level name, and the logging message.
It defines a rotating file handler which writes to my.log and rolls the file at 10MB, keeping 5 historic copies. The handler is set up to use our simple format from above.
The “root” logger writes to our handler and allows only INFO messages through.
The handler is set to DEBUG, so if the root logger is increased to DEBUG during an investigation, it will let the messages through to the log file.

Here is the “logging.yaml” example file:

---
version: 1
disable_existing_loggers: False

# Define format of output logs (named 'simple').
formatters:
    simple:
        format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"

handlers:

    # Create rotating file handler using 'simple' format.
    file_handler:
        class: logging.handlers.RotatingFileHandler
        level: INFO
        formatter: simple
        filename: operations-audit-query-service.log
        maxBytes: 10485760 # 10MB
        backupCount: 5
        encoding: utf8

root:

    level: INFO
    handlers: [file_handler]

References

The code and YAML for this was adapted from this very good blog which I recommend reading: https://fangpenlin.com/posts/2012/08/26/good-logging-practice-in-python/.

Python Dictionary Comprehension For Multi Level Caching

Posted on November 21, 2018 by John Humphreys

What’s the Use Case?

I was coding a multi-level cache in Python and came across dictionary comprehensions. It turns out they are very useful for this! So, it’s a nice example to teach the feature.

Let’s say our cache is like a database schema layout:

database_1
1. table_1
  1. column_1 – type 1
  2. column_2 – type 2
2. table_2
  1. column_1 – type 1
  2. column_2 – type 2
database_2
1. table_1
  1. column_1 – type 1

What’s a Dictionary Comprehension?

A dictionary comprehension basically lets you create a dictionary out of an expression. So, you can essentially say “for each value in this list create a key for my dictionary where the value is X if some condition is true”.

A couple things to note:

You can technically provide any number of input iterables. So, if you want to form a dictionary from multiple sources, it can work; but I”m not going to get into that; google elsewhere!
You can provide any number of “if” clauses to prune the results down. You could achieve this with one, but using one for each condition is neater to write.

A quick example:

>>> some_list = [1,3,8,12]
>>> {key: key * key for key in some_list if key * key % 2 == 0}
{8: 64, 12: 144}

So, here we can see that we took in a list of 4 numbers and made a dictionary out of the number and its square, but only kept the results where the square was even.

The first part “key: key * key” is really just a key : value pair. The key is on the left and the value (the key * key) could be anything you wanted on the right. You can call “key” anything you like in the “for key” section. The “in some_list” is the source collection where our input comes from – again, you can have multiple of these. Finally, the “if key % 2 == 0” is a filter condition which, again, you can have multiple of.

Why is it Useful For Multi-Level Caching?

In our database example, we must first query the list of databases, then query the list of tables for any database we care about, then query the list of columns for any table we care about.

We don’t want to cache things we don’t need.

So, first off, it would be nice to prime the cache with just the database names and empty table values like so. If the cache is already populated, we just return its top level keys which are the database names:

if cache is None:
    cache = {database: None for database in get_database()}
return list(cache.keys())

Now, what about when the user goes to list the tables in a database?

if cache[database_name] is None:
    cache[database_name] = {table: None for table in get_tables(database_name)}
return list(cache[database_name].keys())

Finally, what about when the user goes to list the columns in a database?

if cache[database_name][table_name] is None:
    cache[database_name][table_name] = get_columns(database_name, table_name)
return cache[database_name][table_name]

So, we can see here that it was trivial to use dictionary comprehensions to turn a list into a dictionary with empty keys as a utility while building the multi level cache out – which is very cool.

This might not have been the best way to build a cache – but it seems pretty simple and efficient to me. Building classes around things is usually a better approach though admittedly :).

The Python yield keyword explained

Posted on November 19, 2018 by John Humphreys

I don’t usually re-blog posts, but this person’s post is a wonderful explanation of what yield does in python, and I definitely recommend reading through it.

Python Tips

Hi there folks. Again welcome to yet another useful tutorial. This is again a stackoverflow answer. This one is related to the Python yield keyword. It explains you what yield, generators and iterables are. So without wasting any time lets continue with the answer.

To understand what yield does,

View original post 1,362 more words

Postgres Schema Creation

Posted on November 19, 2018 by John Humphreys

Historically, I have not worked with Postgres much. So, when I had to start using it, one of my first questions was how to create a schema, and how to use it for my new tables and such.

Creating a schema is exactly what you expect:

create schema myschema;

But using it is not quite what I expected. Of course, you can do the standard thing when you’re managing your objects and use . like this:

create table myschema.mytable (x int);

But what if you just want:

create table mytable (x int);

to go into myschema by default? To do this in Postgres, you have to add the schema to your search path. By default your search path will be just set to the public schema; you can view it like this:

SHOW search_path;

You can set it to one or more schemas in reality. The first schema your query sees a the named table in will be the one it takes it from. The first schema in the list will be the default one for when you create new objects too. So, if you did this:

SET search_path TO myschema;
create table mytable (x int);

Then your table would in fact be created in the “myschema” schema properly.

Coding Stream of Consciousness

by John Humphreys – Random code from my life.

Monthly Archives: November 2018