Python Dictionary Comprehension For Multi Level Caching

What’s the Use Case?

I was coding a multi-level cache in Python and came across dictionary comprehensions.  It turns out they are very useful for this! So, it’s a nice example to teach the feature.

Let’s say our cache is like a database schema layout:

  1. database_1
    1. table_1
      1. column_1 – type 1
      2. column_2 – type 2
    2. table_2
      1. column_1 – type 1
      2. column_2 – type 2
  2. database_2
    1. table_1
      1. column_1 – type 1

What’s a Dictionary Comprehension?

A dictionary comprehension basically lets you create a dictionary out of an expression.  So, you can essentially say “for each value in this list create a key for my dictionary where the value is X if some condition is true”.

A couple things to note:

  • You can technically provide any number of input iterables. So, if you want to form a dictionary from multiple sources, it can work; but I”m not going to get into that; google elsewhere!
  • You can provide any number of “if” clauses to prune the results down.  You could achieve this with one, but using one for each condition is neater to write.

A quick example:

>>> some_list = [1,3,8,12]
>>> {key: key * key for key in some_list if key * key % 2 == 0}
{8: 64, 12: 144}

So, here we can see that we took in a list of 4 numbers and made a dictionary out of the number and its square, but only kept the results where the square was even.

The first part “key: key * key” is really just a key : value pair.  The key is on the left and the value (the key * key) could be anything you wanted on the right.  You can call “key” anything you like in the “for key” section.  The “in some_list” is the source collection where our input comes from – again, you can have multiple of these.  Finally, the “if key % 2 == 0” is a filter condition which, again, you can have multiple of.

Why is it Useful For Multi-Level Caching?

In our database example, we must first query the list of databases, then query the list of tables for any database we care about, then query the list of columns for any table we care about.

We don’t want to cache things we don’t need.

So, first off, it would be nice to prime the cache with just the database names and empty table values like so. If the cache is already populated, we just return its top level keys which are the database names:

if cache is None:
    cache = {database: None for database in get_database()}
return list(cache.keys())

Now, what about when the user goes to list the tables in a database?

if cache[database_name] is None:
    cache[database_name] = {table: None for table in get_tables(database_name)}
return list(cache[database_name].keys())

Finally, what about when the user goes to list the columns in a database?

if cache[database_name][table_name] is None:
    cache[database_name][table_name] = get_columns(database_name, table_name)
return cache[database_name][table_name]

So, we can see here that it was trivial to use dictionary comprehensions to turn a list into a dictionary with empty keys as a utility while building the multi level cache out – which is very cool.

This might not have been the best way to build a cache – but it seems pretty simple and efficient to me. Building classes around things is usually a better approach though admittedly :).