Dependency Management | Coding Stream of Consciousness

Module Preparation

The first thing I had to do was prepare the package so that it was deployable using the standard python distribution style.

In my case, I just made a directory for my package (lower-case name, underscore separators). Inside the directory, I created:

An empty __init__.py file inside the directory (to imply it is a package).

A setup.py file. I basically just copied the one from the documentation here: https://packaging.python.org/tutorials/packaging-projects/#packaging-your-project.

A README.md file optionally containing some markdown syntax details about your package.

A module/file with a python class/my code.

Here’s an example. Ignore everything that I didn’t mention; all of that is auto-generated by PyCharm and not relevant. In fact, it probably would have been better to create a sub-directory in this project for the package; but I just considered the top level directory the package directory for now.

Module Installation

Once you have your module set up like this, you can jump into your command line, assuming you have PIP installed, and you can run this command (tailored for your package directory location):

λ pip install -e C:\dev\python\jupyter_audit_query_tools Obtaining file:///C:/dev/python/jupyter_audit_query_tools Installing collected packages: PostgresQueryRunner Running setup.py develop for PostgresQueryRunner Successfully installed PostgresQueryRunner

You’ll also be able to see the package mapped to that directory when you list the packages in PIP:

λ pip list | grep postgres postgres-query-runner 1.1.0 c:\dev\python\jupyter_audit_query_tools

Module Usage

After this, you should be able to import and use the package / modules in your interpreter or notebook. You can change the code in the package and it will update in the places you’re using it assuming you re-import the package. So, in Jupyter, this would mean clicking the restart-kernel/re-run button.

Historically, I’ve mostly used Python for automating minimal tasks that would otherwise have been bash scripts. So, terms like the following were alien to me, and I didn’t really know how to manage dependencies properly in Python.

pip
freezing
virtual environment

The main languages I’ve used in recent memory were Java and JavaScript. They both have a dependency manager; so I expected Python to have one. In Java, people generally use Maven. In JavaScript, they generally use NPM (or YARN). Either way, you make a file, note down some modules you require and their versions (if you’re smart), and then run a “mvn install” or “npm install” to go get all the stuff you need.

Maven is also a build system so, its more like NPM + WebPack in JavaScript; but nonetheless, they work similarly from a dependency management perspective.

Moving on to Python, I’ve learned the following:

Python’s version of Maven or NPM, and why it’s different:

PIP is python’s version of NPM or Maven.
However, it installs things globally for the python version, and not on a project basis.
So, if I had 2 projects with conflicting dependencies, I could have issues because… well… everything is global.
In a lot of cases, people install python modules as they need them by just randomly adding “pip install” to their release notes or running it when they’re hacking a server and need a new library.
This is clearly not a “production-ready” solution though.

Working on a per-project level:

Virtual environments are a bolt-on that allows you to properly run python in isolated environments.
You can install the module for working with virtual environments globally by running “pip install virtualenv”.
After this, for each project, you can create your own virtual environment with “virtualenv <env-name>”. You can also specify the target python version you want to use if you have multiple, etc.
You activate a virtual environment by sourcing or running the “activate” bash or bat script (Linux or windows) in its bin folder. The prior command will have created a folder with the environment name with many sub-folders, one of which is the bin folder.
Once the environment is activated, your shell prompt will change to show you’re within it. Now if you run “pip list”, you’ll notice that you only have 3 basic dependencies; you are shielded from all of your global system ones.
You can run pip installs and python code here until your project works great (but only while you’re in the virtual environment).
Note that you should not necessarily keep your python code in your virtual environment. This is probably similar to how you should not keep your Java code inside your maven directory or your JavaScript code inside your NPM directory. I haven’t had experience either way with this, but I’ve seen it generally recommended in nearly all documentation I’ve come across.

Freezing your dependencies:

When you’re happy with it, you can do “pip freeze -l > requirements.txt” in order to generate a file that locks down your dependencies (the -l means just local ones, not global – and you should do it from your virtual environment).
Then you can install these in other places (e.g. on a prod server with automation) by doing “pip install -r requirements.txt”. This makes it quite similar to installing a JavaScript application with npm install (which would get dependencies from the package.json file).
Again, if you were running multiple projects on the server, you might want to do this in a virtual environment to keep things isolated/clean.

I probably have a lot to learn still, and I’m sure this gets more complex as I’ve used enough languages to know that it takes time to fully learn these things. But, I feel more comfortable with the idea of python in production now that I can see how you can isolate projects and install specific dependencies from a target file.

Coding Stream of Consciousness

by John Humphreys – Random code from my life.

Category Archives: Dependency Management

Python PIP Install Local Module While Developing

Module Preparation

Module Installation

Module Usage

Python Dependency Management and Virtual Environments (vs Maven or NPM).