Python Dependency Management and Virtual Environments (vs Maven or NPM).

Historically, I’ve mostly used Python for automating minimal tasks that would otherwise have been bash scripts. So, terms like the following were alien to me, and I didn’t really know how to manage dependencies properly in Python.

  • pip
  • freezing
  • virtual environment

The main languages I’ve used in recent memory were Java and JavaScript.  They both have a dependency manager; so I expected Python to have one.  In Java, people generally use Maven.  In JavaScript, they generally use NPM (or YARN).  Either way, you make a file, note down some modules you require and their versions (if you’re smart), and then run a “mvn install” or “npm install” to go get all the stuff you need.

Maven is also a build system so, its more like NPM + WebPack in JavaScript; but nonetheless, they work similarly from a dependency management perspective.

Moving on to Python, I’ve learned the following:

Python’s version of Maven or NPM, and why it’s different:

  • PIP is python’s version of NPM or Maven.
  • However, it installs things globally for the python version, and not on a project basis.
  • So, if I had 2 projects with conflicting dependencies, I could have issues because… well… everything is global.
  • In a lot of cases, people install python modules as they need them by just randomly adding “pip install” to their release notes or running it when they’re hacking a server and need a new library.
  • This is clearly not a “production-ready” solution though.

Working on a per-project level:

  • Virtual environments are a bolt-on that allows you to properly run python in isolated environments.
  • You can install the module for working with virtual environments globally by running “pip install virtualenv”.
  • After this, for each project, you can create your own virtual environment with “virtualenv <env-name>”.  You can also specify the target python version you want to use if you have multiple, etc.
  • You activate a virtual environment by sourcing or running the “activate” bash or bat script (Linux or windows) in its bin folder.  The prior command will have created a folder with the environment name with many sub-folders, one of which is the bin folder.
  • Once the environment is activated, your shell prompt will change to show you’re within it.  Now if you run “pip list”, you’ll notice that you only have 3 basic dependencies; you are shielded from all of your global system ones.
  • You can run pip installs and python code here until your project works great (but only while you’re in the virtual environment).
  • Note that you should not necessarily keep your python code in your virtual environment.  This is probably similar to how you should not keep your Java code inside your maven directory or your JavaScript code inside your NPM directory.  I haven’t had experience either way with this, but I’ve seen it generally recommended in nearly all documentation I’ve come across.

Freezing your dependencies:

  • When you’re happy with it, you can do “pip freeze -l > requirements.txt” in order to generate a file that locks down your dependencies (the -l means just local ones, not global – and you should do it from your virtual environment).
  • Then you can install these in other places (e.g. on a prod server with automation) by doing “pip install -r requirements.txt”.  This makes it quite similar to installing a JavaScript application with npm install (which would get dependencies from the package.json file).
  • Again, if you were running multiple projects on the server, you might want to do this in a virtual environment to keep things isolated/clean.

I probably have a lot to learn still, and I’m sure this gets more complex as I’ve used enough languages to know that it takes time to fully learn these things.  But, I feel more comfortable with the idea of python in production now that I can see how you can isolate projects and install specific dependencies from a target file.