Python Loop Index Variable Scope

While crash-studying python for a new job, I found out that this code is actually not an error!

for i in [1, 2, 3]:
pass # Do nothing.
print(i)

It blew my mind that this code actually prints 3. For some crazy reason, python keeps the index variables around after the loop exits; they are not in the scope of the loop.

I found this in the python documentation; but it is described much better here in this blog post: https://eli.thegreenplace.net/2015/the-scope-of-index-variables-in-pythons-for-loops/.

I heavily recommend reading that link as it has lots of good info (thanks to Eli Bendersky).  But in case you’re lazy, here’s a historical anecdote quoted from it that I particularly liked:

“Why this is so

I actually asked Guido van Rossum about this behavior and he was gracious enough to reply with some historical background (thanks Guido!). The motivation is keeping Python’s simple approach to names and scopes without resorting to hacks (such as deleting all the values defined in the loop after it’s done – think about the complications with exceptions, etc.) or more complex scoping rules.

In Python, the scoping rules are fairly simple and elegant: a block is either a module, a function body or a class body. Within a function body, names are visible from the point of their definition to the end of the block (including nested blocks such as nested functions). That’s for local names, of course; global names (and other nonlocal names) have slightly different rules, but that’s not pertinent to our discussion.

The important point here is: the innermost possible scope is a function body. Not a for loop body. Not a with block body. Python does not have nested lexical scopes below the level of a function, unlike some other languages (C and its progeny, for example).

So if you just go about implementing Python, this behavior is what you’ll likely to end with. Here’s another enlightening snippet:

for i in range(4):
    d = i * 2
print(d)

Would it surprise you to find out that d is visible and accessible after the for loop is finished? No, this is just the way Python works. So why would the index variable be treated any differently?

By the way, the index variables of list comprehensions are also leaked to the enclosing scope. Or, to be precise, were leaked, before Python 3 came along.”

And for those like me who didn’t know, Guido van Rossum is the author of the Python programming language.

Oh, and by the way, you can avoid this variable leak with a lambda according to the python documentation here: https://docs.python.org/3.6/tutorial/datastructures.html.

For example:

squares = list(map(lambda x: x**2, range(10)))

Python Dependency Management and Virtual Environments (vs Maven or NPM).

Historically, I’ve mostly used Python for automating minimal tasks that would otherwise have been bash scripts. So, terms like the following were alien to me, and I didn’t really know how to manage dependencies properly in Python.

  • pip
  • freezing
  • virtual environment

The main languages I’ve used in recent memory were Java and JavaScript.  They both have a dependency manager; so I expected Python to have one.  In Java, people generally use Maven.  In JavaScript, they generally use NPM (or YARN).  Either way, you make a file, note down some modules you require and their versions (if you’re smart), and then run a “mvn install” or “npm install” to go get all the stuff you need.

Maven is also a build system so, its more like NPM + WebPack in JavaScript; but nonetheless, they work similarly from a dependency management perspective.

Moving on to Python, I’ve learned the following:

Python’s version of Maven or NPM, and why it’s different:

  • PIP is python’s version of NPM or Maven.
  • However, it installs things globally for the python version, and not on a project basis.
  • So, if I had 2 projects with conflicting dependencies, I could have issues because… well… everything is global.
  • In a lot of cases, people install python modules as they need them by just randomly adding “pip install” to their release notes or running it when they’re hacking a server and need a new library.
  • This is clearly not a “production-ready” solution though.

Working on a per-project level:

  • Virtual environments are a bolt-on that allows you to properly run python in isolated environments.
  • You can install the module for working with virtual environments globally by running “pip install virtualenv”.
  • After this, for each project, you can create your own virtual environment with “virtualenv <env-name>”.  You can also specify the target python version you want to use if you have multiple, etc.
  • You activate a virtual environment by sourcing or running the “activate” bash or bat script (Linux or windows) in its bin folder.  The prior command will have created a folder with the environment name with many sub-folders, one of which is the bin folder.
  • Once the environment is activated, your shell prompt will change to show you’re within it.  Now if you run “pip list”, you’ll notice that you only have 3 basic dependencies; you are shielded from all of your global system ones.
  • You can run pip installs and python code here until your project works great (but only while you’re in the virtual environment).
  • Note that you should not necessarily keep your python code in your virtual environment.  This is probably similar to how you should not keep your Java code inside your maven directory or your JavaScript code inside your NPM directory.  I haven’t had experience either way with this, but I’ve seen it generally recommended in nearly all documentation I’ve come across.

Freezing your dependencies:

  • When you’re happy with it, you can do “pip freeze -l > requirements.txt” in order to generate a file that locks down your dependencies (the -l means just local ones, not global – and you should do it from your virtual environment).
  • Then you can install these in other places (e.g. on a prod server with automation) by doing “pip install -r requirements.txt”.  This makes it quite similar to installing a JavaScript application with npm install (which would get dependencies from the package.json file).
  • Again, if you were running multiple projects on the server, you might want to do this in a virtual environment to keep things isolated/clean.

I probably have a lot to learn still, and I’m sure this gets more complex as I’ve used enough languages to know that it takes time to fully learn these things.  But, I feel more comfortable with the idea of python in production now that I can see how you can isolate projects and install specific dependencies from a target file.

Python Multiple Assignment

The documentation website for Python uses this example of Fibonacci – https://docs.python.org/3.6/tutorial/introduction.html:

>>> a, b = 0, 1
>>> while b < 1000:
...     print(b, end=',')
...     a, b = b, a+b
...
1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,

I’m quite an experienced programmer, and I found this difficult to digest even though it’s rather simple. The multiple assignment lines threw me off a bit.

In Java and similar languages, you would do this:

int x = 5, y = 7, z = 2;

to assign multiple values. So, each value gets its assignment immediately. In python this is not the case.

In python, you note all the target variables in a list, then you note all the values. So, a more clear example would be:

x, y, z = 5, 7, 2

This would provide the same assignments as the Java example does. This seems quirky but maybe I’m just too used to C-style languages :).

So, the initial python example, ‘a’ starts as 0, ‘b’ starts as 1, and every cycle ‘a’ = ‘b’ and b = ‘a’ + ‘b’, which makes sense.