I have been having an embarassingly hard time getting a handle on package imports in Python. I’ll get something working, only to have it break inexplicably when I make what seems to be an incidental change. Tests will run in one directory but not another, but then inadvertently start working, only to stop a few days or minutes later. I’ve tried to be methodical in investigating what changes lead to what behavior, but it’s been difficult. To hopefully put this to rest, I’m going to investigate and methodically record all the behavior I can isolate regarding package imports, to hopefully make some sense of what’s going on.
First, let’s start with a project I’m calling ‘backend’. Here’s the file structure:
backend/ backend/ __init__.py analyzer.py tests/ __init__.py test_analyzer.py
PYTHONPATH is pointing to the directory containing the backend package, but not to the backend package itself (which contains
Let’s open up a terminal and play around:
Very cool. Now let’s
cd into the
backend package and see if anything changes:
Let’s see what happens when we remove the path to the project from our
And into Python, from
Ah. Now we’re getting somewhere. So you can import a package that is within your current working directory without having that package in your
PYTHONPATH, as a local import. From anywhere else, you’ll need your
PYTHONPATH to point to it.
To double-check, let’s
cd all the way to
/ and try to import:
So, a package must be contained in a directory on your
PYTHONPATH to be able to import it from anywhere other than the directory immediately above it.
To verify, let’s try changing our
Here, we’ve pointed it to the package itself, not to the containing directory. Let’s try importing it from
Makes sense. But what happens if we
cd up one directory above
Strange. I would have expected this import to have failed, but it imported
backend as though it were local. Let’s go up one more directory:
And now it fails, as it should. My suspicion is that, since the
paragon directory contained the
backend directory which contained the
backend package, python was able to look into the similarly-named directories. Let’s try renaming the outer
backend directory to
backend1and see what happens.
Ok, so that makes sense (note that renaming
backend1 meant that the
PYTHONPATH was no longer valid. Hence the failure meant that the local import wasn’t working.)
We can verify this by playing a bit more with the
Note that we’re doing an absolute import, not a relative import, because the name of the directory and the package are no longer the same. And now changing the directory name back to
It goes back to importing locally. Now I understand the convention of naming directories after the packages they contain.
Now, let’s look at importing modules from within a package.
For a long time, I assumed that if you imported a package, you could automatically access all of the modules within the package. It took an uncomfortably large amount of time debugging testing errors that I finally realized that this wasn’t the case.
Let’s play around and try importing the
analyzer module inside the
Alright. So it seems that we have to explicitly import submodules inside a package. Once a module is imported, though, we can use all of the functions that module defines.
What if we don’t want to import things one-by-one? Can we use the
from module import * on a package?
Doesn’t seem like it. But what about this?
analyzer is a module, we could import all of the functions from the module, without importing any of their wrapper files into the namespace. Good to know.
Now that that’s a bit clear, let’s take a look at running tests.
For reference, this are the import statements in the test file:
synth.py contain tools for generating synthetic data and mocks, feel free to ignore those for now.
First, running the test file as a simple Python script:
Ok, that worked out. Now, though, we’ll try running the test using the pytest framework:
What is this? This is the bug that has been haunting me. Usually I just delete files and change paths at random until something starts to work. This time, I decided to delete the
__init__.py from the
tests/ directory. Why? I have no idea. YOLO. I ran the tests again and got this:
Well… something changed at least. Investigating the error, I notice that it suggests I delete the
__pycache__ folders that have been popping up in my projects. I’ve set my iPython interpreter not to generate these kinds of files, but I’ve been dropping into vanilla Python from time to time, so I guess that’s where these came from. I go ahead and delete all of these files from the project, and try running the test again:
OH COME ON. Really? This isn’t the first time that
__pycache__ have caused me some pain. But this is good, this is progress. Let’s run the whole shebang:
Alright! Let’s try an experiment: moving the
tests/ directory one level up, so it’s a sibling directory to the
backend package, rather than a child. Typing this command:
mv backend/tests . gives us this:
backend/ backend/ __init__.py analyzer.py tests/ test_analyzer.py
Error! But the same as before. Delete
tests/__pycache__ and try again. SUCCESS!! Let’s commit.
Sigh. As with most thing programming, #itsalwaysusererror. Mind your PYTHONPATH, attend to naming conventions, clear out your cached files, and you’ll have a long and happy life.
BONUS: Testing a Non-Package
Let’s say you’re working on a project but don’t want to add it to your PYTHONPATH. It’s a work-in-progress, no one else should be able to import it, what have you. Can you still import those modules to test them?
It seems like it.
Let’s consider another project, a webapp, with the following structure:
webapp/ webapp/ __init__.py views.py tests/ test_integration.py
Note that this directory is not in my PYTHONPATH:*
*While writing this post, I read some articles advocating for keeping trailing slashes in your PATH variables. I thought it was a good idea, so I’ve changed my PYTHONPATH accordingly.
Let’s try running
Very cool. But what happens if we change directories?
Hmm. Can’t find the package. Here are the import statements at the top of the test:
It seems like running
py.test from the top of the project means that the test can look for local packages from that location. Running the tests from inside the test directory means that the package has to be imported via
Well… that makes sense! In the end, it all makes sense.
Further reading: another pretty good article on imports.