When it comes to non-trivial sized projects, you'll want to split the program code into different files, with each file holding procedures and the like for one part of the project. You'll even want to split the files across different directories for more levels of grouping.
That's well and good, but how does Python find those files?
That is what this post will answer.
To give some context for this discussion, below is a picture of the organisation of files in the
cipher-tools repository on Github. I've one directory for the ciphers, one for the various helpers (such as langauge models and text prettification), and one for tests.
├── 2017 │ ├── 1a.ciphertext │ └── 2017-challenge1.ipynb ├── caesar-break.ods ├── caesar_break_parameter_trials.csv ├── caesar_break_parameter_trials.ipynb ├── caesar_break_parameter_trials.py ├── cipher │ ├── affine.py │ ├── caesar.py │ └── keyword_cipher.py ├── logger.py ├── main.py ├── run_tests ├── support │ ├── count_1l.txt │ ├── count_1w.txt │ ├── count_2l.txt │ ├── count_2w.txt │ ├── count_3l.txt │ ├── count_big.txt │ ├── language_models.py │ ├── lettercount.py │ ├── norms.py │ ├── segment.py │ ├── shakespeare.txt │ ├── sherlock-holmes.txt │ ├── text_prettify.py │ ├── utilities.py │ ├── war-and-peace.txt │ └── words.txt ├── test └── test ├── test_affine.py └── test_doctests.py
import is how Python loads code from an additional file into the current session. When Python is asked to
it looks for a file called
some_module.py. Where it looks is controlled by a built-in variable called
sys.path. This is a list of directories where Python looks for modules. The first item is the empty string, making Python look in the current directory. The other directories are other places on the computer where installed modules are supposed to live.
So, when Python is asked to
import some_module, it first looks in the current directory for
some_module.py, then (on my computer) for
/usr/local/lib/python3.6/dist-packages/some_module.py, and so on.
If the import statement looks like
import cipher.caesar, Python knows that the file
caesar.py will be in a directory called
cipher. It will first look for a subdirectory of the current directory called
cipher and the file
caesar.py within it (i.e.
./cipher/caesar.py), then for
/usr/local/lib/python3.6/dist-packages/cipher/caesar.py, and so on.
That works well if you're always loading self-built modules from the root directory of your project. But, in the example directory tree above, I keep each year's National Cipher Challenge files in a separate directory of the project. That means I need to persuade Python to first go up a directory before searching for the cipher tool files I've written.
This little bit of magic does that:
import os, sys, inspect, collections currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) parentdir = os.path.dirname(currentdir) sys.path.insert(0,parentdir) from cipher.caesar import * from cipher.affine import * from support.utilities import * from support.text_prettify import * from support.language_models import *
Line 3 finds the current directory of the program running.
Line 4 finds the parent directory.
Line 5 inserts that parent directory at the first item of
That means that, when Python looks for files to import in lines 7–11, each import first looks in the parent directory, then the current directory, and then the rest of the directories which were already there in
And that's how to organise your code.
There's not much "code" for this article, but you can see an example of this organiation in the
cipher-tools repository on Github.
On my computer,
sys.path includes directories such as