Organising a Python module

    When it comes to non-trivial sized projects, you'll want to split the program code into different files, with each file holding procedures and the like for one part of the project. You'll even want to split the files across different directories for more levels of grouping.

    That's well and good, but how does Python find those files?

    That is what this post will answer.

    To give some context for this discussion, below is a picture of the organisation of files in the cipher-tools repository on Github. I've one directory for the ciphers, one for the various helpers (such as langauge models and text prettification), and one for tests.

    ├── 2017
    │   ├── 1a.ciphertext
    │   └── 2017-challenge1.ipynb
    ├── caesar-break.ods
    ├── caesar_break_parameter_trials.csv
    ├── caesar_break_parameter_trials.ipynb
    ├── cipher
    │   ├──
    │   ├──
    │   └──
    ├── run_tests
    ├── support
    │   ├── count_1l.txt
    │   ├── count_1w.txt
    │   ├── count_2l.txt
    │   ├── count_2w.txt
    │   ├── count_3l.txt
    │   ├── count_big.txt
    │   ├──
    │   ├──
    │   ├──
    │   ├──
    │   ├── shakespeare.txt
    │   ├── sherlock-holmes.txt
    │   ├──
    │   ├──
    │   ├── war-and-peace.txt
    │   └── words.txt
    ├── test
    └── test

    import is how Python loads code from an additional file into the current session. When Python is asked to

    import some_module

    it looks for a file called Where it looks is controlled by a built-in variable called sys.path. This is a list of directories where Python looks for modules. The first item is the empty string, making Python look in the current directory. The other directories are other places on the computer where installed modules are supposed to live[1].

    So, when Python is asked to import some_module, it first looks in the current directory for, then (on my computer) for /usr/lib/python3.6/, then /usr/local/lib/python3.6/dist-packages/, and so on.

    If the import statement looks like import cipher.caesar, Python knows that the file will be in a directory called cipher. It will first look for a subdirectory of the current directory called cipher and the file within it (i.e. ./cipher/, then for /usr/lib/python3.6/cipher/, then /usr/local/lib/python3.6/dist-packages/cipher/, and so on.

    That works well if you're always loading self-built modules from the root directory of your project. But, in the example directory tree above, I keep each year's National Cipher Challenge files in a separate directory of the project. That means I need to persuade Python to first go up a directory before searching for the cipher tool files I've written.

    This little bit of magic does that:

    import os, sys, inspect, collections
    currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
    parentdir = os.path.dirname(currentdir)
    from cipher.caesar import *
    from cipher.affine import *
    from support.utilities import *
    from support.text_prettify import *
    from support.language_models import *

    Line 3 finds the current directory of the program running.

    Line 4 finds the parent directory.

    Line 5 inserts that parent directory at the first item of sys.path.

    That means that, when Python looks for files to import in lines 7–11, each import first looks in the parent directory, then the current directory, and then the rest of the directories which were already there in sys.path.

    And that's how to organise your code.


    There's not much "code" for this article, but you can see an example of this organiation in the cipher-tools repository on Github.


    Photo by Philip Swinburn on Unsplash

    1. On my computer, sys.path includes directories such as /usr/lib/python3.6, /usr/local/lib/python3.6/dist-packages, /usr/lib/python3/dist-packages, and /usr/lib/python3.6/dist-packages. ↩︎

    Neil Smith

    Read more posts by this author.