import – complex or complicated?

In Python, life is really easy when all your .py files are in one directory. The moment you want to organize your code into folders there’s a wall of challenges you have to climb. I believe this is an issue that can be alleviated with one small fix.

Here’s a comparison of how a developer shares code across a project in C/C++ and Python:

C/C++ Python
Forms #include <from_env_dirs_first>
#include “from_local_dir_first”
#include “abs_or_rel_file_system_path”
import module
import module as alias
from module import var
from module import *
from ..package_relative_path import module
from package.absolute_path import module
try:
    import one_thing
except ImportError:
    import another as one_thing
Namespacing Public toilet – everything included is global. “Namespaces are one honking great idea — let’s do more of those!”
Seriously, module encapsulation is fantastic.
Helpful extra knowledge Makefiles/vcproj configurations of paths
#ifdef
sys.path
__all__
__path__
Mandatory extra
knowledge (“Gotchas”)
#pragma once (or the equivalent #ifdef)
certain things aren’t allowed in .h files
please don’t use absolute paths
__init__.py
syntax for intra-package imports
modules intended for use as the main module of a Python application must always use absolute imports.

Now this isn’t an exhaustive list as I want to discuss just a small subset from the above table. Also note that I didn’t go into “ctypes”, “#pragma comment(lib…)”, etc. as we’re talking about sharing code, not binaries.

For the 6 years of keyboard tapping I’ve done in C and Python, I never once was confused as to how to access code between directories in C; Python on the other hand has gotten me quite a few times and I always need to rertfm. And I consider myself far more interested and fluent in Python than in C/C++. This may be just a problem with my head, but I’d like to vent either way.

Blah, blah, what’s the problem?

Skip this section if you’ve already had experience with said problem, I’m sure it’s as painful to read as it was to write.

Python has this really elegant solution for one-folder-mode, “import x” just gives you what you expected, either from the standard library (sys.path, etc) or your local directory. If you have “os.py” in that local directory then you shadow out the standard “import os”. Once you mix directories in there, python is suddenly afraid of shadowing and you can’t import things from a folder named “os” unless it has an “__init__.py”. So shadowing here is allowed and there not. If you want to access modules from the outside (dot dot and beyond), then you have to be in a package, use sys.path, os.chdir or maybe implement file-system-imports on your own.

Personally, I find myself doing this design pattern a lot:

  1. The App directory
    1. main_app_entry.py
    2. framework
      1. general_useful_things.py
      2. more_frameworkey_goodness.py
    3. components
      1. this_solves_a_problem.py
      2. another_tool.py

I usually have an “if __name__ == ‘__main__':” in my modules and there I have some sort of test, utility function, or a train of code-thought not yet organized.

How can another_tool.py access general_useful_things.py? First things first – __init__.py everywhere! After trying a few ways to do the import – here are a few results.

So what’s needed for another_tool to import general_useful_things:

  • “from framework import general_useful_things” works in another_tool.py if we only use main_app_entry.py, it does not work if we run another_tool.py directly. Does this mean __name__ == “__main__” is a useless feature I should ignore?
  • Here’s the rest of the list of failed attempts:
    #from app.framework import general_useful_things
    #from .app.framework import general_useful_things
    #from ..framework import general_useful_things
    #from .framework import general_useful_things
    #from . import framework
    #from .. import framework
  • And this little recipe works in most cases:
    SRC_DIR = os.path.dirname(os.path.abspath(__file__))
    os.sys.path.append(os.path.join(SRC_DIR, '..', 'framework'))
    import general_useful_things

If you want to tinker around with that example directory structure here you go: http://dl.dropbox.com/u/440522/importing%20is%20hard.zip

Python doesn’t have file-system imports

To summarize my rant – python has this mantra that your import lines should be concise and thus a complex searching import mechanism was built to avoid filesystem-path-like imports. The price we pay for that searching import mechanism is that you really need to learn how to use its implicit kinks and even then it’s not that fun to use.

The theoretical ideal solution

“import x” always imports from sys.path etc, if you want to import something local you use “import ./local_dir_module”, the forward slash signals the parser and the developer that a file-system import is taking place. “local_dir_module.py” needs to be in the current folder for the above example to work. Just in case it isn’t clear, the module “local_dir_module” will be accessed as usual, without the “.py”, dots or slashes. The import statement is the only place where slashes are allowed and the result of the import is a module in the stater’s namespace.

That’s as explicit, simple, concise and useful as it gets.

The practical solution

I don’t mind if “import x” still works as it does today, the main point is that now you can do things like “import ../../that_module_from_far_away”. So you can actually keep python 100% backwards compatible and still add this feature.

Concerning the backslash/forwardslash debate – I’m a windows guy and I don’t mind using the forward slash for Python, Windows doesn’t mind it either (“/” only fails in a few specific scenarios like cmd autocomplete). Another fun fact is you can avoid littering your app with __init__.py if you aren’t going to be accessed using that big old search-import-package mechanism.

I realize this whole fiasco might raise the question of absolute path imports, in my opinion these shouldn’t be allowed. Absolue includes in C/C++ destroy portability, impose annoying folder structure constraints and they’re ever-so tempting at late hours where you don’t really want to calculate the amount of “..” needed. For the special cases that might still need this, the instrumentation existing in python and e.g. import_file are enough.

The good things about __init__.py

Many packages use __init__.py as a way to organize their API’s to the outside world. Your package folder can have tons of scripts and only what you included in __init__.py is exposed when your folder is imported directly (eg json in the std-library). So don’t take this as an attack on __init__.py, it’s just that the import mechanism seems incomplete in my eyes. Just to be a bit specific – package maintainers don’t need to do stuff like “import os as _os” to avoid littering their module namespace when they use __init__.py as their API, that’s a nice thing to have.

Also, I’d like to hear other justifications as I’m sure more than a few exist.

The drawbacks of slashes and file-system-imports

  1. From a compatibility viewpoint, old packages aren’t affected as we’re introducing the “forward slash” in whatever future python version. Whoever uses this feature won’t be compatible with older python versions.
  2. Windows users and *nix users might argue over whether or not to allow backslashes, I think it’s not that important. Though the internet has forward slashes, so that makes it 2 platforms against 1.
  3. It’s uglier (though today’s relative imports are just as ugly and harder to learn).
  4. People might ask for absolute imports.
  5. Dividing the community and its packages into “file-system-importers” and “package-search-importers”.
  6. *reserved for complaints in the comments*

Summary

I’ve tried to do packages the existing python way and I think we can do better. The __init__.py based search mechanism works great for whatever is in sys.path, though I believe its pains outweigh its gains for organizing code. Here’s to hoping there’s a chance for relative-file-system imports in standard python.

References

http://www.python.org/dev/peps/pep-0328/

http://en.cppreference.com/w/cpp/preprocessor/include

http://effbot.org/zone/import-confusion.htm – January 07, 1999 – “There are Many Ways to Import a Module” – “The import and from-import statements are a constant cause of serious confusion for newcomers to Python”

http://stackoverflow.com/questions/448271/what-is-init-py-for

http://stackoverflow.com/questions/1260792/python-import-a-file-from-a-subdirectory 

http://docs.python.org/tutorial/modules.html

http://www.slideshare.net/stuartmitchell/python-relative-imports-just-let-me-use-the-file-system-please

About these ads

One thought on “import – complex or complicated?

  1. To summarize my rant – python has this mantra that your import lines should be concise and thus a complex searching import mechanism was built to avoid filesystem-path-like imports.

    That’s actually beautiful and allows for an abstract view on the import system – ideally. A module path `a.b.c …` could be represented in various ways, not just as FS paths, which remains certainly the single most important use case. As the FP crowd would immediately notice, import is monadic and the presence of an `__init__.py` file in some directory allows for import with a side effect. A customization of those side effects on user level gives us “import-hooks”. It’s possible to well integrate everything.

    Same with relative imports. The initial ‘root’ is always the `__main__` module. You could start your module which becomes `__main__` from an FS of an arbitrary OS, an URL, a database record or whatever ( assuming an FS by default ). The representation of the module-path in some environment may depend on how the main module is called on the CLI. This determines the concrete meaning of the dot operator including that of leading dots. The notion of a ‘root’ also means that relative import is the default semantics.
    The `root` might be changed though, which means we jump from one system of relative imports to another one, eventually changing the representation function ( e.g. when importing from zip-files ). That’s the function of sys.path. It contains data to lookup for a new root, which is not the initial __main__.

    Note that there is no explicit notion of a “package”. A package is only a point from which other modules can be reached using the dot-operator + an optional __init__.py module which is imported as a side-effect, not a sentinel for package-being as if this was a relevant ontological category living on the same footing as that of a module.

    Unfortunately import rather looks like a bunch of hacks accumulated over the years and the weird relative import semantics you complain about ( you are not alone ) is only a symptom. Another weirdness is that a module object may be created n-times ( with n>1 ) depending on the import path. This is due to a simpler-then-necessary ( simplistic ) caching mechanism.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s