Best way to set sys.path for "hot library" development in Python?

Question 1

I have my Python source structured as follows:

+-branchname/
 +-dst/
 +-src/
 | +-library/
 | | +-cleese/
 | | | +-test/
 | | | | +-__init__.py
 | | | | +-test_cleese.py
 | | | +-__init__.py
 | | | +-cleese.py
 | | +-palin/
 | | +-test/
 | | | +-__init__.py 
 | | | +-test_palin.py 
 | | +-__init__.py
 | | +-palin.py
 | +-productline/
 | +-circus/
 | | +-test/
 | | | +-__init__.py
 | | | +-test_circus.py
 | | +-__init__.py
 | | +-circus.py
 | +-grail/
 | +-test/
 | | +-__init__.py
 | | +-test_grail.py
 | +-__init__.py
 | +-grail.py
 +-branch_root_marker

Entry points are in circus/ and grail/, as well as (possibly) each of the test/ directories, depending on how testing is implemented.

I have multiple copies of this source tree present on local storage at any one point in time (corresponding to various maintenance and feature branches), so I cannot set PYTHONPATH in my shell without some pain. (I would need to remember to change it each time I switched to work on a different branch, and I am very forgetful)

Instead, I have some logic that walks up the file tree, starting at the "current" file location, moving from leaf towards root, looking for branch_root_marker. Once the root directory of the current working copy is found, it adds library/ and productline/ to sys.path. I call this function from each entry-point in the system.

"""Add working copy (branch) to sys.path"""
import os
import sys
def _setpath():
 """Add working copy (branch) to sys.path"""
 markerfile = "branch_root_marker"
 path = ""
 if ("__file__" in locals() or globals()) and __file__ != "__main__":
 path = __file__
 elif sys.argv and sys.argv[0] and not sys.argv[0] == "-c":
 path = sys.argv[0]
 path = os.path.realpath(os.path.expandvars(path))
 while os.path.exists(path):
 if os.path.exists(os.path.join(path, markerfile)):
 break
 else:
 path = os.path.dirname(path)
 errormsg = " ".join(("Could not find", markerfile))
 assert os.path.exists(path) and path != os.path.dirname(path), errormsg
 path = os.path.join(path, "src")
 (_, subdir_list, _) = os.walk(path).next()
 for subdir in subdir_list:
 if subdir.startswith("."):
 continue
 subdir_path = os.path.join(path, subdir)
 if subdir_path in sys.path:
 continue
 sys.path.append(subdir_path)
_setpath()

Currently, I need to keep a separate but identical copy of this function in each entry point. Even though it is quite a short function, I am quite distressed by how fragrantly the DRY principle is being violated by this approach, and would love to find a way to keep the sys.path modification logic in one place. Any ideas?

Note:- One thing that springs to mind is to install the sys.path modifying logic into a common location that is always on PYTHONPATH. Whilst this is not a terrible idea, it means introducing an installation step that needs to be carried out each time I move to a fresh environment; another thing to remember (or, more likely, forget), so I would like to avoid this if at all possible.

Question 2

Why do you need multiple branches on disk at the same time? Why not use one directory and just change branches when you need to work on another branch. That would allow you to use PYTHONPATH

Question 3

Because the internal structure of the source tree might change, because I want to do diffs to compare files between branches, and because I want to be able to actually work on multiple branches concurrently. (I might have a script running in the background on one branch while I edit code on another branch).

Question 4

Consider using site and .pth files: stackoverflow.com/questions/700375/…

Question 5

@vartec - Thanks for your response, but I am not sure how putting a .pth file in the site-packages directory would help - As far as I can see, it would suffer from the same problems as setting PYTHONPATH. (Unless I am not understanding something)

Question 6

Have you tried using virtualenv? There are a number of tools/scripts available that make the workflow simpler, especially when using multiple branches.

Question 7

OK, I started to craft a solution that involved PEP 302 import hooks. Way too complicated.

I think you answered your own question with:

this "gateway" script could implement all manner of bootstrapping functionality - for example, ensuring that virtualenv is set up correctly and activated for the current branch.

if you are already using virtualenv, put the logic you have in the "~/venv/or-whatever-path/bin/activate" script that is already there. I'd suggest listing of possible entry points and prompting for a response (list with numbers and a raw_input() statement will do) if you are not specifying one.

I'd also put the name of your branch in to $PS1 so that is shows up in the tab name or window title of your terminal. I so something similar and change the color of the prompt so I don't do the right thing in the wrong place.

Question 8

You have to use virtualenv. It create an isolated python environment. It is also recommended when you need different versions of third party packages depending of the version of your product.

Question 9

I already use virtualenv, but I have only experimented with the most basic features (Currently I use the same env for all my branches). I am not yet sure how I am going to integrate virtualenv with my development process.

Question 10

You need to create a virtualenv per branch. You can also install your product (python setup.py install) in this venv.

Question 11

My current thinking is still tending towards the single entry-point solution:- this "gateway" script could implement all manner of bootstrapping functionality - for example, ensuring that virtualenv is set up correctly and activated for the current branch. (The goal is to have zero additional steps to remember when switching branches).

Question 12

Actually, I am investigating using buildout instead of virtualenv: I am attracted by the idea of explicit configuration files (that can be version controlled).

Question 13

My workflow is somehow similar to yours: I have multiple branches checked out at different directories. I tried creating a virtualenv for each branch, but it's too much hassle.

I ended up setting PYTHONPATH relative to $PWD since I only have runnable code at one directory level: PYTHONPATH=$PWD/.. python code-to-run.py. This uses current dir's source tree, and, after it, virtualenv's and system modules.

This does not work under Windows, obviously.

Another way that I tried was a 'self-modifying' .pth file; it included

import sys, os; sys.path.insert(0. os.path.abspath('..'))

You might use more complex logic if you want.

Question 14

Where do you put your .pth files? I tried to use them a few months ago (When I was a complete Python newbie) with no apparent effect, so it is possible (likely) that I did not grasp some fundamental concept. :-)

Question 15

@WilliamPayne: a .pth file goes wherever a normal module would. It is interpreted as a pointer to a module, a kind of Python-specific glorified symlink. You can put such a file in site-packages.

Question 16

So as I understand it, the .pth file can go into site-packages, but can actually be a python script that determines the location of the current working directory, and uses that to set sys.path. Hmmm. that might actually work.

Question 17

... although I am still a bit nervous about what might happen in the context of different testing frameworks / continuous integration servers.

Question 18

I occurs to me that I could limit the number of entry points in the system by creating a run.py script that lives in the root of the working copy, so instead of calling "python circus.py" one would instead call "python run.py circus". This would force the entry point to always be the root directory of the working copy, which would solve the problem. (Although it seems somewhat hackish).

Question 19

Does your revision control allows you to run a script on sync/clone/update? If so you can configure appropriate files in each directory of your source tree so that the root directory is referenced explicitly.

For example if you were using mercurial then this might help, from http://mercurial.selenic.com/wiki/TipsAndTricks:

Avoid merging autogenerated (binary) files (PDF)
Creating pdfs on the fly

This assumes that you always want to have the PDFs you can use, but that you don't need to track them - only their contents (and those are defined in the tex files).

For this you add an update hook which crates the pdf whenever you update to a revision.

Edit your .hg/hgrc to include the hooks section with an update hook:

[hooks] update.create_pdfs = latex your_tex_file.tex To make this still a bit easier, you can use a versioned script which creates all pdf. that way you can just call the script and don't need to worry about editing the .hg/hgrc when you add text files or change the call.

I use a python script for platform compatibility:

parse_latex.py:

Toggle line numbers 1 #!/usr/bin/env python 2 from subprocess import call 3 for i in ("file1.tex", "file2.tex"): 4 call(["latex", i]) .hg/hgrc:

[hooks] update.create = ./parse_latex.py

Question 20

I am using mercurial, so I have a range of hooks that I can use - this looks like a really good solution.

Question 21

Ok, so I went with the gateway script. At the moment it only has minimal functionality, no cacheing of paths, no virtualenv control etc.. but it works and will form the basis for something that will grow over time.

import os
import sys
import fnmatch
OK = 0
ERROR = 1
_EXCLUDE_FROM_SYSPATH = [".svn", ".hg", ".", ".."]
# -----------------------------------------------------------------------------
def _root_path_of_repo_branch():
 """Returns working copy root path."""
 branch_root_marker = "branch_root_marker"
 path = ""
 if ("__file__" in locals() or globals()) and "__main__" != __file__:
 path = __file__
 elif sys.argv and sys.argv[0] and not sys.argv[0] == "-c":
 path = sys.argv[0]
 path = os.path.realpath(os.path.expandvars(path))
 while os.path.exists(path):
 if os.path.exists(os.path.join(path, branch_root_marker)):
 break
 else:
 path = os.path.dirname(path)
 assert os.path.exists(path), " ".join("Could not find", branch_root_marker)
 return path
# -----------------------------------------------------------------------------
def _pythondirs_in_branch(root_path_of_branch):
 """Find pythondirs in the specified branch."""
 srcpath = os.path.join(root_path_of_branch, "src")
 srctree = os.walk(srcpath)
 pythondirs = list()
 for (dirpath, dirnames, filenames) in srctree:
 for name in _EXCLUDE_FROM_SYSPATH:
 if name in dirnames:
 dirnames.remove(name)
 if any((fnmatch.fnmatch(name, "*.py") for name in filenames)):
 pythondirs.append(dirpath)
 return pythondirs
# -----------------------------------------------------------------------------
def _add_dirs_to_syspath(dirlist):
 """Adds specified directories to the sys path."""
 for dirname in dirlist:
 sys.path.append(dirname)
# -----------------------------------------------------------------------------
def _add_pythondirs_in_branch_to_syspath():
 """Adds pythondirs in the current branch to the sys path"""
 branch_path = _root_path_of_repo_branch()
 pythondirs = _pythondirs_in_branch(branch_path)
 _add_dirs_to_syspath(pythondirs)
# -----------------------------------------------------------------------------
def main(argv=None):
 """Read config & start app."""
 if argv is None:
 argv = sys.argv[1:]
 command_name = argv[0]
 if "test" in command_name:
 pass
 _add_pythondirs_in_branch_to_syspath()
 try:
 command = __import__(command_name)
 except ImportError, err:
 print "IMPORT ERROR : " + err.message
 return ERROR
 return command.main(argv)
if __name__ == "__main__":
 sys.exit(main())

Question 22

According to the python packaging user guide: packaging.python.org you should install your library modules in development mode. pip install -e <path>

Phil Cooper Phil Cooper 2501 silver badge5 bronze badges · Accepted Answer · 2012-05-15 22:33:46Z

OK, I started to craft a solution that involved PEP 302 import hooks. Way too complicated.

I think you answered your own question with:

this "gateway" script could implement all manner of bootstrapping functionality - for example, ensuring that virtualenv is set up correctly and activated for the current branch.

if you are already using virtualenv, put the logic you have in the "~/venv/or-whatever-path/bin/activate" script that is already there. I'd suggest listing of possible entry points and prompting for a response (list with numbers and a raw_input() statement will do) if you are not specifying one.

I'd also put the name of your branch in to $PS1 so that is shows up in the tab name or window title of your terminal. I so something similar and change the color of the prompt so I don't do the right thing in the wrong place.

Stack Exchange Network

Best way to set sys.path for "hot library" development in Python?

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Best way to set sys.path for "hot library" development in Python?

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions