I have a script 'preprocessing.py' containing the function for text preprocessing:
def preprocess():
#...some code here
with open('stopwords.txt') as sw:
for line in sw.readlines():
stop_words.add(something)
#...some more code than doesn't matter
return stop_words
Now I want to use this function in another Python script. So, I do the following:
import sys
sys.path.insert(0, '/path/to/first/script')
from preprocessing import preprocess
x = preprocess(my_text)
Finally, I end up with the issue:
IOError: [Errno 2] No such file or directory: 'stopwords.txt'
The problem is surely that the 'stopwords.txt' file is located next to the first script, not the second.
Is there any way to specify the path to this file, not making any changes to the script 'preprocessing.py'?
Thank you.
4 Answers 4
Since you're running on a *nix like system, it seems, why not use that marvellous environment to glue your stuff together?
cat stopwords.txt | python preprocess.py | python process.py
Of course, your scripts should just use the standard input, and produce just standard output. See! Remove code and get functionality for free!
Comments
The simplest, and possibly most sensible way is to pass in the fully pathed filename:
def preprocess(filename):
#...some code here
with open(filename) as sw:
for line in sw.readlines():
stop_words.add(something)
#...some more code than doesn't matter
return stop_words
Then you can call it appropriately.
3 Comments
preprocessing.py if the code was separated into multiple directories; assuming it was a project with a fixed structure. We perhaps made opposite assumptions in that aspect.Looks like you can put
import os
os.chdir('path/to/first/script')
in your second script. Please try.
Comments
import os
def preprocess():
#...some code here
# get path in same dir
path = os.path.splitext(__file__)
# join them with file name
file_id = os.path.join(path, "stopwords.txt")
with open(file_id) as sw:
for line in sw.readlines():
stop_words.add(something)
#...some more code than doesn't matter
return stop_words
1 Comment
splitext() seems incorrect. You might mean os.path.dirname(os.path.abspath(__file__)) instead¶ 2- Consider pkgutil.get_data(), pkg_resources (setuptools) instead of manually locating files e.g., the data may be in a zip archive
with open('stopwords.txt') as sw:preprocess. Get the directory withos.path.dirname(os.path.realpath(__file__))and use that to findstopwords.txt.setup.py(or usingcookiecutterpackage), runpip install -e .).sys.path.insert()with/without the hardcoded path should be avoided¶ 2- Q: how to access resources (files) that are located relative to the code. A:pkgutil.get_data(),pkg_resources, appdirs