The newer version of this question is located here: Cyther: The Cross Platform Cython/Python Compiler (Take 2)
I am currently writing a Python library (soon to be published)that automatically compiles a Cython/Python file to pure -O3
C, without the user having to do anything other than:
C:\Project_dir\> cyther example.pyx
C:\Project_dir\> cyther example.py
from cyther import main
main('example.pyx')
My 'auto-compiler' is named Cyther, and it organizes and polymorphesizes Cython and gcc together in an attempt to make a platform independent, stable, and easy to use compiler.
Cyther was specifically designed to combat the vcvarsall.bat not found
error on windows, but still work on every other system with exactly the same performance.
I need to make sure that it is completely cross-platform.
I've tried my best, but alas, I may have slipped up here and there. I am also not very familiar with the oddities of other operating systems. I was thinking about using a Virtual Box to run a few other operating systems to test Cyther on, but I can only get so far by trial and error.
The issue: Cyther makes a few assumptions about your system that I'm not sure how to handle:
- Your environment path variable is named 'PATH'
- Cython, Python and gcc are all '.exe'
- libpythonXY.a exists in your libs directory
- mingw32 is up to date with the latest Python release (Windows)
- 'cyther' is callable from the command line
- Any gcc compiled C program will work on Windows
__author__ = 'Nicholas C. Pandolfi'
license = '''
Copyright (c) 2016 Nicholas C Pandolfi ALL RIGHTS RESERVED (MIT)
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall
be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
'''
import os
import sys
import subprocess
import argparse
import platform
import errno
import time
#The system constants
ver = str(sys.version_info.major) + str(sys.version_info.minor)
home = sys.exec_prefix
libs = os.path.join(home, 'libs')
include = os.path.join(home, 'include')
a_lib = 'python' + ver
is_windows = platform.platform().split('-')[0] == 'Windows' # This line and the one below it are delicate
default_output_extension = '.pyd' if is_windows else '.so'
INTERVAL = .25
assumptions = """
Assumptions cyther makes about your system:
1) When you put the -z flag to print this very string:
you must also have the filename specified or else it will return an error
2) Your environment path variable is named 'PATH'
3) Cython, Python and gcc are all .exe
4) mingw32 is up to date with the latest python release (windows)
5) 'cyther' is callable from the command line
6) Any gcc compiled c program will work on windows
"""
class CytherError(Exception):
def __init__(self, message):
Exception.__init__(self, message)
self.message = 'CytherError: {}'.format(repr(message))
#Holds each file's attributes
class AttributeHolder:
pass
#My best shot at a raceless make directory function
def raceless_mkdir(directory):
if not os.path.exists(directory):
try:
os.makedirs(directory)
except OSError as error:
if error.errno != errno.EEXIST:
raise
#gets all files in a specific directory
def get_files(d):
files = []
for file in os.listdir(d):
if os.path.isfile(os.path.join(d, file)):
files.append(file)
return files
#find the program 'program' in the system path
def which(program):
def is_exe(filename):
return os.path.isfile(filename) and os.access(filename, os.X_OK)
filename = os.path.split(program)[0]
if filename:
if is_exe(program):
return program
else:
for path in os.environ["PATH"].split(os.pathsep):
path = path.strip('"')
exe_file = os.path.join(path, program)
if is_exe(exe_file):
return exe_file
return None
#For each file, this constructs all build names and necessary variables to pass along with it to be made
def process_files(args):
to_process = []
for filename in args.filenames:
file = AttributeHolder()
if os.path.exists(filename) and (filename not in os.listdir(os.getcwd())):
file.file_path = filename
elif os.path.exists(os.path.join(os.getcwd(), filename)):
file.file_path = os.path.join(os.getcwd(), filename)
else:
raise CytherError("The file '{}' does not exist".format(filename))
file.file_base_name = os.path.splitext(os.path.basename(file.file_path))[0]
file.no_extension, file.extension = os.path.splitext(file.file_path)
if file.extension not in ('.pyx', '.py'):
raise CytherError("The file '{}' is not a designated cython file".format(file.file_path))
base_path = os.path.dirname(file.file_path)
local_build = args.local
if not local_build:
cache_name = os.path.join(base_path, '__cythercache__')
raceless_mkdir(cache_name)
file.c_name = os.path.join(cache_name, file.file_base_name) + '.c'
else:
file.c_name = file.no_extension + '.c'
if not args.output_name:
file.output_name = file.no_extension + default_output_extension
else:
file.output_name = args.output_name
if not os.path.exists(os.path.dirname(file.output_name)):
raise CytherError('The directory specified to write the output file in does not exist')
file.stamp_if_error = 0
to_process.append(file)
return to_process
#Figures out whether the source code has a corresponding compile
def should_compile(file):
if os.path.exists(file.output_name):
source_time = os.path.getmtime(file.file_path)
output_time = os.path.getmtime(file.output_name)
if source_time > output_time:
return True
else:
return True
return False
def core(args, file):
#Recieves the pass off args
cython_args = args.cython_args if args.cython_args else []
cython_pass_off = []
gcc_args = args.gcc_args if args.gcc_args else []
gcc_pass_off = []
#This command and the rest contruct the arguments to use for compiling given a preset
preset = args.preset
if not preset:
preset = 'ninja'#Default
if preset == 'ninja':
cython_command = ['cython', '-a', '-p', '-o', file.c_name, file.file_path]
gcc_command = ['gcc', '-shared', '-w', '-O3', '-I', include, '-L', libs, '-o', file.output_name, file.c_name,
'-l', a_lib]
elif preset == 'beast':
cython_command = ['cython', '-a', '-l', '-p', '-o', file.c_name, file.file_path]
gcc_command = ['gcc', '-shared', '-Wall', '-O3', '-I', include, '-L', libs, '-o', file.output_name,
file.c_name, '-l', a_lib]
elif preset == 'minimal':
cython_command = ['cython', '-o', file.c_name, file.file_path]
gcc_command = ['gcc', '-shared', '-I', include, '-L', libs, '-o', file.output_name, file.c_name, '-l', a_lib]
else:
raise CytherError("The format '{}' is not supported".format(preset))
#Process cython and gcc pass off args
for offset, process in enumerate((cython_args, gcc_args)):
for arg in process:
if arg[0] == '_':
if arg[1] == '_':
arg = '--' + arg[2:]
else:
arg = '-' + arg[1:]
if offset:
gcc_pass_off.append(arg)
else:
cython_pass_off.append(arg)
#This filters the args to give to cython
if cython_pass_off:
for item in cython_pass_off:
if item[0] == '-':
if item not in cython_command:
cython_command.append(item)
#This filters the args to give to gcc
if gcc_pass_off:
for item in gcc_pass_off:
if item[0] == '-':
if item not in gcc_command:
gcc_command.append(item)
if not args.skip:
#Figure out if the selected software is installed
python_found = which('python.exe')
cython_found = which('cython.exe')
gcc_found = which('gcc.exe')
#Raise errors if software is not installed
if not python_found:
raise CytherError("Python is not able to be called, please add it to the system's path")
if not cython_found:
try:
import cython
raise CytherError("Cython exists and is able to be imported by Python, " + \
"however it is not in the system path. Please add it.")
except ImportError:
raise CytherError("Cython is unable to be imported, and is probably not installed")
if not gcc_found:
raise CytherError("gcc is not able to be called, please add it to the system's path")
print(' '.join(cython_command).strip())
print(' '.join(gcc_command).strip())
#cythonize
cython_error = subprocess.call(cython_command)
string = str(cython_error)
if cython_error:
if args.watch:
print(string)
return -42
else:
raise CytherError(string)
#gcc'ize'
gcc_error = subprocess.call(gcc_command)
string = str(cython_error)
if gcc_error:
if args.watch:
print(string)
return -42
else:
raise CytherError(string)
return 1
def main(args):
numfiles = len(args.filenames)
interval = INTERVAL / numfiles
if args.assumptions:
print(assumptions)
return 0
if type(args) == str:
args = parser.parse_args(args.split(' '))
elif type(args) == argparse.Namespace:
pass#dont do anything
else:
raise CytherError("Args must be a instance of str or argparse.Namespace, not '{}'".format(str(type(args))))
files = process_files(args)
if not args.timestamp and args.watch:
args.timestamp = True
while True:
for file in files:
if args.timestamp:
if should_compile(file) and os.path.getmtime(file.file_path) > file.stamp_if_error:
if args.watch:
if len(args.filenames) > 1:
print("Compiling the file '{}'".format(file.file_path))
else:
print('Compiling the file')
else:
pass#dont print anything, its a single file
print('')
ret = core(args, file)
if ret == -42:
file.stamp_if_error = time.time()
if args.watch:
print('\n...\n')
else:
if not args.watch:
if len(args.filenames) > 1:
print("Skipping the file '{}'".format(file.file_path))
else:
print('Skipping compilation')
else:
pass#dont print anything, its a single file
continue
if not args.watch:
break
else:
time.sleep(interval)
help_filenames = 'The Cython source file'
help_preset = 'The preset options for using cython and gcc (ninja, verbose, beast)'
help_timestamp = 'If this flag is provided, cyther will not compile files that have a modified time before that of your compiled .pyd or .so files'
help_output = 'Change the name of the output file, default is basename plus .pyd'
help_assumptions = 'Print the list of assumptions cyther makes about your system before running'
help_skip = 'Skip the checking procedures that make sure that the right software is installed on your system (saves significant time if everything has already been checked, but not recommended)'
help_local = 'When not flagged, builds in __omicache__, when flagged, it builds locally in the same directory'
help_watch = 'When given, cyther will watch the directory with the \'t\' option implied and compile, when necessary, the files given'
help_cython = "Arguments to pass to Cython (use '_' or '__' instead of '-' or '--'"
help_gcc = "Arguments to pass to gcc (use '_' or '__' instead of '-' or '--'"
#All the command line processing statements
parser = argparse.ArgumentParser(description = 'Auto compile and build .pyx files in place.', usage = 'cyther [options] input_file')
parser.add_argument('filenames', action = 'store', nargs = '+', type = str, help = help_filenames)
parser.add_argument('-p', '--preset', action = 'store', type = str, default = '', dest = 'preset', help = help_preset)
parser.add_argument('-t', '--timestamp', action = 'store_true', default = False, dest = 'timestamp', help = help_timestamp)
parser.add_argument('-o', '--output', action = 'store', dest = 'output_name', type = str, help = help_output)
parser.add_argument('-z', '--assumptions', action = 'store_true', default = False, dest = 'assumptions', help = help_assumptions)
parser.add_argument('-skip', action = 'store_true', default = False, help = help_skip)
parser.add_argument('-l', '--local', action = 'store_true', dest = 'local', default = False, help = help_local)
parser.add_argument('-w', '--watch', action = 'store_true', dest = 'watch', default = False, help = help_watch)
parser.add_argument('-cython', action = 'store', nargs = '+', dest = 'cython_args', help = help_cython)
parser.add_argument('-gcc', action = 'store', nargs = '+', dest = 'gcc_args', help = help_gcc)
command_line_args = parser.parse_args()
#print the assumptions in the help text
if __name__ == '__main__':
main(command_line_args)
Cyther has been published on pypi.
-
2\$\begingroup\$ Please avoid editing your question with updated code. That invalidates the answers. Feel free to link to your new work, or even make a new question if you want a second review, but please avoid adding the new code into the question. Read What to do when I receive answers in the help centre for more information. \$\endgroup\$Quill– Quill2016年03月18日 01:42:36 +00:00Commented Mar 18, 2016 at 1:42
-
\$\begingroup\$ Thats why I put two copies of the code, a old and a new \$\endgroup\$Nick Pandolfi– Nick Pandolfi2016年03月18日 01:43:12 +00:00Commented Mar 18, 2016 at 1:43
-
\$\begingroup\$ Two copies or not, the rule still applies. \$\endgroup\$Quill– Quill2016年03月18日 01:43:50 +00:00Commented Mar 18, 2016 at 1:43
-
\$\begingroup\$ Ok, sorry, I didn't know that. Thanks though! \$\endgroup\$Nick Pandolfi– Nick Pandolfi2016年03月18日 01:45:01 +00:00Commented Mar 18, 2016 at 1:45
2 Answers 2
I will give a quick overview of some aspects that I found at a first glance. Unfortunately, I could not spare the time to do an in-depth analysis of the cross-plattform capabilites and/or thorough testing of your program. As I find your idea very interesting, I maybe check back later with more feedback.
Edit: Updated requirements for proper import and added some thoughts on the given assumptions.
Magic values
On the first sight, I was able to spot some magic values (like -42
), which maybe should be replaced with a CONSTANT
with a nice name (GCC_ERROR
?).
Module level constants
The values of assumptions
and the helper_*
are constants in this module. It is common practice in Python to write such module level constants in ALL_UPPERCASE_LETTERS
.
Missing docstrings
The next thing that caught my eye was a sincere lack of class/method/function docstrings. As the code (maybe) will be available to the public, assume someone will mess around with it. You may help others a lot if you tell them a little bit about each function and maybe the inputs it expects. (削除) The Python bible (削除ここまで) PEP257 is a good reference on this topic.
In addition some comments in the code may help to split longer functions into logical groups, which greatly simplifies rewiews and reworks of the code.
Redefinition of built-ins
With a little help from Spyder's synthax highlighting, I was able to find two redifintions of Python built-ins: license
and file
(talking from a Python 2.7 point of view, maybe that has changed). Refering to a SO answer, __license__
would be the way to go on this, whereas an alternative for file
is left as an exercise to the dear reader.
Space around keyword arguments
Following the infamous PEP8, there should be no spaces around keyword arguments. As an example
parser = argparse.ArgumentParser(description = '...', usage = '...')
would end up as
parser = argparse.ArgumentParser(description='...', usage='...')
Typechecking
From my experience it is more common to check the type of an object with isintance
instead of type
. With that in mind type(args) == str
becomes isinstance(args, str)
, for example.
Miscellaneous
One could move the whole parser initialization into the if __name__ == 'main':
block. If you want to import your program, you will have to move command_line_args = parser.parse_args()
to that block. Otherwise the parser will be started on import, which will cause an error because the script is called with no arguments.
The assumptions
1) Your environment path variable is named 'PATH'
All Linux and Windows systems I have worked on had an enviroment variable called PATH
. It is an educated guess, but from my point of view that assumptions is valid.
2) Cython, Python and gcc are all '.exe'
This may be a bit problematic. Linux systems do not have .exe
files usually. I can start the Python interpreter on Linux and Windows by typing python
into the console if the executable is in PATH
, but that may be caused by my setup. Maybe that can be exploited to check the pressence of Python and co.
3) libpythonXY.a exists in your libs directory
Linux seems to prefer .so
as file extension for libpython
(see this to the Ubuntu package repository for example)
4) mingw32 is up to date with the latest python release (windows)
I use Python library collections (Python(x,y) and/or Anaconda), of which at least one (to the best of my knowledge) comes bundled with its own version of MinGW. There is at least one more MinGW on my system. Following Murphy's Law, that may (and probably will) cause some hiccup, depending on which MinGW installation comes first in PATH
.
5) 'cyther' is callable from the command line
You will have to take care the cyther
is executable then. SO and the internet will give you a lot of advice how this can reached.
6) Any gcc compiled c program will work on windows
My experience says: "Not gonna happen!". Sorry. But I wish you the best of luck thay "any" will be "almost any".
This isn't really organized nor does it have any kind of theme - I just started from the top and went to the bottom, commenting as I went. I also agree with everything listed in Alex Vorndran's answer and didn't repeat anything he said (I think). I will say that this seems to be somewhat far off from actually being highly portable.
is_windows
You could probably make this less fragile if you wrote something like
is_windows = platform.platform().strip().lower().startswith('windows')
Then you're a little less susceptible to format changes IMO. It would probably be safe to lose the .strip()
call as well.
CytherError
Prefer super
to Exception.__init__
super(Exception, self).__init__(message)
It is also super generic - I'm not sure that it actually helps except to identify that the error is coming from Cyther, which I think they should have figured out on their own. Either make more specific errors that inherit this, or just use the standard library exceptions.
AttributeHolder
If you aren't actually going to give this any values or control what attributes it will have, just use a dictionary instead.
raceless_mkdir
I think you've increased the likelihood of race conditions by adding the os.path.exists
check. Just try to make the directory, then move on. I also question that your program should be making directories (this always makes me nervous) that might already exist - either figure out something that you know only yours will ever make (prefix it somehow) or error out if they don't have the directory themselves.
get_files
Prefer a list/generator comprehension here, or just a generator
def get_files(d):
return [file_ for file_ in os.listdir(d)
if os.path.isfile(os.path.join(d, file_))]
or
def get_files(d):
for file_ in os.listdir(d):
if os.path.isfile(os.path.join(d, file_)):
yield file_
raise StopIteration
which
Why is is_exe
a closure? I don't see any benefit there. I also don't like that you return None - raising an exception seems more logical here.
process_files
This could use a lot of whitespace to help with readability.
if file.extension not in ('.pyx', '.py')
- You might be surprised at how often people use non-standard extensions, and even more surprised at how often they have decent reasons for it. You might want to make this the default, but make it configurable/extensible. It should probably also be a named constant.
You don't need to assign to a local variable local_build
- just use args.local
I think this would be much better as a generator - less of a memory footprint, it won't hang forever building up a potentially large list, etc.
should_compile
There is no indication as to why these files should or should not compile - is it just checking if they're out of date? If so, then the name should reflect that.
core
Anytime I see something named core
(or similar) that isn't just a few lines I cringe - you're doing too much here. Break it up into smaller functions for readability and maintainability.
You also have lots of magic values here - what, if anything, do they signify? This should probably be documented somewhere.
Is there a use-case for people who want to use your program but not gcc? You don't support that atm.
main
You have some redundancy in your loop.
if not args.watch:
if len(args.filenames) > 1:
print("Skipping the file '{}'".format(file.file_path))
else:
print('Skipping compilation')
else:
pass#dont print anything, its a single file
continue
First of all, leave some space before a comment so it can be read. Secondly, just leave this as
if not args.watch:
if len(args.filenames) > 1:
print("Skipping the file '{}'".format(file.file_path))
else:
print('Skipping compilation')
argparse
I really don't like -long_name
flags - prefer -l
and --long_name
for consistency with most people's expectations, or just use long_name
as a subparser.
-
\$\begingroup\$ Did you see my new question here? I love the suggestions, they just are a little out-dated. codereview.stackexchange.com/questions/123999/… \$\endgroup\$Nick Pandolfi– Nick Pandolfi2016年03月28日 03:43:56 +00:00Commented Mar 28, 2016 at 3:43
-
\$\begingroup\$ So sorry for the mix up!! \$\endgroup\$Nick Pandolfi– Nick Pandolfi2016年03月28日 03:45:06 +00:00Commented Mar 28, 2016 at 3:45