Disclaimers:
- This question is reposted from SO upon SO user's suggestion to put it here since there is no specific code in question.
- This question is a subset of my larger theme of Fortran modernization.
- There are useful versions of this question asked already (1, 2, 3) and blog posts, and although helpful, I am curious of what is best practice now (some of those posts are 5-10 years old) and the context of my situation, which is the reason for asking it in a similar way.
Background
Our current code base is largely Fortran (400K LOC, most F90/some F77/maybe some newer versions), dispersed amongst some 10 separate dev teams (organizationally) in (somewhat) standalone modules.
We are exploring the best practice idea of taking one of those modules and writing it in a faster-to-develop language (Python), and wrapping and extending bottlenecks in Fortran.
However, since the module is currently completely in Fortran and interacts with other Fortran modules and a main Fortran loop, we need to keep the appearance or interface of the module to main Fortran loop the same.
So the incremental change approach for this Fortran module – and other similarly architected Fortran modules – is to keep the interfaces the same and change the internals.
In other words, the shell or signature of the Fortran subroutines or functions would be unchanged, but the functionality inside the function would be changed.
This functionality inside the function is what we want to rewrite in Python.
From discussions online - and my intuition, whatever that’s worth – it seems unadvised to embed Python like this, and instead do the reverse: extend Python with Fortran (for bottlenecks, as needed).
However, this embedding Python approach seems the only logical, smallest, atomic step forward to isolate changes to the larger Fortran system.
It appears that there are some options for this embedding Python problem.
The most future-proof and stable way is via ISO_C_BINDING and using the Python C API (in theory...).
There are also stable solutions via Cython and maybe ctypes that are reliable and well maintained.
There are more dependency-heavy approaches like cffi and forpy, that introduce complexity for the benefit of not writing the additional C interface code.
We are also aware of simple system calls of Python scripts from Fortran, but these seem too disk write/read heavy; we would really like to do minimal disk reads/writes and keep the array passing in memory (hopefully just pointer locations passed, not array copies).
Another constraint is that this Fortran-Python interface should be able to handle most common datatypes and structures, including multi-dimensional arrays, or at least have a way to decompose complex custom datatypes to some collection of simpler datatypes like strings and arrays.
Question:
Given the above description, do you have any advice on what are the best practice ways that are currently available to call Python from Fortran, with minimal writing of additional C code (so a package like cffi
would be preferred over Python C API approach)?
- using 2003 or 2008 Fortran versions; not sure if all 2018 features are implemented in our Intel Fortran compiler.
Edit (2020年04月16日):
To specify this question further, I am rephrasing the question to be specific to writing as little C overhead code as possible.
-
6To the downvoter: come on! This question is exactly the opposite of something which "needs more focus." Well-thought, well written, detailed, showing research and asking one, precise thing. What else do you need from a question?Arseni Mourzenko– Arseni Mourzenko2020年04月06日 22:57:09 +00:00Commented Apr 6, 2020 at 22:57
-
1Despite its age, the answers in stackoverflow.com/questions/17075418/… still seem to be highly relevant.Bart van Ingen Schenau– Bart van Ingen Schenau2020年04月07日 09:40:27 +00:00Commented Apr 7, 2020 at 9:40
-
My advice would be to create a C-callable library from your Fortran code, and call it from Python with its C FFI. You've put a lot of time and effort into that Fortran code; it probably makes more sense to preserve it in its present form. If you were translating it to another language, C would be the most logical choice, not Python. Also, everything this person says.Robert Harvey– Robert Harvey2020年04月16日 17:09:12 +00:00Commented Apr 16, 2020 at 17:09
1 Answer 1
Integrating Python within a Fortran codebase, especially considering the time and effort put into it, ought to prioritise interoperability and maintainability without sacrificing performance. Embedding Python directly into Fortran, though technically feasible, typically introduces substantial overhead, complexity, and long-term maintainability issues. Creating a C-compatible interface to existing Fortran code (using ISO_C_BINDING
from Fortran 2003) and invoking these compiled routines from Python I think is the best approach here.
Given your stated preference for minimal additional C code, practical options for this integration include Python's ctype
s or cffi
libraries. Both are widely adopted and reliable; however, cffi
generally offers higher-level abstractions and significantly reduces boilerplate compared to ctype
s, making it particularly suitable for complex interfaces or when aiming to minimise explicit C interface code. ctype
s, although simpler, tends to be more verbose and lower-level, requiring explicit pointer management, particularly with multidimensional arrays or complex data structures. In practice, runtime performance differences between ctype
s and cffi
are often negligible compared to efficiency gains from compiled Fortran routines; thus, cffi
’s simplicity and maintainability advantages usually outweigh minor overhead concerns.
Consider an illustrative example of a Fortran subroutine using ISO_C_BINDING
:
subroutine my_subroutine(arg1, arg2) bind(C, name="my_subroutine")
use iso_c_binding
integer(c_int), value :: arg1
real(c_double), intent(inout) :: arg2
! Implementation
end subroutine
After compilation into a shared library, interfacing directly from Python using cffi
would look like:
from cffi import FFI
ffi = FFI()
ffi.cdef("void my_subroutine(int arg1, double *arg2);")
C = ffi.dlopen("./my_fortran_lib.so")
arg2 = ffi.new("double *", 3.14)
C.my_subroutine(10, arg2)
This approach effectively preserves the robustness and computational efficiency of the existing Fortran codebase while enabling manageable integration with Python’s data handling and analysis ecosystem. Adopting cffi
aligns well with contemporary best practice due to its simplicity, minimal C overhead, and proven stability in large-scale scientific computing. Notably, prominent scientific computing projects such as SciPy
and PyPy
utilise cffi
extensively, underscoring its suitability and reliability.
Another (more recent) alternative is fmodpy
, which automates the generation of Python wrappers around existing Fortran code with minimal manual interface specification. This analyses Fortran source code to produce (apparently) seamless Python interfaces, further reducing boilerplate and manual effort. It is very useful if you prefer a fully automated approach that minimises both manual code modifications and explicit handling of intermediate C interfaces, while preserving computational efficiency. I have tried using this in a couple of projects, but I ended up reverting to ctype
s or cffi
which I was more familiar with, so I can't really recommend it, but I would definitely advise you to look into it for this or, similar, projects.
Summing up, embedding Python directly into Fortran is best avoided. Instead, prefer ISO_C_BINDING
coupled with cffi
for a relatively simple, maintainable, and efficient solution.