BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

BeiyanYunyi wants to merge 2 commits into NCAR:develop

from BeiyanYunyi:fix-build

Open

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

BeiyanYunyi wants to merge 2 commits into NCAR:develop from BeiyanYunyi:fix-build

Conversation

@BeiyanYunyi

Copy link

@BeiyanYunyi BeiyanYunyi commented Apr 16, 2025

On a system with a non-ascii compatible LANG environment variable, gfortran will produce non-ascii output. My working environment is Linux with LANG=zh_CN.UTF-8, in my environment,

gfortran -E ompgen.F90 -o omp.f90 -cpp

will output:

# 1 "ompgen.F90"
# 1 "<built-in>"
# 1 "<命令行>"
# 1 "ompgen.F90"
!... other code

instead of:

# 1 "ompgen.F90"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "ompgen.F90"

Chinese character at line 3 will cause the project fail to build:

 Traceback (most recent call last):
 File "/home/BeiyanYunyi/.cache/uv/builds-v0/.tmpP5ioKB/lib/python3.11/site-packages/numpy/f2py/crackfortran.py", line 391, in
 readfortrancode
 l = fin.readline()
 ^^^^^^^^^^^^^^
 File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/fileinput.py", line 292, in
 readline
 line = self._readline()
 ^^^^^^^^^^^^^^^^
 File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/fileinput.py", line 372, in
 _readline
 return self._readline()
 ^^^^^^^^^^^^^^^^
 File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/encodings/ascii.py", line 26, in
 decode
 return codecs.ascii_decode(input, self.errors)[0]
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 71: ordinal not in range(128)

To reproduce the bug, simply run this command in the repo (POSIX environment):

LANG=zh_CN.UTF-8 pip install

As numpy.f2py suggests, It is likely that installing charset_normalizer package will help f2py determine the input file encoding correctly. Adding charset-normalizer to build-system.requires will make it infer the encoding correctly. After adding it to build-system.requires, I've successfully built this package.

@BeiyanYunyi


 BLD,BUG: Add charset-normalizer to improve compability with non-ascii...

c7dfeb3

... environment

@kafitzgerald

Copy link

Collaborator

kafitzgerald commented Apr 17, 2025

Thanks for the PR!

I'll take a look at this tomorrow.

@BeiyanYunyi


 Merge branch 'NCAR:develop' into fix-build

f94dd02

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

Are you sure you want to change the base?

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

Uh oh!

Conversation

@BeiyanYunyi BeiyanYunyi commented Apr 16, 2025

Uh oh!

kafitzgerald commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants