I'm getting lots of warnings like this in Python:
DeprecationWarning: invalid escape sequence \A
orcid_regex = '\A[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{3}[0-9X]\Z'
DeprecationWarning: invalid escape sequence \/
AUTH_TOKEN_PATH_PATTERN = '^\/api\/groups'
DeprecationWarning: invalid escape sequence \
"""
DeprecationWarning: invalid escape sequence \.
DOI_PATTERN = re.compile('(https?://(dx\.)?doi\.org/)?10\.[0-9]{4,}[.0-9]*/.*')
<unknown>:20: DeprecationWarning: invalid escape sequence \(
<unknown>:21: DeprecationWarning: invalid escape sequence \(
What do they mean? And how can I fix them?
In Python 3.12+ the error message is changed from a DeprecationWarning to a SyntaxWarning (changelog):
SyntaxWarning: invalid escape sequence '\A'
2 Answers 2
\ is the escape character in Python string literals.
For example if you want to put a tab character in a string you may use:
>>> print("foo \t bar")
foo bar
If you want to put a literal \ in a string you may use \\:
>>> print("foo \\ bar")
foo \ bar
Or you may use a "raw string":
>>> print(r"foo \ bar")
foo \ bar
You can't just go putting backslashes in string literals whenever you want one. A backslash is only allowed when part of one of the valid escape sequences, and it will cause a DeprecationWarning (< 3.12) or a SyntaxWarning (3.12+) otherwise. For example \A isn't a valid escape sequence:
$ python3.6 -Wd -c '"\A"'
<string>:1: DeprecationWarning: invalid escape sequence \A
$ python3.12 -c '"\A"'
<string>:1: SyntaxWarning: invalid escape sequence '\A'
If your backslash sequence does accidentally match one of Python's escape sequences, but you didn't mean it to, that's even worse because the data is just corrupted without any error or warning.
So you should always use raw strings or \\.
It's important to remember that a string literal is still a string literal even if that string is intended to be used as a regular expression. Python's regular expression syntax supports many special sequences that begin with \. For example \A matches the start of a string. But \A is not valid in a Python string literal! This is invalid:
my_regex = "\Afoo"
Instead you should do this:
my_regex = r"\Afoo"
Docstrings are another one to remember: docstrings are string literals too, and invalid \ sequences are invalid in docstrings too! Use r"""raw strings""" for docstrings if they must contain \.
6 Comments
r"{}".format(my_variable)"\\." or r"\." instead of "\.". Took me a while to figure out. Your answer helped. Thanks.r'{}'.format(my_variable) and '{}'.format(my_variable) are exactly the same thing; the difference between them accomplishes no benefit, because {} contains no characters with parsing that's different between raw and conventional interpretation. What's in my_variable is irrelevant, because format() is only called after '{}' or r'{}' is parsed, and they both parse to the exact same thing.r only do special manipulation for "treat backslashes as literal characters" 2. parse order is implied by that here we actually call one method docs.python.org/3/library/stdtypes.html#str.format for the same class instance compared by == stackoverflow.com/a/1227325/21294350.For convenience, you can use the following method to automatically add r to docstrings:
- write a script using e.g.
libcstto parse the source code, modify it to addrat appropriate places, then write it back - run the script on your code
The advantage of using CST over AST is for example number of parentheses or white spaces are preserved.
You can choose to write the script manually, or GPT o3-mini is capable of writing the script for you. I use the following prompt successfully on https://duck.ai to write a script:
write Python program using libcst to automatically add r to docstrings that would raise syntax warning (invalid escape sequence)
Because posting content generated by generative AI tools are not allowed, you can use the following prompt on the AI tool yourself. For convenience, I put the response I got at https://gist.github.com/user202729/78846233ae50f298cd1d20a8f79cf86e , although you should be able to get the same by using the instructions outlined above.
Example usage:
[]$ cat a.py
def f():
"""
hello \s world
"""
print("\d")
[]$ python code-slightly-modified.py a.py
[]$ cat a.py
def f():
r"""
hello \s world
"""
print("\d")
My answer has added value on top of AI, since pasting the question verbatim and ask for an automatic fix isn't particularly helpful, suggesting regex solutions which is needless to say less robust than
libcst. See:
1 Comment
\\ which is entirely valid, it is still converted to raw string.
'''' I'm stupid: r'$\Delta$' '''in your code. The checker doesn't see$\Delta$is latex, it doesn't seerprevents escaping anything, and it doesn't see anyway this is a comment enclosed in'''.