Skip to main content
Stack Overflow
  1. About
  2. For Teams

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Required fields*

Python regex and Unicode [duplicate]

I am currently trying to figure out how to use Unicode in a regex in Python.

The regex I want to get to work is the following:

r"([A-ZÜÖÄß]+\s)+"

This should include all occurences of multiple capitalized words, that may or may not have Umlauts in them. Funnily enouth it will do nearly what I wanted, but it still ignores Umlauts.

For example, in FUßBALL AND MORE only BALL AND MORE should be detected.

I already tried to simply use the Unicode representations (Ü becomes \u00DC etc.), as it was advised in another thread, but that does not work too. Instead I might try to use the "regex" library instead of "re", but I kindoff want to know what I am doing wrong right now.

If you are able to enlighten me, please feel free to do so.

Answer*

Draft saved
Draft discarded
Cancel
0

lang-py

AltStyle によって変換されたページ (->オリジナル) /