16 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
1
vote
2
answers
116
views
Convert Full width numbers into Normal numbers in python
I have a data in an excel file(only 1 column) where there are several japanese characters followed by fullwidth numbers. I want to convert these numbers into normal numbers.
いつもありがとう890ございます
...
0
votes
1
answer
89
views
More efficient way to replace special chars with their unicode name in pandas df
I have a large pandas dataframe and would like to perform a thorough text cleaning on it. For this, I have crafted the below code that evaluates if a character is either an emoji, number, Roman number,...
0
votes
2
answers
868
views
Capture output including control characters of subprocess
I have the following simple program to run a subprocess and tee its output to both stdout and some buffer
import subprocess
import sys
import time
import unicodedata
p = subprocess.Popen(
"...
0
votes
1
answer
745
views
Convert check mark in Python
I have a dataframe which has, in a certain column, a check mark (unicode: '\u2714'). I have been trying to replace it with the following coomand:
import unicodedata
df['Column'].str.replace(...
1
vote
0
answers
44
views
UnicodeEncodeError printing Hangul characters in the terminal [duplicate]
This application runs on a mac only and I'm stuck with Python 2.
I have an input string '한글' which when decoded through an online unicode converter shows as \u1112\u1161\u11ab\u1100\u1173\u11af
...
0
votes
1
answer
1k
views
Understanding unistr of unicodedata.normalize()
Wikipedia basically says the following for the four values of unistr.
- NFC (Normalization Form Canonical Composition)
- Characters are decomposed
- then recomposed by canonical equivalence.
-...
3
votes
1
answer
429
views
Determine if a unicode character exists in a unicode subset
I'd like to find a way to determine if a Unicode character exists in a standardized subset of Unicode characters, specifically Latin basic and Latin-1. I am using Python 2 and the unicodedata module ...
2
votes
1
answer
1k
views
What are the differences between the modules unicode and unicodedata?
I have a large dataset with over 2 million rows of textual data. Now I want to remove the accents from the strings.
In the link below, two different modules are described to remove the accents:
...
-1
votes
1
answer
497
views
C++ implementation of python unicodedata library
New user here, please be gentle.
we are looking to implement a piece of python code in c++, but it involves some intricate unicode library called unicodedata, in particular this function
...
-1
votes
1
answer
74
views
how to return values from map function on dataframe
I am trying to return values from map function but instead it gives me the memory address. I tried using list, but then it gives me an error stating str object doesn't have an attribute decode. Is ...
3
votes
1
answer
2k
views
Python convert this utf8 string to latin1
I have this UTF-8 string:
s = "Naděždaüäö"
Which I'd like to convert to a UTF-8 string which can be encoded in "latin-1" without throwing an exception. I'd like to do so by replacing every character ...
0
votes
3
answers
963
views
How to remove every possible accents from a column in python
I am new in python. I have a data frame with a column, named 'Name'. The column contains different type of accents. I am trying to remove those accents. For example, rubén => ruben, zuñiga=zuniga, ...
1
vote
2
answers
1k
views
Remove special characters from string such as smileys but keep german special charactes
I know how to remove unwanted charactes in a string, like smileys etc. However, some languages like german have special charactes, too.
This is my current code:
import unicodedata
string = "süß 😆😋...
2
votes
1
answer
2k
views
Get a list of all Greek unicode characters
I would like to know how to obtain a list of all Greek characters (upper and lowercase letters). I know how to find specific characters (unicodedata.lookup(name)), but I want all upper and lowercase ...
3
votes
1
answer
1k
views
What is the difference between unicodedata.digit and unicodedata.numeric?
From unicodedata doc:
unicodedata.digit(chr[, default]) Returns the digit value assigned to
the character chr as integer. If no such value is defined, default is
returned, or, if not given, ...
user avatar
user1785721