Timeline for answer to Replace non-ASCII characters with a single space by Alvaro Fuentes
Current License: CC BY-SA 4.0
Post Revisions
14 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| S May 6, 2023 at 23:07 | history | suggested | do-me | CC BY-SA 4.0 |
Add Python 3 version as unicode is undefined
|
| May 6, 2023 at 12:07 | review | Suggested edits | |||
| S May 6, 2023 at 23:07 | |||||
| Sep 21, 2021 at 13:46 | history | edited | mit | CC BY-SA 4.0 |
this is python 2, add note
|
| Dec 30, 2020 at 14:07 | review | Suggested edits | |||
| Dec 30, 2020 at 17:57 | |||||
| Dec 26, 2020 at 11:24 | comment | added | rjurney |
This works for Python3 - if you use unidecode(text). I got some quotation marks from funny unicode characters during a crawl this way.
|
|
| Feb 22, 2018 at 17:58 | history | edited | idbrii | CC BY-SA 3.0 |
link to module to be clear it's not standard
|
| Jan 25, 2017 at 10:16 | comment | added | Igor Savinkin | @AlvaroFuentes, how to handle/rewrite your wonderful code for Python 3 since this? Error: NameError: global name 'unicode' is not defined | |
| Dec 14, 2016 at 20:58 | comment | added | user5359531 | Does not seem to work with UTF-16 encoded text strings | |
| Nov 7, 2016 at 18:44 | comment | added | Do Not Track Me | There have been some security vulnerabilities with stuff like this in the past. Just be careful how you implement this! | |
| Feb 24, 2016 at 19:13 | comment | added | Alvaro Fuentes | Yes, I know this does not work for this question, but I landed here trying to solve that problem, so I thought I’d just share my solution to my own problem, which I think is very common for people as @dotancohen who deal with non-ascii characters all the time. | |
| Feb 20, 2016 at 20:16 | comment | added | dotancohen |
Thank you, this is a good answer. It doesn't work for the purpose of this question because most of the data that I'm dealing with does not have an ASCII-like representation. Such as דותן. However, in the general sense this is great, thank you!
|
|
| Feb 18, 2016 at 21:15 | comment | added | jxramos | interesting suggestion, but it assumes the user wishes non ascii to become what the rules for unidecode are. This however poses a follow up question to the asker about why they insist on spaces, to perhaps replace with another character? | |
| Feb 18, 2016 at 21:10 | review | Late answers | |||
| Feb 18, 2016 at 21:15 | |||||
| Feb 18, 2016 at 20:50 | history | answered | Alvaro Fuentes | CC BY-SA 3.0 |