1

I am doing text cleaning for my pandas dataframe

This is a string from my description column before punctuation is removed:

['dedicated', 'to', 'support', 'the', 'fast-paced', 'technology', 
'lifestyle', 'needs', 'of', 'today', 'â€TM', 's', 'modern', 'society', 
'.', 'gadget', 'mix', 'have', 'the', 'benefit', 'of', '“', 
'efficient', 'life', 'â€', 'tied', 'to', 'the', 'products', 'and', 
'services', 'they', 'provide', '.']

This is how the string look like after i applied the code below:

['dedicated', 'to', 'support', 'the', 'fast-paced', 'technology', 
'lifestyle', 'needs', 'of', 'today', 'â€TM', 's', 'modern', 'society', 
'gadget', 'mix', 'have', 'the', 'benefit', 'of', '“', 'efficient', 
'life', 'â€', 'tied', 'to', 'the', 'products', 'and', 'services', 
'they', 'provide']

This is my code:

#removing punctuation
import string
punc=string.punctuation
updated_mall['Cleansed_description']=update_mall['Cleansed_description'].apply(lambdax: [word for word in x if word not in punc])
update_mall.head(105)

This code did remove punctuation except:

words like "Fast-paced","...","restaurant/catering".

Other than that,after punctuation removal and changing to lower casing words like Asia's became 'asia' and 's.

I was told that this only check an entire string if is a punctuation instead of checking every single word in a string for punctuation.

asked Feb 8, 2023 at 12:02

1 Answer 1

1

Can you try the below code using regex

import re
updated_mall['Cleansed_description']=update_mall['Cleansed_description'].apply(lambda x: [re.sub(r'[^\w\d\s]', ' ', word.lower()) for word in x])
update_mall.head(105)
il_raffa
5,175145 gold badges35 silver badges41 bronze badges
answered Feb 8, 2023 at 12:38
Sign up to request clarification or add additional context in comments.

4 Comments

hi,i've tried and got an error:'list' object has no attribute 'lower'
I have updated the code, can you try that once
hi,yes your code worked. Thank you.Other than that for string with weird character like â€TM it return me a '' is there a way to just return nothing other than a ''.
I think those are not special characters, they are some other characters.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.