1

I was wondering if it's possible to use list comprehension in the following case, or if it should be left as a for loop.

temp = []
for value in my_dataframe[my_col]:
 match = my_regex.search(value)
 if match:
 temp.append(value.replace(match.group(1),'')
 else:
 temp.append(value)

I believe I can do it with the if/else section, but the 'match' line throws me off. This is close but not exactly it.

temp = [value.replace(match.group(1),'') if (match) else value for 
 value in my_dataframe[my_col] if my_regex.search(value)]
asked Jan 16, 2018 at 15:48
7
  • value.replace(match.group(1)),'' - shouldn't the outer ) be after the 2nd '? Commented Jan 16, 2018 at 15:51
  • If you would do it with a list comprehension, you would not define match. Commented Jan 16, 2018 at 15:52
  • yep. typo fixed Commented Jan 16, 2018 at 15:55
  • Yes you can, but whether I'd do it depends on the regex (which you're not showing). Commented Jan 16, 2018 at 15:56
  • This could work, but it's inefficient because I can't assign a value to match. temp = [value.replace(my_regex.search(value).group(1),'') if my_regex.search(value) else value for value in my_dataframe[my_col]] Commented Jan 16, 2018 at 16:01

3 Answers 3

2

Single-statement approach:

result = [
 value.replace(match.group(1), '') if match else value
 for value, match in (
 (value, my_regex.search(value))
 for value in my_dataframe[my_col])]

Functional approach - python 2:

data = my_dataframe[my_col]
gen = zip(data, map(my_regex.search, data))
fix = lambda (v, m): v.replace(m.group(1), '') if m else v
result = map(fix, gen)

Functional approach - python 3:

from itertools import starmap
data = my_dataframe[my_col]
gen = zip(data, map(my_regex.search, data))
fix = lambda v, m: v.replace(m.group(1), '') if m else v
result = list(starmap(fix, gen))

Pragmatic approach:

def fix_string(value):
 match = my_regex.search(value)
 return value.replace(match.group(1), '') if match else value
result = [fix_string(value) for value in my_dataframe[my_col]]
answered Jan 16, 2018 at 16:46
Sign up to request clarification or add additional context in comments.

Comments

1

This is actually a good example of a list comprehension that performs worse than its corresponding for-loop and is (far) less readable.

If you wanted to do it, this would be the way:

temp = [value.replace(my_regex.search(value).group(1),'') if my_regex.search(value) else value for value in my_dataframe[my_col]]
# ^ ^

Note that there is no place for us to define match inside the comprehension and as a result we have to call my_regex.search(value) twice.. This is of course inefficient.

As a result, stick to the for-loop!

answered Jan 16, 2018 at 15:57

1 Comment

You can easily define match inside the comprehension: ... for match in [my_regex.search(value)] .... Or you can have a nested list comprehension, the inner one producing value+match pairs (might want to use a generator expression instead).
0

use a regular expression pattern with a sub group pattern looking for any word until an space plus character and characters he plus character is found and a space plus character and el is found plus any character . repeat the sub group pattern

paragraph="""either the well was very deep, or she fell very slowly, for she had
plenty of time as she went down to look about her and to wonder what was
going to happen next. first, she tried to look down and make out what
she was coming to, but it was too dark to see anything; then she
looked at the sides of the well, and noticed that they were filled with
cupboards and book-shelves; here and there she saw maps and pictures
hung upon pegs. she took down a jar from one of the shelves as
she passed; it was labelled 'orange marmalade', but to her great
disappointment it was empty: she did not like to drop the jar for fear
of killing somebody, so managed to put it into one of the cupboards as
she fell past it."""
sentences=paragraph.split(".")
pattern="\w+\s+((\whe)\s+(\w+el\w+)){1}\s+\w+"
temp=[]
for sentence in sentences:
 result=re.findall(pattern,sentence)
 for item in result:
 temp.append("".join(item[0]).replace(' ',''))
print(temp) 

output:

['thewell', 'shefell', 'theshelves', 'shefell']
answered Jun 25, 2021 at 20:06

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.