String cut by a separator with the separator in the respective list element in python

Question 1

Python 3.6.5

Is there any better solution than this one? Particularly the last line. I don't like it.

import re
s = "my_separator first thing my_separator second thing"
data = re.split("(my_separator )", s)[1:]
data = [even+odd for i, odd in enumerate(data) for j, even in enumerate(data) if i%2==1 and j%2==0 and i==j+1]

Question 2

No need for regex at all. str.split can accept string separator.

Question 3

@hjpotter92 it can however I don't know how to call it to keep the separator. It removes it from the results however if you enclose re.split expression in a group (()) it keeps the separators in the results (but puts them separately from the results).

Question 4

You can exploit zip and iterators to allow you to pair things together:

data = [a + b for a, b in zip(*[iter(data)]*2)]

You could use just re, and change the separator with a look ahead assertion.

data = re.split(" (?=my_separator)", s)

You can use str.split, and just add the separator back:

sep = 'my_separator '
data = s.split(sep)[1:]
data = [sep + i for i in data]

data = [sep + i for i in s.split(sep)]

Question 5

Didn't know about itertools recipes, thanks a lot! I consider the lookahead most elegant as I won't have to process it then.

Question 6

The lookahead is not as general as I thought because split has to consume something and that something has to precede the separator :-( pairwise it is!

Question 7

pairwise cuts it as (1,2),(2,3),(3,4),..., my code does (1,2),(3,4),....

Question 8

@VaNa My bad, yes, I've fixed that

Question 9

@VaNa: The lookahead works fine, just not in Python. :-/ In Ruby : "my_separator first thing my_separator second thing".split(/(?=my_separator)/). Done!

Question 10

As already commented, use the str.split() version itself:

SEPARATOR = "my_separator "
s = "my_separator first thing my_separator second thing"
data = [SEPARATOR + part for part in s.split(SEPARATOR) if part]

Question 11

You meant it like this! You won! :-D

Question 12

Note that it doesn't work if 'my_separator' isn't present in the string.

Question 13

hjpotters92’s answer is great for fixed separator strings. If the separators vary and one wants to join them with each subsequent match one can use the following two approaches, neither of which requires closures:

1 Generator function

def split_with_separator1(s, sep):
 tokens = iter(re.split(sep, s))
 next(tokens)
 while True:
 yield next(tokens) + next(tokens)

The expression inside the loop works because the Python language guarantees left-to-right evaluation (unlike many other languages, e. g. C).

2 Interleaved slices and binary map

import operator
def split_with_separator2(s, sep)
 tokens = re.split(sep, s)
 return map(operator.add, tokens[1::2], tokens[2::2])

Of course one can slice with itertools.islice instead if one doesn't want to create two ephemeral token list copies.

Question 14

your last line "repaired"

import re
s = "my_separator first thing my_separator second thing"
data = re.split("(my_separator )", s)[1:]
data = [data[i]+data[i+1] for i in range(0, len(data), 2)]

Question 15

Although this one does not seem to be that elegant, it is correct and better than mine. Thanks!

Peilonrayz ♦Peilonrayz 44.4k7 gold badges80 silver badges157 bronze badges · Accepted Answer · 2018-05-15 11:28:37Z

8

\$\begingroup\$

You can exploit zip and iterators to allow you to pair things together:

data = [a + b for a, b in zip(*[iter(data)]*2)]

You could use just re, and change the separator with a look ahead assertion.

data = re.split(" (?=my_separator)", s)

You can use str.split, and just add the separator back:

sep = 'my_separator '
data = s.split(sep)[1:]
data = [sep + i for i in data]

data = [sep + i for i in s.split(sep)]

Share

edited May 15, 2018 at 15:15

answered May 15, 2018 at 11:28

Peilonrayz's user avatar

Peilonrayz ♦Peilonrayz

44.4k7 gold badges80 silver badges157 bronze badges

\$\endgroup\$

5

\$\begingroup\$ Didn't know about itertools recipes, thanks a lot! I consider the lookahead most elegant as I won't have to process it then. \$\endgroup\$

VaNa
– VaNa

2018年05月15日 14:31:45 +00:00
Commented May 15, 2018 at 14:31
\$\begingroup\$ The lookahead is not as general as I thought because split has to consume something and that something has to precede the separator :-( pairwise it is! \$\endgroup\$

VaNa
– VaNa

2018年05月15日 14:46:45 +00:00
Commented May 15, 2018 at 14:46
\$\begingroup\$ pairwise cuts it as (1,2),(2,3),(3,4),..., my code does (1,2),(3,4),.... \$\endgroup\$

VaNa
– VaNa

2018年05月15日 15:15:05 +00:00
Commented May 15, 2018 at 15:15
1

\$\begingroup\$ @VaNa My bad, yes, I've fixed that \$\endgroup\$

Peilonrayz
– Peilonrayz ♦

2018年05月15日 15:19:58 +00:00
Commented May 15, 2018 at 15:19
2

\$\begingroup\$ @VaNa: The lookahead works fine, just not in Python. :-/ In Ruby : "my_separator first thing my_separator second thing".split(/(?=my_separator)/). Done! \$\endgroup\$

Eric Duminil
– Eric Duminil

2018年05月15日 16:07:36 +00:00
Commented May 15, 2018 at 16:07

Add a comment |

Stack Exchange Network

String cut by a separator with the separator in the respective list element in python

4 Answers 4

1 Generator function

2 Interleaved slices and binary map

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

String cut by a separator with the separator in the respective list element in python

4 Answers 4

1 Generator function

2 Interleaved slices and binary map

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions