12

how to split a string at positions before a character?

  • split a string before 'a'
  • input: "fffagggahhh"
  • output: ["fff", "aggg", "ahhh"]

the obvious way doesn't work:

>>> h=re.compile("(?=a)")
>>> h.split("fffagggahhh")
['fffagggahhh']
>>>
asked Nov 4, 2010 at 6:31
7
  • 2
    What do you expect when you split "aaa"['', 'a', 'a', 'a'] or ['a', 'a', 'a']? Commented Nov 4, 2010 at 7:22
  • "aaa" -> "a", "a", "a" or "", "a", "a", "a" Commented Nov 4, 2010 at 7:55
  • 1
    -1: "aaa" -> ["a", "a", "a"] or ["", "a", "a", "a"]. That's the least helpful thing I've ever seen. Both are right? In that case, no pattern can ever work. Close this question. Commented Nov 4, 2010 at 10:30
  • either one of them will do. if you have coded in python before, you would know a simple filter(bool, L) will filter out the empty element. Commented Nov 5, 2010 at 1:57
  • 1
    Did something change in Python? Now your "the obvious way" works flawlessly. Commented Mar 13, 2021 at 1:11

7 Answers 7

21

Ok, not exactly the solution you want but I thought it will be a useful addition to problem here.

Solution without re

Without re:

>>> x = "fffagggahhh"
>>> k = x.split('a')
>>> j = [k[0]] + ['a'+l for l in k[1:]]
>>> j
['fff', 'aggg', 'ahhh']
>>> 
answered Nov 4, 2010 at 6:39
Sign up to request clarification or add additional context in comments.

2 Comments

@knitti: Thanks. I understand it is not the re based solution and I wanted to write it first before I write re solution. By the time, I finished writing this, the re based solution had come.
yeah, why use a hammer on a single nail if you've got a nail shooter.
5
>>> rx = re.compile("(?:a|^)[^a]*")
>>> rx.findall("fffagggahhh")
['fff', 'aggg', 'ahhh']
>>> rx.findall("aaa")
['a', 'a', 'a']
>>> rx.findall("fgh")
['fgh']
>>> rx.findall("")
['']
answered Nov 4, 2010 at 6:41

1 Comment

-1 re.findall("(?:^|a)[^a]*", "aaa") produces ['', 'a', 'a']
4
>>> r=re.compile("(a?[^a]+)")
>>> r.findall("fffagggahhh")
['fff', 'aggg', 'ahhh']

EDIT:

This won't handle correctly double as in the string:

>>> r.findall("fffagggaahhh")
['fff', 'aggg', 'ahhh']

KennyTM's re seems better suited.

answered Nov 4, 2010 at 6:38

2 Comments

I wonder if the OP would want to keep the empty string from the split if it started with an 'a'.
-1 Uncool. Fails on repeated a's ... e.g. "aaa" -> empty list
3
import re
def split_before(pattern,text):
 prev = 0
 for m in re.finditer(pattern,text):
 yield text[prev:m.start()]
 prev = m.start()
 yield text[prev:]
if __name__ == '__main__':
 print list(split_before("a","fffagggahhh"))

re.split treats the pattern as a delimiter.

>>> print list(split_before("a","afffagggahhhaab"))
['', 'afff', 'aggg', 'ahhh', 'a', 'ab']
>>> print list(split_before("a","ffaabcaaa"))
['ff', 'a', 'abc', 'a', 'a', 'a']
>>> print list(split_before("a","aaaaa"))
['', 'a', 'a', 'a', 'a', 'a']
>>> print list(split_before("a","bbbb"))
['bbbb']
>>> print list(split_before("a",""))
['']
answered Nov 4, 2010 at 6:50

Comments

1

This one works on repeated a's

 >>> re.findall("a[^a]*|^[^a]*", "aaaaa")
 ['a', 'a', 'a', 'a', 'a']
 >>> re.findall("a[^a]*|[^a]+", "ffaabcaaa")
 ['ff', 'a', 'abc', 'a', 'a', 'a']

Approach: the main chunks that you are looking for are an a followed by zero or more not-a. That covers all possibilities except for zero or more not-a. That can happen only at the start of the input string.

answered Nov 4, 2010 at 7:02

Comments

-1
>>> foo = "abbcaaaabbbbcaaab"
>>> bar = foo.split("c")
>>> baz = [bar[0]] + ["c"+x for x in bar[1:]]
>>> baz
['abb', 'caaaabbbb', 'caaab']

Due to how slicing works, this will work properly even if there are no occurrences of c in foo.

answered Nov 4, 2010 at 6:41

Comments

-3

split() takes an argument for the character to split on:

>>> "fffagggahhh".split('a')
['fff', 'ggg', 'hhh']
answered Nov 4, 2010 at 6:40

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.