I have 2 strings:
s7="ONE : TWO : THREE : FOUR : FIVE 30.1 : SIX 288.3 : SEVEN 1.9 : EIGHT 45.3 :"
s8="ONE : TWO : THREE : FOUR 155.5 : FIVE 334.7 : SIX 6.7 : SEVEN 44.5 :"
I'm using the following code to parse it:
c=s.count(':')
if c==8:
res=""
res=s.split(' : ')
res = [item.strip() for item in s.split(':')]
for index, item in enumerate(res):
print index, item
if c==7:
res=""
res=s.split(' : ')
res = [item.strip() for item in s.split(':')]
for index, item in enumerate(res):
print index, item
This output i get is
>>> parse(s7)
0 ONE
1 TWO
2 THREE
3 FOUR
4 FIVE 30.1
5 SIX 288.3
6 SEVEN 1.9
7 EIGHT 45.3
8
>>> parse(s8)
0 ONE
1 TWO
2 THREE
3 FOUR 155.5
4 FIVE 334.7
5 SIX 6.7
6 SEVEN 44.5
7
How can I extract the numerical values from index 4 to 7 in s7 and index 3 to 6 in s8? I need to store these values so I can write them into a database later.
I've tried a couple of things but they aren't working.
Please help.
3 Answers 3
Does it always look like that? you can simply do:
s7="ONE : TWO : THREE : FOUR : FIVE 30.1 : SIX 288.3 : SEVEN 1.9 : EIGHT 45.3 :"
for elem in s7.split(' '):
try:
print elem
total += float(elem)
except:
pass
s7 = total
>>> s7
365.6
And do the same for s8 as well.
Comments
You could do a list comprehension as well.
Assuming the strings always take on the format you provided then this would work:
def parse(s):
results = [float(x) for x in s.split(' ') if x.count('.') == 1]
>> [30.1, 288.3, 1.9, 45.3]
>> [155.5, 334.7, 6.7, 44.5]
This code says:
for every `x` in the split string, which I've split on whitespace,
cast it to a float if x's count of `.` is 1.
Using count() here works since it won't raise any exceptions if it doesn't find any occurences of the . however if you want to use exception handling then index() would be the one you're looking for.
Comments
You can use the following regular expression on each string:
[A-Z][ ]+([\d.]+)
For each string, the value you're looking for will be in the first captured group, if not empty. You can see exactly what's going on at www.debuggex.com.
Full code:
import re
s7="ONE : TWO : THREE : FOUR : FIVE 30.1 : SIX 288.3 : SEVEN 1.9 : EIGHT 45.3 :"
def parse(s):
res = s.split(' : ')
matches = [re.search('[A-Z][ ]+([\d.]+)', x) for x in res]
return [float(x.group(1)) for x in matches if x is not None]
print(parse(s7)) // prints "[30.1, 288.3, 1.9, 45.3]"
resare redundant