How to read between 2 specific lines in python

Question 1

I'm having a variable which holds the contents that is somewhat similar to this

**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;5
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****

I want to read data that starts with Main_data1.{ Read only the last column and store it into a list} . Please note that this is a variable that holds this data and this is not a file.

My Desired Output:

Some_list=[1,2,3,4,5]

I thought of using something like this.

for line in var_a.splitlines():
 if Main_data1 in line:
 print (line)

But there are more than 200 lines from which I need to read the last column. What could be an efficient way of doing this

Question 2

Side note: 200 lines is practically nothing.

Question 3

Have you got solution yet?

Question 4

Check if line starts with "Main_data" than split by semi-colon ; and choose the last element by index -1:

some_list = []
for line in var_a.split("\n"):
 if line.startswith("Main_data"):
 some_list.append(int(line.split(";")[-1]))

Question 5

You can use a list comprehension to store the numbers :

my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if line.startswith('Main_data5')]

Also note that as a more pyhtonic way you better to use str.startswith() method rather than in operator. (with regards to this poing that it might happen to one line has Main_data5 in the middle of the line!)

If you have two case for start of the line you can use an or operator with two startswith consition.

my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if line.startswith('Main_data5') or line.startswith('Main_data1')]

But if you have more key-words you can use regex.For example if you want to match all the linse that stats with Main_data and followed by a number you can use re.match():

import re
my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if re.match(r'Main_data\d.*',line)]

Question 6

This is not a file. Change to variable.

Question 7

Hi, thanks a lot! Is there a way that i can specify like start reading from line which has Main_data1 and end reading where line has Main_data5?

Question 8

 my_list = []
 for line in my_var.strip().split('\n):
 if "Main_data1" in line:
 my_list.append(int(line.split(";")[-1]))
 else:
 continue

Or you can use the startswith('match)' function like someone mentioned.

Question 9

My approach is regex since it can control over pattern more-

File content

**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ******** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ******** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****

Code

import re
fl = open(r"C:\text.txt",'rb')
pattern = r'Main_data.*(?<=;)([0-9]{1,})'
data = []
for line in fl.readlines():
 #match all the digits that have ; before and line starts with Main_data
 if re.search(pattern, line, re.IGNORECASE | re.MULTILINE):
 data.append(re.search(pattern, line, re.IGNORECASE | re.MULTILINE).group(1))
 else:
 data.append('N')
strng = ','.join(data)#get string of the list
lsts = re.findall(r'(?<=,)[0-9,]+(?=,)',strng)# extracts values and excludes 'N'
outpt = [i.split(',') for i in lsts]# generate final list
print outpt

Output

[['1', '2', '3', '4', '523233'], ['1', '2', '3', '4', '523233'], ['1', '2', '3', '4', '523233']]

Assem 12.2k5 gold badges63 silver badges103 bronze badges · Answer 1 · 2015-11-08 08:18:25Z

Check if line starts with "Main_data" than split by semi-colon ; and choose the last element by index -1:

some_list = []
for line in var_a.split("\n"):
 if line.startswith("Main_data"):
 some_list.append(int(line.split(";")[-1]))

Kasravnd 108k19 gold badges167 silver badges195 bronze badges · Answer 2 · 2015-11-08 08:22:02Z

You can use a list comprehension to store the numbers :

my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if line.startswith('Main_data5')]

Also note that as a more pyhtonic way you better to use str.startswith() method rather than in operator. (with regards to this poing that it might happen to one line has Main_data5 in the middle of the line!)

If you have two case for start of the line you can use an or operator with two startswith consition.

my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if line.startswith('Main_data5') or line.startswith('Main_data1')]

But if you have more key-words you can use regex.For example if you want to match all the linse that stats with Main_data and followed by a number you can use re.match():

import re
my_list = [int(line.strip().split(';')[-1]) for line in my_var.split('\n') if re.match(r'Main_data\d.*',line)]

Hi, thanks a lot! Is there a way that i can specify like start reading from line which has Main_data1 and end reading where line has Main_data5?

Zuko 2,94433 silver badges32 bronze badges · Answer 3 · 2015-11-08 08:32:51Z

 my_list = []
 for line in my_var.strip().split('\n):
 if "Main_data1" in line:
 my_list.append(int(line.split(";")[-1]))
 else:
 continue

Or you can use the startswith('match)' function like someone mentioned.

Learner 5,3101 gold badge29 silver badges39 bronze badges · Answer 4 · 2015-11-08 10:33:47Z

My approach is regex since it can control over pattern more-

File content

**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ******** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
Main_data1;a;b;c;dss;e;1
Main_data2;aa;bb;sdc;d;e;2
Main_data3;aaa;bbb;ccce;d;e;3
Main_data4;aaaa;bbbb;cc;d;e;4
Main_data5;aaaaa;bbbbb;cccc;d;e;523233
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ******** SOME JUNK DATA ****
**** SOME JUNK DATA ****
**** SOME JUNK DATA ****

Code

import re
fl = open(r"C:\text.txt",'rb')
pattern = r'Main_data.*(?<=;)([0-9]{1,})'
data = []
for line in fl.readlines():
 #match all the digits that have ; before and line starts with Main_data
 if re.search(pattern, line, re.IGNORECASE | re.MULTILINE):
 data.append(re.search(pattern, line, re.IGNORECASE | re.MULTILINE).group(1))
 else:
 data.append('N')
strng = ','.join(data)#get string of the list
lsts = re.findall(r'(?<=,)[0-9,]+(?=,)',strng)# extracts values and excludes 'N'
outpt = [i.split(',') for i in lsts]# generate final list
print outpt

Output

[['1', '2', '3', '4', '523233'], ['1', '2', '3', '4', '523233'], ['1', '2', '3', '4', '523233']]

CollectivesTM on Stack Overflow

How to read between 2 specific lines in python

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related