Python Parse text

Asked 5 years, 6 months ago

Viewed 269 times

Here is a text, I need to parse;

JAVA_OPTS=blablalba
lbalbalba
1. main1:
 aelo1 2020年06月15日 11 4422
 sddg2 2020年06月12日 19 422
2. main2:
 fdata3 2020年06月15日 11 4422
 gcontent4 2020年06月12日 19 422
3. main3:
 hxvnt5 2020年06月15日 11 4422
 vcfdet6 2020年06月12日 19 422

I need to only parse the numbered bullet point, until next bullet point. and find the 4 th column greater than 1000 and older than 12 hours (2nd column date time) then send the details in email. I tried parsing via re library in python, but cannot achieve it.

So the expected output is;

 1. main1:
 aelo1 2020年06月15日 11 4422
 2. main2:
 fdata3 2020年06月15日 11 4422
 3. main3:
 hxvnt5 2020年06月15日 11 4422

is it possible via bash or python ?

python

Improve this question

edited Jun 17, 2020 at 6:45

asked Jun 17, 2020 at 4:09

user13760031's user avatar

user13760031

212 bronze badges

What do you mean "older than 12h"?

DV82XL
– DV82XL

2020年06月17日 04:16:13 +00:00
Commented Jun 17, 2020 at 4:16
The "older than 12 hours" requirement needs clarification - do you want to keep the rows with 3rd column values > 12 or ignore them? Also, sharing what you have tried will help others help you.

Omkar Neogi
– Omkar Neogi

2020年06月17日 04:34:24 +00:00
Commented Jun 17, 2020 at 4:34
Add parse to you post tag

Akshat Zala
– Akshat Zala

2020年06月17日 05:55:53 +00:00
Commented Jun 17, 2020 at 5:55

Add a comment |

3 Answers 3

Sorted by: Reset to default

Here is the regex which you can use to match (I am not sure about 12 hours).

\d+\.\s\S+\s+\S+\s[0-9-]+\s\d+\s[1-9][0-9]{3,}

Improve this answer

answered Jun 17, 2020 at 4:20

Akshay G Bhardwaj's user avatar

Akshay G Bhardwaj

3391 gold badge4 silver badges15 bronze badges

Comments

Here a solution for you

def parsing(text):
 if text.strip() == '':
 return ''
 lines = text.split('\n')
 buffer = ''
 for line in lines:
 t = line.strip()
 if t == '' or t[0] in '0123456789':
 buffer += line + '\n'
 else:
 lst = t.split()
 if len(lst) >= 4:
 if (len(lst[1].split('-'))==3 and int(lst[2]) <= 12 and
 int(lst[3]) > 1000):
 buffer += line + '\n'
 return buffer.strip()
print(parsing(text))

Improve this answer

edited Jun 17, 2020 at 6:56

answered Jun 17, 2020 at 5:41

Jason Yang's user avatar

Jason Yang

13.1k2 gold badges11 silver badges29 bronze badges

2 Comments

Jason Yang

Jason Yang Over a year ago

That's why requested to provide more information about their requirements and situations. Updated.

2020年06月17日T06:57:48.417Z+00:00

user13760031

user13760031 Over a year ago

Updated the expected output for better understanding

2020年06月17日T07:10:47.18Z+00:00

Can use TTP to parse/filter it in one template:

from ttp import ttp
import pprint
data = """
JAVA_OPTS=blablalba
lbalbalba
1. main1:
 aelo1 2020年06月15日 11 4001
 sddg2 2020年06月12日 19 422
2. main2:
 fdata3 2020年06月16日 11 4422
 gcontent4 2020年06月12日 19 422
3. main3:
 hxvnt5 2020年06月17日 11 4002
 vcfdet6 2020年06月12日 19 422
"""
 
template = """
<group contains="value">
1. main1: {{ _start_ }}
 {{ ignore }} {{ date }} {{ hour | lessthan("12") }} {{ value | greaterthan("4000") }}
</group> 
"""
 
parser = ttp(data, template)
parser.parse()
res = parser.result()
pprint.pprint(res)
# prints:
# [[[{'date': '2020-06-15', 'hour': '11', 'value': '4001'},
# {'date': '2020-06-16', 'hour': '11', 'value': '4422'},
# {'date': '2020-06-17', 'hour': '11', 'value': '4002'}]]]

Can test templates online here if you'd like.

Disclaimer: I am the author of TTP.

Edit: after parsing can further post-process results to compose email report or whatever the end result must look like.

Improve this answer

answered Dec 30, 2021 at 12:38

apraksim's user avatar

apraksim

2011 silver badge4 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

python

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Python Parse text

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related