Simple Nagios Scraper - version 2

Question 1

Version 1 - Beginner web scraper for Nagios

Current version changes:

Moved NAGIOS_DATA dictionary to separate file (and added to .gitignore)
Used functions with DOCSTRINGS
Removed the multiple redundant print() statements
Actually read the PEP8 standards, and renamed variables to match the requirements

Again, beginner Python programmer. I appreciate the feedback!

import requests
from scraper import NAGIOS_DATA
from bs4 import BeautifulSoup
from requests.auth import HTTPBasicAuth, HTTPDigestAuth
def get_url_response(url, user, password, auth_type):
 """Get the response from a URL.
 Args:
 url (str): Nagios base URL
 user (str): Nagios username
 password (str): Nagios password
 auth_type (str): Nagios auth_type - Basic or Digest
 Returns: Response object
 """
 if auth_type == "Basic":
 return requests.get(url, auth=HTTPBasicAuth(user, password))
 return requests.get(url, auth=HTTPDigestAuth(user, password))
def main():
 """
 Main entry to the program
 """
 # for nagios_entry in ALL_NAGIOS_INFO:
 for url, auth_data in NAGIOS_DATA.items():
 user, password, auth_type = auth_data["user"], auth_data["password"], \
 auth_data["auth_type"]
 full_url = "{}/cgi-bin/status.cgi?host=all".format(url)
 response = get_url_response(full_url, user, password, auth_type)
 if response.status_code == 200:
 html = BeautifulSoup(response.text, "html.parser")
 for i, items in enumerate(html.select('td')):
 if i == 3:
 hostsAll = items.text.split('\n')
 hosts_up = hostsAll[12]
 hosts_down = hostsAll[13]
 hosts_unreachable = hostsAll[14]
 hosts_pending = hostsAll[15]
 hosts_problems = hostsAll[24]
 hosts_types = hostsAll[25]
 if i == 12:
 serviceAll = items.text.split('\n')
 service_ok = serviceAll[13]
 service_warning = serviceAll[14]
 service_unknown = serviceAll[15]
 service_critical = serviceAll[16]
 service_problems = serviceAll[26]
 service_types = serviceAll[27]
 # print(i, items.text) ## To get the index and text
 print_stats(
 user, url, hosts_up, hosts_down, hosts_unreachable,
 hosts_pending, hosts_problems, hosts_types, service_ok,
 service_warning, service_unknown, service_critical,
 service_problems, service_types)
 # print("Request returned:\n\n{}".format(html.text))
 # To get the full request
def print_stats(
 user, url, hosts_up, hosts_down, hosts_unreachable, hosts_pending,
 hosts_problems, hosts_types, service_ok, service_warning,
 service_unknown, service_critical, service_problems, service_types):
 print("""{}@{}:
 Hosts
 Up\tDown\tUnreachable\tPending\tProblems\tTypes
 {}\t{}\t{}\t\t{}\t{}\t\t{}
 Services
 OK\tWarning\tUnknown\tCritical\tProblems\tTypes
 {}\t{}\t{}\t{}\t\t{}\t\t{}""".format(
 user, url, hosts_up, hosts_down, hosts_unreachable, hosts_pending,
 hosts_problems, hosts_types, service_ok, service_warning,
 service_unknown, service_critical, service_problems, service_types))
if __name__ == '__main__':
 main()

scraper.py source:

NAGIOS_DATA = {
 'http://192.168.0.5/nagios': {
 'user': 'nagiosadmin',
 'password': 'PasswordHere1',
 'auth_type': 'Basic'
 },
 'https://www.example.com/nagios': {
 'user': 'exampleuser',
 'password': 'P@ssw0rd2',
 'auth_type': 'Digest'
 },
}

Question 2

There are still a couple of rouge non-PEP8-compliant variable names: serviceAll and hostsAll.

This is a minor detail, but to avoid too much nesting I would suggest inverting this condition if response.status_code == 200:. Then you can write it like this:

if response.status_code != 200:
 continue # or raise an exception
html = BeautifulSoup(response.text, "html.parser")

IMO, such code is much easier to read. These kind of checks are also called guards (https://en.wikipedia.org/wiki/Guard_(computer_science)).

Instead of iterating through all the td tags, I would store them in a list and then extract the necessary elements with an index:

td_elements = list(html.select('td'))
hosts_all = td_elements[3].text.split('\n')
service_all = td_elements[12].text.split('\n')

Next, I would like to focus on the print_stats function. It takes way to many parameters and has become tough to work with. I suggest storing all variables you extract from the HTML in a dictionary, which you can then pass to the print_stats function.

extracted_information = {
 'hosts_up': hosts_all[12],
 'hosts_down': hosts_all[13],
 'hosts_unreachable': hosts_all[14],
 'hosts_pending': hosts_all[15],
 'hosts_problems': hosts_all[24],
 'hosts_types': hosts_all[25],
 'service_ok': service_all[13],
 'service_warning': service_all[14],
 'service_unknown': service_all[15],
 'service_critical': service_all[16],
 'service_problems': service_all[26],
 'service_types': service_all[27],
}

Then you would call the print_stats function like this: print_stats(user, url, extracted_information).

Of course, we now have to rewrite the print_stats function itself. The Python format function can also take named parameters. For example: "{param1} and {param2}".format(param1="a", param2="b") would return string "a and b". Using this we can rewrite the template string and pass the "unpacked" extracted_information dictionary to the format function.

def print_stats(user, url, extracted_information):
 template = """{user}@{url}:
 Hosts
 Up\tDown\tUnreachable\tPending\tProblems\tTypes
 {hosts_up}\t{hosts_down}\t{hosts_unreachable}\t\t{hosts_pending}\t{hosts_problems}\t\t{hosts_types}
 Services
 OK\tWarning\tUnknown\tCritical\tProblems\tTypes
 {service_ok}\t{service_warning}\t{service_unknown}\t{service_critical}\t\t{service_problems}\t\t{service_types}"""
 print(template.format(user=user, url=url, **extracted_information))

Rok Novosel Rok Novosel 1645 bronze badges · Answer 1 · 2020-01-25 09:33:36Z

There are still a couple of rouge non-PEP8-compliant variable names: serviceAll and hostsAll.

This is a minor detail, but to avoid too much nesting I would suggest inverting this condition if response.status_code == 200:. Then you can write it like this:

if response.status_code != 200:
 continue # or raise an exception
html = BeautifulSoup(response.text, "html.parser")

IMO, such code is much easier to read. These kind of checks are also called guards (https://en.wikipedia.org/wiki/Guard_(computer_science)).

Instead of iterating through all the td tags, I would store them in a list and then extract the necessary elements with an index:

td_elements = list(html.select('td'))
hosts_all = td_elements[3].text.split('\n')
service_all = td_elements[12].text.split('\n')

Next, I would like to focus on the print_stats function. It takes way to many parameters and has become tough to work with. I suggest storing all variables you extract from the HTML in a dictionary, which you can then pass to the print_stats function.

extracted_information = {
 'hosts_up': hosts_all[12],
 'hosts_down': hosts_all[13],
 'hosts_unreachable': hosts_all[14],
 'hosts_pending': hosts_all[15],
 'hosts_problems': hosts_all[24],
 'hosts_types': hosts_all[25],
 'service_ok': service_all[13],
 'service_warning': service_all[14],
 'service_unknown': service_all[15],
 'service_critical': service_all[16],
 'service_problems': service_all[26],
 'service_types': service_all[27],
}

Then you would call the print_stats function like this: print_stats(user, url, extracted_information).

Of course, we now have to rewrite the print_stats function itself. The Python format function can also take named parameters. For example: "{param1} and {param2}".format(param1="a", param2="b") would return string "a and b". Using this we can rewrite the template string and pass the "unpacked" extracted_information dictionary to the format function.

def print_stats(user, url, extracted_information):
 template = """{user}@{url}:
 Hosts
 Up\tDown\tUnreachable\tPending\tProblems\tTypes
 {hosts_up}\t{hosts_down}\t{hosts_unreachable}\t\t{hosts_pending}\t{hosts_problems}\t\t{hosts_types}
 Services
 OK\tWarning\tUnknown\tCritical\tProblems\tTypes
 {service_ok}\t{service_warning}\t{service_unknown}\t{service_critical}\t\t{service_problems}\t\t{service_types}"""
 print(template.format(user=user, url=url, **extracted_information))

Stack Exchange Network

Simple Nagios Scraper - version 2

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Simple Nagios Scraper - version 2

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions