This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Commit c43a74c

authored

Update economictimes_scraper.py

1 parent 8bd958d commit c43a74cCopy full SHA for c43a74c

File tree

-2

lines changed

-2

lines changed

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,5 @@`
`1`	`1`	`from bs4 import BeautifulSoup`
`2`		`-fromlxmlimport etree`
	`2`	`+import defusedxml`
`3`	`3`	`import requests`
`4`	`4`	`import json`
`5`	`5`	`import datetime`
`@@ -38,7 +38,7 @@ def datestr_to_date(datestr):`
`38`	`38`	`## Gets News article metadata from article url`
`39`	`39`	`def fetchNewsArticle(url):`
`40`	`40`	`html = requests.get(url).content`
`41`		`- root = etree.HTML(html)`
	`41`	`+ root = defusedxml.HTML(html)`
`42`	`42`	`x = root.xpath("/html/body//script[@type='application/ld+json']")`
`43`	`43`	`metadata = None ## When Article does not exists (404)`
`44`	`44`	`if (len(x) >= 2):`

Comments

(0)