Commit f5f44b1

committed

Add torrent item extraction to the spider.

1 parent 4a392ff commit f5f44b1Copy full SHA for f5f44b1

File tree

+14

-0

lines changed

+14

-0

lines changed

Lines changed: 14 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -9,3 +9,17 @@ def parse(self, response):`
`9`	`9`	`for page_url in response.css('a[title ~= page]::attr(href)').extract():`
`10`	`10`	`page_url = response.urljoin(page_url)`
`11`	`11`	`yield scrapy.Request(url=page_url, callback=self.parse)`
	`12`	`+`
	`13`	`+ # extract the torrent items`
	`14`	`+ for tr in response.css('table.lista2t tr.lista2'):`
	`15`	`+ tds = tr.css('td')`
	`16`	`+ link = tds[1].css('a')[0]`
	`17`	`+ yield {`
	`18`	`+ 'title' : link.css('::attr(title)').extract_first(),`
	`19`	`+ 'url' : response.urljoin(link.css('::attr(href)').extract_first()),`
	`20`	`+ 'date' : tds[2].css('::text').extract_first(),`
	`21`	`+ 'size' : tds[3].css('::text').extract_first(),`
	`22`	`+ 'seeders': int(tds[4].css('::text').extract_first()),`
	`23`	`+ 'leechers': int(tds[5].css('::text').extract_first()),`
	`24`	`+ 'uploader': tds[7].css('::text').extract_first(),`
	`25`	`+ }`

Comments

(0)