Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 36426bd

Browse files
Merge pull request avinashkranjan#2146 from Mihan786Chistie/askUbuntu
added Ask ubuntu scraper
2 parents dd27396 + 6b327c1 commit 36426bd

File tree

3 files changed

+95
-0
lines changed

3 files changed

+95
-0
lines changed

‎AskUbuntu-Scraper/README.md‎

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
## Ask Ubuntu
2+
3+
### Scrape questions, views, votes, answer counts, and descriptions from Ask Ubuntu website regarding a topic
4+
5+
Create an instance of `AskUbuntu` class.
6+
7+
```python
8+
questions = AskUbuntu("topic")
9+
```
10+
11+
| Methods | Details |
12+
| -------------- | ----------------------------------------------------------------------------------- |
13+
| `.getQuestions()` | Returns the questions, views, votes, answer counts, and descriptions in JSON format |
14+
15+
16+
**Example**
17+
18+
```python
19+
que = AskUbuntu("github")
20+
scrape = que.getQuestions()
21+
22+
```

‎AskUbuntu-Scraper/questions.py‎

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
from bs4 import BeautifulSoup
2+
import requests
3+
import json
4+
5+
6+
class AskUbuntu:
7+
"""
8+
Create an instance of `AskUbuntu` class.
9+
10+
```python
11+
questions = AskUbuntu("topic")
12+
```
13+
"""
14+
15+
def __init__(self, topic):
16+
self.topic = topic
17+
18+
def getQuestions(self):
19+
"""
20+
Class - `AskUbuntu`
21+
Example:
22+
```
23+
que = AskUbuntu("github")
24+
scrape = que.getQuestions()
25+
```
26+
Returns:
27+
{
28+
"question": question title
29+
"views": view count of question
30+
"vote_count": vote count of question
31+
"answer_count": no. of answers to the question
32+
"description": description of the question
33+
}
34+
"""
35+
url = "https://askubuntu.com/questions/tagged/" + self.topic
36+
try:
37+
res = requests.get(url)
38+
soup = BeautifulSoup(res.text, "html.parser")
39+
40+
questions_data = {"questions": []}
41+
42+
questions = soup.select(".s-post-summary")
43+
for que in questions:
44+
title = que.select_one(".s-link").getText()
45+
stats = que.select(".s-post-summary--stats-item-number")
46+
vote = stats[0].getText()
47+
ans = stats[1].getText()
48+
views = stats[2].getText()
49+
desc = (
50+
que.select_one(".s-post-summary--content-excerpt")
51+
.getText()
52+
.strip()
53+
.encode("ascii", "ignore")
54+
.decode()
55+
)
56+
questions_data["questions"].append(
57+
{
58+
"question": title,
59+
"views": views,
60+
"vote_count": vote,
61+
"answer_count": ans,
62+
"description": desc,
63+
}
64+
)
65+
json_data = json.dumps(questions_data)
66+
return json_data
67+
except ValueError:
68+
error_message = {"message": "No questions related to the topic found"}
69+
ejson = json.dumps(error_message)
70+
return ejson

‎AskUbuntu-Scraper/requirements.txt‎

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
beautifulsoup4
2+
requests
3+
json

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /