Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
2 votes
0 answers
30 views

I have a dataproc pipeline with which I do webscraping and store data in gcp. Task setting is something like this: create_dataproc_cluster = DataprocCreateClusterOperator( task_id='...
0 votes
0 answers
66 views

I have a code to scrape data from a website. I used pandas.read_html() and wrote everything in a dataproc. When I run the code in (Composer &)Airflow somethimes it runs successfully and sometimes ...
0 votes
1 answer
441 views

In the course I am learning it is suggested to execute the following code: (the idea is to get the 7th table from the wikipedia page) data = requests.get("https://en.wikipedia.org/wiki/...
1 vote
1 answer
139 views

I am currently working on a personal project and utilizing the chessdotcom Public API Package. I am currently able to store in a variable the PGN from the daily puzzle (Portable Game Notation) which ...
0 votes
0 answers
135 views

I'm having an issue with the read_html function from pandas. I'm Trying to read a datatable in a webpage that is made with <div> instead of <td> and <tr>. I'm trying to do it with ...
0 votes
1 answer
89 views

Visual Studio Code not reading html5lib I am using bs4 in VS Code, along with html5lib, but VS Code is indicating that it does not exist (I installed it using the command prompt). import requests ...
0 votes
2 answers
461 views

I'm Getting the value error trying to parse a page with BeautifulSoup and html5lib in Jupyter: import pandas as pd import requests import html5lib url = "https://worldpopulationreview.com/...
0 votes
2 answers
48 views

I am learning BS4. I parsed some div class. But I want to get data in div code. ` [<div class="handlebarData theme_is_whitehot" data-enrollment='{"available":{"id":...
8 votes
3 answers
1k views

Suggestions please, thanks :) pip list --outdated --format=freeze Gives the following error: ERROR: Exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pip/...
dewijones92's user avatar
  • 1,359
1 vote
1 answer
507 views

I want to use Python to parse HTML markup, and given one of the resultant DOM tree elements, get the start and end offsets of that element within the original, unmodified markup. For example, given ...
0 votes
0 answers
182 views

I'm using Selenium for functional testing of a Django application and thought I'd try html5lib as a way of validating the html output. One of the validations is that the page starts with a <!...
0 votes
2 answers
274 views

I want to replace the <h1> tag of a html page. But the content of the heading can be HTML (not just a string). I want to insert foo <b>bold</b> bar input: start <h1 class="...
0 votes
1 answer
178 views

How to replace the innerHTML of all tags with html5lib? input: foo <h1>Moonlight</h1> bar Desired output: foo <h1>Sunshine</h1> bar I would like to use html5lib, since it is ...
0 votes
1 answer
313 views

this is my first project with pandas and selenium so I may be making a dumb mistake. I've written this function to go through a list of nba players and scrape their game logs into data frames. It all ...
0 votes
1 answer
64 views

I'm getting this error. Is it a bug or is it a code error? What does it mean? Traceback (most recent call last): File "isc.py", line 8, in <module> import requests, os, sys, bs4 ...
Mayank's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
...
8

AltStyle によって変換されたページ (->オリジナル) /