I tried to web scrape the table data from a binary signals website. The data updates after some time and I wanted to get the data as it updates. The problem is, when I scrape the code it returns empty values. The table has a table tag.
I'm not sure if it uses something else other than html because it updates without reloading. I had to use a browser user agent to get passed the security.
When I run it returns correct data but I have noticed signal id increments by 1
<table class="ui stripe hover dt-center table" id="isosignal-table" style="width:100%"><thead><tr><th></th><th class="no-sort">Current Price</th><th class="no-sort">Direction</th><th class="no-sort">Asset</th><th class="no-sort">Strike Price</th><th class="no-sort">Expiry Time</th></tr></thead><tbody><tr :class="[ signal.direction.toLowerCase() == 'call' ? 'call' : 'put' ]" :id="'signal-' + signal.id" :key="signal.id" ref="signals" v-for="signal in signals"><td style="display: none;" v-text="signal.id"></td><td v-text="signal.current_price"></td><td v-html="showDirection(signal.direction)"></td><td v-text="signal.asset"></td><td v-text="signal.strike_price"></td><td v-text="parseTime(signal.expiry)"></td></tr></tbody></table>
table = soup.table
print(table)
But when I run the whole code it returns this: [] ['', '', '', '', '', '']
from bs4 import BeautifulSoup
from urllib.request import Request, urlopen
url = "https://signals.investingstockonline.com/free-binary-signal-page"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
page = urlopen(req)
data = page.read()
soup = BeautifulSoup(data, 'html.parser')
table = soup.table
table_rows = table.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
if len(row) < 1:
pass
print(row)
I thought it would display the whole table but it just displayed empty strings. What could be the problem?
1 Answer 1
In the HTML you've provided, there is no text content in the elements, so you're getting that correctly. When you look at the live website, text content that appears in the table was inserted dynamically by JS fetching information from a server via ajax. In other words, if you perform a request, you'll get the skeleton (HTML) but no meat (live data).
You can use something like Selenium to extract this information as follows:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("https://signals.investingstockonline.com/free-binary-signal-page")
for tr in driver.find_elements_by_tag_name("tr"):
for td in tr.find_elements_by_tag_name("td"):
print(td.get_attribute("innerText"))
Output (truncated):
EURJPY
126.044
22:00:00
1.50318
EURCAD
1.50332
22:00:00
1.12595
EURUSD
1.12604
22:00:00
0.86732
EURGBP
0.86743
22:00:00
1.29825
GBPUSD
1.29841
22:00:00
145.320
4 Comments
Explore related questions
See similar questions with these tags.