Table data returning empty values after web scraping

Question 1

I tried to web scrape the table data from a binary signals website. The data updates after some time and I wanted to get the data as it updates. The problem is, when I scrape the code it returns empty values. The table has a table tag.

I'm not sure if it uses something else other than html because it updates without reloading. I had to use a browser user agent to get passed the security.

When I run it returns correct data but I have noticed signal id increments by 1

<table class="ui stripe hover dt-center table" id="isosignal-table" style="width:100%"><thead><tr><th></th><th class="no-sort">Current Price</th><th class="no-sort">Direction</th><th class="no-sort">Asset</th><th class="no-sort">Strike Price</th><th class="no-sort">Expiry Time</th></tr></thead><tbody><tr :class="[ signal.direction.toLowerCase() == 'call' ? 'call' : 'put' ]" :id="'signal-' + signal.id" :key="signal.id" ref="signals" v-for="signal in signals"><td style="display: none;" v-text="signal.id"></td><td v-text="signal.current_price"></td><td v-html="showDirection(signal.direction)"></td><td v-text="signal.asset"></td><td v-text="signal.strike_price"></td><td v-text="parseTime(signal.expiry)"></td></tr></tbody></table>
table = soup.table
print(table)

But when I run the whole code it returns this: [] ['', '', '', '', '', '']

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen
url = "https://signals.investingstockonline.com/free-binary-signal-page"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
page = urlopen(req)
data = page.read()
soup = BeautifulSoup(data, 'html.parser')
table = soup.table
table_rows = table.find_all('tr')
for tr in table_rows:
 td = tr.find_all('td')
 row = [i.text for i in td]
 if len(row) < 1:
 pass
 print(row)

I thought it would display the whole table but it just displayed empty strings. What could be the problem?

Question 2

In the HTML you've provided, there is no text content in the elements, so you're getting that correctly. When you look at the live website, text content that appears in the table was inserted dynamically by JS fetching information from a server via ajax. In other words, if you perform a request, you'll get the skeleton (HTML) but no meat (live data).

You can use something like Selenium to extract this information as follows:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options() 
chrome_options.add_argument("--headless") 
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("https://signals.investingstockonline.com/free-binary-signal-page")
for tr in driver.find_elements_by_tag_name("tr"):
 for td in tr.find_elements_by_tag_name("td"):
 print(td.get_attribute("innerText"))

Output (truncated):

EURJPY
126.044
22:00:00
1.50318
EURCAD
1.50332
22:00:00
1.12595
EURUSD
1.12604
22:00:00
0.86732
EURGBP
0.86743
22:00:00
1.29825
GBPUSD
1.29841
22:00:00
145.320

Question 3

Thank you ggorlen for your answer. I was skeptical when I saw the rows being dynamically added. Unfortunately, I'm using the Firefox version of Selenium, how does it translate because it does not run?

Question 4

Thank you @ggorlen, I just changed every chrome to option.

Question 5

@MarkGacoka if the answer resolved the problem, it's customary to accept the solution.

Question 6

Ok. I basically changed all the 'Chrome' to 'Firefox' and everything worked out perfectly. Thank you.

ggorlen 59.5k9 gold badges119 silver badges174 bronze badges · Accepted Answer · 2019-04-23 01:29:01Z

In the HTML you've provided, there is no text content in the elements, so you're getting that correctly. When you look at the live website, text content that appears in the table was inserted dynamically by JS fetching information from a server via ajax. In other words, if you perform a request, you'll get the skeleton (HTML) but no meat (live data).

You can use something like Selenium to extract this information as follows:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options() 
chrome_options.add_argument("--headless") 
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("https://signals.investingstockonline.com/free-binary-signal-page")
for tr in driver.find_elements_by_tag_name("tr"):
 for td in tr.find_elements_by_tag_name("td"):
 print(td.get_attribute("innerText"))

Output (truncated):

EURJPY
126.044
22:00:00
1.50318
EURCAD
1.50332
22:00:00
1.12595
EURUSD
1.12604
22:00:00
0.86732
EURGBP
0.86743
22:00:00
1.29825
GBPUSD
1.29841
22:00:00
145.320

Thank you ggorlen for your answer. I was skeptical when I saw the rows being dynamically added. Unfortunately, I'm using the Firefox version of Selenium, how does it translate because it does not run?
@MarkGacoka if the answer resolved the problem, it's customary to accept the solution.
Ok. I basically changed all the 'Chrome' to 'Firefox' and everything worked out perfectly. Thank you.

CollectivesTM on Stack Overflow

Table data returning empty values after web scraping

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related