How to parse html table with python and beautifulsoup and write to csv

Asked 12 years, 9 months ago

Viewed 24k times

I try to parse html page and fetch values for currencies and write to csv. I have following code:

#!/usr/bin/env python
import urllib2
from BeautifulSoup import BeautifulSoup
contenturl = "http://www.bank.gov.ua/control/en/curmetal/detail/currency?period=daily"
soup = BeautifulSoup(urllib2.urlopen(contenturl).read())
table = soup.find('div', attrs={'class': 'content'})
rows = table.findAll('tr')
for tr in rows:
 cols = tr.findAll('td')
 for td in cols:
 text = td.find(text=True) + ';'
 print text,
 print

The problem is, that I do not know, how to retrieve only values for currency. I tried some regexp like '^[0-9]{3}' - start with 3 digits but it doesn't work.

Improve this question

edited Mar 6, 2013 at 14:50

Martijn Pieters's user avatar

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

asked Mar 6, 2013 at 14:50

user2140323's user avatar

user2140323

731 gold badge1 silver badge3 bronze badges

Any reason you are using BeautifulSoup 3 instead of 4? Not that it matters much for your problem, but bs4 offers much better functionality in places.

Martijn Pieters
– Martijn Pieters

2013年03月06日 14:52:57 +00:00
Commented Mar 6, 2013 at 14:52
Are you trying to get just the values of "official exchange rates" column?

jurgenreza
– jurgenreza

2013年03月06日 15:02:51 +00:00
Commented Mar 6, 2013 at 15:02

Add a comment |

1 Answer 1

Sorted by: Reset to default

You'd be much better off picking out specific cells in the table. The td cells with the cell_c class contain data you are interested in, and the last one is always the currency exchange rate:

rows = table.findAll('tr')
for tr in rows:
 cols = tr.findAll('td')
 if 'cell_c' in cols[0]['class']:
 # currency row
 digital_code, letter_code, units, name, rate = [c.text for c in cols]
 print digital_code, letter_code, units, name, rate

With the data in separate variables, you can now turn the text to decimal numbers, store them in a database, whatever.

Improve this answer

edited Mar 6, 2013 at 15:15

answered Mar 6, 2013 at 14:59

Martijn Pieters's user avatar

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

How to parse html table with python and beautifulsoup and write to csv

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related