0

I new to Python and BeautifulSoup so am still learning, this is probably quite simple but I'm struggling to find an answer.

I'm basically trying to scrape '12' from the last line using the 'data-offset' tag. I can navigate to the last line by searching for class="solr-page-selector-page next full", but don't know how to then get to '12' from here.

'<'a class="solr-page-selector-page" data-offset="12">2</a>
'<'a class="solr-page-selector-page" data-offset="24">3</a>
'<'a class="solr-page-selector-page" data-offset="36">4</a>
'<'a class="solr-page-selector-page" data-offset="48">5</a>
'<'a class="solr-page-selector-page next full" data-offset="12">Next</a>

Any help would be greatly appreciated.

Thank you

asked Jan 14, 2016 at 1:44
1

1 Answer 1

3

This will do the trick:

>>> soup.find(class_='solr-page-selector-page next full').get('data-offset')
'12'

Calling get() allows you to access attributes of the selected tag. You can also perform dict style lookups:

>>> soup.find(class_='solr-page-selector-page next full')['data-offset']
'12'

The two methods differ in their behaviour if the attribute does not exist for the tag. get() will return None whereas [] will raise a KeyError exception.

answered Jan 14, 2016 at 1:55
Sign up to request clarification or add additional context in comments.

2 Comments

It's not a must that the variable class_ has an underscore after it right? That was just done to avoid using the reserved keyword class?
Yes, it must be class_ to avoid clashing with the reserved word class. Alternatively you can pass a dict in attrs: soup.find(attrs={'class':'solr-page-selector-page next full'})

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.