Does Python allow variables to be passed into function for dynamic screen scraping?

Laura Creighton lac at openend.se
Sat Nov 28 17:28:25 EST 2015


In a message of 2015年11月28日 14:03:10 -0800, ryguy7272 writes:
>I'm looking at this URL.
>https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names
>>If I hit F12 I can see tags such as these:
><a title=
><a class=
>And so on and so forth. 
>>I'm wondering if someone can share a script, or a function, that will allow me to pass in variables and download (or simply print) the results. I saw a sample online that I thought would work, and I made a few modifications but now I keep getting a message that says: ValueError: All objects passed were None
>>Here's the script that I'm playing around with.
>>import requests
>import pandas as pd
>from bs4 import BeautifulSoup
>>#Get the relevant webpage set the data up for parsing
>url = "https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names"
>r = requests.get(url)
>soup=BeautifulSoup(r.content,"lxml")
>>#set up a function to parse the "soup" for each category of information and put it in a DataFrame
>def get_match_info(soup,tag,class_name):
> info_array=[]
> for info in soup.find_all('%s'%tag,attrs={'class':'%s'%class_name}):
> return pd.DataFrame(info_array)
>>#for each category pass the above function the relevant information i.e. tag names
>tag1 = get_match_info(soup,"td","title")
>tag2 = get_match_info(soup,"td","class")
>>#Concatenate the DataFrames to present a final table of all the above info 
>match_info = pd.concat([tag1,tag2],ignore_index=False,axis=1)
>>print match_info
>>I'd greatly appreciate any help with this.

Post your error traceback. If you are getting Value Errors about None,
then probably something you expect to return a match, isn't. But without
the actual error, we cannot help much.
Laura


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /