0

Environment Python version : 3.4.2 OS version : OS X Mavericks

Hi,

I wanted to do some web scraping example with python.

So, I created the script file and named it 'html.py'. ( in my project directory)

But, when I executed it with python3, it generates errors like this.

------------------------------------------- Error Msg -----------------------------------------------

Traceback (most recent call last):
 File "html.py", line 1, in <module>
 from bs4 import BeautifulSoup
 File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/bs4/__init__.py", line 30, in <module>
 from .builder import builder_registry, ParserRejectedMarkup
 File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/bs4/builder/__init__.py", line 4, in <module>
 from bs4.element import (
 File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/bs4/element.py", line 5, in <module>
 from bs4.dammit import EntitySubstitution
 File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/bs4/dammit.py", line 11, in <module>
 from html.entities import codepoint2name
 File "/Users/tester/Project/Python/html.py", line 1, in <module>
 from bs4 import BeautifulSoup
ImportError: cannot import name 'BeautifulSoup'

But, I installed 'BeautifulSoup4' by ' sudo pip3 install BeautifulSoup4 '.

And, I checked it is installed in the right path.

Strange thing was that I tried to import 'BeautifulSoup4' with python3 shell mode in the different directory(for ex. not in my project directory),

it makes no errors.

Errors appear only when I execute the script file in that directory. (where 'html.py' exist)

So, why is this happening?

And, errors are also disappered when I changed the script file name. (html.py -> test_html.py)

What's wrong with file name?

Am I not allowed to use module name as my script files?

asked Nov 17, 2014 at 5:59

1 Answer 1

2

html is the name of a standard module.

I'm guessing that, at some point, BeautifulSoup is importing this html module. However, Python will (by default) first look for modules in the directory of the script file you are running. So it will find and import your html.py instead of the standard module.

From the docs on sys.path:

As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter

You could change the name of your script (good idea) or modify sys.path to change the order of where Python will look for modules (bad idea).

answered Nov 17, 2014 at 6:16
Sign up to request clarification or add additional context in comments.

1 Comment

Naming user modules after stdlib modules (such as random.py) is a common beginner problem. The docs have a module index one can check. 2.7.8 had 'htmllib', but 3.x has 'html'

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.